PHP SilverStripe ORM: Duplicate key value violates unique constraint for DataObject write
That note in the documentation is very out of date (even SilverStripe 2.1 from 2007 has the correct behaviour) and the approach described by the docs would lead to race conditions.
The complexity is that SilverStripe uses multi-table inheritance and what SilverStripe does is such cases is this:
- Insert into the SiteTree table
- Fetch the ID generated
- Insert into the Page table (and other tables) using the same ID
It may also do subsequent UPDATE writes to the SiteTree table with the same ID.
Unfortunately this doesn't necessarily help you solve your issue, but it can at least close off one possible source of the issue.
Edit: The documented SilverStripe's generated key support is broken, and uses a method of identifier generation that won't work properly. However, one of the devs has confirmed that this is a documentation bug, and the framework's true behaviour is no longer to use max() queries. So the problem is not there.
For anyone wondering why it's wrong to use max(...)
for key generation: it's totally concurrency unsafe. Even in subqueries. If you do:
INSERT INTO my_table(id, ...)VALUES( (SELECT max(id) + 1 FROM my_table), ...);
then both SELECT
s can run at the same time. They'll get the same result, and then both insert
s will try to insert the same value. Even if one insert
completes before the other select
runs, if it hasn't committed yet, the other select
won't see the new value.
It's only safe to do that if you LOCK TABLE
first, or SELECT ... FOR UPDATE
in the subquery. A SELECT ... FOR UPDATE
is much slower in this case.
So if you can modify SilverStripe, change it to at least send a LOCK TABLE mytable IN EXCLUSIVE MODE;
before the SELECT max(...)
so it's slow and clumsy but not also broken.
Or, better, fix it to just use a database sequence.
If there's a real business need for gap-less numbering, use a counter table maintained by UPDATE ... RETURNING ...
instead. (If you need portability you must use SELECT ... FOR UPDATE
then UPDATE
, in the same transaction).
Update: The framework no longer uses that approach in recent versions.
(Removed grumpy rant about the approach as non-constructive)
As @CraigRinger pointed out, the way the inserted ID was being accessed was across all sessions, rather than per session. I've updated the PostgreSQL module to instead use currval() which is session based.
So far I haven't replicated the issue again however I'm not entirely convinced this is the core of the issue. I'll update this if it faults again.