Core Data Multithreading Import (Duplicate Objects) Core Data Multithreading Import (Duplicate Objects) multithreading multithreading

Core Data Multithreading Import (Duplicate Objects)


So the problem is:

  • contexts are a scratchpad — unless and until you save, changes you make in them are not pushed to the persistent store;
  • you want one context to be aware of changes made on another that hasn't yet been pushed.

To me it doesn't sound like merging between contexts is going to work — contexts are not thread safe. Therefore for a merge to occur nothing else can be ongoing on the thread/queue of the other context. You're therefore never going to be able to eliminate the risk that a new object is inserted while another context is partway through its insertion process.

Additional observations:

  • SQLite is not thread safe in any practical sense;
  • hence all trips to the persistent store are serialised regardless of how you issue them.

Bearing in mind the problem and the SQLite limitations, in my app we've adopted a framework whereby the web calls are naturally concurrent as per NSURLConnection, subsequent parsing of the results (JSON parsing plus some fishing into the result) occurs concurrently and then the find-or-create step is channeled into a serial queue.

Very little processing time is lost by the serialisation because the SQLite trips would be serialised anyway, and they're the overwhelming majority of the serialised stuff.


Start by creating dependences between your operations. Make sure one can't complete until its dependency does.

Check out http://developer.apple.com/library/mac/documentation/Cocoa/Reference/NSOperation_class/Reference/Reference.html#//apple_ref/occ/instm/NSOperation/addDependency:

Each operation should call save when it finished. Next, I would try the Find-Or-Create methodology suggested here:

https://developer.apple.com/library/ios/documentation/Cocoa/Conceptual/CoreData/Articles/cdImporting.html

It'll solve your duplicates problem, and can probably result in you doing less fetches (which are expensive and slow, thus drain battery quickly).

You could also create a global child context to handle all of your imports, then merge the whole huge thing at the end, but it really comes down to how big the data set is and your memory considerations.


I've been struggling with the same issue for a while now. The discussion on this question so far has given me a few ideas, which I will share now.

Please note that this is essentially untested since in my case I only see this duplicate issue very rarely during testing and there's no obvious way for me to reproduce it easily.

I have the same CoreData stack setup - A master MOC on a private queue, which has a child on the main queue and it used as the app's main context. Finally, bulk import operations (find-or-create) are passed off onto a third MOC using a background queue. Once the operation is complete saves are propagated up to the PSC.

I've moved all my Core Data stack from the AppDelegate to a separate class (AppModel) that provides the app with access to the aggregate root object of the domain (the Player) and also a helper function for performing background operations on the model (performBlock:onSuccess:onError:).

Luckily for me, all the major CoreData operations are funnelled through this method so if I can ensure that these operations are run serially then the duplicate problem should be solved.

- (void) performBlock: (void(^)(Player *player, NSManagedObjectContext *managedObjectContext)) operation onSuccess: (void(^)()) successCallback onError:(void(^)(id error)) errorCallback{    //Add this operation to the NSOperationQueue to ensure that     //duplicate records are not created in a multi-threaded environment    [self.operationQueue addOperationWithBlock:^{        NSManagedObjectContext *managedObjectContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];        [managedObjectContext setUndoManager:nil];        [managedObjectContext setParentContext:self.mainManagedObjectContext];        [managedObjectContext performBlockAndWait:^{            //Retrive a copy of the Player object attached to the new context            id player = [managedObjectContext objectWithID:[self.player objectID]];            //Execute the block operation            operation(player, managedObjectContext);            NSError *error = nil;            if (![managedObjectContext save:&error])            {                //Call the error handler                dispatch_async(dispatch_get_main_queue(), ^{                    NSLog(@"%@", error);                    if(errorCallback) return errorCallback(error);                });                return;            }            //Save the parent MOC (mainManagedObjectContext) - WILL BLOCK MAIN THREAD BREIFLY            [managedObjectContext.parentContext performBlockAndWait:^{                NSError *error = nil;                if (![managedObjectContext.parentContext save:&error])                {                    //Call the error handler                    dispatch_async(dispatch_get_main_queue(), ^{                        NSLog(@"%@", error);                        if(errorCallback) return errorCallback(error);                    });                    return;                }            }];            //Attempt to clear any retain cycles created during operation            [managedObjectContext reset];            //Call the success handler            dispatch_async(dispatch_get_main_queue(), ^{                if (successCallback) return successCallback();            });        }];    }];}

What I've added here that I hope is going to resolve the issue for me is wrapping the whole thing in addOperationWithBlock. My operation queue is simply configured as follows:

single.operationQueue = [[NSOperationQueue alloc] init];[single.operationQueue setMaxConcurrentOperationCount:1];

In my API class, I might perform an import on my operation as follows:

- (void) importUpdates: (id) methodResult onSuccess: (void (^)()) successCallback onError: (void (^)(id error)) errorCallback{    [_model performBlock:^(Player *player, NSManagedObjectContext *managedObjectContext) {        //Perform bulk import for data in methodResult using the provided managedObjectContext    } onSuccess:^{        //Call the success handler        dispatch_async(dispatch_get_main_queue(), ^{            if (successCallback) return successCallback();        });    } onError:errorCallback];}

Now with the NSOperationQueue in place it should no longer be possible for more than one batch operation to take place at the same time.