Improve process of mirroring server database to a client database via JSON? Improve process of mirroring server database to a client database via JSON? json json

Improve process of mirroring server database to a client database via JSON?


I have experience in a very similar project. The Core Data insertions take some time, so we condition the user that this will take a while, but only the first time. The best performance tweak was of course to get the batch size right between saves, but I am sure you are aware of that.

One performance suggestion: I have tried a few things and found that creating many download threads can be a hit on performance, I suppose because for each request there is some latency from the server etc.

Instead, I discovered that downloading all the JSON in one go was much faster. I do not know how much data you have, but I tested with > 100.000 records and a 40MB+ JSON string this works really fast, so the bottleneck is just the Core Data insertions. With an @autorelease pool this even performed acceptably on a first generation iPad.

Stay away from the SQLite API - it will take you more than a man year (provided high productivity) to replicate the performance optimizations you get out of the box with Core Data.


First off, you're doing a lot of work, and it will take some time no matter how you slice it, but there are ways to improve things.

I'd recommend doing your fetches in batches, with a batch size matching your batch size for processing new objects. For example, when creating new Agency records, do something like:

  1. Make sure the current Agency batch is sorted by city_id. (I'll explain why later).

  2. Get the City ID for each Agency in the batch. Depending on how your JSON is structured, this is probably a one-liner like this (since valueForKey works on arrays):

    NSArray *cityIDs = [myAgencyBatch valueForKey:@"city_id"];
  3. Get all the City instances for the current pass in one fetch by using the IDs you found in the previous step. Sort the results by city_id. Something like:

    NSFetchRequest *request = [NSFetchRequest fetchRequestWithEntityName:@"City"];NSPredicate *predicate = [NSPredicate predicateWithFormat:@"city_id in %@", cityIDs];[request setPredicate:predicate];[request setSortDescriptors:@[ [NSSortDescriptor sortDescriptorWithKey:@"city_id" ascending:YES] ]];NSArray *cities = [context executeFetchRequest:request error:nil];

Now, you have one array of Agency and another one of City, both sorted by city_id. Match them up to set up the relationships (check city_id in case things don't match). Save changes, and go on to the next batch.

This will dramatically reduce the number of fetches you need to do, which should speed things up. For more on this technique, see Implementing Find-or-Create Efficiently in Apple's docs.

Another thing that may help is to "warm up" Core Data's internal cache with the objects you need before you start fetching them. This will save time later on because getting property values won't require a trip to the data store. For this you'd do something like:

NSFetchRequest *request = [NSFetchRequest fetchRequestWithEntityName:@"City"];// no predicate, get everything[request setResultType:NSManagedObjectIDResultType];NSArray *notUsed = [context executeFetchRequest:request error:nil];

..and then just forget about the results. This is superficially useless but will alter the internal Core Data state for faster access to City instances later on.

Now as for your other questions,

  • Using SQLite directly instead of Core Data might not be a terrible choice for your situation. The benefit would be that you'd have no need to set up the relationships, since you could use use fields like city_id as foreign keys. So, fast importing. The downside, of course, is that you'll have to do your own work converting your model objects to/from SQL records, and probably rewrite quite a lot of existing code that assumes Core Data (e.g. every time you follow a relationship you now need to look up records by that foreign key). This change might fix your import performance issues, but the side effects could be significant.

  • JSON is generally a very good format if you're transmitting data as text. If you could prepare a Core Data store on the server, and if you would use that file as-is instead of trying to merge it into an existing data store, then that would almost certainly speed things up. Your import process would run once on the server and then never again. But those are big "if"s, especially the second one. If you get to where you need to merge a new server data store with existing data, you're right back to where you are now.


Do you have control of the server? I ask, because it sounds like you do from the following paragraph:

"For the first time a complete synchronization is performed (app's first launch time) - perform the fetch of the whole database data in, say, one archived file (something like database dump) and then somehow import it as a whole to the CoreData land".

If sending a dump is possible, why not send the Core Data file itself? Core Data (by default) is backed by a SQLite database -- why not generate that database on the server, zip it and send it across the wire?

This would mean you could eliminate all the JSON parsing, network requests etc and replace it with a simple file download and archive extraction. We did this on a project and it improved performance immeasurably.