Exporting large amounts of data from coredata to json Exporting large amounts of data from coredata to json json json

Exporting large amounts of data from coredata to json


[UPDATED TO IMPLEMENT LOW MEMORY SERIAL OUTPUT OF NESTED FOLDER HIERARCHY AS NESTED JSON OBJECT FILE]

Now you have provided more detail it's clear the original problem statement lacked sufficient detail for anyone to be able to provide an answer for you. Your issue is actually an age-old problem of how to traverse hierarchies in a memory efficient way combined with the fact the iOS JSON Library is quite light and doesn't easily support streamed writing of deep hierarchies).

The best approach is to use a technique known as the visitor pattern. For each of your NSManagedObject types shown above, implement a protocol called visitor, e.g. just the interface line for each object should look something like this:

@interface Folder : NSManagedObject <Visitable>@interface Word : NSManagedObject <Visitable>

The visitor protocol should define a method call for all objects that comply with the protocol.

@protocol Visitable <NSObject>- (void)acceptVisitor:(id<Visitor>)visitor;@end

You are going to define a visitor object, which itself implements a visitor protocol.

@protocol Visitor <NSObject>- (void)visitFolder:(Folder*)folder;- (void)visitWord:(Word*)word;@end@interface JSONVisitor : NSObject <Visitor>@property (nonatomic, strong) NSURL *streamURL;- (void)startVisiting:(id<Visitable>)visitableObject;@end@implementation JSONVisitor@property (nonatomic, strong) NSOutputStream *outputStream;- (void)startVisiting:(id<Visitable>)visitableObject{    if ([visitableObject respondsToSelector:@selector(acceptVisitor:)]     {        if (_outputStream == nil)         {            // more code required set up your output stream            // specifically as a JSON output stream.            // add code to either set the stream URL here,             // or set it when the visitor object is instantiated.            _outputStream = [NSOutputStream outputStreamWithURL:_streamURL append:YES];        }        [_outputStream open];        // Note 1a Bypass Apple JSON API which doesn't support        // writing of partial objects (doing so is very easy anyway).        // Write opening root object fragment text string to stream        // such as:        // {        //     "$schema" : "http://myschema.com/draft-01/schema#Folder1",        //     "name" : "Folder export",        //     "created" : "2013-07-16T19:20:30.45+01:00",        //     "Folders" : [        [visitableObject acceptVisitor:self];        // Note 1b write closing JSON  root object        // e.g.         //     ]        // }        [_outputStream close];    }}- (void)visitFolder:(Folder*)folder{    // Note 2a Bypass Apple JSON API which doesn't appear to support    // writing of partial objects (Writing JSON is very easy anyway).    // This next step would be best done with a proper templating system,    // but for simplicity of illustration I'm suggesting writing out raw    // JSON object text fragments.    // Write opening JSON Folder object fragment text string to stream    // e.g.     // "Folder" : {     if ([folder.folders count] > 1) {        // Write opening folder array fragment to stream e.g.        // "Folders" : [        // loop through folder member NSManagedObjects here         // (note defensive checks for nulls not included).        NSUInteger count = 0;        for (Folder *nestedFolder in folder.folders)        {           if (count > 0) // print comma to output stream           [nestedFolder acceptVisitor:self];           count++;        }        // write closing folders array to stream        // ]    }    if ([folder.words count] > 1) {        // Write opening words array fragment to stream e.g.        // "Words" : [        // loop through Word member NSManagedObjects here         // (note defensive checks for nulls not included).        NSUInteger count = 0;        for (Word *nestedWord in folder.words)        {           if (count > 0) // print comma to output stream           [nestedFolder acceptVisitor:self];           count++;        }        // write closing Words array to stream        // ]    }    // Print closing Folder object brace to stream (should only be followed    // a comma if there are more members in the folder this object is contained by)    // e.g.    // },    // Note 2b Next object determination code here. }- (void)visitWord:(Word*)word{    // Write to JSON stream    [NSJSONSerialization writeJSONObject:word toStream:_outputStream options: NSJSONWritingPrettyPrinted error:nil];}@end

This object is able to "visit" each object in your hierarchy and do some work work with it (in your case write it to a JSON stream). Note you don't need to extract to a dictionary first. You just work directly with the Core Data objects, making them visitable. Core Data contains it's own memory management, with faulting, so you don't have to worry about excessive memory usage.

This is the process. You instantiate the visitor object and then call it's start visiting method passing in the root Folder object of your hierarchy above. In that method, the visitor object "knocks on the door" of the first object to be visited by calling - (void)acceptVisitor:(id<Visitor>)visitor on the object to be visited. The root Folder then "welcomes the visitor in" by calling a method back on the visitor object matching it's own object type, e.g.:

- (void)acceptVisitor:(id<Visitor>)visitor{    if ([visitor respondsToSelector:@selector(visitFolder:)]) {        [visitor visitFolder:self];    }}

This in turn calls the visitFolder: method on the visitor object which opens the stream writes the object as JSON and closes the stream. This is the important thing. This pattern may appear complex at first, but I guarantee, if you are working with hierarchies, once you have implemented it you will find it powerful and easy to manage.

To support low memory serial output of a deep hierarchy, I'm suggesting you write your own JSON Folder object to the output stream. Since JSON is so simple, this is much easier than it might at first appear. The alternative is to look for a JSON Library which supports low memory serialised writing of nested objects (I haven't used JSON much so don't know if such exists and is easy to use on iOS). The visitor pattern ensures you need have no more than one NSManagedObject instantiated to work on for each level of the hierarchy (though of course more will inevitably need to be instantiated as you implement hierarchy traversal logic) so this is light on memory usage.

I have given examples of the text string that needs to be written to the output stream. Best practice would dictate using a templating system for this rather than directly writing statically allocated strings. But personally I wouldn't worry about adopting the quick and dirty approach if your deadline is tight.

I've assumed your folder objects contain a folders property providing a set of additional folders. I have also assumed your Folders NSManagedObject class contains a words property containing a set of Words NSManagedObjects. Remember if you stay working in Core Data it will look after ensuring you keep a low memory footprint.

At the end of the visitFolder: method, you can use the following logic.

  1. Check if the Folder's contains any folders and visit each in turn if it does.

  2. If it contains no more folders, check if it contains any Words, and visit each in turn if it does.

Note the above code is the simplest construct for minimising the memory footprint. You may want to optimise it for performance by e.g. only doing an auto-release when a certain batch size is exceeded. However given the problem you have described, it will be best to implement the most memory efficient method first.

If you have polymorphic hierarchies - your on your own :) - get a book out and do some study -managing them is a grad degree in itself.

Clearly this code is untested!


Check the NSFetchRequest documentation. You will see two properties:

- (NSUInteger)fetchOffset;– fetchBatchSize;

With use of these two properties you can restrict the number of returned NSManagedObjects to a given batch size.

Open a stream you can write too. Set up a loop to execute a fetch request. But set a batch size (x) and then update the fetch offset of the fetch request at the end of the loop code for the next iteration of the loop.

myFetchRequestObject.fetchOffset += x;

Process the batch of data objects writing the JSON data to your open stream before starting the next iteration of the loop.

When either no more objects are returned or the number of objects returned by the fetch are less than the batch size, exit your loop.

Close your stream.


problem was that i had Enable Zombie Objects in the project schema turned on.For some reason this also carried through to the release build too.

turning it off fixed all my problems.

I ended up also using TheBasicMinds design pattern because its a cool design pattern...