Querying MongoDB GridFS? Querying MongoDB GridFS? mongoose mongoose

Querying MongoDB GridFS?


GridFS works by storing a number of chunks for each file. This way, you can deliver and store very large files without having to store the entire file in RAM. Also, this enables you to store files that are larger than the maximum document size. The recommended chunk size is 256kb.

The file metadata field can be used to store additional file-specific metadata, which can be more efficient than storing the metadata in a separate document. This greatly depends on your exact requirements, but the metadata field, in general, offers a lot of flexibility. Keep in mind that some of the more obvious metadata is already part of the fs.files document, by default:

> db.fs.files.findOne();{    "_id" : ObjectId("4f9d4172b2ceac15506445e1"),    "filename" : "2e117dc7f5ba434c90be29c767426c29",    "length" : 486912,    "chunkSize" : 262144,    "uploadDate" : ISODate("2011-10-18T09:05:54.851Z"),    "md5" : "4f31970165766913fdece5417f7fa4a8",    "contentType" : "application/pdf"}

To actually read the file from GridFS you'll have to fetch the file document from fs.files and the chunks from fs.chunks. The most efficient way to do that is to stream this to the client chunk-by-chunk, so you don't have to load the entire file in RAM. The chunks collection has the following structure:

> db.fs.chunks.findOne({}, {"data" :0});{    "_id" : ObjectId("4e9d4172b2ceac15506445e1"),    "files_id" : ObjectId("4f9d4172b2ceac15506445e1"),    "n" : 0, // this is the 0th chunk of the file    "data" : /* loads of data */}

If you want to use the metadata field of fs.files for your queries, make sure you understand the dot notation, e.g.

> db.fs.files.find({"metadata.OwnerId": new ObjectId("..."),                     "metadata.ImageWidth" : 280});

also make sure your queries can use an index using explain().


As the specification says, you can store whatever you want in the metadata field.

Here's how a document from the files collection looks like:

Required fields

{  "_id" : <unspecified>,                  // unique ID for this file  "length" : data_number,                 // size of the file in bytes  "chunkSize" : data_number,              // size of each of the chunks.  Default is 256k  "uploadDate" : data_date,               // date when object first stored  "md5" : data_string                     // result of running the "filemd5" command on this file's chunks}

Optional fields

{      "filename" : data_string,               // human name for the file  "contentType" : data_string,            // valid mime type for the object  "aliases" : data_array of data_string,  // optional array of alias strings  "metadata" : data_object,               // anything the user wants to store}

So store anything you want in the metadata and query it normally like you would in MongoDB:

db.fs.files.find({"metadata.some_info" : "sample"});


I know the question doesn't ask about the Java way of querying for metadata, but here it is, assuming you add gender as a metadata field:

// Get your database's GridFSGridFS gfs = new GridFS("myDatabase);// Write out your JSON query within JSON.parse() and cast it as a DBObjectDBObject dbObject = (DBObject) JSON.parse("{metadata: {gender: 'Male'}}");// Querying action (find)List<GridFSDBFile> gridFSDBFiles = gfs.find(dbObject);// Loop through the resultsfor (GridFSDBFile gridFSDBFile : gridFSDBFiles) {    System.out.println(gridFSDBFile.getFilename());}