Mongoose, document state using String vs Booleans
There are a few other things to consider outside of the concern for searching, such as: conditional logic; setting values; validating documents, document size; and indexing.
Let's review the two proposed schemas and name them proposalA
, using an enum, and proposalB
, using multiple fields to emulate an enum:
const proposalA = new Schema({ state: { type: String, enum: ['pending', 'approved', 'denied'], default: 'pending' }});const proposalB = new Schema({ approved: { type: Boolean, default: false }, denied: { type: Boolean, default: false }});
Assumptions
proposalA
indicates that the document state can only be one of the three values: 'pending', 'approved', and 'denied'.proposalB
indicates that to support the assumption of proposalA
then for a 'pending' state both 'approved' and 'denied' would both be false.
Concerns
Querying, Indexing & Modification
While proposalA
does use a string value, the matching for either proposal is an equality check, { state : 'approved' }
or { approved: true }
for the search query. The big difference is for pending
:
proposalA
:{ state: 'pending' }
proposalB
:{ approved: false, denied: false }
Assuming there are no other query parameters, this would require a single index on state
for proposalA
while proposalB
would require two indexes, one each for approved
and denied
, and to use mongo's index intersection or a compound index of approved
and denied
.
The issue with leaving it up to index intersection is that if the query becomes more involved then the ability to predict which intersection will be used becomes very tricky for a variety of reasons. For instance, at the moment while there are only 3 states, if new states are added then more indexes would need to be created to ensure the query is efficient.
That leads to another issue of each index occupies space in memory for a mongo server. While a compound index reduces the number of indexes to one for this query, it would likely still be a larger index in memory than a single index for proposalA
.
Speaking of memory size, a document of { state: 'pending' }
is about half the size of { approved: false, denied: false }
. While this seems trivial at the moment, as noted previously that if more states are added or this pattern is continued with other fields then it is easy to see how the document size would bloat very quickly as well.
Returning to the search query from a programmatic viewpoint shows that proposalA
is pretty straightforward:
function getDocsFromState(state) { const Foo = mongoose.Model('foo'); const query = { state }; // assuming state is a string of 'pending', 'approved', or 'denied' return Foo.find(query).exec(); // Promise}
While some conditional code would need to be employed to construct the query for proposalB
(a possible variant of this logic):
function getDocsFromState(state) { const Foo = mongoose.Model('foo'); const query = { approved: state === 'approved', denied: state === 'denied' }; return Foo.find(query).exec(); // Promise}
Aside from proposalA
having more concise code, it would not require implementation updates to support new states whereas proposalB
would require them.
The same issue applies to updating the value for the state. proposalA
remains concise:
function updateDocState(_id, state) { const Foo = mongoose.Model('foo'); const update = { state }; // assuming state is a string of 'pending', 'approved', or 'denied' return Foo.update({ _id }, update).exec(); // Promise}
While proposalB
still requires more additional logic:
function updateDocState(_id, state) { const Foo = mongoose.Model('foo'); const update = { approved: state === 'approved', denied: state === 'denied' }; return Foo.update({ _id }, update).exec(); // Promise}
Conditional Logic & Validation
Validation becomes a bit more cumbersome when emulating an enum by using multiple fields to represent each enum value. An enum by definition prevents more than one value from being stored at a time, proposalB
would need to employ validation to prevent both approved
and denied
being true at the same time. To enforce this could potentially limit the methods for updating (partial updates vs updating a full document in memory before saving) depending on the validation tools used (native mongo vs a 3rd party lib like mongoose) to update a document.
Finally, we have already seen how conditional logic was necessary for querying and updating a document but there might be other areas in the code where this might be required. Any time a document in memory for proposalB
would need to be checked for its current state would require the use of similar conditional logic while proposalA
would simply check the enum value.
TL;DR;
We have seen how enums provide builtin document validation, reduce both document and index sizes, simplify indexing strategy, simplify current and possible future implementation code, and finally pose little performance concern in querying since both approaches use equality checks.
Hope this helps!