Concat String by Group Concat String by Group mongodb mongodb

Concat String by Group


You can do it with the aggregation framework as a "two step" operation. Which is to first accumulate the items to an array via $push withing a $group pipeline, and then to use $concat with $reduce on the produced array in final projection:

db.collection.aggregate([  { "$group": {    "_id": "$tag_id",    "client_id": { "$push": "$client_id" }  }},  { "$addFields": {    "client_id": {      "$reduce": {        "input": "$client_id",        "initialValue": "",        "in": {          "$cond": {            "if": { "$eq": [ "$$value", "" ] },            "then": "$$this",            "else": {              "$concat": ["$$value", ",", "$$this"]            }          }        }      }    }  }}])

We also apply $cond here to avoid concatenating an empty string with a comma in the results, so it looks more like a delimited list.

FYI There is an JIRA issue SERVER-29339 which does ask for $reduce to be implemented as an accumulator expression to allow it's use directly in a $group pipeline stage. Not likely to happen any time soon, but it theoretically would replace $push in the above and make the operation a single pipeline stage. Sample proposed syntax is on the JIRA issue.

If you don't have $reduce ( requires MongoDB 3.4 ) then just post process the cursor:

db.collection.aggregate([  { "$group": {    "_id": "$tag_id",    "client_id": { "$push": "$client_id" }  }},]).map( doc =>  Object.assign(    doc,   { "client_id": doc.client_id.join(",") }  ))

Which then leads to the other alternative of doing this using mapReduce if you really must:

db.collection.mapReduce(  function() {    emit(this.tag_id,this.client_id);  },  function(key,values) {    return [].concat.apply([],values.map(v => v.split(","))).join(",");  },  { "out": { "inline": 1 } })

Which of course outputs in the specific mapReduce form of _id and value as the set of keys, but it is basically the output.

We use [].concat.apply([],values.map(...)) because the output of the "reducer" can be a "delimited string" because mapReduce works incrementally with large results and therefore output of the reducer can become "input" on another pass. So we need to expect that this can happen and treat it accordingly.


Starting Mongo 4.4, the $group stage has a new aggregation operator $accumulator allowing custom accumulations of documents as they get grouped:

// { "tag_id" : 1, "client_id" : "10001" }// { "tag_id" : 1, "client_id" : "10002" }// { "tag_id" : 2, "client_id" : "9999"  }db.collection.aggregate([  { $group: {    _id: "$tag_id",    client_id: {      $accumulator: {        accumulateArgs: ["$client_id"],        init: function() { return [] },        accumulate: function(ids, id) { return ids.concat(id) },        merge: function(ids1, ids2) { return ids1.concat(ids2) },        finalize: function(ids) { return ids.join(",") },        lang: "js"      }    }  }}])// { "_id" : 2, "client_id" : "9999" }// { "_id" : 1, "client_id" : "10001,10002" }

The accumulator:

  • accumulates on the field client_id (accumulateArgs)
  • is initialised to an empty array (init)
  • accumulates by concatenating new ids to already seen ids to new ones (accumulate and merge)
  • and finally joins all ids as a string (finalize)