Calculated group-by fields in MongoDB Calculated group-by fields in MongoDB mongodb mongodb

Calculated group-by fields in MongoDB


You can actually do something like this with "project" first, but to me it's a little counter-intuitive to require a $project stage before hand:

    Aggregation agg = newAggregation(        project("quantity")            .andExpression("dayOfMonth(date)").as("day")            .andExpression("month(date)").as("month")            .andExpression("year(date)").as("year")            .andExpression("price * quantity").as("totalAmount"),        group(fields().and("day").and("month").and("year"))            .avg("quantity").as("averavgeQuantity")            .sum("totalAmount").as("totalAmount")            .count().as("count")    );

Like I said, counter-intuitive as you should just be able to declare all of this under $group stage, but the helpers don't seem to work this way. The serialization comes out a bit funny ( wraps the date operator arguments with arrays ) but it does seem to work. But still, this is two pipeline stages rather than one.

What is the problem with this? Well by separating the stages the stages the "project" portion forces the processing of all of the documents in the pipeline in order to get the calculated fields, that means it passes through everything before moving on to the group stage.

The difference in processing time can be clearly seen by running the queries in both forms. With a separate project stage, on my hardware takes three times longer to execute than the query where all fields are calculated during the "group" operation.

So it seems the only present way to construct this properly is by building the pipeline object yourself:

    ApplicationContext ctx =            new AnnotationConfigApplicationContext(SpringMongoConfig.class);    MongoOperations mongoOperation = (MongoOperations) ctx.getBean("mongoTemplate");    BasicDBList pipeline = new BasicDBList();    String[] multiplier = { "$price", "$quantity" };    pipeline.add(        new BasicDBObject("$group",            new BasicDBObject("_id",                new BasicDBObject("month", new BasicDBObject("$month", "$date"))                    .append("day", new BasicDBObject("$dayOfMonth", "$date"))                    .append("year", new BasicDBObject("$year", "$date"))            )            .append("totalPrice", new BasicDBObject(                "$sum", new BasicDBObject(                    "$multiply", multiplier                )            ))            .append("averageQuantity", new BasicDBObject("$avg", "$quantity"))            .append("count",new BasicDBObject("$sum",1))        )    );    BasicDBObject aggregation = new BasicDBObject("aggregate","collection")        .append("pipeline",pipeline);    System.out.println(aggregation);    CommandResult commandResult = mongoOperation.executeCommand(aggregation);

Or if all of that seems to terse to you, then you can always work with the JSON source and parse that. But of course, it has to be valid JSON:

    String json = "[" +        "{ \"$group\": { "+            "\"_id\": { " +                "\"month\": { \"$month\": \"$date\" }, " +                "\"day\": { \"$dayOfMonth\":\"$date\" }, " +                "\"year\": { \"$year\": \"$date\" } " +            "}, " +            "\"totalPrice\": { \"$sum\": { \"$multiply\": [ \"$price\", \"$quantity\" ] } }, " +            "\"averageQuantity\": { \"$avg\": \"$quantity\" }, " +            "\"count\": { \"$sum\": 1 } " +        "}}" +    "]";    BasicDBList pipeline = (BasicDBList)com.mongodb.util.JSON.parse(json);


Another alternative is to use a custom aggregation operation class defined like so:

public class CustomAggregationOperation implements AggregationOperation {    private DBObject operation;    public CustomAggregationOperation (DBObject operation) {        this.operation = operation;    }    @Override    public DBObject toDBObject(AggregationOperationContext context) {        return context.getMappedObject(operation);    }}

Then implement it in the pipeline like so:

Aggregation aggregation = newAggregation(    new CustomAggregationOperation(        new BasicDBObject(            "$group",            new BasicDBObject("_id",                new BasicDBObject("day", new BasicDBObject("$dayOfMonth", "$date" ))                .append("month", new BasicDBObject("$month", "$date"))                .append("year", new BasicDBObject("$year", "$date"))            )            .append("totalPrice", new BasicDBObject(                "$sum", new BasicDBObject(                    "$multiply", Arrays.asList("$price","$quantity")                )            ))            .append("averageQuantity", new BasicDBObject("$avg", "$quantity"))            .append("count",new BasicDBObject("$sum",1))        )    ))

So it's basically just an interface that is consistent with that used by the existing pipeline helpers, but instead of using other helper methods it directly takes a DBObject definition to return in pipeline construction.