Using jq to count Using jq to count json json

Using jq to count


[UPDATE: If the input is not an array, see the last section below.]

count/1

I'd recommend defining a count filter (and maybe putting it in your ~/.jq), perhaps as follows:

 def count(s): reduce s as $_ (0;.+1);

With this, assuming the input is an array, you'd write:

 count(.[] | select(.sapm_score > 40))

or slightly more efficiently:

 count(.[] | (.sapm_score > 40) // empty)

This approach (counting items in a stream) is usually preferable to using length as it avoids the costs associated with constructing an array.

count/2

Here's another definition of count that you might like to use (and perhaps add to ~/.jq as well):

def count(stream; cond): count(stream | cond // empty);

This counts the elements of the stream for which cond is neither false nor null.

Now, assuming the input consists of an array, you can simply write:

count(.[]; .sapm_score > 40)

"sapm_score" vs "spam_score"

If the point is that you want to normalize "sapm_score" to "spam_score", then (for example) you could use count/2 as defined above, like so:

 count(.[]; .spam_score > 40 or .sapm_score > 40)

This assumes all the items in the array are JSON objects. If that is not the case, then you might want to try adding "?" after the key names:

count(.[]; .spam_score? > 40 or .sapm_score? > 40)

Of course all the above assumes the input is valid JSON. If that is not the case, then please see https://github.com/stedolan/jq/wiki/FAQ#processing-not-quite-valid-json

If the input is a stream of JSON objects ...

The revised question indicates the input consists of a stream of JSON objects (whereas originally the input was said to be an array of JSON objects). If the input consists of a stream of JSON objects, then the above solutions can easily be adapted, depending on the version of jq that you have. If your version of jq has inputs then (2) is recommended.

(1) All versions: use the -s command-line option.

(2) If your jq has inputs: use the -n command line option, and change .[] above to inputs, e.g.

count(inputs; .spam_score? > 40 or .sapm_score? > 40)


Filter the items that satisfy the condition then get the length.

map(select(.sapm_score > 40)) | length


Here is one way:

reduce .[] as $s(0; if $s.spam_score > 40 then .+1 else . end)

Try it online at jqplay.org

If instead of an array the input is a sequence of newline delimited objects (jsonlines)

reduce inputs as $s(0; if $s.spam_score > 40 then .+1 else . end)

will work if jq is invoked with the -n flag. Here is an example:

$ cat data.json{ "spam_score":40.776 }{ "spam_score":17.376 }$ jq -Mn 'reduce inputs as $s(0; if $s.spam_score > 40 then .+1 else . end)' data.json1

Try it online at tio.run