Using jq to count
[UPDATE: If the input is not an array, see the last section below.]
count/1
I'd recommend defining a count
filter (and maybe putting it in your ~/.jq), perhaps as follows:
def count(s): reduce s as $_ (0;.+1);
With this, assuming the input is an array, you'd write:
count(.[] | select(.sapm_score > 40))
or slightly more efficiently:
count(.[] | (.sapm_score > 40) // empty)
This approach (counting items in a stream) is usually preferable to using length
as it avoids the costs associated with constructing an array.
count/2
Here's another definition of count
that you might like to use (and perhaps add to ~/.jq as well):
def count(stream; cond): count(stream | cond // empty);
This counts the elements of the stream for which cond
is neither false
nor null
.
Now, assuming the input consists of an array, you can simply write:
count(.[]; .sapm_score > 40)
"sapm_score" vs "spam_score"
If the point is that you want to normalize "sapm_score" to "spam_score", then (for example) you could use count/2
as defined above, like so:
count(.[]; .spam_score > 40 or .sapm_score > 40)
This assumes all the items in the array are JSON objects. If that is not the case, then you might want to try adding "?" after the key names:
count(.[]; .spam_score? > 40 or .sapm_score? > 40)
Of course all the above assumes the input is valid JSON. If that is not the case, then please see https://github.com/stedolan/jq/wiki/FAQ#processing-not-quite-valid-json
If the input is a stream of JSON objects ...
The revised question indicates the input consists of a stream of JSON objects (whereas originally the input was said to be an array of JSON objects). If the input consists of a stream of JSON objects, then the above solutions can easily be adapted, depending on the version of jq that you have. If your version of jq has inputs
then (2) is recommended.
(1) All versions: use the -s
command-line option.
(2) If your jq has inputs
: use the -n
command line option, and change .[]
above to inputs
, e.g.
count(inputs; .spam_score? > 40 or .sapm_score? > 40)
Filter the items that satisfy the condition then get the length.
map(select(.sapm_score > 40)) | length
Here is one way:
reduce .[] as $s(0; if $s.spam_score > 40 then .+1 else . end)
If instead of an array the input is a sequence of newline delimited objects (jsonlines)
reduce inputs as $s(0; if $s.spam_score > 40 then .+1 else . end)
will work if jq is invoked with the -n
flag. Here is an example:
$ cat data.json{ "spam_score":40.776 }{ "spam_score":17.376 }$ jq -Mn 'reduce inputs as $s(0; if $s.spam_score > 40 then .+1 else . end)' data.json1