Converting CSV to JSON in bash

The right tool for this job is jq.

jq -Rsn '  {"occurrences":    [inputs     | . / "\n"     | (.[] | select(length > 0) | . / ";") as $input     | {"position": [$input[0], $input[1]], "taxo": {"espece": $input[2]}}]}' <se.csv

emits, given your input:

{  "occurences": [    {      "position": [        "-21.3214077",        "55.4851413"      ],      "taxo": {        "espece": "Ruizia cordata"      }    },    {      "position": [        "-21.3213078",        "55.4849803"      ],      "taxo": {        "espece": "Cossinia pinnata"      }    }  ]}

By the way, a less-buggy version of your original script might look like:

#!/usr/bin/env bashitems=( )while IFS=';' read -r lat long pos _; do  printf -v item '{ "position": [%s, %s], "taxo": {"espece": "%s"}}' "$lat" "$long" "$pos"  items+=( "$item" )done <se.csvIFS=','printf '{"occurrences": [%s]}\n' "${items[*]}"

Note:

There's absolutely no point using cat to pipe into a loop (and good reasons not to); thus, we're using a redirection (<) to open the file directly as the loop's stdin.
read can be passed a list of destination variables; there's thus no need to read into an array (or first to read into a string, and then to generate a heresting and to read from that into an array). The _ at the end ensures that extra columns are discarded (by putting them into the dummy variable named _) rather than appended to pos.
"${array[*]}" generates a string by concatenating elements of array with the character in IFS; we can thus use this to ensure that commas are present in the output only when they're needed.
printf is used in preference to echo, as advised in the APPLICATION USAGE section of the specification for echo itself.
This is still inherently buggy since it's generating JSON via string concatenation. Don't use it.

json bash csv unix jq

Here's a python one-liner/script that'll do the trick:

cat my.csv | python -c 'import csv, json, sys; print(json.dumps([dict(r) for r in csv.DictReader(sys.stdin)]))

json bash csv unix jq

The accepted answer uses jq to parse the input. This works but jq doesn't handle escapes i.e. input from a CSV produced from Excel or similar tools is quoted like this:

foo,"bar,baz",gaz

will result in the incorrect output, as jq will see 4 fields, not 3.

One option is to use tab-separated values instead of comma (as long as your input data doesn't contain tabs!), along with the accepted answer.

Another option is to combine your tools, and use the best tool for each part: a CSV parser for reading the input and turning it into JSON, and jq for transforming the JSON into the target format.

The python-based csvkit will intelligently parse the CSV, and comes with a tool csvjson which will do a much better job of turning the CSV into JSON. This can then be piped through jq to convert the flat JSON output by csvkit into the target form.

With the data provided by the OP, for the desired output, this as as simple as:

csvjson --no-header-row  |  jq '.[] | {occurrences: [{ position: [.a, .b], taxo: {espece: .c}}]}'

Note that csvjson automatically detects ; as the delimiter, and without a header row in the input, assigns the json keys as a, b, and c.

The same also applies to writing to CSV files -- csvkit can read a JSON array or new-line delimited JSON, and intelligently output a CSV via in2csv.

CodeHunter

Converting CSV to JSON in bash

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last