Remove escape sequence characters like newline, tab and carriage return from JSON file Remove escape sequence characters like newline, tab and carriage return from JSON file unix unix

Remove escape sequence characters like newline, tab and carriage return from JSON file


A pure jq solution:

$ jq -r '.content.message | gsub("[\\n\\t]"; "")' file.jsonERROR LALALLAERROR INFO NANANANSOME MORE ERROR INFOBABABABABABBA BABABABA ABABBABAA BABABABAB

If you want to keep the enlosing " characters, omit -r.

Note: peak's helpful answer contains a generalized regular expression that matches all control characters in the ASCII and Latin-1 Unicode range by way of a Unicode category specifier, \p{Cc}. jq uses the Oniguruma regex engine.


Other solutions, using an additional utility, such as sed and tr.

Using sed to unconditionally remove escape sequences \n and t:

$ jq '.content.message' file.json | sed 's/\\[tn]//g'"ERROR LALALLAERROR INFO NANANANSOME MORE ERROR INFOBABABABABABBA BABABABA ABABBABAA BABABABAB"

Note that the enclosing " are still there, however.To remove them, add another substitution to the sed command:

$ jq '.content.message' file.json | sed 's/\\[tn]//g; s/"\(.*\)"/\1/'ERROR LALALLAERROR INFO NANANANSOME MORE ERROR INFOBABABABABABBA BABABABA ABABBABAA BABABABAB

A simpler option that also removes the enclosing " (note: output has no trailing \n):

$ jq -r '.content.message' file.json | tr -d '\n\t'ERROR LALALLAERROR INFO NANANANSOME MORE ERROR INFOBABABABABABBA BABABABA ABABBABAA BABABABAB

Note how -r is used to make jq interpolate the string (expanding the \n and \t sequences), which are then removed - as literals - by tr.


With your input, the following incantation:

$ jq 'walk(if type == "string" then gsub("\\p{Cc}"; "<>") else . end)' 

produces:

{  "HOSTNAME": "server1.example",  "content": {    "message": "ERROR LALALLA<>ERROR INFO NANANAN<>SOME MORE ERROR INFO<>BABABABABABBA<> BABABABA<> ABABBABAA<><> BABABABAB<><>"  },  "level": "WARN",  "level_value": 30000,  "logger_name": "server1.example.adapter"}

Of course, the above invocation is just illustrative:

  • you might not need to use walk/1 at all. (walk/1 walks the input JSON.)
  • you might want to use a different character class, or specify a pipeline of gsub/2 invocations.
  • if you simply want to excise the control characters, specify "" as the second argument of gsub/2.

If you do want to use walk/1 but your jq does not have it, then simply add its definition (easily available on the web, such as here) before its invocation.