Replace multiple consecutive white spaces with one comma in Unix Replace multiple consecutive white spaces with one comma in Unix unix unix

Replace multiple consecutive white spaces with one comma in Unix


If you want to use sed, you can use this one:

$ sed 's/ \{1,\}/,/g' fileSNP,A1,A2,FRQ,INFO,OR,SE,P10:33367054,C,T,0.9275,0.9434,1.1685,0.1281,0.184310:33367707,G,A,0.9476,0.9436,1.0292,0.1530,0.824410:33367804,G,C,0.4193,1.0443,0.9734,0.0988,0.644310:33368119,C,A,0.9742,0.9343,1.0201,0.1822,0.9156

It is based on glenn jackman's answer to How to strip multipe spaces to one using sed?.

It can also be like

sed 's/[[:space:]]\{1,\}/,/g' file

And note you can use sed -i.bak '...' file to get an in place edit, so that the original file will be backed up as file.bak and file will have the edited content.


But I think it is more clear with tr. With it, you can squeeze the spaces and then replace each one of them with a comma:

$ tr -s ' ' < file | tr ' ' ','SNP,A1,A2,FRQ,INFO,OR,SE,P10:33367054,C,T,0.9275,0.9434,1.1685,0.1281,0.184310:33367707,G,A,0.9476,0.9436,1.0292,0.1530,0.824410:33367804,G,C,0.4193,1.0443,0.9734,0.0988,0.644310:33368119,C,A,0.9742,0.9343,1.0201,0.1822,0.9156

By pieces:

$ tr -s ' ' < fileSNP A1 A2 FRQ INFO OR SE P10:33367054 C T 0.9275 0.9434 1.1685 0.1281 0.184310:33367707 G A 0.9476 0.9436 1.0292 0.1530 0.824410:33367804 G C 0.4193 1.0443 0.9734 0.0988 0.644310:33368119 C A 0.9742 0.9343 1.0201 0.1822 0.9156

From man tr:

tr [OPTION]... SET1 [SET2]

Translate, squeeze, and/or delete characters from standard input, writing to standard output.

-s, --squeeze-repeats

replace each input sequence of a repeated character that is listed in SET1 with a single occurrence of that character


If you enable extended regular expressions with -r, then you can just add + to \s which means one or more:

$ sed -r 's/\s+/,/g' file.txtSNP,A1,A2,FRQ,INFO,OR,SE,P10:33367054,C,T,0.9275,0.9434,1.1685,0.1281,0.184310:33367707,G,A,0.9476,0.9436,1.0292,0.1530,0.824410:33367804,G,C,0.4193,1.0443,0.9734,0.0988,0.644310:33368119,C,A,0.9742,0.9343,1.0201,0.1822,0.9156

For reference:

-r, --regexp-extended    use extended regular expressions in the script.

Note: On Mac OS X, sed is based on BSD and does not have the GNU extensions so you will have to use the -E flag:

-E    Interpret regular expressions as extended (modern) regular expressions rather      than basic regular expressions (BRE's). The re_format(7) manual page fully       describes both formats.


Here is a very simple solution with awk

awk '{$1=$1}1' OFS=, fileSNP,A1,A2,FRQ,INFO,OR,SE,P10:33367054,C,T,0.9275,0.9434,1.1685,0.1281,0.184310:33367707,G,A,0.9476,0.9436,1.0292,0.1530,0.824410:33367804,G,C,0.4193,1.0443,0.9734,0.0988,0.644310:33368119,C,A,0.9742,0.9343,1.0201,0.1822,0.9156

$1=$1 reformat the file so that all extra spaces are set to one space.