rearrange data on a file (NOT a direct transpose)

Using GNU awk for true multi-dimensional arrays:

$ cat tst.awkNR==1 { split($0,hdr); next }{    idx = (NR-2)%4+1    val[idx][0]    split($0,val[idx])}NR==5 {    printf "%s", hdr[1]    for (j=3; j in hdr; j++) {        for (i=1; i<=idx; i++) {            printf "%s%s", OFS, hdr[j]"/"val[i][2]        }    }    print ""}idx==4 {    printf "%s", $1    for (j=3; j<=NF; j++) {        for (i=1; i<=idx; i++) {            printf "%s%s", OFS, val[i][j]        }    }    print ""}$ awk -f tst.awk fileNAME A/Y1 A/Y2 A/Y3 A/Y4 B/Y1 B/Y2 B/Y3 B/Y4 C/Y1 C/Y2 C/Y3 C/Y4JOHN 10,00 11,00 12,00 13,00 19,00 23,00 33,00 34,00 65,00 64,00 34,00 32,00PAUL 14,00 15,00 16,00 20,00 43,00 90,00 32,00 45,00 23,00 34,00 56,00 65,00RINGO 25,00 24,00 31,00 75,00 60,00 30,00 20,00 10,00 87,00 23,00 54,00 12,00

bash shell awk sed gawk

$ cat foo.awkNR==1{next}                                              # skip the header{    printf "%s", (b!=$1?(b==""?"":ORS) $1:"") OFS; b=$1  # print name or OFS} {    printf "%s", $3 OFS $4 OFS $5                        # print fields} END {print ""}                                           # finish with ORS

Spin it:

$ awk -f foo.awk foo.txtJOHN 10,00 19,00 65,00 11,00 23,00 64,00 12,00 33,00 34,00 13,00 34,00 32,00PAUL 14,00 43,00 23,00 15,00 90,00 34,00 16,00 32,00 56,00 20,00 45,00 65,00RINGO 25,00 60,00 87,00 24,00 30,00 23,00 31,00 20,00 54,00 75,00 10,00 12,00

bash shell awk sed gawk

My proposal in python (sed is clearly not up to the task, maybe awk, but that's a challenge). I have hardcoded the 4*3 aspect of the "matrix". Maybe something more elegant could have been done:

import collectionsnb_year = 4d = collections.defaultdict(lambda: [None]*nb_year*3)    with open("input_file") as infile:    next(infile)  # skip title    for l in infile:  # read line by line        fields = l.strip().split()  # extract blank-separated fields        if len(fields)<3: continue  # protection against "accidental" blank lines        target = d[fields[0]]       # name        offset = int(fields[1][1])-1    # extract year index 1 to 4        for i,f in enumerate(fields[2:]):  # interleaved matrix fill            target[offset+i*nb_year] = f      # fill "matrix"    print("NAME A/Y1 A/Y2 A/Y3 A/Y4 B/Y1 B/Y2 B/Y3 B/Y4 C/Y1 C/Y2 C/Y3 C/Y4")    for k,v in sorted(d.items()):        print("{} {}".format(k," ".join(v)))

output

NAME A/Y1 A/Y2 A/Y3 A/Y4 B/Y1 B/Y2 B/Y3 B/Y4 C/Y1 C/Y2 C/Y3 C/Y4JOHN 10,00 11,00 12,00 13,00 19,00 23,00 33,00 34,00 65,00 64,00 34,00 32,00PAUL 14,00 15,00 16,00 20,00 43,00 90,00 32,00 45,00 23,00 34,00 56,00 65,00RINGO 25,00 24,00 31,00 75,00 60,00 30,00 20,00 10,00 87,00 23,00 54,00 12,00

CodeHunter

rearrange data on a file (NOT a direct transpose)

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last