Easiest way to join two files from the unix command line, inserting zero entries for missing keys Easiest way to join two files from the unix command line, inserting zero entries for missing keys unix unix

Easiest way to join two files from the unix command line, inserting zero entries for missing keys


join -o 0,1.2,2.2 -e 0 -a1 -a2 a.txt b.txt
  • -o 0,1.2,2.2 → output join field, then 2nd field of 1st file, then 2nd field of 2nd file.
  • -e 0 → Output 0 on empty input fields.
  • -a1 -a2 → Show all values from file 1 and file 2.


Write a script, whatever language you want. You will parse both files using a map/hashtable/dictionary data structure (lets just say dictionary). Each dictionary will have the first word as the key and the count (or even a string of counts) as the value. Here is some pseudocode of the algorithm:

Dict fileA, fileB; //Already parsedwhile(!fileA.isEmpty()) {      string check = fileA.top().key();      int val1 = fileA.top().value();      if(fileB.contains(check)) {          printToFile(check + " " + val1 + " " + fileB.getValue(check));          fileB.remove(check);      }      else {          printToFile(check + " " + val1 + " 0");      }      fileA.pop();}while(!fileB.isEmpty()) {      //Know key does not exist in FileA     string check = fileB.top().key();     int val1 = fileB.top().value();     printToFile(check + " 0 " + val1);     fileB.pop();}

You can use any type of iterator to go through the data structure instead of pop and top. Obviously you may need to access the data a different way depending on what language/data structure you need to use.


@ninjalj's answer is much saner, but here's a shell script implementation just for fun:

exec 8< a.txtexec 9< b.txtwhile true; do   if [ -z "$k1" ]; then    read k1 v1 <& 8   fi   if [ -z "$k2" ]; then    read k2 v2 <& 9   fi   if [ -z "$k1$k2" ]; then break; fi   if [ "$k1" == "$k2" ]; then    echo $k1 $v1 $v2     k1=    k2=   elif [ -n "$k1" -a "$k1" '<' "$k2" ]; then    echo $k1 $v1 0     k1=   else     echo $k2 0 $v2    k2=   fidone