Unix: extracting data from a .dat file and inserting into SQL database? Unix: extracting data from a .dat file and inserting into SQL database? shell shell

Unix: extracting data from a .dat file and inserting into SQL database?


This is not a good problem for grep and sed. I recommend awk. An untested first cut:

awk '/<Name>/ {name=$1}/<Email>/ {emails[name] = $1}END {for (n in emails) {print n, print email[n]}}' *.dat

You could also try

END {for (n in emails) {print "sqlite db.sql INSERT INTO users VALUES (" n "," email");"}}


Seems like you are a great fan of grep. Give it a try:

grep -Po '(?<=(Name|mail)>[\t\s])(.*)$' file | `xargs -n2 printf "sqlite db.sql INSERT INTO users VALUES (%s, %s)\n"`

The first part is doing a positive lookbehind to fetch the relevant info. Lookbehind doesn't support varibale lengths, that why mail is being used instead of Email. It outputs :

Name_1Email_1Name_2Email_2

The xargs -n2 is combining name and email as follows:

Name_1 Email_1Name_2 Email_2

This is formatted by the printf and is being executed. Hope it helps.

Now please don't tell me your grep doesn't support -P ;-)


You can do it in (GNU) sed, altough the awk script is much simpler.

dat2sql.sed:

/<NAME>/I H  # store name/<EMAIL>/I {  H;         # store email  g          # get stored strings  s/<[^>]+>\s+//gI; # remove <NAME> and <EMAIL>  s/^$\n/sqlite db.sql INSERT INTO users VALUES ("/;  s/\n/", "/;  s/$/" );/;  p                 # print results  s/.*//g;  x;      # clear hold space} 

Use it like this: sed -rn -f dat2sql.sed your_file.

The prerequisite is that Name is before Email for each record in the file.