Print lines in one file matching patterns in another file

Try grep -Fwf file2 file1 > out

The -F option specifies plain string matching, so should be faster without having to engage the regex engine.

Here's how to do it in awk:

awk 'NR==FNR{pats[$0]; next} $2 in pats' File2 File1

Using a 60,000 line File1 (your File1 repeated 8000 times) and a 6,000 File2 (yours repeated 1200 times):

$ time grep -Fwf File2 File1 > ou2real    0m0.094suser    0m0.031ssys     0m0.062s$ time awk 'NR==FNR{pats[$0]; next} $2 in pats' File2 File1 > ou1real    0m0.094suser    0m0.015ssys     0m0.077s$ diff ou1 ou2

i.e. it's about as fast as the grep. One thing to note though is that the awk solution lets you pick a specific field to match on so if anything from File2 shows up anywhere else in File1 you won't get a false match. It also lets you match on a whole field at a time so if your target strings were various lengths and you didn't want "scign000003" to match "scign0000031" for example (though the -w for grep gives similar protection for that).

For completeness, here's the timing for the other awk solution posted elsethread:

$ time awk 'BEGIN{i=0}FNR==NR{a[i++]=$1;next}{for(j=0;j<i;j++)if(index($0,a[j]))print $0}' File2 File1 > ou3real    3m34.110suser    3m30.850ssys     0m1.263s

and here's the timing I get for the perl script Mark posted:

$ time ./go.pl > out2real    0m0.203suser    0m0.124ssys     0m0.062s

unix sed awk grep extract

You could try with this awk:

awk 'BEGIN{i=0}FNR==NR { a[i++]=$1; next }{ for(j=0;j<i;j++)    if(index($0,a[j]))        {print $0;break}}' file2 file1

The FNR==NR part specifies that the stuff following it in curly braces is only to be applied when processing the first input file (file2). And it says to save all the words you are looking for in an array a[]. The bit in the second set of curly braces applies to the processing of the second file... as each line is read in, it is compared with all elements of a[] and if any are found, the line is printed. That's all folks!

CodeHunter

Print lines in one file matching patterns in another file

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last