Regular expression - Ruby vs Perl Regular expression - Ruby vs Perl ruby ruby

Regular expression - Ruby vs Perl


regex = Regexp.new(/(.*?) \|.*?SENDING REQUEST.*?TID=(.*?),/)f = File.open( ARGV.shift ).each do |line|    if regex .match(line)        puts "#{$1}: #{$2}"    endend

Or

regex = Regexp.new(/(.*?) \|.*?SENDING REQUEST.*?TID=(.*?),/)f = File.open( ARGV.shift )f.each_line do |line|  if regex.match(line)    puts "#{$1}: #{$2}"  end


One possible difference is the amount of backtracking being performed. Perl might do a better job of pruning the search tree when backtracking (i.e. noticing when part of a pattern can't possibly match). Its regex engine is highly optimised.

First, adding a leading «^» could make a huge difference. If the pattern doesn't match starting at position 0, it's not going to match at starting position 1 either! So don't try to match at position 1.

Along the same lines, «.*?» isn't as limiting as you might think, and replacing each instance of it with a more limiting pattern could prevent a lot of backtracking.

Why don't you try:

/    ^    (.*?)                       [ ]\|    (?:(?!SENDING[ ]REQUEST).)* SENDING[ ]REQUEST    (?:(?!TID=).)*              TID=    ([^,]*)                     ,/x

(Not sure if it was safe to replace the first «.*?» with «[^|]», so I didn't.)

(At least for patterns that match a single string, (?:(?!PAT).) is to PAT as [^CHAR] is to CHAR.)

Using /s could possibly speed things up if «.» is allowed to match newlines, but I think it's pretty minor.

Using «\space» instead of «[space]» to match a space under /x might be slightly faster in Ruby. (They're the same in recent versions of Perl.) I used the latter because it's far more readable.


From the perlretut chapter: Using regular expressions in Perl section - "Search and replace"

(Even though the regular expression appears in a loop, Perl is smart enough to compile it only once.)

I don't know Ruby very good, but I suspect that it does compile the regex in each cycle.
(Try the code from LaGrandMere's answer to verfiy it).