Regular expression for a string literal in flex/lex Regular expression for a string literal in flex/lex c c

Regular expression for a string literal in flex/lex


A string consists of a quote mark

"

followed by zero or more of either an escaped anything

\\.

or a non-quote character, non-backslash character

[^"\\]

and finally a terminating quote

"

Put it all together, and you've got

\"(\\.|[^"\\])*\"

The delimiting quotes are escaped because they are Flex meta-characters.


For a single line... you can use this:

\"([^\\\"]|\\.)*\"  {/*matches string-literal on a single line*/;}


How about using a start state...

int enter_dblquotes = 0;%x DBLQUOTES%%\"  { BEGIN(DBLQUOTES); enter_dblquotes++; }<DBLQUOTES>*\" {    if (enter_dblquotes){       handle_this_dblquotes(yytext);        BEGIN(INITIAL); /* revert back to normal */       enter_dblquotes--;    } }         ...more rules follow...

It was similar to that effect (flex uses %s or %x to indicate what state would be expected. When the flex input detects a quote, it switches to another state, then continues lexing until it reaches another quote, in which it reverts back to the normal state.