Regular expression for a string literal in flex/lex
A string consists of a quote mark
"
followed by zero or more of either an escaped anything
\\.
or a non-quote character, non-backslash character
[^"\\]
and finally a terminating quote
"
Put it all together, and you've got
\"(\\.|[^"\\])*\"
The delimiting quotes are escaped because they are Flex meta-characters.
For a single line... you can use this:
\"([^\\\"]|\\.)*\" {/*matches string-literal on a single line*/;}
How about using a start state...
int enter_dblquotes = 0;%x DBLQUOTES%%\" { BEGIN(DBLQUOTES); enter_dblquotes++; }<DBLQUOTES>*\" { if (enter_dblquotes){ handle_this_dblquotes(yytext); BEGIN(INITIAL); /* revert back to normal */ enter_dblquotes--; } } ...more rules follow...
It was similar to that effect (flex uses %s
or %x
to indicate what state would be expected. When the flex input detects a quote, it switches to another state, then continues lexing until it reaches another quote, in which it reverts back to the normal state.