wtd
|
Posted: Sun Nov 14, 2004 12:03 am Post subject: [Regex-tut] Non-Greedy Matches |
|
|
Recap
In the previous installments, dealing with negative sets, I used:
code: | / ( [ ' " ] ) ( [^ \1 ]* ) \1 /x |
To match the contents of a quoted string, where those contents didn't include the quote character itself. I used the above rather than:
code: | / ( [ ' " ] ) ( .* ) \1 /x |
Because:
Matches any chatacter zero or more times, and would have matched the quote as well.
Why was that necessary?
After all, I had specified that the string to be matched should end with a quote. The match should have been complete when it found a quote to match the first one.
But it didn't, and it didn't stop because * is "greedy" (+ also behaves this way). If there's another quote, it'll fly right past the one where we would have expected it to stop.
A quick fix
To fix this, we either have to specify that the string being matched cannot contain a matching quote, which we did, quite successfully, but at the cost of making the regex more complex, or, we simply change the behavior of the * and make it "non-greedy".
Following the * (or +) with a question mark will do the trick. In keeping with regex tradition, it's very short.
code: | / ( [ ' " ] ) ( .*? ) \1 /x |
|
|
|