
-----------------------------------
wtd
Sat Nov 13, 2004 10:50 pm

[Regex-tut] Capturing Groups: Inside the Regex
-----------------------------------
Recap

Groups surrounded by parentheses remember what they matched and store it in a global variable.

This is great, but...

What if I want to figure out what a group matched before I'm done matching?

Variables like $1, $2, $3, etc. reflect whatever the expression actually matched.  It shouldn't be too surprising that someone figured out a way to do something similar in the regex.

Let's say we want to match a string between two quotes.  We want to allow either single or double quotes, but the key is that they have to match.  

So, we create an expression to match a single or double quote, and then a bunch of characters.  

Note: the "." character means essentially "anything".

/ ['"] ( .* ) /x

But, how do I figure out which of the two quotes was matched?  Simple, put parentheses around that too.

/ ( [ ' " ] ) ( .* ) /x

Now, I need to match that same character at the end of the expression.  A backslash followed by the number of the matching parens will work.

/ ( [ ' " ] ) ( .* ) \1 /x

And putting it all together:

input = gets.chomp

if input =~ / ( [ ' " ] ) ( .* ) \1 /x
   puts "Match the string: #{$2}."
end
