Computer Science Canada

[Regex-tut] Cleaning it up

Author:  wtd [ Sat Nov 13, 2004 10:07 pm ]
Post subject:  [Regex-tut] Cleaning it up

Quick recap

Last time, we ended up with the following regular expression that does a decent job of matching a greeting or parting message for world.

code:
/^\s*([hH]ello|[tT]oodles),?\s+world\s*$/


It looks like gobbledy-gook

Go ahead, say it. It's not pretty, is it?

Part of the appeal of regexes is that they allow programmers to express powerful patterns without having to write a lot of code. If you know what you're doing, they're reasonably easy to read. Still, it's nice that Ruby (and Perl) gives us a way to clean them up.

Spit it out already man!

The "x" modifier for a regular expression makes whitespace in the regular expression meaningless, so we can use it to separate things in the expression.

code:
/^ \s* ( [hH]ello | [tT]oodles ) ,? \s+ world \s* $/x


It gets better

Since whitespace is nothing to these regular expression, we could break this up and comment it.

code:
/
   ^         # beginning of the string
   \s*       # zero or more whitespace characters
   (         # start group
      [hH]   #either h or H
      ello
   |
      [tT]   # either t or T
      oodles
   )         # end group
   ,?        # one or zero commas
   \s+       # one or more whitespaces
   world
   \s*       # zero or more whitespace characters
   $         # end of string
/


: