Computer Science Canada

How to "reuse" regular expressions

Author:  BigBear [ Wed Jan 02, 2013 8:38 pm ]
Post subject:  How to "reuse" regular expressions

Given a paragraph with 3 dates how can I use a regular expression to get all 3 dates.

In between the dates there are numbers and letters.


The dates are in the format

10-jan-2012

or
code:

\d{2}-\D{3}-\d{4}\b

I can output the first date with

code:

m = re.search(r'\d{2}-\D{3}-\d{4}\b', s)

print m.group(0)


but how can I output all three dates.

I can copy the regular expression and paste it again with a /D+ in between but that only works if there is only non numbers in between the dates.

Also I think that this is a silly thing to do, there has to be a way to reuse the same regular expression or get all instances of the regular expression in some text.

Author:  Zren [ Wed Jan 02, 2013 8:53 pm ]
Post subject:  RE:How to "reuse" regular expressions

re.findall() or re.finditer() should probably work.

http://docs.python.org/2/library/re.html#re.findall
Examples: http://docs.python.org/2/library/re.html#finding-all-adverbs

Author:  BigBear [ Fri Jan 04, 2013 12:53 pm ]
Post subject:  RE:How to "reuse" regular expressions

Thank you very much.

How would you findall the peices of text that contain characters numbers and slashes in between commas?

/filepath/morepath/file.txt, /file3path/path/file3.txt

how would you make a list of all the paths ?

Author:  Tony [ Fri Jan 04, 2013 3:39 pm ]
Post subject:  RE:How to "reuse" regular expressions

dot . is a wildcard that will match any character.

Although if you know that comma is a delimiter of a list, then you are looking for "split"
code:

>>> map(lambda x: x.strip(), "foo, bar, bazz".split(","))
['foo', 'bar', 'bazz']


although this assumes that commas will not appear in the path/filename


: