This is the P2PU Archive. If you want the current site, go to www.p2pu.org!

Python Programming 101

Regular Expressions

Brylie Oxley's picture
Tue, 2011-05-31 05:46

In computing, a regular expression, also referred to as regex or regexp, provides a concise and flexible means for matching strings of text, such as particular characters, words, or patterns of characters.

The following examples illustrate a few specifications that could be expressed in a regular expression:

  • The sequence of characters "car" appearing consecutively in any context, such as in "car", "cartoon", or "bicarbonate"
  • The sequence of characters "car" occurring in that order with other characters between them, such as in "Icelander" or "chandler"
  • The word "car" when it appears as an isolated word
  • The word "car" when preceded by the word "blue" or "red"
  • The word "car" when not preceded by the word "motor"
  • A dollar sign immediately followed by one or more digits, and then optionally a period and exactly two more digits (for example, "$100" or "$245.99").

Source: Wikipedia

Python regular expression syntax follows in the Perl lineage. The Python module re provides regular expression functionality. Regular Expressions are a sub-language embedded within the larger Python language.
 

Educational Resources

Python for Informatics

Dive Into Python 3

Python.org

Practice

Please post regex excercises and questions below. We can help each other learn and explore this robust and slightly difficult aspect of Python.

Comments

Exercises chapter 11:

Johan Mares's picture
Johan Mares
Sun, 2011-06-05 11:25

Exercises chapter 11: http://www.pastie.org/2021643
The average I found for exercise 11.2 using mbox.txt was 38549.7949721

Chapter 11 Exercises:

Jimmy Moore's picture
Jimmy Moore
Wed, 2011-06-08 04:28

Chapter 11 Exercises: https://github.com/jwmnatl/jwmnatl.github.com/blob/master/python101/chap...

For Ex. 11.2, my average for mbox.txt was 39607.

I received smaller number of lines than those referenced in Ex. 11.1 :
e.g. -
Enter a regular expression: ^Author
mbox.txt had 339 lines that matched ^Author

Enter a regular expression: ^X-
mbox.txt had 2715 lines that matched ^X-

Enter a regular expression: java$
mbox.txt had 510 lines that matched java$