From: A.M. K. <aku...@us...> - 2003-04-18 13:45:47
|
Update of /cvsroot/py-howto/pyhowto In directory sc8-pr-cvs1:/tmp/cvs-serv20033 Modified Files: regex.tex Log Message: Various rewrites and tweaks suggested by Jeffrey Elkner Index: regex.tex =================================================================== RCS file: /cvsroot/py-howto/pyhowto/regex.tex,v retrieving revision 1.18 retrieving revision 1.19 diff -C2 -r1.18 -r1.19 *** regex.tex 10 Apr 2003 14:18:34 -0000 1.18 --- regex.tex 18 Apr 2003 13:45:41 -0000 1.19 *************** *** 94,107 **** \begin{verbatim} ! . ^ $ * + ? { [ \ | ( ) \end{verbatim} % $ ! The first metacharacter we'll look at is \samp{[}; it's used for ! specifying a character class, which is a set of characters that you ! wish to match. Characters can be listed individually, or a range of ! characters can be indicated by giving two characters and separating ! them by a \character{-}. For example, \regexp{[abc]} will match any ! of the characters \samp{a}, \samp{b}, or \samp{c}; this is the same as \regexp{[a-c]}, which uses a range to express the same set of characters. If you wanted to match only lowercase letters, your --- 94,108 ---- \begin{verbatim} ! . ^ $ * + ? { [ ] \ | ( ) \end{verbatim} % $ ! The first metacharacters we'll look at are \samp{[} and \samp{]}. ! They're used for specifying a character class, which is a set of ! characters that you wish to match. Characters can be listed ! individually, or a range of characters can be indicated by giving two ! characters and separating them by a \character{-}. For example, ! \regexp{[abc]} will match any of the characters \samp{a}, \samp{b}, or ! \samp{c}; this is the same as \regexp{[a-c]}, which uses a range to express the same set of characters. If you wanted to match only lowercase letters, your *************** *** 165,173 **** Being able to match varying sets of characters is the first thing ! regular expressions can do that isn't already possible with Python's ! \module{string} module. However, if that was the only additional ! capability of regexes, they wouldn't be much of an advance. Another ! capability is that you can specify that portions of the RE must be ! repeated a certain number of times. The first metacharacter for repeating things that we'll look at is --- 166,174 ---- Being able to match varying sets of characters is the first thing ! regular expressions can do that isn't already possible with the ! methods available on strings. However, if that was the only ! additional capability of regexes, they wouldn't be much of an advance. ! Another capability is that you can specify that portions of the RE ! must be repeated a certain number of times. The first metacharacter for repeating things that we'll look at is *************** *** 221,225 **** Another repeating metacharacter is \regexp{+}, which matches one or more times. Pay careful attention to the difference between ! \regexp{*} and \\regexp{+}; \regexp{*} matches \emph{zero} or more times, so whatever's being repeated may not be present at all, while \regexp{+} requires at least \emph{one} occurrence. To use a similar --- 222,226 ---- Another repeating metacharacter is \regexp{+}, which matches one or more times. Pay careful attention to the difference between ! \regexp{*} and \regexp{+}; \regexp{*} matches \emph{zero} or more times, so whatever's being repeated may not be present at all, while \regexp{+} requires at least \emph{one} occurrence. To use a similar *************** *** 365,372 **** You can learn about this by interactively experimenting with the \module{re} module. If you have Tkinter available, you may also want ! to look at \file{redemo.py}, a demonstration program included with the ! Python distribution. It allows you to enter REs and strings, and ! displays whether the RE matches or fails. \file{redemo.py} can be ! quite useful when trying to debug a complicated RE. Phil Schwartz's \ulink{Kodos}{http://kodos.sourceforge.net} is also an interactive tool for developing and testing RE patterns. This HOWTO will use the --- 366,374 ---- You can learn about this by interactively experimenting with the \module{re} module. If you have Tkinter available, you may also want ! to look at \file{Tools/scripts/redemo.py}, a demonstration program ! included with the Python distribution. It allows you to enter REs and ! strings, and displays whether the RE matches or fails. ! \file{redemo.py} can be quite useful when trying to debug a ! complicated RE. Phil Schwartz's \ulink{Kodos}{http://kodos.sourceforge.net} is also an interactive tool for developing and testing RE patterns. This HOWTO will use the *************** *** 381,385 **** >>> p = re.compile('[a-z]+') >>> p ! <re.RegexObject instance at 80c3c28> \end{verbatim} --- 383,387 ---- >>> p = re.compile('[a-z]+') >>> p ! <_sre.SRE_Pattern object at 80c3c28> \end{verbatim} *************** *** 405,409 **** >>> m = p.match( 'tempo') >>> print m ! <re.MatchObject instance at 80c4f68> \end{verbatim} --- 407,411 ---- >>> m = p.match( 'tempo') >>> print m ! <_sre.SRE_Match object at 80c4f68> \end{verbatim} *************** *** 629,634 **** \begin{verbatim} charref = re.compile(r""" ! &\# # Start of a numeric entity reference ! (?P<char> [0-9]+[^0-9] # Decimal form | 0[0-7]+[^0-7] # Octal form --- 631,636 ---- \begin{verbatim} charref = re.compile(r""" ! &[#] # Start of a numeric entity reference ! ( [0-9]+[^0-9] # Decimal form | 0[0-7]+[^0-7] # Octal form *************** *** 640,644 **** Without the verbose setting, the RE would look like this: \begin{verbatim} ! charref = re.compile("&#(?P<char>[0-9]+[^0-9]" "|0[0-7]+[^0-7]" "|x[0-9a-fA-F]+[^0-9a-fA-F])") --- 642,646 ---- Without the verbose setting, the RE would look like this: \begin{verbatim} ! charref = re.compile("&#([0-9]+[^0-9]" "|0[0-7]+[^0-7]" "|x[0-9a-fA-F]+[^0-9a-fA-F])") *************** *** 903,907 **** Python adds an extension syntax to Perl's extension syntax. If the first character after the question mark is a \samp{P}, you know that ! it's a extension that's specific to Python. Currently there are two such extensions: \regexp{(?P<\var{name}>...)} defines a named group, and \regexp{(?P=\var{name})} is a backreference to a named group. If --- 905,909 ---- Python adds an extension syntax to Perl's extension syntax. If the first character after the question mark is a \samp{P}, you know that ! it's an extension that's specific to Python. Currently there are two such extensions: \regexp{(?P<\var{name}>...)} defines a named group, and \regexp{(?P=\var{name})} is a backreference to a named group. If |