|
From: A.M. K. <aku...@us...> - 2003-04-18 13:45:47
|
Update of /cvsroot/py-howto/pyhowto
In directory sc8-pr-cvs1:/tmp/cvs-serv20033
Modified Files:
regex.tex
Log Message:
Various rewrites and tweaks suggested by Jeffrey Elkner
Index: regex.tex
===================================================================
RCS file: /cvsroot/py-howto/pyhowto/regex.tex,v
retrieving revision 1.18
retrieving revision 1.19
diff -C2 -r1.18 -r1.19
*** regex.tex 10 Apr 2003 14:18:34 -0000 1.18
--- regex.tex 18 Apr 2003 13:45:41 -0000 1.19
***************
*** 94,107 ****
\begin{verbatim}
! . ^ $ * + ? { [ \ | ( )
\end{verbatim}
% $
! The first metacharacter we'll look at is \samp{[}; it's used for
! specifying a character class, which is a set of characters that you
! wish to match. Characters can be listed individually, or a range of
! characters can be indicated by giving two characters and separating
! them by a \character{-}. For example, \regexp{[abc]} will match any
! of the characters \samp{a}, \samp{b}, or \samp{c}; this is the same as
\regexp{[a-c]}, which uses a range to express the same set of
characters. If you wanted to match only lowercase letters, your
--- 94,108 ----
\begin{verbatim}
! . ^ $ * + ? { [ ] \ | ( )
\end{verbatim}
% $
! The first metacharacters we'll look at are \samp{[} and \samp{]}.
! They're used for specifying a character class, which is a set of
! characters that you wish to match. Characters can be listed
! individually, or a range of characters can be indicated by giving two
! characters and separating them by a \character{-}. For example,
! \regexp{[abc]} will match any of the characters \samp{a}, \samp{b}, or
! \samp{c}; this is the same as
\regexp{[a-c]}, which uses a range to express the same set of
characters. If you wanted to match only lowercase letters, your
***************
*** 165,173 ****
Being able to match varying sets of characters is the first thing
! regular expressions can do that isn't already possible with Python's
! \module{string} module. However, if that was the only additional
! capability of regexes, they wouldn't be much of an advance. Another
! capability is that you can specify that portions of the RE must be
! repeated a certain number of times.
The first metacharacter for repeating things that we'll look at is
--- 166,174 ----
Being able to match varying sets of characters is the first thing
! regular expressions can do that isn't already possible with the
! methods available on strings. However, if that was the only
! additional capability of regexes, they wouldn't be much of an advance.
! Another capability is that you can specify that portions of the RE
! must be repeated a certain number of times.
The first metacharacter for repeating things that we'll look at is
***************
*** 221,225 ****
Another repeating metacharacter is \regexp{+}, which matches one or
more times. Pay careful attention to the difference between
! \regexp{*} and \\regexp{+}; \regexp{*} matches \emph{zero} or more
times, so whatever's being repeated may not be present at all, while
\regexp{+} requires at least \emph{one} occurrence. To use a similar
--- 222,226 ----
Another repeating metacharacter is \regexp{+}, which matches one or
more times. Pay careful attention to the difference between
! \regexp{*} and \regexp{+}; \regexp{*} matches \emph{zero} or more
times, so whatever's being repeated may not be present at all, while
\regexp{+} requires at least \emph{one} occurrence. To use a similar
***************
*** 365,372 ****
You can learn about this by interactively experimenting with the
\module{re} module. If you have Tkinter available, you may also want
! to look at \file{redemo.py}, a demonstration program included with the
! Python distribution. It allows you to enter REs and strings, and
! displays whether the RE matches or fails. \file{redemo.py} can be
! quite useful when trying to debug a complicated RE. Phil Schwartz's
\ulink{Kodos}{http://kodos.sourceforge.net} is also an interactive
tool for developing and testing RE patterns. This HOWTO will use the
--- 366,374 ----
You can learn about this by interactively experimenting with the
\module{re} module. If you have Tkinter available, you may also want
! to look at \file{Tools/scripts/redemo.py}, a demonstration program
! included with the Python distribution. It allows you to enter REs and
! strings, and displays whether the RE matches or fails.
! \file{redemo.py} can be quite useful when trying to debug a
! complicated RE. Phil Schwartz's
\ulink{Kodos}{http://kodos.sourceforge.net} is also an interactive
tool for developing and testing RE patterns. This HOWTO will use the
***************
*** 381,385 ****
>>> p = re.compile('[a-z]+')
>>> p
! <re.RegexObject instance at 80c3c28>
\end{verbatim}
--- 383,387 ----
>>> p = re.compile('[a-z]+')
>>> p
! <_sre.SRE_Pattern object at 80c3c28>
\end{verbatim}
***************
*** 405,409 ****
>>> m = p.match( 'tempo')
>>> print m
! <re.MatchObject instance at 80c4f68>
\end{verbatim}
--- 407,411 ----
>>> m = p.match( 'tempo')
>>> print m
! <_sre.SRE_Match object at 80c4f68>
\end{verbatim}
***************
*** 629,634 ****
\begin{verbatim}
charref = re.compile(r"""
! &\# # Start of a numeric entity reference
! (?P<char>
[0-9]+[^0-9] # Decimal form
| 0[0-7]+[^0-7] # Octal form
--- 631,636 ----
\begin{verbatim}
charref = re.compile(r"""
! &[#] # Start of a numeric entity reference
! (
[0-9]+[^0-9] # Decimal form
| 0[0-7]+[^0-7] # Octal form
***************
*** 640,644 ****
Without the verbose setting, the RE would look like this:
\begin{verbatim}
! charref = re.compile("&#(?P<char>[0-9]+[^0-9]"
"|0[0-7]+[^0-7]"
"|x[0-9a-fA-F]+[^0-9a-fA-F])")
--- 642,646 ----
Without the verbose setting, the RE would look like this:
\begin{verbatim}
! charref = re.compile("&#([0-9]+[^0-9]"
"|0[0-7]+[^0-7]"
"|x[0-9a-fA-F]+[^0-9a-fA-F])")
***************
*** 903,907 ****
Python adds an extension syntax to Perl's extension syntax. If the
first character after the question mark is a \samp{P}, you know that
! it's a extension that's specific to Python. Currently there are two
such extensions: \regexp{(?P<\var{name}>...)} defines a named group,
and \regexp{(?P=\var{name})} is a backreference to a named group. If
--- 905,909 ----
Python adds an extension syntax to Perl's extension syntax. If the
first character after the question mark is a \samp{P}, you know that
! it's an extension that's specific to Python. Currently there are two
such extensions: \regexp{(?P<\var{name}>...)} defines a named group,
and \regexp{(?P=\var{name})} is a backreference to a named group. If
|