[Py-howto-checkins] CVS: pyhowto regex.tex,1.18,1.19

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Update of /cvsroot/py-howto/pyhowto
In directory sc8-pr-cvs1:/tmp/cvs-serv20033

Modified Files:
	regex.tex 
Log Message:
Various rewrites and tweaks suggested by Jeffrey Elkner

Index: regex.tex
===================================================================
RCS file: /cvsroot/py-howto/pyhowto/regex.tex,v
retrieving revision 1.18
retrieving revision 1.19
diff -C2 -r1.18 -r1.19
*** regex.tex	10 Apr 2003 14:18:34 -0000	1.18
--- regex.tex	18 Apr 2003 13:45:41 -0000	1.19
***************
*** 94,107 ****

  \begin{verbatim}
! . ^ $ * + ? { [ \ | ( )
  \end{verbatim}
  % $

! The first metacharacter we'll look at is \samp{[}; it's used for
! specifying a character class, which is a set of characters that you
! wish to match.  Characters can be listed individually, or a range of
! characters can be indicated by giving two characters and separating
! them by a \character{-}.  For example, \regexp{[abc]} will match any
! of the characters \samp{a}, \samp{b}, or \samp{c}; this is the same as
  \regexp{[a-c]}, which uses a range to express the same set of
  characters.  If you wanted to match only lowercase letters, your
--- 94,108 ----

  \begin{verbatim}
! . ^ $ * + ? { [ ] \ | ( )
  \end{verbatim}
  % $

! The first metacharacters we'll look at are \samp{[} and \samp{]}.
! They're used for specifying a character class, which is a set of
! characters that you wish to match.  Characters can be listed
! individually, or a range of characters can be indicated by giving two
! characters and separating them by a \character{-}.  For example,
! \regexp{[abc]} will match any of the characters \samp{a}, \samp{b}, or
! \samp{c}; this is the same as
  \regexp{[a-c]}, which uses a range to express the same set of
  characters.  If you wanted to match only lowercase letters, your
***************
*** 165,173 ****

  Being able to match varying sets of characters is the first thing
! regular expressions can do that isn't already possible with Python's
! \module{string} module.  However, if that was the only additional
! capability of regexes, they wouldn't be much of an advance.  Another
! capability is that you can specify that portions of the RE must be
! repeated a certain number of times.

  The first metacharacter for repeating things that we'll look at is
--- 166,174 ----

  Being able to match varying sets of characters is the first thing
! regular expressions can do that isn't already possible with the
! methods available on strings.  However, if that was the only
! additional capability of regexes, they wouldn't be much of an advance.
! Another capability is that you can specify that portions of the RE
! must be repeated a certain number of times.

  The first metacharacter for repeating things that we'll look at is
***************
*** 221,225 ****
  Another repeating metacharacter is \regexp{+}, which matches one or
  more times.  Pay careful attention to the difference between
! \regexp{*} and \\regexp{+}; \regexp{*} matches \emph{zero} or more
  times, so whatever's being repeated may not be present at all, while
  \regexp{+} requires at least \emph{one} occurrence.  To use a similar
--- 222,226 ----
  Another repeating metacharacter is \regexp{+}, which matches one or
  more times.  Pay careful attention to the difference between
! \regexp{*} and \regexp{+}; \regexp{*} matches \emph{zero} or more
  times, so whatever's being repeated may not be present at all, while
  \regexp{+} requires at least \emph{one} occurrence.  To use a similar
***************
*** 365,372 ****
  You can learn about this by interactively experimenting with the
  \module{re} module.  If you have Tkinter available, you may also want
! to look at \file{redemo.py}, a demonstration program included with the
! Python distribution.  It allows you to enter REs and strings, and
! displays whether the RE matches or fails.  \file{redemo.py} can be
! quite useful when trying to debug a complicated RE.  Phil Schwartz's
  \ulink{Kodos}{http://kodos.sourceforge.net} is also an interactive
  tool for developing and testing RE patterns.  This HOWTO will use the
--- 366,374 ----
  You can learn about this by interactively experimenting with the
  \module{re} module.  If you have Tkinter available, you may also want
! to look at \file{Tools/scripts/redemo.py}, a demonstration program
! included with the Python distribution.  It allows you to enter REs and
! strings, and displays whether the RE matches or fails.
! \file{redemo.py} can be quite useful when trying to debug a
! complicated RE.  Phil Schwartz's
  \ulink{Kodos}{http://kodos.sourceforge.net} is also an interactive
  tool for developing and testing RE patterns.  This HOWTO will use the
***************
*** 381,385 ****
  >>> p = re.compile('[a-z]+')
  >>> p
! <re.RegexObject instance at 80c3c28>
  \end{verbatim}

--- 383,387 ----
  >>> p = re.compile('[a-z]+')
  >>> p
! <_sre.SRE_Pattern object at 80c3c28>
  \end{verbatim}

***************
*** 405,409 ****
  >>> m = p.match( 'tempo')
  >>> print m
! <re.MatchObject instance at 80c4f68>
  \end{verbatim}

--- 407,411 ----
  >>> m = p.match( 'tempo')
  >>> print m
! <_sre.SRE_Match object at 80c4f68>
  \end{verbatim}

***************
*** 629,634 ****
  \begin{verbatim}
  charref = re.compile(r"""
!  &\#		     # Start of a numeric entity reference
!  (?P<char>      
     [0-9]+[^0-9]      # Decimal form
     | 0[0-7]+[^0-7]   # Octal form
--- 631,636 ----
  \begin{verbatim}
  charref = re.compile(r"""
!  &[#]		     # Start of a numeric entity reference
!  (
     [0-9]+[^0-9]      # Decimal form
     | 0[0-7]+[^0-7]   # Octal form
***************
*** 640,644 ****
  Without the verbose setting, the RE would look like this:
  \begin{verbatim}
! charref = re.compile("&#(?P<char>[0-9]+[^0-9]"
                       "|0[0-7]+[^0-7]"
                       "|x[0-9a-fA-F]+[^0-9a-fA-F])")
--- 642,646 ----
  Without the verbose setting, the RE would look like this:
  \begin{verbatim}
! charref = re.compile("&#([0-9]+[^0-9]"
                       "|0[0-7]+[^0-7]"
                       "|x[0-9a-fA-F]+[^0-9a-fA-F])")
***************
*** 903,907 ****
  Python adds an extension syntax to Perl's extension syntax.  If the
  first character after the question mark is a \samp{P}, you know that
! it's a extension that's specific to Python.  Currently there are two
  such extensions: \regexp{(?P<\var{name}>...)} defines a named group,
  and \regexp{(?P=\var{name})} is a backreference to a named group.  If
--- 905,909 ----
  Python adds an extension syntax to Perl's extension syntax.  If the
  first character after the question mark is a \samp{P}, you know that
! it's an extension that's specific to Python.  Currently there are two
  such extensions: \regexp{(?P<\var{name}>...)} defines a named group,
  and \regexp{(?P=\var{name})} is a backreference to a named group.  If