From: James A. T. <tr...@de...> - 2006-04-20 21:41:55
|
On Thu, Apr 20, 2006 at 11:32:13PM +0300, Eero Tamminen wrote: > > See, the problems with these are that in the localized handler > > self._bce_re must have the exact same groupls defined. > > If we just compose _bce_re from the list of 'B\.C\.E\.', 'B C E', > > etc, then localized handlers can just re-define bce list and be done. > > If the number of groups varies then every handler needs to > > rewrite a lot of the parser, or suffer the breakage. > > Ah, I hadn't realized that subgroups produce separate matches. This isn't necessarily a problem as subgroups can be non-capturing. Just start the group you don't want to see in the result with '?:', e.g. (?:abc) If there is concern with changing the group numbering, named groups can be used. E.g. >>> self.bce_re = re.compile("(?P<pre>.*)\s+(?P<bce>B[.]C[.](E[.])?)(?P<post> ?.*)") >>> m = self.bce_re.search('before B.C.E. after') >>> m.group('bce') 'B.C.E.' >>> m.group('pre') 'before' I'm not suggesting the code be switched to use this. I just wanted you to be aware of alternatives which might be useful. -- James Treacy tr...@de... |