From: skaller <sk...@us...> - 2004-09-11 00:24:41
|
On Sat, 2004-09-11 at 02:54, Richard Jones wrote: > (Repost: original was apparently tagged as sp*m). > > Not really sure I understand it from a user's point of view. > > For me what really matters is that I could write something like: > > if string =~ /(a+)(b*)/ then ... This can be done down the track. That's under the heading 'parsers'. One write a parser for 'Perl Re' or "Emacs' or 'Glob' or 'Posix' and translates that string to a REGEXP and then run thru the engine. String based Re's are convenient for 'Micky Mouse' jobs. They're totally unsuitable as fundamental components. Reasons: (1) for complex regexps, regular *expressions* are untenable. Regular *definitions* are mandatory. Thats when you use a sequence of named regexps like in Lex. (2) Encoding both the regexp operators and data in the same string is untenable for complex regexps. All that escaping is a problem. Numbered groups is a very bad idea for complex regexps like a lexer specification for a programming language. And there is another reason too: (3) Encoding character data and regexp operators in a string is not i18n compatible. It isn't possible to deem certain characters as the operators, because you don't know what the character set is. (4) Strings are 8 bit. The engine must support 32 bit. (5) String based Re's aren't typesafe. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |