From: Colin P. A. <co...@co...> - 2007-02-10 14:57:20
|
>>>>> "Colin" == Colin Paul Adams <co...@co...> writes: Colin> For XML 1.1, the equivalent to \c is 3830417 bytes long. Colin> This is definitely too big, so something is wrong with the Colin> test program. No. What is wrong is that this figure includes all as yet unallocated code-points (barring a few excluded ones). For XML 1.0, the figure is 104080. Either way, the strings are too long, so a proper Unicode-aware regular expression engine is needed. -- Colin Adams Preston Lancashire |