From: Geoff H. <ghu...@ws...> - 2001-12-01 23:57:45
|
At 12:06 PM -0600 11/30/01, Gilles Detillieux wrote: >This is the part I find a bit troubling, but I don't know what we >can do about it. I don't know why Armstrong's patch, which uses rx >instead of regex, causes htdig to run 2-3 times faster, unless there >are other changes between 092301 and 112501 that account for much of >this, but it could well be just implementation efficiencies in one >library and not in the other. Ages and ages ago, I remember a small war over the most efficient regex implementation in other forums. At the time, I believe rx was considered the fastest for most things. So when I was working on htdig and saw rx, I wasn't surprised. Then we had the wonderful discovery that using the system regex instead of rx improved life for building the endings db for almost everyone. I'd be interested if a modification to the 3.2 code to use rx like the original Armstrong patch would give Joe a similar speed boost--this might be an interesting experiment. I haven't done extensive timing tests and don't have the time required to do them on my Linux box. But I can't believe there's a 3x difference on Linux. I'll see if I can dig up some autoconf tricks for switching between various regex implementations. If it's buried in HtRegex.* we can hide the changes from the rest of the code. -- -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ |