|
From: Joe R. J. <jj...@cl...> - 2001-12-01 10:39:15
|
On Fri, 30 Nov 2001, Gilles Detillieux wrote:
> Date: Fri, 30 Nov 2001 12:06:04 -0600 (CST)
> From: Gilles Detillieux <gr...@sc...>
> To: Joe R. Jah <jj...@cl...>
> Cc: "ht://Dig developers list" <htd...@li...>
> Subject: Re: [htdig-dev] to-do list for 3.1.6
>
> > ___________________ 112501 + parsedate.0 + ssl.6 ___________________
> > rm htlib/regex.*
> > remove reference to regex.o in htlib/Makefile
> > #undef HAVE_BROKEN_REGEX in include/htconfig.h
> >
> > htdig: Start digging: Thu Nov 29 22:22:32 PST 2001
> > htmerge: Start merging: Thu Nov 29 22:24:14 PST 2001 104 seconds
> ...
> > ___________________ 112501 + parsedate.0 + ssl.6 ___________________
> > rm htlib/regex.*
> > remove reference to regex.o in htlib/Makefile
> > #define HAVE_BROKEN_REGEX in include/htconfig.h
> >
> > htdig: Start digging: Thu Nov 29 22:25:33 PST 2001
> > htmerge: Start merging: Thu Nov 29 22:27:12 PST 2001 99 seconds
> ...
>
> I don't think the difference between 99 and 104 seconds is significant.
> This confirms my suspicion that the HAVE_BROKEN_REGEX doesn't do a
> whole lot. To be sure, though, I think we'd need timings for 112501 +
> parsedate.0 + ssl.6, remove reference to regex.o in htlib/Makefile, #undef
> AND #define HAVE_BROKEN_REGEX (i.e. two tests) in include/htconfig.h
> (but don't remove htlib/regex.h). I suspect the timings for both will
> be like the 2nd test above, around 143 sec.
___________________ 112501 + parsedate.0 + ssl.6 ___________________
remove reference to regex.o in htlib/Makefile
#define HAVE_BROKEN_REGEX in include/htconfig.h
htdig: Start digging: Sat Dec 1 00:10:58 PST 2001
htmerge: Start merging: Sat Dec 1 00:12:44 PST 2001 106
htmerge: Total word count: 13159
htmerge: Total documents: 163
htmerge: Total size of documents (in K): 1904
___________________ 112501 + parsedate.0 + ssl.6 ___________________
remove reference to regex.o in htlib/Makefile
#undef HAVE_BROKEN_REGEX in include/htconfig.h
htdig: Start digging: Sat Dec 1 00:18:55 PST 2001
htmerge: Start merging: Sat Dec 1 00:20:38 PST 2001 103
htmerge: Total word count: 13159
htmerge: Total documents: 163
htmerge: Total size of documents (in K): 1904
____________________________________________________________________
> I suspect the difference between the 143 and the 99-104 sec is due
> to the inclusion of the bundled regex.h even though you're using
> the C library regex.o code. It's a wonder this works at all, but
> there does seem to be some impact on performance.
I am not sure how that 143 came about last time; I can't reproduce it any
more;-/
> > ____________________ 092301 + Armstrong + ssl.4 ____________________
> > htdig: Start digging: Fri Nov 30 00:18:06 PST 2001
> > htmerge: Start merging: Fri Nov 30 00:18:44 PST 2001 38 seconds
> ...
>
> This is the part I find a bit troubling, but I don't know what we
> can do about it. I don't know why Armstrong's patch, which uses rx
> instead of regex, causes htdig to run 2-3 times faster, unless there
> are other changes between 092301 and 112501 that account for much of
> this, but it could well be just implementation efficiencies in one
> library and not in the other.
I reported the difference in indexing time to the list the very first time
url_rewrite_rules was integrated in the code. I don't believe at that
time anything else had changed in the code.
> In your tests above, do you make use of url_rewrite_rules? If so,
> how do the timings change if you don't use it?
___________________ 112501 + parsedate.0 + ssl.6 ___________________
remove reference to regex.o in htlib/Makefile
#define HAVE_BROKEN_REGEX in include/htconfig.h
no url_rewrite_rules
htdig: Start digging: Sat Dec 1 00:40:09 PST 2001
htmerge: Start merging: Sat Dec 1 00:40:34 PST 2001 25 seconds
htmerge: Total word count: 13159
htmerge: Total documents: 163
htmerge: Total size of documents (in K): 1904
rundig: end rundig: Sat Dec 1 00:40:39 PST 2001
___________________ 112501 + parsedate.0 + ssl.6 ___________________
remove reference to regex.o in htlib/Makefile
#undef HAVE_BROKEN_REGEX in include/htconfig.h
no url_rewrite_rules
htdig: Start digging: Sat Dec 1 00:28:50 PST 2001
htmerge: Start merging: Sat Dec 1 00:29:10 PST 2001 20 seconds
htmerge: Total word count: 13159
htmerge: Total documents: 163
htmerge: Total size of documents (in K): 1904
____________________________________________________________________
Regards,
Joe
--
_/ _/_/_/ _/ ____________ __o
_/ _/ _/ _/ ______________ _-\<,_
_/ _/ _/_/_/ _/ _/ ......(_)/ (_)
_/_/ oe _/ _/. _/_/ ah jj...@cl...
|