You can subscribe to this list here.
| 2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(47) |
Nov
(74) |
Dec
(66) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2002 |
Jan
(95) |
Feb
(102) |
Mar
(83) |
Apr
(64) |
May
(55) |
Jun
(39) |
Jul
(23) |
Aug
(77) |
Sep
(88) |
Oct
(84) |
Nov
(66) |
Dec
(46) |
| 2003 |
Jan
(56) |
Feb
(129) |
Mar
(37) |
Apr
(63) |
May
(59) |
Jun
(104) |
Jul
(48) |
Aug
(37) |
Sep
(49) |
Oct
(157) |
Nov
(119) |
Dec
(54) |
| 2004 |
Jan
(51) |
Feb
(66) |
Mar
(39) |
Apr
(113) |
May
(34) |
Jun
(136) |
Jul
(67) |
Aug
(20) |
Sep
(7) |
Oct
(10) |
Nov
(14) |
Dec
(3) |
| 2005 |
Jan
(40) |
Feb
(21) |
Mar
(26) |
Apr
(13) |
May
(6) |
Jun
(4) |
Jul
(23) |
Aug
(3) |
Sep
(1) |
Oct
(13) |
Nov
(1) |
Dec
(6) |
| 2006 |
Jan
(2) |
Feb
(4) |
Mar
(4) |
Apr
(1) |
May
(11) |
Jun
(1) |
Jul
(4) |
Aug
(4) |
Sep
|
Oct
(4) |
Nov
|
Dec
(1) |
| 2007 |
Jan
(2) |
Feb
(8) |
Mar
(1) |
Apr
(1) |
May
(1) |
Jun
|
Jul
(2) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
| 2008 |
Jan
(1) |
Feb
|
Mar
(1) |
Apr
(2) |
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
| 2009 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
| 2011 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
| 2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2013 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2016 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
|
From: Gilles D. <gr...@sc...> - 2003-06-08 13:40:38
|
According to Lachlan Andrew: > What docs need to be updated? The www.htdig.org FAQ is much bigger > than the distribution FAQ. Should it be copied over (even though a > lot of it relates to 3.1)? Geoff has a pre-release check list he uses to determine which docs are forward ported or back ported from the maindocs tree. As this is still a beta release, I don't expect a lot of forward porting to maindocs, but there will be a need to upgrade some maindocs files like where.html, and the FAQ. If the FAQ is lacking or outdated in any 3.2 items, it should be updated, but the FAQ in the 3.2.0b5 release should be a snapshot of the FAQ in maindocs at release time. -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) |
|
From: Gilles D. <gr...@sc...> - 2003-06-08 13:24:37
|
According to Lachlan Andrew: > Gilles: How is the Latin encodings going? If it is simply a forward > port, I can try it. If you want new features, I'll leave it to you. It's not a forward port, as the SGML decoding is quite different in 3.2. I was busy the past couple weeks with a server upgrade, and couldn't spare a few hours to code these changes and test them. I really will make the effort to get that done early this week, as I'm leaving next week for a much-needed two week vacation. -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) |
|
From: Lachlan A. <lh...@us...> - 2003-06-08 10:33:20
|
Greetings all, I've noticed that MATCH_MESSAGE is not "internationalised". It is=20 set to "all" or "some", rather than the values in method_names. =20 Moreover, if method_names doesn't use names "and" and "or", then it=20 isn't set at all. Does anyone mind if I change the behaviour (after 3.2.0b5 is out) to=20 use method_names? The incompatibilities would be: 1. The default values would be "all" and "any", not "all" and "some" 2. If Boolean queries are used, it will be "Boolean", not empty. Cheers, Lachlan |
|
From: Lachlan A. <lh...@us...> - 2003-06-08 10:27:02
|
On Sun, 8 Jun 2003 18:10, J. op den Brouw wrote: > I'm tracking down the problem. Excellent :) It sounds like very clever detective work! > I will make a file containing the original 2.57 select test > and the 2.13 select test. Maybe someone with knowledge > on autoconf can change it. Could you also send config.log and the error messages produced by=20 the compiler? Thanks, Lachlan |
|
From: J. op d. B. <ht...@op...> - 2003-06-08 08:10:57
|
I'm tracking down the problem. The original configure script (build bij autoconf 2.57) creates a C-program where HP-UX gcc chokes on, screaming that there are more than one prototype definitions for select. Autoconfing the configure.in script using autoconf 2.13 produces a different C-program and .. tada(.wav)! The select problem is gone. Actually, I took the select part from the 2.13 version and placed it over the 2.57 part. But now it breaks on the compile part saying something about ambigious declarations. Later more on that. I will make a file containing the original 2.57 select test and the 2.13 select test. Maybe someone with knowledge on autoconf can change it. ----- Original Message ----- From: "Lachlan Andrew" <lh...@us...> To: <htd...@li...> Sent: Sunday, June 08, 2003 7:20 AM Subject: [htdig-dev] 3.2.0b4 Progress check :) > Greetings all, > > Thank you to everybody for your work towards getting 3.2.0b5 out. > > Could people doing specific tasks give an update on how they're going, > and what others of us can do to help? > > Geoff: Did you manage the copyright update? If you have written the > sed script, but haven't had time to apply/commit, could you please > post it? I should have some time this week to do that. > > Gilles: How is the Latin encodings going? If it is simply a forward > port, I can try it. If you want new features, I'll leave it to you. > > Neal: It's good to hear that you're coming along with the Win32 port. > >From where I'm standing, the highest priority is applying the > memory-leak patch, so that the rest of us can test it thoroughly. > Could you please do that some time this week? > > Jess: Any luck with the HP-UX configuration? If you post the > config.log then one of the rest of us might have an idea why it is > failing. > > Does the following timeline sound feasible? > This time, I think the showstoppers are gone, so we can afford to set > a firm timeline (although not necessarily this one :) > > Sun 8 - Fri 13: Finalise Unix code / configuration > Update copyright > Other documentation updates? > Weekend 14-15: Unix code freeze, and re-test all available platforms > Ask "non-team" volunteers to "install/make check" the snapshot > Mon 16-Fri 20: Finish Win32, or decide to postpone until 3.2.0b6 > Further documentation updates? > Testing by non-team volunteers > Weekend 21-22: Code freeze > Test Win32 (if finished) > Rename 3.2.0b4->3.2.0b5 > Mon 23: BETA RELEASE > > What docs need to be updated? The www.htdig.org FAQ is much bigger > than the distribution FAQ. Should it be copied over (even though a > lot of it relates to 3.1)? > > > ------------------------------------------------------- > This SF.net email is sponsored by: Etnus, makers of TotalView, The best > thread debugger on the planet. Designed with thread debugging features > you've never dreamed of, try TotalView 6 free at www.etnus.com. > _______________________________________________ > htdig-dev mailing list > htd...@li... > https://lists.sourceforge.net/lists/listinfo/htdig-dev > |
|
From: Geoff H. <ghu...@us...> - 2003-06-08 07:14:28
|
STATUS of ht://Dig branch 3-2-x CHECKLIST FOR 3.2.0b5: * Latin-encodings (Gilles) * Apply memory leak patches (Neal) * Must be able to (a) make check and (b) index www.htdig.org using "robotstxt_name: master-htdig" on all systems listed as "supported". Systems tested so far: - Mandrake 8.2, gcc 3.2 (lha, 21 May) - FreeBSD 4.6, gcc 2.95.3 (lha, 23 May) - Debian, Linux kernel 2.2.19, gcc 2.95.4 (lha, 23 May) - SunOS 5.8 = Solaris 2.8, gcc 3.1 (lha, 25 May) - SunOS 5.8 = Solaris 2.8, Sun cc with g++ 3.1 (lha, 29 May) - OS X (Jim, 30 May) Partly tested: - RedHat 8 (Jim, 1 June. make check requires tweaking for apache) - SunOS 5.8 = Solaris 2.8, gcc 2.95.2 (lha. Makes check minus apache, Digs small htdig.org. 27 May) - SunOS 5.8 = Solaris 2.8, Sun cc with g++ 2.95.2 (lha. Makes check minus apache, Digs small htdig.org. 2 June) - RedHat 7.3 (lha. Makes check minus apache. Digs small htdig.org. 25 May) - Alpha Debian (lha. Makes check minus apache. Digs small htdig.org. 25 May) To be tested: - HP-UX 10.20, gcc 2.8.1 (Jesse) - RedHat, other versions anyone? - RedHat AdvanceServer Itanium II (David Bannon) * Check bugs listed in bug-tracker... * Polish release docs (Geoff) RELEASES: 3.2.0b5: Next release, June 2003??? 3.2.0b4: "In progress" -- snapshots called "3.2.0b4" until prerelease. 3.2.0b3: Released: 22 Feb 2001. 3.2.0b2: Released: 11 Apr 2000. 3.2.0b1: Released: 4 Feb 2000. (Please note that everything added here should have a tracker PR# so we can be sure they're fixed. Geoff is currently trying to add PR#s for what's currently here.) SHOWSTOPPERS: * Mifluz database errors are a severe problem (PR#428295) -- Does Neal's new zlib patch solve this for now? KNOWN BUGS: * Odd behavior with $(MODIFIED) and scores not working with wordlist_compress set but work fine without wordlist_compress. (the date is definitely stored correctly, even with compression on so this must be some sort of weird htsearch bug) PR#618737. * META descriptions are somehow added to the database as FLAG_TITLE, not FLAG_DESCRIPTION. (PR#618738) Can anyone reproduce this? I can't! -- Lachlan PENDING PATCHES (available but need work): * Additional support for Win32. * Memory improvements to htmerge. (Backed out b/c htword API changed.) * Mifluz merge. NEEDED FEATURES: * Quim's new htsearch/qtest query parser framework. * File/Database locking. PR#405764. TESTING: * httools programs: (htload a test file, check a few characteristics, htdump and compare) * Tests for new config file parser * Duplicate document detection while indexing * Major revisions to ExternalParser.cc, including fork/exec instead of popen, argument handling for parser/converter, allowing binary output from an external converter. * ExternalTransport needs testing of changes similar to ExternalParser. DOCUMENTATION: * List of supported platforms/compilers is ancient. (PR#405279) * Document all of htsearch's mappings of input parameters to config attributes to template variables. (Relates to PR#405278.) Should we make sure these config attributes are all documented in defaults.cc, even if they're only set by input parameters and never in the config file? * Split attrs.html into categories for faster loading. * Turn defaults.cc into an XML file for generating documentation and defaults.cc. * require.html is not updated to list new features and disk space requirements of 3.2.x (e.g. regex matching, database compression.) PRs# 405280 #405281. * TODO.html has not been updated for current TODO list and completions. I've tried. Someone "official" please check and remove this -- Lachlan * Htfuzzy could use more documentation on what each fuzzy algorithm does. PR#405714. * Document the list of all installed files and default locations. PR#405715. OTHER ISSUES: * Can htsearch actually search while an index is being created? * The code needs a security audit, esp. htsearch. PR#405765. |
|
From: Lachlan A. <lh...@us...> - 2003-06-08 05:20:32
|
Greetings all, Thank you to everybody for your work towards getting 3.2.0b5 out. Could people doing specific tasks give an update on how they're going,=20 and what others of us can do to help? Geoff: Did you manage the copyright update? If you have written the=20 sed script, but haven't had time to apply/commit, could you please=20 post it? I should have some time this week to do that. Gilles: How is the Latin encodings going? If it is simply a forward=20 port, I can try it. If you want new features, I'll leave it to you. Neal: It's good to hear that you're coming along with the Win32 port.=20 =46rom where I'm standing, the highest priority is applying the=20 memory-leak patch, so that the rest of us can test it thoroughly. =20 Could you please do that some time this week? Jess: Any luck with the HP-UX configuration? If you post the=20 config.log then one of the rest of us might have an idea why it is=20 failing. Does the following timeline sound feasible? This time, I think the showstoppers are gone, so we can afford to set=20 a firm timeline (although not necessarily this one :) Sun 8 - Fri 13: Finalise Unix code / configuration =09=09Update copyright =09=09Other documentation updates? Weekend 14-15: Unix code freeze, and re-test all available platforms =09=09Ask "non-team" volunteers to "install/make check" the snapshot Mon 16-Fri 20:=09Finish Win32, or decide to postpone until 3.2.0b6 =09=09Further documentation updates? =09=09Testing by non-team volunteers Weekend 21-22:=09Code freeze =09=09Test Win32 (if finished) =09=09Rename 3.2.0b4->3.2.0b5 Mon 23: BETA RELEASE What docs need to be updated? The www.htdig.org FAQ is much bigger=20 than the distribution FAQ. Should it be copied over (even though a=20 lot of it relates to 3.1)? |
|
From: Jonathan B. <ba...@ps...> - 2003-06-07 02:26:20
|
On 06/06/03 23:30, Lachlan Andrew wrote: >Regarding the time taken to create your htdig.conf file, I >realise that this is a problem. Can you suggest what would >have speeded up the process? Not really. But here's the story. I maintain what seems to be a very large site: http://finzi.psych.upenn.edu. The search function there searches about 4 years of a VERY active mailing list, plus other stuff. I started off using rundig. It would crash, sometimes after 3 hours of running. Then I would remove some attachments from the mail archives and start over, and then usually it worked. (It might have also worked if I had recited an incantation and just started over.) But finally I gave up this procedure, and for each monthly update I now use htdig instead of rundig. No more crashes, even with all the attachments. But this take 5 hours. I just looked at the manual again and realized that I can probably make separate databases for new stuff and old stuff. I had seen that before but I was afraid to try it. But that was before I gave up rundig. So now I will try that next month and see if it works. If it does, I'm in good shape. One thing that isn't clear to me is whether I can keep using htmerge repeatedly, merging each new month's mail with the result of the last merge. This is not the whole thing. The site also deals with "functions," and I have to re-do these every month because many of the files are updated. So I would merge the mail into a mail file, then merge that with the new functions file. But probably this will work. So I will try it. Aside from this, it took me forever to figure out how to get my index to stop including stuff outside of my local site. I think I've finally succeeded in that, but it was a lot of trial and error. I don't have time now to read through the manual to figure out why it took me so long. This is all just for your information. Probably if I had just RTFM, over and over, I would have found things faster. I must say that this site I run is much used and much appreciated, so I intend to maintain it. And I think it is some of the features of htdig that make it so useful. Jon |
|
From: Jonathan B. <ba...@ps...> - 2003-06-06 18:46:49
|
>I'm sorry, I misread your bug report. I take it that common_dir >*wasn't* /usr/share/htdig in the RedHat RPM, but that it should have >been. If the packagers configured it with --prefix=/usr (which the >package information says they did), then the default location for >common_dir would be /usr/share/htdig. If that isn't what it is, >they must also have overridden either --datadir=... or >--with-common-dir=... > >The point remains that the way the packagers choose to configure the >package is beyond our control. We just provide the flexibility for >them to put things wherever they want. > >Out of interest, do you know where they *did* point common_dir to? >I've tried to find out, haven't been able to :( I am now totally baffled. I have spent too long looking at the source RPMs for the old version of htdig (which worked) and the new version (which was the one I was complaining about, specifically old: htdig-3.2.0-1.b4.0.71.i386.rpm htdig-web-3.2.0-1.b4.0.71.i386.rpm new: htdig-3.2.0-16.20021103.i386.rpm htdig-web-3.2.0-16.20021103.i386.rpm Eventually, I realized that most of what I needed was not in the source RPMs but just in these binary RPMs. If you say rpm -qpl [file.rpm] you get a complete list of all the files. They were essentially identical except for two irrelevant changes (one the position of htdig.conf, the other a specification of the /var/www/html/htdig directory in the new one). Importantly, both RPMs put the file footer.html (the source of my original problem) in /usr/share/htdig/ and that is the value of "common_dir". I did notice that, in the source RPM for the new one, but not the old one, there was a single file called htdig.conf, consisting of one line: Alias /htdig /usr/share/htdig But this should not affect anything, right? This file was just sitting there. It was not part of any directory, but it was in the root directory when you unpack the source RPM. I don't know enough about RPMs to say. The bug is that, in the old version, you do not need to specify the location of common_dir in htdig.conf, but, in the new version you do. YOU COULD FIX THE BUG by adding common_dir: /usr/share/htdig to the default htdig.conf. If this is where common_dir is supposed to be, then it would do no harm. It would be redundant. But this would make it work with the new RPM, which DOES NOT CHANGE the location of common_dir. The problem is that it does not recognize this location unless it is specified in htdig.conf. Jon |
|
From: Lachlan A. <lh...@us...> - 2003-06-06 13:46:46
|
Greetings again, Jon, I'm sorry, I misread your bug report. I take it that common_dir =20 *wasn't* /usr/share/htdig in the RedHat RPM, but that it should have=20 been. If the packagers configured it with --prefix=3D/usr (which the=20 package information says they did), then the default location for =20 common_dir would be /usr/share/htdig. If that isn't what it is,=20 they must also have overridden either --datadir=3D... or =20 --with-common-dir=3D... The point remains that the way the packagers choose to configure the=20 package is beyond our control. We just provide the flexibility for=20 them to put things wherever they want. Out of interest, do you know where they *did* point common_dir to? =20 I've tried to find out, haven't been able to :( Cheers, Lachlan On Fri, 6 Jun 2003 23:30, Lachlan Andrew wrote: > Regarding the value of common_dir,... > The actual ht://Dig default configuration points=20 > common_dir to /opt/www/share/htdig, so the decision to make it > /usr/share/htdig must have been made by the packagers. [deleted] > The header and footer of the search results did not display. I was > able to fix the problem by adding the line > common_dir: /usr/share/htdig > to /etc/htdig.conf (Don't ask how I arrived at this solution. I > have no idea.) |
|
From: Lachlan A. <lh...@us...> - 2003-06-06 13:31:20
|
Greetings Jon, Thanks for your offer to advise us on how to simplify configuring=20 ht://Dig, and you are right that it doesn't belong in the bug report. =20 You can reply to me (lh...@us...) or to =20 htd...@li... Regarding the value of common_dir, thank you for contacting the=20 RedHat packagers. The point I was trying to make was not that the=20 RPM should have located old .conf files, but that when the packagers=20 compiled their package, they can set it to whatever they consider=20 fits best with the other packages for their distribution. (That is=20 why there are RedHat RPMs which are distinct from, say, Mandrake=20 RPMs.) The actual ht://Dig default configuration points common_dir =20 to /opt/www/share/htdig, so the decision to make it /usr/share/htdig=20 must have been made by the packagers. Cheers, Lachlan > Date: 2003-05-17 21:16 > Sender: jonbaron > Logged In: YES=20 > user_id=3D770474 >=20 > Concerning the second part of your question, what to do to make > construction of htdig.conf easier, I'd rather discuss that > by email. > It doesn't belong in this bug report. >=20 > ba...@ps... Date: 2003-05-17 21:13 Sender: jonbaron Logged In: YES=20 user_id=3D770474 It wouldn't be a problem with the RPM if htdig kept common_dir in=20 /usr/share/htdig. The fact that it changed is what caused the=20 problem, no? I guess it is too late now. I suppose it is my job to=20 alert RedHat to this problem. I will try to figure out how to do=20 that. Should an RPM be expected to find all the various htdig.conf=20 files and fix them? I don't think so. (I have several, with=20 different names.) Date: 2003-05-17 16:40 Sender: lha Logged In: YES=20 user_id=3D663373 Thanks for your feedback. Yes, it does sound like a bug in the RPM. The default value of common_dir is set at compile time, and it would be reasonable to expect the RPM maintainer to put the default header and footer files in the default common directory. Can you suggest what the ht://Dig team should do? Regarding the time taken to create your htdig.conf file, I realise that this is a problem. Can you suggest what would have speeded up the process? Thanks again, Lachlan --- Original Message --- I upgraded to RedHat Linux 9, which has htdig-3.2.0-16.20021103. I do=20 not know what version of htdig I had before, but I was using RH 7.3=20 (but I don't think the version I was using came with that). The header and footer of the search results did not display. I was=20 able to fix the problem by adding the line common_dir: /usr/share/htdig to /etc/htdig.conf (Don't ask how I arrived at this solution. I have=20 no idea.) This may be a problem with the RPM. But, on the other hand, it did a=20 reasonable thing, which was to leave my htdig.conf file alone (which=20 took me hours and hours to make). That file did not contain any=20 definition of common_dir.=20 |
|
From: Neal R. <ne...@ri...> - 2003-06-05 14:31:26
|
Dooh! Good one. Need to read more carefully.. I looked at hundreds of
'diffs' that day....
You've done a lot of work recently as I crawl through diffs... EXCELLENT
JOB!
The port will change the main code-branch and will be invisible to Unix...
the differences in code are surrounded by '#ifdefs'.
It's 90% done from a compilation point of view. Next I'll test the
functionality. Native WIN32 has no 'alarm' and associated Unix functions,
so I have to revisit various code sections and implement a parallel
functionality in native WIN32 APIs.
There is also a separate Makefile set and db_config.h & htconfig.h for
WIN32. I am using cygwin and GNU-make environments, but use the msvc
compiler and linker for all code. I will probably create an msvc project
file as well.
Thanks.
On Thu, 5 Jun 2003, Lachlan Andrew wrote:
> Greetings Neal,
>
> These conditions were subsumed into the earlier tests:
> if (boolean && mystrcasecmp(word.get(), "and") == 0)
> became
> if (boolean && (mystrcasecmp(word.get(), "+") == 0
> || mystrcasecmp(word.get(), boolean_keywords[AND]) == 0))
> and similarly for "-".
>
> To my mind this is clearer, and it also should produce (marginally)
> smaller code.
>
> I hope the porting is going well. Is it going to change the main Unix
> source, be a patch relative to the main source, or be a stand-alone
> port?
>
> Cheers,
> Lachlan
>
> On Thu, 5 Jun 2003 07:54, Neal Richter wrote:
>
> > On 2002/12/30 12:42:59
> > These lines were removed new line 550:
> >
> > else if (boolean && mystrcasecmp(word.get(), "+") == 0)
> > tempWords.Add(new WeightWord("&", -1.0));
> > else if (boolean && mystrcasecmp(word.get(), "-") == 0)
> > tempWords.Add(new WeightWord("!", -1.0));
> >
> > Any explaination on this particular change? This looks to
> > implement + (required word) and - (exclude word).
>
Neal Richter
Knowledgebase Developer
RightNow Technologies, Inc.
Customer Service for Every Web Site
Office: 406-522-1485
|
|
From: Lachlan A. <lh...@us...> - 2003-06-05 12:46:26
|
Currently the install docs say to ./configure with --disable-bigfile.=20 Are you suggesting putting an explicit test in configure.in, like=20 for OS X? I suppose that would be the cleanest... Cheers, Lachlan On Tue, 3 Jun 2003 05:44, Geoff Hutchison wrote: > > 1. It works fine with --disable-bigfile and I'd be inclined to > > leave it at that for 3.2.0b5. (If people have indexes over 4GB, > > the I say eliminating redundancy from the database structure is a > > higher priority than enabling big file support...) > > So are we saying that for SunOS native cc, we're defaulting to > --disable-bigfile? |
|
From: Lachlan A. <lh...@us...> - 2003-06-05 12:43:52
|
Greetings Neal,
These conditions were subsumed into the earlier tests:
if (boolean && mystrcasecmp(word.get(), "and") =3D=3D 0)
became
if (boolean && (mystrcasecmp(word.get(), "+") =3D=3D 0
|| mystrcasecmp(word.get(), boolean_keywords[AND]) =3D=3D 0))
and similarly for "-".
To my mind this is clearer, and it also should produce (marginally)=20
smaller code.
I hope the porting is going well. Is it going to change the main Unix=20
source, be a patch relative to the main source, or be a stand-alone=20
port?
Cheers,
Lachlan
On Thu, 5 Jun 2003 07:54, Neal Richter wrote:
> On 2002/12/30 12:42:59
> These lines were removed new line 550:
>
> else if (boolean && mystrcasecmp(word.get(), "+") =3D=3D 0)
> tempWords.Add(new WeightWord("&", -1.0));
> else if (boolean && mystrcasecmp(word.get(), "-") =3D=3D 0)
> tempWords.Add(new WeightWord("!", -1.0));
>
> Any explaination on this particular change? This looks to
> implement + (required word) and - (exclude word).
|
|
From: Neal R. <ne...@ri...> - 2003-06-04 21:54:20
|
Lachlan,
I'm working on the WIN32 port and reveiwing changes since my last port..
On 2002/12/30 12:42:59 you were hacking on with a commit message of:
"Forward-port of many 3.1.6 features"
These lines were removed new line 550:
else if (boolean && mystrcasecmp(word.get(), "+") == 0)
tempWords.Add(new WeightWord("&", -1.0));
else if (boolean && mystrcasecmp(word.get(), "-") == 0)
tempWords.Add(new WeightWord("!", -1.0));
Any explaination on this particular change? This looks to implement
+ (required word) and - (exclude word).
Thanks!
Neal Richter
Knowledgebase Developer
RightNow Technologies, Inc.
Customer Service for Every Web Site
Office: 406-522-1485
|
|
From: Geoff H. <ghu...@ws...> - 2003-06-02 19:44:11
|
> 1. It works fine with --disable-bigfile and I'd be inclined to leave > it at that for 3.2.0b5. (If people have indexes over 4GB, the I say > eliminating redundancy from the database structure is a higher > priority than enabling big file support...) So are we saying that for SunOS native cc, we're defaulting to --disable-bigfile? -Geoff |
|
From: Lachlan A. <lh...@us...> - 2003-06-02 13:24:16
|
Greetings David, Thanks for agreeing to do some testing. I'll get back to you with the=20 tests once they pass on all platforms we have direct access to... Most systems break with w_c_d_l =3D 1000, and many with 40, hence the=20 current default of 4. Everything in htcommon/defaults.cc can be=20 overridden by a parameter of the same name in the file htdig.conf=20 (except for the default name of the configure file, of course :) Did=20 you try? If you did and it didn't work, that is a bug! Thanks again, Lachlan On Mon, 2 Jun 2003 08:26, David Bannon wrote: > Did a search for for wordlist_cache_dirty_level and found it in > htcommon/defaults.cc. It appears to be set to '1000', I changed > that to 40 and all went very well! > Maybe this needs to be a config parameter ? > > If you have any specific tests you would like me to run, I'll do so |
|
From: Lachlan A. <lh...@us...> - 2003-06-02 13:09:12
|
Excellent work, Jim! Could you please mail a list of the lines you commented out? It would=20 be really nice to have make check work "out of the box", and I=20 think most of the lines in *.conf are not necessary for any platform. The first warning was due to a (trivial) bug, and I'll also get rid of=20 the last. The other two seem harmless and platform-dependent, so=20 I'll leave them for now. Thanks for your detective work :) Lachlan On Sun, 1 Jun 2003 12:04, Jim Cole wrote: > I tested Friday's CVS on a Red Hat 8 system. > > I ran rundig against www.htdig.org using robotstxt_name: > master-htdig and did not encounter any noticeable problems. The > database sizes seemed reasonable and were searchable. > > Four of the tests fail outright and numerous errors are reported. > Most of the problems are due to the fact that the httpd.conf file > used is not entirely compatible with Apache 2. By merging > httpd.conf, access.conf, and srm.conf into a single file and then > commenting out everything that 'make check' complained about, I was > able to get three of the four problematic tests to pass. > WARNINGS > --- > mp_cmpr.c:120: warning: initialization makes integer from pointer > without a cast > --- > os_rw.c: In function `CDB___os_io': > os_rw.c:42: warning: implicit declaration of function `pread' > os_rw.c:48: warning: implicit declaration of function `pwrite' > --- > In file included from regex.c:212: > gregex.h:530:1: warning: "__restrict_arr" redefined > In file included from /usr/include/features.h:291, > from /usr/include/stdlib.h:25, > from regex.c:147: > /usr/include/sys/cdefs.h:246:1: warning: this is the location of > the previous definition > --- > conf_lexer.cxx:244: warning: `void* yy_flex_realloc(void*, unsigned=20 int)' > declared `static' but never defined |
|
From: Lachlan A. <lh...@us...> - 2003-06-02 10:25:42
|
Greetings Geoff, 1. It works fine with --disable-bigfile and I'd be inclined to leave=20 it at that for 3.2.0b5. (If people have indexes over 4GB, the I say=20 eliminating redundancy from the database structure is a higher=20 priority than enabling big file support...) 2. The problem is that the arguments to pwrite in db/os_rw.cc (of=20 type 'off_t') seem to be treated as 32 bit instead of 64 bit. This=20 causes the page size to be treated as 0, and all pages to be written=20 over page 0. (This problem would not manifest itself on a=20 little-endian architecture.) 3. The configure script assumes a lot of variable are the same for=20 both the C and C++ compilers, which is not the same when using cc =20 and g++. That may or may not be the problem here. 4. Again, this may not be relevant, but the SunOS configurations I've=20 tested #define open open64 and call a separate set of libraries to support big files. This=20 interferes with C++, and is #undef'd in stream.h. Good luck with the grant review :) Lachlan On Sun, 1 Jun 2003 13:42, Geoff Hutchison wrote: > > The problem with make check on SunOS with native cc is the size > > of off_t, the size of an offset in a file. This seems to be > > related to the --enable-bigfile configure option. > > This is new. > > I'm trying to hunt for the exact message in the developer list, but > can't find it. What is the exact problem with native Sun CC? Many > type problems can be fixed with configure results. (e.g. the > various hacks in Connection.cc) |
|
From: David B. <D.B...@vp...> - 2003-06-01 22:29:53
|
Thanks for input. I think I worked things out over the weekend. Did a search for for wordlist_cache_dirty_level and found it in htcommon/defaults.cc. It appears to be set to '1000', I changed that to 40 and all went very well! So I don't know if the need to go from 1000->40 is due to some wrinkle in our web site or because we are running it on an 64bit Itanium rather than the 32 bit system linux usually appears on. But either way it works. Maybe this needs to be a config parameter ? If you have any specific tests you would like me to run, I'll do so and you can add RedHat AdvancedServer on Itanium II to your supported platforms. We are quite happy to undertake tests as requested or before a release. Thanks for your help. ------------------------------------------------------------- David Bannon D.B...@vp... Phone 61 03 9925 4733 Fax 61 03 9925 4647 Mobile 0418 525687 http://www.vpac.org Systems Manager, Victorian Partnership for Advanced Computing ------------------------------------------------------------- ..... Humpty Dumpty was pushed ! > -----Original Message----- > From: Lachlan Andrew [mailto:lh...@us...] > Sent: Saturday, May 31, 2003 5:31 AM > To: D.B...@vp... > Subject: Re: HTDig on IA64 Linux > > > Greetings again, > > On second thoughts, there was a problem with that variable not being > set properly. To be safe, try this week's snapshot, due out on > Sunday or Monday (I don't remember which...). > > Cheers, > Lachlan > > On Fri, 30 May 2003 22:10, Lachlan Andrew wrote: > > > The snapshot would be about a week old. Try setting > > wordlist_cache_dirty_level=4 (the new default value) > > > On Fri, 30 May 2003 14:40, David Bannon wrote: > > > 3.2.04 CVS snapshots - indexes for far too long > > > and then produces an endless list of error > > > messages like this : > > > > > > .......... > > > Try setting wordlist_cache_dirty_level=40 in configuration file |
|
From: Geoff H. <ghu...@us...> - 2003-06-01 07:44:45
|
STATUS of ht://Dig branch 3-2-x CHECKLIST FOR 3.2.0b5: * Add more items to checklist :-) * Must be able to (a) make check and (b) index www.htdig.org using "robotstxt_name: master-htdig" on all systems listed as "supported". Systems tested so far: - Mandrake 8.2, gcc 3.2 (lha, 21 May) - FreeBSD 4.6, gcc 2.95.3 (lha, 23 May) - Debian, Linux kernel 2.2.19, gcc 2.95.4 (lha, 23 May) - SunOS 5.8 = Solaris 2.8, gcc 3.1 (lha, 25 May) - SunOS 5.8 = Solaris 2.8, Sun cc with g++ 3.1 (lha, 29 May) Partly tested: - SunOS 5.8 = Solaris 2.8, gcc 2.95.2 (lha. Makes check minus apache, Digs small htdig.org. 27 May) - RedHat 7.3 (lha. Makes check minus apache. Digs small htdig.org. 25 May) - Alpha Debian (lha. Makes check minus apache. Digs small htdig.org. 25 May) To be tested: - SunOS 5.8 = Solaris 2.8, Sun cc with g++ 2.95.2 (lha) - HP-UX 10.20, gcc 2.8.1 (Jesse) - RedHat anyone? - OS X anyone? * Latin-encodings (Gilles) * Check bugs listed in bug-tracker... * Polish release docs (Geoff) RELEASES: 3.2.0b5: Next release, June 2003??? 3.2.0b4: "In progress" -- snapshots called "3.2.0b4" until prerelease. 3.2.0b3: Released: 22 Feb 2001. 3.2.0b2: Released: 11 Apr 2000. 3.2.0b1: Released: 4 Feb 2000. (Please note that everything added here should have a tracker PR# so we can be sure they're fixed. Geoff is currently trying to add PR#s for what's currently here.) SHOWSTOPPERS: * Mifluz database errors are a severe problem (PR#428295) -- Does Neal's new zlib patch solve this for now? KNOWN BUGS: * Odd behavior with $(MODIFIED) and scores not working with wordlist_compress set but work fine without wordlist_compress. (the date is definitely stored correctly, even with compression on so this must be some sort of weird htsearch bug) PR#618737. * META descriptions are somehow added to the database as FLAG_TITLE, not FLAG_DESCRIPTION. (PR#618738) Can anyone reproduce this? I can't! -- Lachlan PENDING PATCHES (available but need work): * Additional support for Win32. * Memory improvements to htmerge. (Backed out b/c htword API changed.) * Mifluz merge. NEEDED FEATURES: * Quim's new htsearch/qtest query parser framework. * File/Database locking. PR#405764. TESTING: * httools programs: (htload a test file, check a few characteristics, htdump and compare) * Tests for new config file parser * Duplicate document detection while indexing * Major revisions to ExternalParser.cc, including fork/exec instead of popen, argument handling for parser/converter, allowing binary output from an external converter. * ExternalTransport needs testing of changes similar to ExternalParser. DOCUMENTATION: * List of supported platforms/compilers is ancient. (PR#405279) * Document all of htsearch's mappings of input parameters to config attributes to template variables. (Relates to PR#405278.) Should we make sure these config attributes are all documented in defaults.cc, even if they're only set by input parameters and never in the config file? * Split attrs.html into categories for faster loading. * Turn defaults.cc into an XML file for generating documentation and defaults.cc. * require.html is not updated to list new features and disk space requirements of 3.2.x (e.g. regex matching, database compression.) PRs# 405280 #405281. * TODO.html has not been updated for current TODO list and completions. I've tried. Someone "official" please check and remove this -- Lachlan * Htfuzzy could use more documentation on what each fuzzy algorithm does. PR#405714. * Document the list of all installed files and default locations. PR#405715. OTHER ISSUES: * Can htsearch actually search while an index is being created? * The code needs a security audit, esp. htsearch. PR#405765. |
|
From: Geoff H. <ghu...@ws...> - 2003-06-01 03:49:16
|
Hmm. We need to update copyright information before releasing 3.2.0b5. 1) Files need to have current copyright, especially if they have been touched since 2001. 2) As per the ht://Dig group decision, the source is now available under the LGPL. Thus the COPYING file, as well as the per-file copyright information need to be updated. I can do this with a sed script, but because the patch will be big and I'd like to be sure no one else is about to commit at the same time, I'd like to give advance warning. So if there aren't objections, I'm going to set aside Tues. evening (Chicago US-CDT) to do this. -Geoff |
|
From: Geoff H. <ghu...@ws...> - 2003-06-01 03:42:55
|
Sorry I've been AWOL. There was a big grant review this last week and lots of things were dumped on me. > The problem with make check on SunOS with native cc is the size of > off_t, the size of an offset in a file. This seems to be related to > the --enable-bigfile configure option. Does anyone remember > dealing with this issue before? This is new. I'm trying to hunt for the exact message in the developer list, but can't find it. What is the exact problem with native Sun CC? Many type problems can be fixed with configure results. (e.g. the various hacks in Connection.cc) -Geoff |
|
From: Geoff H. <ghu...@ws...> - 2003-06-01 03:36:10
|
On Saturday, May 31, 2003, at 12:00 AM, Jim Cole wrote: > the time being. We can always add a FAQ telling people to use > --disable-shared if it comes up a lot. Or we can continue the hack that I put into the configure scripts to set --disable-shared as the default on powerpc-*-* targets. Peter O'Gorman <pe...@po...> wrote: > libtool issues totally suck :( Yes. It would be nice if there were also some easier ways at controlling the libtool macros in configure.in for these purposes. :-( If that sounds like an acceptable solution, let's get the configure, configure.in and various *.m4 files set for release and then I'll go hack away. Cheers, -Geoff |
|
From: Jim C. <li...@yg...> - 2003-06-01 02:37:16
|
Hi - I tested Friday's CVS on a Red Hat 8 system (technically KRUD 8, but for all practical purposes it should be the same thing). Everything built with only a few warnings; the warnings are listed at the end of this message. I ran rundig against www.htdig.org using robotstxt_name: master-htdig and did not encounter any noticeable problems. The database sizes seemed reasonable and were searchable. I ran 'make check' and found it to be rather broken under Red Hat. Four of the tests fail outright and numerous errors are reported. Most of the problems are due to the fact that the httpd.conf file used is not entirely compatible with Apache 2. By merging httpd.conf, access.conf, and srm.conf into a single file and then commenting out everything that 'make check' complained about, I was able to get three of the four problematic tests to pass. After hacking the httpd.conf file, one test still failed (t_htdig_local -- output below). The problem appears to be due to the manner in which sockets are recycled. If I either tweaked things so that I could run t_htdig_local directly or added a sleep to test_functions immediately before it starts httpd, then the test passed. I guess the short version is that while 'make check' appears to have a number of problems under Red Hat 8, the ht://Dig suite itself builds, can be coerced into passing all 14 tests, and seems to behave when indexing/searching real sites. Jim OUTPUT FROM make check --- PASS: t_htnet (98)Address already in use: make_sock: could not bind to address 0.0.0.0:7400 no listening sockets available, shutting down running htdig: expected http://localhost:7400/set1/ http://localhost:7400/set1/bad_local.htm http://localhost:7400/set1/script.html http://localhost:7400/set1/site%201.html http://localhost:7400/set1/site2.html http://localhost:7400/set1/site3.html http://localhost:7400/set1/site4.html http://localhost:7400/set1/sub%2520dir/ http://localhost:7400/set1/sub%2520dir/empty%20file.html http://localhost:7400/set1/title.html but got http://localhost:7400/set1/ http://localhost:7400/set1/script.html http://localhost:7400/set1/site%201.html http://localhost:7400/set1/site2.html http://localhost:7400/set1/site3.html http://localhost:7400/set1/site4.html http://localhost:7400/set1/title.html FAIL: t_htdig_local ==================== 1 of 14 tests failed ==================== WARNINGS --- gcc -DHAVE_CONFIG_H -I. -I. -I. -I./../htlib -D_REENTRANT -g -O2 -Wall -c mp_cmpr.c -fPIC -DPIC -o .libs/mp_cmpr.lo mp_cmpr.c:120: warning: initialization makes integer from pointer without a cast --- gcc -DHAVE_CONFIG_H -I. -I. -I. -I./../htlib -D_REENTRANT -g -O2 -Wall -c os_rw.c -fPIC -DPIC -o .libs/os_rw.lo os_rw.c: In function `CDB___os_io': os_rw.c:42: warning: implicit declaration of function `pread' os_rw.c:48: warning: implicit declaration of function `pwrite' --- gcc -DHAVE_CONFIG_H -I. -I. -I../include -DDEFAULT_CONFIG_FILE=\"/home/greyleaf/local/htdig_cvs/conf/ htdig.conf\" -I../include -I../htlib -I../htnet -I../htcommon -I../htword -I../db -I../db -g -O2 -Wall -c regex.c -fPIC -DPIC -o .libs/regex.lo In file included from regex.c:212: gregex.h:530:1: warning: "__restrict_arr" redefined In file included from /usr/include/features.h:291, from /usr/include/stdlib.h:25, from regex.c:147: /usr/include/sys/cdefs.h:246:1: warning: this is the location of the previous definition --- g++ -DHAVE_CONFIG_H -I. -I. -I../include -DDEFAULT_CONFIG_FILE=\"/home/greyleaf/local/htdig_cvs/conf/ htdig.conf\" -I../include -I../htlib -I../htnet -I../htcommon -I../htword -I../db -I../db -DBIN_DIR=\"/home/greyleaf/local/htdig_cvs/bin\" -DCOMMON_DIR=\"/home/greyleaf/local/htdig_cvs/share/htdig\" -DCONFIG_DIR=\"/home/greyleaf/local/htdig_cvs/conf\" -DDATABASE_DIR=\"/home/greyleaf/local/htdig_cvs/var/htdig\" -DIMAGE_URL_PREFIX=\"/htdig\" -g -O2 -Wall -fno-rtti -fno-exceptions -Wno-deprecated -c conf_lexer.cxx -fPIC -DPIC -o .libs/conf_lexer.lo conf_lexer.cxx:244: warning: `void* yy_flex_realloc(void*, unsigned int)' declared `static' but never defined |