You can subscribe to this list here.
| 2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(47) |
Nov
(74) |
Dec
(66) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2002 |
Jan
(95) |
Feb
(102) |
Mar
(83) |
Apr
(64) |
May
(55) |
Jun
(39) |
Jul
(23) |
Aug
(77) |
Sep
(88) |
Oct
(84) |
Nov
(66) |
Dec
(46) |
| 2003 |
Jan
(56) |
Feb
(129) |
Mar
(37) |
Apr
(63) |
May
(59) |
Jun
(104) |
Jul
(48) |
Aug
(37) |
Sep
(49) |
Oct
(157) |
Nov
(119) |
Dec
(54) |
| 2004 |
Jan
(51) |
Feb
(66) |
Mar
(39) |
Apr
(113) |
May
(34) |
Jun
(136) |
Jul
(67) |
Aug
(20) |
Sep
(7) |
Oct
(10) |
Nov
(14) |
Dec
(3) |
| 2005 |
Jan
(40) |
Feb
(21) |
Mar
(26) |
Apr
(13) |
May
(6) |
Jun
(4) |
Jul
(23) |
Aug
(3) |
Sep
(1) |
Oct
(13) |
Nov
(1) |
Dec
(6) |
| 2006 |
Jan
(2) |
Feb
(4) |
Mar
(4) |
Apr
(1) |
May
(11) |
Jun
(1) |
Jul
(4) |
Aug
(4) |
Sep
|
Oct
(4) |
Nov
|
Dec
(1) |
| 2007 |
Jan
(2) |
Feb
(8) |
Mar
(1) |
Apr
(1) |
May
(1) |
Jun
|
Jul
(2) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
| 2008 |
Jan
(1) |
Feb
|
Mar
(1) |
Apr
(2) |
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
| 2009 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
| 2011 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
| 2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2013 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2016 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
|
From: Lachlan A. <lh...@us...> - 2003-02-26 12:45:00
|
Greetings all,
Just for the record:
1) The -i option doesn't remove the _weakcmpr file.
Neal, what effect will that have?
2) I've just run htdig on an existing database *without* -i and
it also complained about weakcmpr problems.
(I've forgotten whether I ran htpurge after the first run, so
I'm running it again without it.)
3) There is still a (different) problem with pagesize 32k. The
htdig ran OK, but the second htpurge complained near the end.
Cheers,
Lachlan
On Wednesday 26 February 2003 10:30, Neal Richter wrote:
> Does the error happen when you run htdig -i twice NOT using rundig?
|
|
From: Neal R. <ne...@ri...> - 2003-02-25 23:28:45
|
Jim, Does the error happen when you run htdig -i twice (NOT using rundig)? Thanks. On Mon, 24 Feb 2003, Jim Cole wrote: > Hi - I was able to repeat the problem again. The second time around I > made a point of catching the page numbers. They were the same as those > listed in your log file. > > Jim > > On Sunday, February 23, 2003, at 06:21 AM, Lachlan Andrew wrote: > > > OK, now try this on for size... > > > > If I run the attached rundig script, with -v and the attached > > .conf script on the attached directory (51 copies of the attached > > file hash) with an empty .../var/htdig-crash1 directory, then all > > is well. However, if I run it a *second* time, it gives the attached > > log file. > > > > This is odd since the script uses -i which is supposed to ignore the > > contents of the directory. (On another note, should -i also ignore > > the db.log file? It currently doesn't.) > > > > Neal, can you (or anyone else) replicate this behaviour? > > > > Thanks! > > Lachlan > > > > On Sunday 23 February 2003 16:50, Lachlan Andrew wrote: > >> Whoops! I didn't make clean after installing the new libraries. > >> Now that I have, I haven't been able to reproduce the > >> problem.<rundig><valid_punct.conf><directory><hash><log.first-200- > >> lines> > > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > htdig-dev mailing list > htd...@li... > https://lists.sourceforge.net/lists/listinfo/htdig-dev > Neal Richter Knowledgebase Developer RightNow Technologies, Inc. Customer Service for Every Web Site Office: 406-522-1485 |
|
From: Lachlan A. <lh...@us...> - 2003-02-25 12:55:25
|
Thanks for that, Jim! I am glad it isn't just my system. The OS X crash is probably a good thing to look at, if it occurs early=20 in the dig. (a) Could you please post (or mail me) the complete=20 backtrace? (b) Does it still occur without compression? (c) Are you=20 using zlib-1.1.4? (I have had core dumps with earlier versions, but=20 not since upgrading.) Thanks again, Lachlan On Tuesday 25 February 2003 15:32, Jim Cole wrote: > Hi - I was able to repeat the problem again. The second time around > I made a point of catching the page numbers. They were the same as > those listed in your log file. |
|
From: Jim C. <li...@yg...> - 2003-02-25 04:32:29
|
Hi - I was able to repeat the problem again. The second time around I made a point of catching the page numbers. They were the same as those listed in your log file. Jim On Sunday, February 23, 2003, at 06:21 AM, Lachlan Andrew wrote: > OK, now try this on for size... > > If I run the attached rundig script, with -v and the attached > .conf script on the attached directory (51 copies of the attached > file hash) with an empty .../var/htdig-crash1 directory, then all > is well. However, if I run it a *second* time, it gives the attached > log file. > > This is odd since the script uses -i which is supposed to ignore the > contents of the directory. (On another note, should -i also ignore > the db.log file? It currently doesn't.) > > Neal, can you (or anyone else) replicate this behaviour? > > Thanks! > Lachlan > > On Sunday 23 February 2003 16:50, Lachlan Andrew wrote: >> Whoops! I didn't make clean after installing the new libraries. >> Now that I have, I haven't been able to reproduce the >> problem.<rundig><valid_punct.conf><directory><hash><log.first-200- >> lines> |
|
From: Ted Stresen-R. <ted...@ma...> - 2003-02-24 22:20:11
|
Hi, Thanks to everyone for the feedback. I've been taking notes of everything that's been suggested. I'm also determined to come up with a "cheerier" presentation or two for all to choose from, but that's low on the list of priorities as far as I can tell... Current workload aside (I'm a little backed up during the day gig right now...) the plan is to use XSL to create defaults.cc, a split version of attrs.html and a version in a single file, as well as versions sorted by / grouped by "program" or "category". Additionally, and the htdig developers should chime in if they feel the need to, it is my understanding that in order for this to be a truly valuable contribution, it needs to be able to do its thing in Perl and not rely on the users having Xalan or Saxon or any other XSL processor installed. I'm more than happy to take a whack at reproducing all of this in Perl, but I'd like to know if there is any objection to including one of the Perl XSL processors in the htdig distribution. This would _greatly_ reduce the amount of work I have ahead of me. As always, task status is here: http://www.tedmasterweb.com/htdig/ Comments? Ted Stresen-Reuter On Monday, February 24, 2003, at 03:47 PM, Lachlan Andrew wrote: > On Monday 24 February 2003 21:01, Budd, Sinclair E wrote: >> strange as it might seem, I sometimes print of the full >> Documentation. Could there be a way of presenting a printable >> version. > > Greetings, > > I had assumed the plan was to keep attrs.html in addition to the > dynamic pages. However, being able to generate attrs.html and > cf_*.html on the fly would reduce the redundancy in the > documentation. > > (Also note that there is currently a bug in the generation of > attrs.html, where the #defines in defaults.cc aren't interpreted. > I'm not sure if this affects defaults.xml. See > <https://sourceforge.net/tracker/ > index.php?func=detail&aid=692125&group_id=4593&atid=104593> > for the bug report.) > > Cheers, > Lachlan > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > htdig-dev mailing list > htd...@li... > https://lists.sourceforge.net/lists/listinfo/htdig-dev > |
|
From: Phillip P. <pp...@my...> - 2003-02-24 22:10:31
|
Hi, Just wondering - is anyone actively working on Python bindings to ht://Dig? I've been working on getting it to build as a Python module so I can hook it in to my Python Community Server project and use it to search hosted weblogs. (See http://www.pycs.net/ for more info). I've got it so "./configure --with-python && make" will build htsearch as a Python module as well as making htsearch and qtest as usual. Policy question: Once this is all working, would it be possible for the Python interface to make it into the default version of ht://Dig, or would it be better to keep it as a separate branch? What do you guys usually do about this sort of thing (language-specific bindings)? Sorry for asking a question that should be answerable by browsing the list archives ... but they're not working for me :( Cheers, Phil |
|
From: Lachlan A. <lh...@us...> - 2003-02-24 21:48:13
|
On Monday 24 February 2003 21:01, Budd, Sinclair E wrote: > strange as it might seem, I sometimes print of the full > Documentation. Could there be a way of presenting a printable > version. Greetings, I had assumed the plan was to keep attrs.html in addition to the=20 dynamic pages. However, being able to generate attrs.html and =20 cf_*.html on the fly would reduce the redundancy in the=20 documentation. (Also note that there is currently a bug in the generation of =20 attrs.html, where the #defines in defaults.cc aren't interpreted. =20 I'm not sure if this affects defaults.xml. See=20 <https://sourceforge.net/tracker/index.php?func=3Ddetail&aid=3D692125&gro= up_id=3D4593&atid=3D104593>=20 for the bug report.) Cheers, Lachlan |
|
From: Budd, S. E <s....@im...> - 2003-02-24 10:02:34
|
strange as it might seem, I sometimes print of the full Documentation. Could there be a way of presenting a printable version. ( isnt that what xml and stylesheets are about? ) -----Original Message----- From: Geoff Hutchison [mailto:ghu...@ws...] Sent: Friday, February 21, 2003 5:44 PM To: Ted Stresen-Reuter Cc: htd...@li... Subject: Re: [htdig-dev] update to Documentation via defaults.xml > http://www.tedmasterweb.com/htdig/ > > Always appreciate the feedback... I think it's looking pretty good overall. It'd be nice to have lists by category, programs, etc. and I'm sure that's on your TODO list. Minor nit-picky things. It might not be a bad idea to have letters (with no links) even when there are no attributes. ABCDE ... H looks a bit strange. :-) I'm also not thrilled about the really dark sidebar since the visited link text ends up disappearing. But that's something that can be tweaked later. The basics look pretty good, and it's definitely a giant step in the right direction, IMHO. Thanks! -Geoff ------------------------------------------------------- This SF.net email is sponsored by: SlickEdit Inc. Develop an edge. The most comprehensive and flexible code editor you can use. Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial. www.slickedit.com/sourceforge _______________________________________________ htdig-dev mailing list htd...@li... https://lists.sourceforge.net/lists/listinfo/htdig-dev |
|
From: Jim C. <li...@yg...> - 2003-02-24 07:32:28
|
On Sunday, February 23, 2003, at 06:21 AM, Lachlan Andrew wrote: > If I run the attached rundig script, with -v and the attached > .conf script on the attached directory (51 copies of the attached > file hash) with an empty .../var/htdig-crash1 directory, then all > is well. However, if I run it a *second* time, it gives the attached > log file. > > This is odd since the script uses -i which is supposed to ignore the > contents of the directory. (On another note, should -i also ignore > the db.log file? It currently doesn't.) > > Neal, can you (or anyone else) replicate this behaviour? Hi - I was able to duplicate the problem on a machine running Red Hat 8.0. The results of the second run were almost identical to what you log file shows. I was neither redirecting stderr nor paying attention when the errors started, so I didn't catch the page number of the first failure. I am in the process of repeating the experiment and will catch the page number this time around. I tried the same thing on an OS X box, and htdig core dumps (segfault) early on during the first pass with rundig. The core file indicates failure in a vm_allocate call, however the backtrace shows the call to be 3000+ CDB___* calls deep. The entry point from the backtrace is show below. I can provide the full backtrace if anyone wants to see it. Jim #3249 0x00054bc4 in CDB___bam_c_put (dbc_orig=0x19e9660, key=0xbfffdd30, data=0xbfffdd50, flags=15) at bt_cursor.c:925 #3250 0x0003c7b0 in CDB___db_put (dbp=0x19e9904, txn=0x19e9660, key=0xa99640, data=0xbfffdd30, flags=27170944) at db_am.c:508 #3251 0x00025298 in WordList::Put(WordReference const&, int) (this=0xbffff1f0, arg=@0xbfffdd30, flags=1) at WordDB.h:126 #3252 0x0001eeac in HtWordList::Flush() (this=0xbffff1f0) at ../htword/WordList.h:118 #3253 0x00020c9c in DocumentRef::AddDescription(char const*, HtWordList&) (this=0x548ab90, d=0x11bd98 "", words=@0xbffff1f0) at DocumentRef.cc:512 #3254 0x0000bac8 in Retriever::got_href(URL&, char const*, int) (this=0xbffff140, url=@0x1ae8d30, description=0x5446880 "QPainter", hops=1) at Retriever.cc:1496 #3255 0x00005c84 in HTML::do_tag(Retriever&, String&) (this=0xbba9b0, retriever=@0xbffff140, tag=@0xbbaa24) at ../htlib/htString.h:45 #3256 0x00005120 in HTML::parse(Retriever&, URL&) (this=0xbba9b0, retriever=@0xbffff140, baseURL=@0x10000) at HTML.cc:414 #3257 0x00009a24 in Retriever::RetrievedDocument(Document&, String const&, DocumentRef*) (this=0xbffff140, doc=@0xa99950, url=@0x10000, ref=0xbb9ad0) at Retriever.cc:818 #3258 0x000094d0 in Retriever::parse_url(URLRef&) (this=0xbffff140, urlRef=@0x1b730c0) at ../htcommon/URL.h:51 #3259 0x00008b28 in Retriever::Start() (this=0xbffff140) at Retriever.cc:432 #3260 0x0000f988 in main (ac=5, av=0xbffff9c4) at htdig.cc:338 #3261 0x0000266c in _start (argc=5, argv=0xbffff9c4, envp=0xbffff9dc) at /SourceCache/Csu/Csu-45/crt.c:267 #3262 0x000024ec in start () at /usr/include/gcc/darwin/3.1/g++-v3/streambuf:129 |
|
From: Lachlan A. <lh...@us...> - 2003-02-23 13:23:48
|
OK, now try this on for size... If I run the attached rundig script, with -v and the attached =20 =2Econf script on the attached directory (51 copies of the attached=20 file hash) with an empty .../var/htdig-crash1 directory, then all=20 is well. However, if I run it a *second* time, it gives the attached=20 log file. This is odd since the script uses -i which is supposed to ignore the=20 contents of the directory. (On another note, should -i also ignore=20 the db.log file? It currently doesn't.) Neal, can you (or anyone else) replicate this behaviour? Thanks! Lachlan On Sunday 23 February 2003 16:50, Lachlan Andrew wrote: > Whoops! I didn't make clean after installing the new libraries. > Now that I have, I haven't been able to reproduce the problem. |
|
From: Gabriele B. <bar...@in...> - 2003-02-23 08:33:24
|
>I don't see any reason we shouldn't stick to a feature freeze. Is there >any feature we're somehow leaving out right now? There's my cookies input file feature. I am ready now to commit it (after I have reviewed it according to Gilles' suggestion). Ciao -Gabriele -- Gabriele Bartolini - Web Programmer - ht://Dig & IWA Member - ht://Check maintainer Current Location: Prato, Tuscany, Italia bar...@in... | http://www.prato.linux.it/~gbartolini | ICQ#129221447 |
|
From: Geoff H. <ghu...@us...> - 2003-02-23 08:16:55
|
STATUS of ht://Dig branch 3-2-x
RELEASES:
3.2.0b5: Next release, First quarter 2003???
3.2.0b4: "In progress" -- snapshots called "3.2.0b4" until prerelease.
3.2.0b3: Released: 22 Feb 2001.
3.2.0b2: Released: 11 Apr 2000.
3.2.0b1: Released: 4 Feb 2000.
(Please note that everything added here should have a tracker PR# so
we can be sure they're fixed. Geoff is currently trying to add PR#s for
what's currently here.)
SHOWSTOPPERS:
* Mifluz database errors are a severe problem (PR#428295)
-- Does Neal's new zlib patch solve this for now?
KNOWN BUGS:
* Odd behavior with $(MODIFIED) and scores not working with
wordlist_compress set but work fine without wordlist_compress.
(the date is definitely stored correctly, even with compression on
so this must be some sort of weird htsearch bug) PR#618737.
* META descriptions are somehow added to the database as FLAG_TITLE,
not FLAG_DESCRIPTION. (PR#618738)
Can anyone reproduce this? I can't! -- Lachlan
PENDING PATCHES (available but need work):
* Additional support for Win32.
* Memory improvements to htmerge. (Backed out b/c htword API changed.)
* Mifluz merge.
NEEDED FEATURES:
* Quim's new htsearch/qtest query parser framework.
* File/Database locking. PR#405764.
TESTING:
* httools programs:
(htload a test file, check a few characteristics, htdump and compare)
* Tests for new config file parser
* Duplicate document detection while indexing
* Major revisions to ExternalParser.cc, including fork/exec instead of popen,
argument handling for parser/converter, allowing binary output from an
external converter.
* ExternalTransport needs testing of changes similar to ExternalParser.
DOCUMENTATION:
* List of supported platforms/compilers is ancient. (PR#405279)
* Document all of htsearch's mappings of input parameters to config attributes
to template variables. (Relates to PR#405278.)
Should we make sure these config attributes are all documented in
defaults.cc, even if they're only set by input parameters and never
in the config file?
* Split attrs.html into categories for faster loading.
* Turn defaults.cc into an XML file for generating documentation and
defaults.cc.
* require.html is not updated to list new features and disk space
requirements of 3.2.x (e.g. regex matching, database compression.)
PRs# 405280 #405281.
* TODO.html has not been updated for current TODO list and
completions.
I've tried. Someone "official" please check and remove this -- Lachlan
* Htfuzzy could use more documentation on what each fuzzy algorithm
does. PR#405714.
* Document the list of all installed files and default
locations. PR#405715.
OTHER ISSUES:
* Can htsearch actually search while an index is being created?
* The code needs a security audit, esp. htsearch. PR#405765.
|
|
From: Lachlan A. <lh...@us...> - 2003-02-23 05:50:52
|
On Wednesday 19 February 2003 23:44, Lachlan Andrew wrote: > (BTW, zlib 1.1.4 is still giving errors, albeit for a slightly > different data set.) Whoops! I didn't make clean after installing the new libraries. =20 Now that I have, I haven't been able to reproduce the problem. I'm=20 keeping trying, but sorry for leading people on a wild goose chase... Cheers, Lachlan |
|
From: Geoff H. <ghu...@ws...> - 2003-02-22 22:34:26
|
Begin forwarded message: > From: Dave Stevens <ge...@un...> > Date: Sun Feb 9, 2003 9:47:57 PM US/Central > To: Geoff Hutchison <ghu...@ws...> > Subject: Re: Formation of the ht://Dig Group > > Dear Mr. Hutchison, > > My apologies for not replying sooner to your invitation. I got your > mail because I was able once to review a document related to mifluz > which was written in English by a non-native speaker and render it > into idiomatically correct English (or more nearly so). This was for > me just about exactly the kind of contribution I would like to make. I > have a strong computer background with 28 years in the field, but I am > not able to take on any technical development work. Nonetheless if in > future you would like me to undertake this kind of polishing work I > would be happy to try to do so. In particular, here's what I will > undertake to do: > > if you send me an English language document for editing I will try to > make sure it is correct with respect to grammar, spelling and idiom > > if I see obvious technical errors or impenetrable jargon I will point > them out without necessarily correcting them > > if you have editorial standards for such documents I will try to > consistently apply them > > and if for any reason work comes to me which I am not able to handle, > which is occasionally likely to be the case, I will say so > immediately, as well as giving an estimate of my availability in the > near future. > > I am not able to undertake a technical review of the document for > correctness as a description of the programs or their use, I can't > take the time to stay current to that degree. > > If you would like to do a test, please send me an excerpt that needs > work and I will turn it around to you ASAP > > Dave Stevens |
|
From: Lachlan A. <lh...@us...> - 2003-02-22 08:50:37
|
On Thursday 20 February 2003 02:19, Gilles Detillieux wrote: > According to Lachlan Andrew: > > I hadn't realised it, but the > > valid_punctuation attribute seems to be treated as an *optional* > > word break. (The docs say it is *not* a word break) > > I guess the docs haven't kept up with what the code does. > this functionality was extended to also index each word part, > so that something like "post-doctoral" gets indexed as > postdoctoral, post and doctoral. This greatly enhances searches for > compound words, or parts thereof, but it tends to break down when > you're indexing something that's not really words... Thanks for that clarification Gilles. Would it be better to convert queries for post-doctoral into the=20 phrase "post doctoral" in queries, and simply the words post and =20 doctoral in the database? As it stands, a search for "the=20 non-smoker" will match "the smoker", since all the words are given=20 the same position in the database. It also reduces the size of the=20 database (marginally in most cases, but significantly for=20 pathological documents). Now that there is phrase searching, is=20 there any benefit of the current approach? If not, we could do away=20 with valid_punctuation entirely (after 3.2.0b5). > if you're going to feed a bunch of C code into htdig, you > should probably do so with a severely stripped down setting of > valid_punctuation.... However, > if the underlying word database is solid, then it shouldn't fall > apart no matter how much junk you throw at it. the root > cause of the trouble seems to be a bug somewhere in the code. My thoughts exactly. I'm only using this page for debugging... Cheers, Lachlan |
|
From: Geoff H. <ghu...@ws...> - 2003-02-22 05:19:25
|
Well, I had an oral exam on Thursday, so I've been quite busy the last few weeks and fortunately things should settle down a bit. (I can't say I followed much e-mail unless my filters threw it into the "Family" mailbox, sorry.) >> SHOWSTOPPER: >> * Still need thorough testing of the database, with Neal's zlib >> patch Any ideas how we can test this thoroughly? Seems like you've got a pretty good start so far. I'd be glad to run some larger-scale digs under Mac OS X. Is there another platform we can throw into the mix Neal? I can offer RH8.0 as well, but I don't know that this would shed any new light on things. As far as a large-scale test, I'd be glad to tar up the htdig.org website including mail archives. >:-) >> TESTING: >> * httools programs: (htload a test file, check a few >> characteristics, htdump and compare) I haven't written an actual test-suite script for this, but I do run these tests by hand. If anyone has any suggestions on what all I should be testing, please let me know. >> * Test field restricted searching Lachlan, does this fall into your area? >> DOCUMENTATION: >> * Split attrs.html into categories for faster loading. >> (i.e., commit Ted's scripts) Hmm. Let's see how this comes along. It's nice right now, but I don't know that we'd want to put it in a release within a few weeks. But any decision on that can wait. >> Fri 14 Feb: Last additions to the above TO-DO list >> Fri 21 Feb: Feature freeze. >> No new features added. All available development >> effort for testing, bug fixes, documentation and >> configure/install issues I don't see any reason we shouldn't stick to a feature freeze. Is there any feature we're somehow leaving out right now? >> Fri 28 Feb: Code freeze. Testing and documenting only. >> Update version number throughout. >> Fri 7 Mar: Release. This may be ambitious. I've never knowingly sent out a release that had a database bug. So we obviously want to kill that SHOWSTOPPER soon. Neal, how much time do you have to devote to the bug? Would it help for me to set up valgrind on my RH box? -Geoff |
|
From: Geoff H. <ghu...@ws...> - 2003-02-22 05:03:31
|
> I'm now working with this Snapshot, but have the same truble with lex > like jesse > on AIX. > - > lex -L `test -f conf_lexer.lxx || echo './'`conf_lexer.lxx It shouldn't need to be running flex/lex. The code does have the appropriately-generated file. Make sure that the conf_lexer.cxx hasn't been overwritten by lex, and do the following: touch conf_lexer.cxx touch conf_parser.cxx See if that helps prevent lex from trying to parse a flex file. If you want to try installing flex, check the GNU site: http://www.gnu.org/software/flex/ -Geoff |
|
From: Gilles D. <gr...@sc...> - 2003-02-22 03:55:11
|
According to Gabriele Bartolini:
> Here is the description for cookies_input_file:
>
> "Set the input file to be used when importing cookies for the crawl;
> cookies must be specified according to Netscape's format (tab separated
> fields).
> For more information, give a look at the example cookies file (cookies.txt)
> distributed with ht://Dig. By default, no input file is read."
Here's my edited version:
{ "cookies_input_file", "", \
"string", "htdig", "", "3.2.0b4", "Indexing:Connection", "cookies_input_file: ${common_dir}/cookies.txt", " \
Specifies the location of the file used for importing cookies \
for the crawl. These cookies will be preloaded into htdig's \
in-memory cookie jar, but aren't written back to the file. \
Cookies are specified according to Netscape's format \
(tab-separated fields). If this attribute is left blank, \
no cookie file will be read. \
For more information, see the sample cookies.txt file in the \
ht://Dig source distribution. \
" }, \
--
Gilles R. Detillieux E-mail: <gr...@sc...>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada)
|
|
From: Gabriele B. <bar...@in...> - 2003-02-22 02:34:41
|
Cheers Gilles,
I'll give a look at the code for the cookie jar tomorrow morning.
>Anyway, let me know when you've written something, and I'll have a look.
>(Maybe post to htdig-dev for others to comment too.)
Here is the description for cookies_input_file:
"Set the input file to be used when importing cookies for the crawl;
cookies must be specified according to Netscape's format (tab separated
fields).
For more information, give a look at the example cookies file (cookies.txt)
distributed with ht://Dig. By default, no input file is read."
>I have to ask, was it for practical or ideological reasons that they
>convinced you to switch? ;-)
I guess both. I am starting to love Debian packages system. :-)
Ciao ciao
-Gabriele
--
Gabriele Bartolini - Web Programmer - ht://Dig & IWA Member - ht://Check
maintainer
Current Location: Prato, Tuscany, Italia
bar...@in... | http://www.prato.linux.it/~gbartolini | ICQ#129221447
|
|
From: Gilles D. <gr...@sc...> - 2003-02-22 00:17:35
|
According to Gabriele Bartolini: > >been set to a new HtCookieInFileJar object, which doesn't get deleted. > >Shouldn't the delete cookie_file statement be moved outside of the > >innermost if clause, and past the end of the else clause? > > It is deleted later by the main function, through the base class > HtCookieJar pointer. OK, but that still doesn't explain the inconsistency. The HtCookieInFileJar object is allocated whether or not the result is non-zero, so why do you delete the object after using it when the result is zero, but leave it up to a completely different function otherwise? It just doesn't make sense to me, and it's clearly not "defensive programming". If you don't need the object anymore, you should delete it unconditionally. If you do need the HtCookieInFileJar object for later when the result code is non-zero, that hasn't been made clear. > >expert on every piece of code added to 3.2. If you add a description to > >defaults.cc yourself, doing your best to describe it, I'll gladly fix any > >grammatical errors or ask you about ambiguities I find in the description, > >but I don't want to have to document things I don't understand or use. > >Ditto for testing - I can't test cookie support in htdig, because I > >don't use them on my system. > > Sorry Gilles, I didn't explain myself correctly. I just wanted you to > eventually correct my entry in the defaults file, as you know my english > has a marked spaghetti accent. Well, on Feb. 1 you wrote: "Then, as always, Gilles (I know I am terrible but you should know me already!) please find me a suitable description for the 'cookies_input_file' attribute to be put in the defaults.cc file (and defaults.xml too!)." The words "find me a suitable description" certainly seem to imply you were expecting me to write it, not just proofread it, especially since you never gave me anything to proofread. Anyway, let me know when you've written something, and I'll have a look. (Maybe post to htdig-dev for others to comment too.) > I did not mean to ask you to test the cookies code, sorry about it! I was > referring just to the description stuff (or maybe the distributed > cookies.txt file). I misinterpreted what you wrote on Feb. 1: "Please test it and let me know if I can commit in the next week." I thought you were asking me, but that e-mail was addressed to all the developers. > I have switched to Debian now (my friends at my local LUG convinced my > after years of RedHat) and hopefully tomorrow I'll be able to commit the code. I have to ask, was it for practical or ideological reasons that they convinced you to switch? ;-) -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) |
|
From: Gabriele B. <bar...@in...> - 2003-02-21 20:15:04
|
Ciao Gilles! >been set to a new HtCookieInFileJar object, which doesn't get deleted. >Shouldn't the delete cookie_file statement be moved outside of the >innermost if clause, and past the end of the else clause? It is deleted later by the main function, through the base class HtCookieJar pointer. >expert on every piece of code added to 3.2. If you add a description to >defaults.cc yourself, doing your best to describe it, I'll gladly fix any >grammatical errors or ask you about ambiguities I find in the description, >but I don't want to have to document things I don't understand or use. >Ditto for testing - I can't test cookie support in htdig, because I >don't use them on my system. Sorry Gilles, I didn't explain myself correctly. I just wanted you to eventually correct my entry in the defaults file, as you know my english has a marked spaghetti accent. I did not mean to ask you to test the cookies code, sorry about it! I was referring just to the description stuff (or maybe the distributed cookies.txt file). I have switched to Debian now (my friends at my local LUG convinced my after years of RedHat) and hopefully tomorrow I'll be able to commit the code. Ciao and thanks! -Gabriele -- Gabriele Bartolini - Web Programmer - ht://Dig & IWA Member - ht://Check maintainer Current Location: Prato, Tuscany, Italia bar...@in... | http://www.prato.linux.it/~gbartolini | ICQ#129221447 |
|
From: Geoff H. <ghu...@ws...> - 2003-02-21 17:51:45
|
> http://www.tedmasterweb.com/htdig/ > > Always appreciate the feedback... I think it's looking pretty good overall. It'd be nice to have lists by category, programs, etc. and I'm sure that's on your TODO list. Minor nit-picky things. It might not be a bad idea to have letters (with no links) even when there are no attributes. ABCDE ... H looks a bit strange. :-) I'm also not thrilled about the really dark sidebar since the visited link text ends up disappearing. But that's something that can be tweaked later. The basics look pretty good, and it's definitely a giant step in the right direction, IMHO. Thanks! -Geoff |
|
From: Geoff H. <ghu...@ws...> - 2003-02-21 17:46:53
|
On Fri, 14 Feb 2003, Neal Richter wrote: > I'm currently working on an extension to libhtdig that will allow > validation/testing of a given URL against these config vars: > > limit_urls_to > limit_normalized > exclude_urls > max_hop_count > restrict ... > If I am missing any config vars tell me! I'd guess they're less widely used, but: bad_querystr bad_extensions valid_extensions server_max_docs can also limit things (as would the robots.txt and meta-robots tag). -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ |
|
From: Martin L. <mar...@fi...> - 2003-02-21 11:15:38
|
Hi Gilles,
thanks for your notice about the snapshot 3.2.0b4 from sunday night.
I'm now working with this Snapshot, but have the same truble with lex l=
ike jesse
on AIX.
-
lex -L `test -f conf_lexer.lxx || echo './'`conf_lexer.lxx
0: (Warning) Unknown option L
159: (Warning) Undefined start condition <EOF
(Error) Too little core for final packing
-
Did anybody find out something about the POSIX lex <<EOF>> problem?
I've searched some newsgroups, but without any result.
Could having the flex instead lex on the system solve this and when,
what do i have to configure?
Cheers
Martin
---------------------
FIDUCIA AG Karlsruhe/Stuttgart
Wachhausstra=DFe 4
76227 Karlsruhe
Martin Laarz
Wissensmanager Technik
Tel.: 07 21 / 4004 - 2861 mailto: mar...@fi...
Fax: 07 21 / 4004 - 1176 Web: http://www.fiducia.de
Gilles Detillieux <gr...@sc...>@lists.sourceforge.net on
19.02.2003 19:00:32
Gesendet von: htd...@li...
An: mar...@fi... (Martin Laarz)
Kopie: htd...@li...
Thema: Re: [htdig-dev] Problem with illegal instruction with Version 3.=
2.0.b3 on
AIX 5.1L (RS6000)
According to Martin Laarz:
> I try to compile the 3.2.0.b3 Version on an AIX 5.1L Partition using =
gcc
> 2.9-aix51-020209
> on a RS6000 (IBM Regatta). What i need from the newer Version is the =
phrases
> feature.
>
> During the compilation phase i got theese warnings:
...
> At least I got my binaries, linking lstdc++ and lm against it.
>
> But when i try to use them e.g. htfuzzy to begin to build up the firs=
t .db for
> the endings, it got
> an exception like "illegal instruction" a core but not able to backtr=
ace.
Well, at the very least, I recommend you dump 3.2.0b3, and try the late=
st
3.2.0b4 development snapshot from http://www.htdig.org/files/snapshots/=
--
Gilles R. Detillieux E-mail: <gr...@sc...>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/=
Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada)
-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
www.slickedit.com/sourceforge
_______________________________________________
htdig-dev mailing list
htd...@li...
https://lists.sourceforge.net/lists/listinfo/htdig-dev
=
|
|
From: Lachlan A. <lh...@us...> - 2003-02-20 22:57:51
|
Greetings Geoff, Do you have any comments on my previous post about the to-do list and=20 release schedule for 3.2.0b5? It would be nice to have an official=20 list and schedule (or to know that a schedule is premature). Thanks :) Lachlan On Monday 10 February 2003 22:58, Lachlan Andrew wrote: > This are my list of *bare essentials* for 3.2.0b5 and a tentative > release schedule to get it out in about three weeks. Comments on > both are invited... > > TO-DO LIST > ~~~~~~~~~~ > SHOWSTOPPER: > * Still need thorough testing of the database, with Neal's zlib > patch Any ideas how we can test this thoroughly? > TESTING: > * httools programs: (htload a test file, check a few > characteristics, htdump and compare) > * Test field restricted searching > DOCUMENTATION: > * Split attrs.html into categories for faster loading. > (i.e., commit Ted's scripts) > * require.html is not updated to list new features and disk space > requirements of 3.2.x (e.g. regex matching, database > compression.) PRs# 405280 #405281. > * Document the list of all installed files and default > locations. PR#405715. > * (Update version number from 3.2.0b4 to 3.2.0b5) > > TENTATIVE SCHEDULE > ~~~~~~~~~~~~~~~~~~ > Fri 14 Feb: Last additions to the above TO-DO list > Fri 21 Feb: Feature freeze. > =09 No new features added. All available development > =09 effort for testing, bug fixes, documentation and > =09 configure/install issues > Fri 28 Feb: Code freeze. Testing and documenting only. > =09 Update version number throughout. > Fri 7 Mar: Release. > > On Tuesday 04 February 2003 01:55, Geoff Hutchison wrote: > > > Is there a list of tasks which *must* be completed before the > > > release of 3.2.0b4/5? > > > > The STATUS file is the list > > I would definitely say that this zlib compression issue is a > > "showstopper" at the moment. > > > > Why don't you propose a list of what you think is essential for > > 3.2.0b5. On Friday 21 February 2003 05:12, Gilles Detillieux wrote: > You don't need anyone's permission to commit to cvs, > as long as there isn't a feature freeze in place, which there isn't > for 3.2.0b4/b5.=20 |