|
From: Joe R. J. <jj...@cl...> - 2002-08-28 05:02:01
|
On Tue, 27 Aug 2002, Geoff Hutchison wrote:
> Date: Tue, 27 Aug 2002 23:04:04 -0500
> From: Geoff Hutchison <ghu...@ws...>
> To: Joe R. Jah <jj...@cl...>
> Cc: htdig3-dev <htd...@li...>
> Subject: Re: [htdig-dev] Re: mifluz merge snapshot 2002-08-27
>
> Do you get this with htstat or htdump? Can you also run something like
Here they go:
--------------------------------8<--------------------------------
$ htdump
WordKeyInfo::WordKeyInfo: didn't find key description in config
WordKey::Pack: malloc returned 0
WordKey::Pack: malloc returned 0
WordKey::Pack: malloc returned 0
WordDBEncoded::ShiftValue: what = 9, (idx = 1) >= (length = 1)
Abort (core dumped)
$ gdb htdump htdump.core
GNU gdb
This GDB was configured as "i386-unknown-bsdi4.3"...
Core was generated by `htdump'.
Program terminated with signal 6, Aborted.
Reading symbols from /usr/local/htdig/3.2/lib/htdig/libhtnet-...so...done.
Reading symbols from /usr/local/htdig/3.2/lib/htdig/libcommon-...so...done.
Reading symbols from /usr/local/htdig/3.2/lib/htdig/libhtword-...so...done.
Reading symbols from /usr/local/htdig/3.2/lib/htdig/libht-...so...done.
Reading symbols from /usr/lib/libz.so...done.
Reading symbols from /usr/local/lib/libiconv.so.2...done.
Reading symbols from /usr/lib/libstdc++.so.1...done.
Reading symbols from /shlib/libm.so.0.0...done.
Reading symbols from /shlib/libgcc.so.1...done.
Reading symbols from /shlib/libc.so.2...done.
Reading symbols from /shlib/ld-bsdi.so...done.
#0 0x482c548d in kill () from /shlib/libc.so.2
(gdb) bt
#0 0x482c548d in kill () from /shlib/libc.so.2
#1 0x483509b3 in abort () from /shlib/libc.so.2
#2 0x48116544 in WordDBCompress::UncompressIBtree (this=0x8110840, inbuff=0x8116000 "#",
inbuff_length=2032, outbuff=0x81a4a28 "", outbuff_length=8192) at WordDBCompress.cc:219
#3 0x48115d93 in WordDBCompress::UncompressBtree (this=0x8110840, inbuff=0x8116000 "#",
inbuff_length=2032, outbuff=0x81a4a28 "", outbuff_length=8192) at WordDBCompress.cc:726
#4 0x48114aa2 in WordDBCompress::Uncompress (this=0x8110840, inbuff=0x8116000 "#", inbuff_length=2032,
outbuff=0x81a4a28 "", outbuff_length=8192) at WordDBCompress.cc:351
#5 0x481146af in WordDBCompress_uncompress_c (inbuff=0x8116000 "#", inbuff_length=2032,
outbuff=0x81a4a28 "", outbuff_length=8192, user_data=0x8110840) at WordDBCompress.cc:75
#6 0x8089491 in CDB___memp_cmpr_read (dbmfp=0x80aa6c0, bhp=0x81a49f0, db_io=0x8047800, niop=0x80477fc)
at mp_cmpr.c:353
#7 0x80890f2 in CDB___memp_cmpr (dbmfp=0x80aa6c0, bhp=0x81a49f0, db_io=0x8047800, flag=1, niop=0x80477fc)
at mp_cmpr.c:134
#8 0x8088717 in CDB___memp_pgread (dbmfp=0x80aa6c0, bhp=0x81a49f0, can_create=0) at mp_bh.c:214
#9 0x8062db7 in CDB_memp_fget (dbmfp=0x80aa6c0, pgnoaddr=0x8047904, flags=0, addrp=0x8047908)
at mp_fget.c:370
#10 0x8097aed in CDB___bam_search (dbc=0x80cbe00, key=0x8047aec, flags=257, stop=1, recnop=0x0,
exactp=0x80479e4) at bt_search.c:302
#11 0x809039d in __bam_c_search (dbc=0x80cbe00, key=0x8047aec, flags=30, exactp=0x80479e4)
at bt_cursor.c:1828
#12 0x808eabb in __bam_c_get (dbc=0x80cbe00, key=0x8047aec, data=0x8047ad0, flags=30, pgnop=0x8047a3c)
at bt_cursor.c:938
#13 0x80794dc in CDB___db_c_get (dbc_arg=0x80cbf00, key=0x8047aec, data=0x8047ad0, flags=30) at db_cam.c:569
#14 0x4810fe6a in WordCursorOne::WalkNextStep (this=0x80cbd00) at WordDB.h:226
#15 0x4810fd74 in WordCursorOne::WalkNext (this=0x80cbd00) at WordCursorOne.cc:269
#16 0x4810f707 in WordCursorOne::Walk (this=0x80cbd00) at WordCursorOne.cc:158
#17 0x480ce04f in HtWordList::Dump (this=0x8047c38, filename=@0x8047c28) at HtWordList.cc:173
#18 0x804beb4 in main (ac=1, av=0x8047d2c) at htdump.cc:149
#19 0x804b723 in __start ()
(gdb) q
$ htstat
htstat: Total documents: 130
WordKeyInfo::WordKeyInfo: didn't find key description in config
WordList::NotImplemented
Abort (core dumped)
$ gdb htstat htstat.core
GNU gdb
This GDB was configured as "i386-unknown-bsdi4.3"...
Core was generated by `htstat'.
Program terminated with signal 6, Aborted.
Reading symbols from /usr/local/htdig/3.2/lib/htdig/libhtnet-...so...done.
Reading symbols from /usr/local/htdig/3.2/lib/htdig/libcommon-...so...done.
Reading symbols from /usr/local/htdig/3.2/lib/htdig/libhtword-...so...done.
Reading symbols from /usr/local/htdig/3.2/lib/htdig/libht-...so...done.
Reading symbols from /usr/lib/libz.so...done.
Reading symbols from /usr/local/lib/libiconv.so.2...done.
Reading symbols from /usr/lib/libstdc++.so.1...done.
Reading symbols from /shlib/libm.so.0.0...done.
Reading symbols from /shlib/libgcc.so.1...done.
Reading symbols from /shlib/libc.so.2...done.
Reading symbols from /shlib/ld-bsdi.so...done.
#0 0x482c548d in kill () from /shlib/libc.so.2
(gdb) bt
#0 0x482c548d in kill () from /shlib/libc.so.2
#1 0x483509b3 in abort () from /shlib/libc.so.2
#2 0x481239ed in WordList::NotImplemented () at WordList.h:427
#3 0x804c03f in main (ac=1, av=0x8047d2c) at ../htword/WordList.h:202
#4 0x804b813 in __start ()
(gdb) q
--------------------------------8<--------------------------------
> "htfuzzy metaphone" to see if programs can read the resulting database?
No apparent Problem:
--------------------------------8<--------------------------------
$ htfuzzy metaphone
$
--------------------------------8<--------------------------------
> (i.e. is htdig writing, but everyone else crashes?)
Htdig took four minutes and 20 seconds to index ~300 documents:
--------------------------------8<--------------------------------
$ ll ../db
-rw-r--r-- 1 jjah www 98304 Aug 27 21:27 db.docdb
-rw-r--r-- 1 jjah www 1648618 Aug 27 21:31 db.docs
-rw-r--r-- 1 jjah www 32768 Aug 27 21:27 db.docs.index
-rw-r--r-- 1 jjah www 860160 Aug 27 21:27 db.excerpts
-rw-r--r-- 1 jjah www 0 Aug 27 21:31 db.worddump
-rw-r--r-- 1 jjah www 1609728 Aug 27 21:27 db.words.db
--------------------------------8<--------------------------------
> Unfortunately, the key features to improve indexing performance are also
> really buggy. I'm not sure if it's in mifluz yet, or the interface to
> htdig.
You are right about indexing prformance; htdig-3.1.6 takes ~10 minutes on
my system to index ~10500 documents;)
Regards,
Joe
--
_/ _/_/_/ _/ ____________ __o
_/ _/ _/ _/ ______________ _-\<,_
_/ _/ _/_/_/ _/ _/ ......(_)/ (_)
_/_/ oe _/ _/. _/_/ ah jj...@cl...
|