From: Bryan W. <br...@ar...> - 2005-12-10 23:04:20
|
I have been running CLucene 0.8.11 for about 18 months. Every night I create a new index of my entire database. I am now experimenting with upgrading to 0.9.10. The text box is running Fedora Core 4. I built CLucene with ./configure --enable-ascii. The old version would produce a directory that looked like this: -rw-rw-rw- 1 bryan arcamax 4 Dec 10 06:38 deletable -rw-rw-rw- 1 bryan arcamax 679432 Dec 10 06:38 _g6i5.f1 -rw-rw-rw- 1 bryan arcamax 35636466 Dec 10 06:37 _g6i5.fdt -rw-rw-rw- 1 bryan arcamax 5435456 Dec 10 06:37 _g6i5.fdx -rw-rw-rw- 1 bryan arcamax 33 Dec 10 06:37 _g6i5.fnm -rw-rw-rw- 1 bryan arcamax 33663041 Dec 10 06:38 _g6i5.frq -rw-rw-rw- 1 bryan arcamax 40747370 Dec 10 06:38 _g6i5.prx -rw-rw-rw- 1 bryan arcamax 101416 Dec 10 06:38 _g6i5.tii -rw-rw-rw- 1 bryan arcamax 7115198 Dec 10 06:38 _g6i5.tis -rw-rw-rw- 1 bryan arcamax 18 Dec 10 06:38 segments About 120MB in total size. The new version is producing a directory with: -rwxrwxr-x 1 bryan bryan 4 Dec 10 17:10 deletable -rwxrwxr-x 1 bryan bryan 30 Dec 10 17:10 segments and 75494 files that look roughly like this: -rwxrwxr-x 1 bryan bryan 2108 Dec 10 16:38 _1007.cfs -rwxrwxr-x 1 bryan bryan 15890 Dec 10 16:38 _1008.cfs -rwxrwxr-x 1 bryan bryan 137531 Dec 10 16:38 _1009.cfs -rwxrwxr-x 1 bryan bryan 2117 Dec 10 16:38 _100k.cfs -rwxrwxr-x 1 bryan bryan 2272 Dec 10 16:38 _100v.cfs -rwxrwxr-x 1 bryan bryan 2621 Dec 10 16:38 _1016.cfs -rwxrwxr-x 1 bryan bryan 2631 Dec 10 16:38 _101h.cfs -rwxrwxr-x 1 bryan bryan 2641 Dec 10 16:38 _101s.cfs -rwxrwxr-x 1 bryan bryan 2458 Dec 10 16:38 _1023.cfs -rwxrwxr-x 1 bryan bryan 2381 Dec 10 16:38 _102.cfs The total size is about 1196MB I have not actually tried using the new index yet. Either something is very wrong or the new version puts a much larger load on the file system. On Linux the ext3 file system does not have very large directories (over 1000 entries) very efficiently. Most of the changes I made to my source code were changing pointers to references and visa-versa. I suspect I may have a problem with not freeing memory objects or something like that. I am operating with these using directives: using namespace lucene::analysis; using namespace lucene::util; using namespace lucene::queryParser; using namespace lucene::document; using namespace lucene::search; using namespace lucene::index; using namespace lucene::store; At startup I do: Analyzer = new lucene::analysis::WhitespaceAnalyzer(); Writer = new IndexWriter( FSDirectory::getDirectory(NewDir.c_str(),true), Analyzer, true); The core of my indexing loop looks like this: Document doc; doc.add(*Field::UnIndexed(_T("doctype"),ts.TableName.c_str())); doc.add(*Field::UnIndexed(_T("dockey"), key)); doc.add(*Field::UnIndexed(_T("label"), label.c_str())); doc.add(*Field::UnStored (_T("body" ), body.c_str())); Writer->addDocument(&doc); Uppon completion I do: Writer->optimize(); Writer->close(); delete Writer; delete Analyzer; -- Bryan White Mal: "Well, my days of not taking you seriously are certainly coming to a middle." Simon: "This must be what going mad feels like." |
From: Ben v. K. <bva...@gm...> - 2005-12-11 20:04:59
|
From the look of it, the old segments aren't being deleted. There seems to be lots of old segment files that aren't deleted. Each cfs is a compound file containing all the .f1, .fdt, .fdx, etc files. Is there any reason why they would not have be deleted? Is there something about the directory security? Or was something accessing the index directory while it was being written? ben On 12/11/05, Bryan White <br...@ar...> wrote: > I have been running CLucene 0.8.11 for about 18 months. Every night I > create a new index of my entire database. I am now experimenting with > upgrading to 0.9.10. The text box is running Fedora Core 4. I built > CLucene with ./configure --enable-ascii. > > The old version would produce a directory that looked like this: > -rw-rw-rw- 1 bryan arcamax 4 Dec 10 06:38 deletable > -rw-rw-rw- 1 bryan arcamax 679432 Dec 10 06:38 _g6i5.f1 > -rw-rw-rw- 1 bryan arcamax 35636466 Dec 10 06:37 _g6i5.fdt > -rw-rw-rw- 1 bryan arcamax 5435456 Dec 10 06:37 _g6i5.fdx > -rw-rw-rw- 1 bryan arcamax 33 Dec 10 06:37 _g6i5.fnm > -rw-rw-rw- 1 bryan arcamax 33663041 Dec 10 06:38 _g6i5.frq > -rw-rw-rw- 1 bryan arcamax 40747370 Dec 10 06:38 _g6i5.prx > -rw-rw-rw- 1 bryan arcamax 101416 Dec 10 06:38 _g6i5.tii > -rw-rw-rw- 1 bryan arcamax 7115198 Dec 10 06:38 _g6i5.tis > -rw-rw-rw- 1 bryan arcamax 18 Dec 10 06:38 segments > About 120MB in total size. > > The new version is producing a directory with: > -rwxrwxr-x 1 bryan bryan 4 Dec 10 17:10 deletable > -rwxrwxr-x 1 bryan bryan 30 Dec 10 17:10 segments > > and 75494 files that look roughly like this: > -rwxrwxr-x 1 bryan bryan 2108 Dec 10 16:38 _1007.cfs > -rwxrwxr-x 1 bryan bryan 15890 Dec 10 16:38 _1008.cfs > -rwxrwxr-x 1 bryan bryan 137531 Dec 10 16:38 _1009.cfs > -rwxrwxr-x 1 bryan bryan 2117 Dec 10 16:38 _100k.cfs > -rwxrwxr-x 1 bryan bryan 2272 Dec 10 16:38 _100v.cfs > -rwxrwxr-x 1 bryan bryan 2621 Dec 10 16:38 _1016.cfs > -rwxrwxr-x 1 bryan bryan 2631 Dec 10 16:38 _101h.cfs > -rwxrwxr-x 1 bryan bryan 2641 Dec 10 16:38 _101s.cfs > -rwxrwxr-x 1 bryan bryan 2458 Dec 10 16:38 _1023.cfs > -rwxrwxr-x 1 bryan bryan 2381 Dec 10 16:38 _102.cfs > The total size is about 1196MB > > I have not actually tried using the new index yet. > > Either something is very wrong or the new version puts a much larger > load on the file system. On Linux the ext3 file system does not have > very large directories (over 1000 entries) very efficiently. > > Most of the changes I made to my source code were changing pointers to > references and visa-versa. I suspect I may have a problem with not > freeing memory objects or something like that. > > I am operating with these using directives: > using namespace lucene::analysis; > using namespace lucene::util; > using namespace lucene::queryParser; > using namespace lucene::document; > using namespace lucene::search; > using namespace lucene::index; > using namespace lucene::store; > > At startup I do: > Analyzer =3D new lucene::analysis::WhitespaceAnalyzer(); > Writer =3D new IndexWriter( > FSDirectory::getDirectory(NewDir.c_str(),true), > Analyzer, true); > > The core of my indexing loop looks like this: > Document doc; > doc.add(*Field::UnIndexed(_T("doctype"),ts.TableName.c_str())); > doc.add(*Field::UnIndexed(_T("dockey"), key)); > doc.add(*Field::UnIndexed(_T("label"), label.c_str())); > doc.add(*Field::UnStored (_T("body" ), body.c_str())); > Writer->addDocument(&doc); > > Uppon completion I do: > Writer->optimize(); > Writer->close(); > delete Writer; > delete Analyzer; > > -- > Bryan White > Mal: "Well, my days of not taking you seriously are certainly > coming to a middle." > Simon: "This must be what going mad feels like." > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi= les > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=3D7637&alloc_id=3D16865&op=3Dclick > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > |
From: Bryan W. <br...@ar...> - 2005-12-12 01:49:37
|
Ben van Klinken wrote: > From the look of it, the old segments aren't being deleted. There > seems to be lots of old segment files that aren't deleted. > > Each cfs is a compound file containing all the .f1, .fdt, .fdx, etc > files. Is there any reason why they would not have be deleted? Is > there something about the directory security? Or was something > accessing the index directory while it was being written? This is a new index creation. The directory is created by the IndexWriter constructor. No other process is accessing the index. The directory is created with a temp name and after completion (Writer->Close()) the directory is renamed. Furthermore this is a test box and it has nothing running that would be accessing the index. I have verified that the indexes are readable. Also a valgrind run of the index creation process did not show much of interest: ==2151== Memcheck, a memory error detector for x86-linux. ==2151== Copyright (C) 2002-2005, and GNU GPL'd, by Julian Seward et al. ==2151== Using valgrind-2.4.0, a program supervision framework for x86-linux. ==2151== Copyright (C) 2000-2005, and GNU GPL'd, by Julian Seward et al. ==2151== For more details, rerun with: -v ==2151== ==2151== Conditional jump or move depends on uninitialised value(s) ==2151== at 0x1BBEBBE0: strstr (in /lib/libc-2.3.5.so) ==2151== by 0x1BCAF6E7: __pthread_initialize_minimal (in /lib/libpthread-2.3.5.so) ==2151== by 0x1BCAF297: (within /lib/libpthread-2.3.5.so) ==2151== by 0x1BCAEE7F: (within /lib/libpthread-2.3.5.so) ==2151== by 0x1B8F1DCA: call_init (in /lib/ld-2.3.5.so) ==2151== by 0x1B8F1EEC: _dl_init (in /lib/ld-2.3.5.so) ==2151== by 0x1B8E47CE: (within /lib/ld-2.3.5.so) ==2151== ==2151== Conditional jump or move depends on uninitialised value(s) ==2151== at 0x1BBEBBE4: strstr (in /lib/libc-2.3.5.so) ==2151== by 0x1BCAF6E7: __pthread_initialize_minimal (in /lib/libpthread-2.3.5.so) ==2151== by 0x1BCAF297: (within /lib/libpthread-2.3.5.so) ==2151== by 0x1BCAEE7F: (within /lib/libpthread-2.3.5.so) ==2151== by 0x1B8F1DCA: call_init (in /lib/ld-2.3.5.so) ==2151== by 0x1B8F1EEC: _dl_init (in /lib/ld-2.3.5.so) ==2151== by 0x1B8E47CE: (within /lib/ld-2.3.5.so) sh: /bin/chmod: Argument list too long searchengine.ProcessDump: newsstory count:35591 maxid:37292 ==2151== ==2151== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 49 from 1) ==2151== malloc/free: in use at exit: 603200 bytes in 81 blocks. ==2151== malloc/free: 43026643 allocs, 43026562 frees, 2968442847 bytes allocated. ==2151== For counts of detected errors, rerun with: -v ==2151== searching for pointers to 81 not-freed blocks. ==2151== checked 1082428 bytes. ==2151== ==2151== LEAK SUMMARY: ==2151== definitely lost: 0 bytes in 0 blocks. ==2151== possibly lost: 0 bytes in 0 blocks. ==2151== still reachable: 603200 bytes in 81 blocks. ==2151== suppressed: 0 bytes in 0 blocks. ==2151== Reachable blocks (those to which a pointer was found) are not shown. ==2151== To see them, rerun with: --show-reachable=yes The chmod error occurs when the process attempts to chmod 666 on the contents of the directory and values because there are two many files. I don't know that the two strstr errors are. It seems to have something to do with pthreads but this is not a multithreaded program. I doubt is is related to the problem. The valgrind run represents the results of a much smaller run. In this case the resulting index directory was 163MB in 3957 files. The old working version produces a 19MB index directory. -- Bryan White |
From: Ben v. K. <bva...@gm...> - 2005-12-12 11:24:43
|
I'm sure that the index sizes should be the same. It seems as though the old segments arent being deleted. Have a look at the deletable file, it normally contains the segments which should be deleted but for some reason couldnt be - i suspect that this will have a list of all the segments except the latest one. Also look at the segments file, it contains the currently used segments. Since you optimized the index, there should only be one segment name in this. Can you verify these two points for me? thanks ben On 12/12/05, Bryan White <br...@ar...> wrote: > Ben van Klinken wrote: > > From the look of it, the old segments aren't being deleted. There > > seems to be lots of old segment files that aren't deleted. > > > > Each cfs is a compound file containing all the .f1, .fdt, .fdx, etc > > files. Is there any reason why they would not have be deleted? Is > > there something about the directory security? Or was something > > accessing the index directory while it was being written? > > This is a new index creation. The directory is created by the > IndexWriter constructor. No other process is accessing the index. The > directory is created with a temp name and after completion > (Writer->Close()) the directory is renamed. Furthermore this is a test > box and it has nothing running that would be accessing the index. > > I have verified that the indexes are readable. Also a valgrind run of > the index creation process did not show much of interest: > =3D=3D2151=3D=3D Memcheck, a memory error detector for x86-linux. > =3D=3D2151=3D=3D Copyright (C) 2002-2005, and GNU GPL'd, by Julian Seward= et al. > =3D=3D2151=3D=3D Using valgrind-2.4.0, a program supervision framework fo= r > x86-linux. > =3D=3D2151=3D=3D Copyright (C) 2000-2005, and GNU GPL'd, by Julian Seward= et al. > =3D=3D2151=3D=3D For more details, rerun with: -v > =3D=3D2151=3D=3D > =3D=3D2151=3D=3D Conditional jump or move depends on uninitialised value(= s) > =3D=3D2151=3D=3D at 0x1BBEBBE0: strstr (in /lib/libc-2.3.5.so) > =3D=3D2151=3D=3D by 0x1BCAF6E7: __pthread_initialize_minimal (in > /lib/libpthread-2.3.5.so) > =3D=3D2151=3D=3D by 0x1BCAF297: (within /lib/libpthread-2.3.5.so) > =3D=3D2151=3D=3D by 0x1BCAEE7F: (within /lib/libpthread-2.3.5.so) > =3D=3D2151=3D=3D by 0x1B8F1DCA: call_init (in /lib/ld-2.3.5.so) > =3D=3D2151=3D=3D by 0x1B8F1EEC: _dl_init (in /lib/ld-2.3.5.so) > =3D=3D2151=3D=3D by 0x1B8E47CE: (within /lib/ld-2.3.5.so) > =3D=3D2151=3D=3D > =3D=3D2151=3D=3D Conditional jump or move depends on uninitialised value(= s) > =3D=3D2151=3D=3D at 0x1BBEBBE4: strstr (in /lib/libc-2.3.5.so) > =3D=3D2151=3D=3D by 0x1BCAF6E7: __pthread_initialize_minimal (in > /lib/libpthread-2.3.5.so) > =3D=3D2151=3D=3D by 0x1BCAF297: (within /lib/libpthread-2.3.5.so) > =3D=3D2151=3D=3D by 0x1BCAEE7F: (within /lib/libpthread-2.3.5.so) > =3D=3D2151=3D=3D by 0x1B8F1DCA: call_init (in /lib/ld-2.3.5.so) > =3D=3D2151=3D=3D by 0x1B8F1EEC: _dl_init (in /lib/ld-2.3.5.so) > =3D=3D2151=3D=3D by 0x1B8E47CE: (within /lib/ld-2.3.5.so) > sh: /bin/chmod: Argument list too long > searchengine.ProcessDump: newsstory count:35591 maxid:37292 > =3D=3D2151=3D=3D > =3D=3D2151=3D=3D ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 49 = from 1) > =3D=3D2151=3D=3D malloc/free: in use at exit: 603200 bytes in 81 blocks. > =3D=3D2151=3D=3D malloc/free: 43026643 allocs, 43026562 frees, 2968442847= bytes > allocated. > =3D=3D2151=3D=3D For counts of detected errors, rerun with: -v > =3D=3D2151=3D=3D searching for pointers to 81 not-freed blocks. > =3D=3D2151=3D=3D checked 1082428 bytes. > =3D=3D2151=3D=3D > =3D=3D2151=3D=3D LEAK SUMMARY: > =3D=3D2151=3D=3D definitely lost: 0 bytes in 0 blocks. > =3D=3D2151=3D=3D possibly lost: 0 bytes in 0 blocks. > =3D=3D2151=3D=3D still reachable: 603200 bytes in 81 blocks. > =3D=3D2151=3D=3D suppressed: 0 bytes in 0 blocks. > =3D=3D2151=3D=3D Reachable blocks (those to which a pointer was found) ar= e not > shown. > =3D=3D2151=3D=3D To see them, rerun with: --show-reachable=3Dyes > > The chmod error occurs when the process attempts to chmod 666 on the > contents of the directory and values because there are two many files. > > I don't know that the two strstr errors are. It seems to have something > to do with pthreads but this is not a multithreaded program. I doubt is > is related to the problem. > > The valgrind run represents the results of a much smaller run. In this > case the resulting index directory was 163MB in 3957 files. The old > working version produces a 19MB index directory. > -- > Bryan White > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi= les > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=3D7637&alloc_id=3D16865&op=3Dclick > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > |
From: Bryan W. <br...@ar...> - 2005-12-12 11:52:37
|
Ben van Klinken wrote: > I'm sure that the index sizes should be the same. It seems as though > the old segments arent being deleted. Have a look at the deletable > file, it normally contains the segments which should be deleted but > for some reason couldnt be - i suspect that this will have a list of > all the segments except the latest one. Also look at the segments > file, it contains the currently used segments. Since you optimized the > index, there should only be one segment name in this. Can you verify > these two points for me? deletable contains 4 bytes, all zero. segments: 0000000 ff ff ff ff 00 00 00 00 00 01 26 e7 00 0b 84 ee 0000020 00 00 00 01 05 5f 67 36 69 35 00 0a 5e 08 In the shorter run the deletable file is the same. segments: 0000000 ff ff ff ff 00 00 00 00 00 00 0f 74 00 00 9a 79 0000020 00 00 00 01 04 5f 75 69 67 00 00 8b 06 -- Bryan White |
From: Ben v. K. <bva...@gm...> - 2005-12-15 17:39:15
|
From the looks of the segments file, only one segment is being used... all the others in the directory shouldn't be there... So we have to find out why they are not being deleted. I'm just checking in some changes to indexwriter.* and clconfig.h relating to _CL_DEBUG_INFO. I have just modified the info streams to be more usable. Can you try and use this and see what kind of information you get. Define the _CL_DEBUG_INFO on ~line 179 of clconfig.h. If you see something like '... Will re-try later', then it is having trouble deleting old segments. ben On 12/12/05, Bryan White <br...@ar...> wrote: > Ben van Klinken wrote: > > I'm sure that the index sizes should be the same. It seems as though > > the old segments arent being deleted. Have a look at the deletable > > file, it normally contains the segments which should be deleted but > > for some reason couldnt be - i suspect that this will have a list of > > all the segments except the latest one. Also look at the segments > > file, it contains the currently used segments. Since you optimized the > > index, there should only be one segment name in this. Can you verify > > these two points for me? > > deletable contains 4 bytes, all zero. > segments: > 0000000 ff ff ff ff 00 00 00 00 00 01 26 e7 00 0b 84 ee > 0000020 00 00 00 01 05 5f 67 36 69 35 00 0a 5e 08 > > In the shorter run the deletable file is the same. > segments: > 0000000 ff ff ff ff 00 00 00 00 00 00 0f 74 00 00 9a 79 > 0000020 00 00 00 01 04 5f 75 69 67 00 00 8b 06 > > -- > Bryan White > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi= les > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=3D7637&alloc_id=3D16865&op=3Dclick > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > |
From: Bryan W. <br...@ar...> - 2005-12-16 11:45:51
|
Ben van Klinken wrote: > From the looks of the segments file, only one segment is being used... > all the others in the directory shouldn't be there... So we have to > find out why they are not being deleted. > > I'm just checking in some changes to indexwriter.* and clconfig.h > relating to _CL_DEBUG_INFO. I have just modified the info streams to > be more usable. Can you try and use this and see what kind of > information you get. Define the _CL_DEBUG_INFO on ~line 179 of > clconfig.h. If you see something like '... Will re-try later', then it > is having trouble deleting old segments. Sorry, I am not well versed in using CVS. I have been trying to check things out and build it from the info on: http://sourceforge.net/cvs/?group_id=80013 Following those directions I am not certain I got a current version. After checking out all the individual files I ran: bootstrap, configure, and make. All completed without error but at no time in that process did I find a clconfig.h file anywhere in the tree. -- Bryan White |
From: Ben v. K. <bva...@gm...> - 2005-12-16 12:03:45
|
You have to check out using the tag version0_9branch I think you use -r tagname when using cvs command line ben On 12/16/05, Bryan White <br...@ar...> wrote: > Ben van Klinken wrote: > > From the looks of the segments file, only one segment is being used... > > all the others in the directory shouldn't be there... So we have to > > find out why they are not being deleted. > > > > I'm just checking in some changes to indexwriter.* and clconfig.h > > relating to _CL_DEBUG_INFO. I have just modified the info streams to > > be more usable. Can you try and use this and see what kind of > > information you get. Define the _CL_DEBUG_INFO on ~line 179 of > > clconfig.h. If you see something like '... Will re-try later', then it > > is having trouble deleting old segments. > > Sorry, I am not well versed in using CVS. I have been trying to check > things out and build it from the info on: > http://sourceforge.net/cvs/?group_id=3D80013 > > Following those directions I am not certain I got a current version. > After checking out all the individual files I ran: bootstrap, configure, > and make. All completed without error but at no time in that process > did I find a clconfig.h file anywhere in the tree. > > -- > Bryan White > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi= les > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=3D7637&alloc_id=3D16865&op=3Dclick > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > |
From: Ben v. K. <bva...@gm...> - 2005-12-16 12:06:18
|
You have to check out using the version0_9branch tag I think you use -r version0_9branch with the cvs command line ben On 12/16/05, Bryan White <br...@ar...> wrote: > Ben van Klinken wrote: > > From the looks of the segments file, only one segment is being used... > > all the others in the directory shouldn't be there... So we have to > > find out why they are not being deleted. > > > > I'm just checking in some changes to indexwriter.* and clconfig.h > > relating to _CL_DEBUG_INFO. I have just modified the info streams to > > be more usable. Can you try and use this and see what kind of > > information you get. Define the _CL_DEBUG_INFO on ~line 179 of > > clconfig.h. If you see something like '... Will re-try later', then it > > is having trouble deleting old segments. > > Sorry, I am not well versed in using CVS. I have been trying to check > things out and build it from the info on: > http://sourceforge.net/cvs/?group_id=3D80013 > > Following those directions I am not certain I got a current version. > After checking out all the individual files I ran: bootstrap, configure, > and make. All completed without error but at no time in that process > did I find a clconfig.h file anywhere in the tree. > > -- > Bryan White > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi= les > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=3D7637&alloc_id=3D16865&op=3Dclick > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > |
From: Bryan W. <br...@ar...> - 2005-12-16 14:31:05
|
Ben van Klinken wrote: > You have to check out using the version0_9branch tag > > I think you use > > -r version0_9branch > > with the cvs command line I believe I sucessfully checked out the version0_9branch. I commented the ifdefs around the line: #define _CL__CND_DEBUG Then did: autogen.sh configure --enable-ascii --enable-cnddebug --enable-debug=yes make Finally I rebuilt my program against it and ran it. I am not seeing any debug output. Where should I look for it? -- Bryan White, ArcaMax Publishing Inc. I never look back, darling. it distracts from the now. - Edna Mode |
From: Ben v. K. <bva...@gm...> - 2005-12-16 14:35:31
|
See clconfig.h line 179 //define this to print out lots of information about merges, etc //requires __CL__CND_DEBUG to be defined #define _CL_DEBUG_INFO stdout should print out debug info straight onto the console (probably lots of it) On 12/16/05, Bryan White <br...@ar...> wrote: > Ben van Klinken wrote: > > You have to check out using the version0_9branch tag > > > > I think you use > > > > -r version0_9branch > > > > with the cvs command line > > I believe I sucessfully checked out the version0_9branch. > I commented the ifdefs around the line: > #define _CL__CND_DEBUG > Then did: > autogen.sh > configure --enable-ascii --enable-cnddebug --enable-debug=3Dyes > make > > Finally I rebuilt my program against it and ran it. > > I am not seeing any debug output. Where should I look for it? > > -- > Bryan White, ArcaMax Publishing Inc. > > I never look back, darling. it distracts from the now. - Edna Mode > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi= les > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=3D7637&alloc_id=3D16865&op=3Dclick > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > |
From: Bryan W. <br...@ar...> - 2005-12-16 14:45:39
|
Ben van Klinken wrote: > See clconfig.h line 179 > > //define this to print out lots of information about merges, etc > //requires __CL__CND_DEBUG to be defined > #define _CL_DEBUG_INFO stdout > > should print out debug info straight onto the console (probably lots of it) FYI: I don't see a clconfig.h file anywhere in the tree. I do see a CLConfig.h file and that is where I alteed the ifdefs around the _CL__CND_DEBUG. I don't see the string _CL_DEBUG_INFO anywhere in that file or even anywhere in the tree. This leads me to believe I still don't have the correct version of things. -- Bryan White, ArcaMax Publishing Inc. I never look back, darling. it distracts from the now. - Edna Mode |
From: Ben v. K. <bva...@gm...> - 2005-12-16 14:48:53
|
I think you do have the latests, because the __CL prefix only occured in 0.= 9 I'll send you the affected files directly to your email ben On 12/16/05, Bryan White <br...@ar...> wrote: > Ben van Klinken wrote: > > See clconfig.h line 179 > > > > //define this to print out lots of information about merges, etc > > //requires __CL__CND_DEBUG to be defined > > #define _CL_DEBUG_INFO stdout > > > > should print out debug info straight onto the console (probably lots of= it) > > FYI: I don't see a clconfig.h file anywhere in the tree. I do see a > CLConfig.h file and that is where I alteed the ifdefs around the > _CL__CND_DEBUG. > > I don't see the string _CL_DEBUG_INFO anywhere in that file or even > anywhere in the tree. This leads me to believe I still don't have the > correct version of things. > > -- > Bryan White, ArcaMax Publishing Inc. > > I never look back, darling. it distracts from the now. - Edna Mode > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi= les > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=3D7637&alloc_id=3D16865&op=3Dclick > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > |
From: Bryan W. <br...@ar...> - 2005-12-16 15:02:50
|
Ben van Klinken wrote: > I think you do have the latests, because the __CL prefix only occured in 0.9 > > I'll send you the affected files directly to your email I received the files, copied them into place and ran make. I got these errors: ../src/CLucene/index/IndexWriter.cpp: In member function 'void lucene::index::IndexWriter::_IndexWriter(bool)': ../src/CLucene/index/IndexWriter.cpp:110: error: 'class lucene::store::Directory' has no member named 'THIS_LOCK' ../src/CLucene/index/IndexWriter.cpp: In member function 'void lucene::index::IndexWriter::mergeSegments(uint32_t)': ../src/CLucene/index/IndexWriter.cpp:415: error: 'class lucene::store::Directory' has no member named 'THIS_LOCK' ../src/CLucene/index/IndexWriter.cpp: In member function 'void lucene::index::IndexWriter::addIndexes(lucene::index::IndexReader**)': ../src/CLucene/index/IndexWriter.cpp:578: error: 'class lucene::store::Directory' has no member named 'THIS_LOCK' -- Bryan White, ArcaMax Publishing Inc. I never look back, darling. it distracts from the now. - Edna Mode |
From: Ben v. K. <bva...@gm...> - 2005-12-16 15:10:16
|
Hmm... some new changes to the code. try configuring with multi-threading disabled, or define _CL_DISABLE_MULTITHREADING in CLConfig.h ben On 12/16/05, Bryan White <br...@ar...> wrote: > Ben van Klinken wrote: > > I think you do have the latests, because the __CL prefix only occured i= n 0.9 > > > > I'll send you the affected files directly to your email > > I received the files, copied them into place and ran make. I got these > errors: > ../src/CLucene/index/IndexWriter.cpp: In member function 'void > lucene::index::IndexWriter::_IndexWriter(bool)': > ../src/CLucene/index/IndexWriter.cpp:110: error: 'class > lucene::store::Directory' has no member named 'THIS_LOCK' > ../src/CLucene/index/IndexWriter.cpp: In member function 'void > lucene::index::IndexWriter::mergeSegments(uint32_t)': > ../src/CLucene/index/IndexWriter.cpp:415: error: 'class > lucene::store::Directory' has no member named 'THIS_LOCK' > ../src/CLucene/index/IndexWriter.cpp: In member function 'void > lucene::index::IndexWriter::addIndexes(lucene::index::IndexReader**)': > ../src/CLucene/index/IndexWriter.cpp:578: error: 'class > lucene::store::Directory' has no member named 'THIS_LOCK' > > > -- > Bryan White, ArcaMax Publishing Inc. > > I never look back, darling. it distracts from the now. - Edna Mode > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi= les > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=3D7637&alloc_id=3D16865&op=3Dclick > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > |
From: Bryan W. <br...@ar...> - 2005-12-16 15:31:21
|
Ben van Klinken wrote: > Hmm... some new changes to the code. > > try configuring with multi-threading disabled, or define > _CL_DISABLE_MULTITHREADING in CLConfig.h Ok, I am getting debug output now. The debug output file is 750KB. It starts out like: ------------ merging segments _0 (1 docs) _1 (1 docs) _2 (1 docs) _3 (1 docs) _4 (1 docs) _5 (1 docs) _6 (1 docs) _7 (1 docs) _8 (1 docs) _9 (1 docs) into _a (10 docs) merging segments _b (1 docs) _c (1 docs) _d (1 docs) _e (1 docs) _f (1 docs) _g (1 docs) _h (1 docs) _i (1 docs) _j (1 docs) _k (1 docs) into _l (10 docs) merging segments _m (1 docs) _n (1 docs) _o (1 docs) _p (1 docs) _q (1 docs) _r (1 docs) _s (1 docs) _t (1 docs) _u (1 docs) _v (1 docs) --------- And finishes like: --------- into _ui2 (10 docs) merging segments _ui3 (1 docs) _ui4 (1 docs) _ui5 (1 docs) _ui6 (1 docs) _ui7 (1 docs) _ui8 (1 docs) _ui9 (1 docs) _uia (1 docs) _uib (1 docs) _uic (1 docs) into _uid (10 docs) merging segments _ufm (100 docs) _ufx (10 docs) _ug8 (10 docs) _ugj (10 docs) _ugu (10 docs) _uh5 (10 docs) _uhg (10 docs) _uhr (10 docs) _ui2 (10 docs) _uid (10 docs) into _uie (190 docs) merging segments _qkr (1000 docs) _rfm (1000 docs) _sah (1000 docs) _t5c (1000 docs) _u07 (1000 docs) _u3a (100 docs) _u6d (100 docs) _u9g (100 docs) _ucj (100 docs) _uie (190 docs) into _uif (5590 docs) merging segments _8km (10000 docs) _h59 (10000 docs) _ppw (10000 docs) _uif (5590 docs) into _uig (35590 docs) --------- -- Bryan White, ArcaMax Publishing Inc. I never look back, darling. it distracts from the now. - Edna Mode |
From: Ben v. K. <bva...@gm...> - 2005-12-17 01:39:01
|
Do you see anywhere in the file which mentions 'will ret-try later'? ben On 12/16/05, Bryan White <br...@ar...> wrote: > Ben van Klinken wrote: > > Hmm... some new changes to the code. > > > > try configuring with multi-threading disabled, or define > > _CL_DISABLE_MULTITHREADING in CLConfig.h > > Ok, I am getting debug output now. The debug output file is 750KB. > > It starts out like: > ------------ > merging segments > _0 (1 docs) > _1 (1 docs) > _2 (1 docs) > _3 (1 docs) > _4 (1 docs) > _5 (1 docs) > _6 (1 docs) > _7 (1 docs) > _8 (1 docs) > _9 (1 docs) > > into _a (10 docs) > merging segments > _b (1 docs) > _c (1 docs) > _d (1 docs) > _e (1 docs) > _f (1 docs) > _g (1 docs) > _h (1 docs) > _i (1 docs) > _j (1 docs) > _k (1 docs) > > into _l (10 docs) > merging segments > _m (1 docs) > _n (1 docs) > _o (1 docs) > _p (1 docs) > _q (1 docs) > _r (1 docs) > _s (1 docs) > _t (1 docs) > _u (1 docs) > _v (1 docs) > --------- > > And finishes like: > --------- > into _ui2 (10 docs) > merging segments > _ui3 (1 docs) > _ui4 (1 docs) > _ui5 (1 docs) > _ui6 (1 docs) > _ui7 (1 docs) > _ui8 (1 docs) > _ui9 (1 docs) > _uia (1 docs) > _uib (1 docs) > _uic (1 docs) > > into _uid (10 docs) > merging segments > _ufm (100 docs) > _ufx (10 docs) > _ug8 (10 docs) > _ugj (10 docs) > _ugu (10 docs) > _uh5 (10 docs) > _uhg (10 docs) > _uhr (10 docs) > _ui2 (10 docs) > _uid (10 docs) > > into _uie (190 docs) > merging segments > _qkr (1000 docs) > _rfm (1000 docs) > _sah (1000 docs) > _t5c (1000 docs) > _u07 (1000 docs) > _u3a (100 docs) > _u6d (100 docs) > _u9g (100 docs) > _ucj (100 docs) > _uie (190 docs) > > into _uif (5590 docs) > merging segments > _8km (10000 docs) > _h59 (10000 docs) > _ppw (10000 docs) > _uif (5590 docs) > > into _uig (35590 docs) > --------- > > -- > Bryan White, ArcaMax Publishing Inc. > > I never look back, darling. it distracts from the now. - Edna Mode > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi= les > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=3D7637&alloc_id=3D16865&op=3Dclick > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > |
From: Bryan W. <br...@ar...> - 2005-12-17 13:11:23
|
Ben van Klinken wrote: > Do you see anywhere in the file which mentions 'will ret-try later'? No. In fact there is nothing in the file that does not match the basic pattern of: ---- merging segments _0 (1 docs) _1 (1 docs) _2 (1 docs) _3 (1 docs) _4 (1 docs) _5 (1 docs) _6 (1 docs) _7 (1 docs) _8 (1 docs) _9 (1 docs) into _a (10 docs) ---- All the occurances of the above pattern except the last are the same size. Is it more likely this is a recently introduced problem or some problem in my configuration / usage. I could load up an earlier version than 0.9.10 and try it. BTW: I appreciate the time you have spent on this. Debugging via email is hard. -- Bryan White |
From: Ben v. K. <bva...@gm...> - 2005-12-17 22:59:15
|
Sorry I can't help more... what happens when you run the demo? does it have a maximum of about 20 files in the index directory at any time? ben On 12/17/05, Bryan White <br...@ar...> wrote: > Ben van Klinken wrote: > > Do you see anywhere in the file which mentions 'will ret-try later'? > > No. In fact there is nothing in the file that does not match the basic > pattern of: > ---- > merging segments > _0 (1 docs) > _1 (1 docs) > _2 (1 docs) > _3 (1 docs) > _4 (1 docs) > _5 (1 docs) > _6 (1 docs) > _7 (1 docs) > _8 (1 docs) > _9 (1 docs) > > into _a (10 docs) > ---- > > All the occurances of the above pattern except the last are the same size= . > > Is it more likely this is a recently introduced problem or some problem > in my configuration / usage. I could load up an earlier version than > 0.9.10 and try it. > > BTW: I appreciate the time you have spent on this. Debugging via email > is hard. > > -- > Bryan White > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi= les > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=3D7637&alloc_id=3D16865&op=3Dclick > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > |
From: Bryan W. <br...@ar...> - 2005-12-18 13:56:44
|
Ben van Klinken wrote: > Sorry I can't help more... > what happens when you run the demo? does it have a maximum of about 20 > files in the index directory at any time? Running cl_demo does produce an index with a large number of files. It was over 1200 when I killed it. I am going to try installing some older versions to see if I can identify when this behavior started. -- Bryan White |
From: Ben v. K. <bva...@gm...> - 2005-12-18 18:32:05
|
That's strange, because i have never had this problem. What are the security settings of the index folder? Could delete access somehow be denied? ben On 12/18/05, Bryan White <br...@ar...> wrote: > Ben van Klinken wrote: > > Sorry I can't help more... > > what happens when you run the demo? does it have a maximum of about 20 > > files in the index directory at any time? > > Running cl_demo does produce an index with a large number of files. It > was over 1200 when I killed it. I am going to try installing some older > versions to see if I can identify when this behavior started. > > -- > Bryan White > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi= les > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=3D7637&alloc_id=3D16865&op=3Dclick > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > |