You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(16) |
Jul
(56) |
Aug
(2) |
Sep
(62) |
Oct
(71) |
Nov
(45) |
Dec
(6) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(12) |
Feb
(22) |
Mar
|
Apr
(62) |
May
(15) |
Jun
(57) |
Jul
(4) |
Aug
(24) |
Sep
(7) |
Oct
(34) |
Nov
(81) |
Dec
(41) |
2005 |
Jan
(70) |
Feb
(51) |
Mar
(46) |
Apr
(16) |
May
(22) |
Jun
(34) |
Jul
(23) |
Aug
(13) |
Sep
(43) |
Oct
(42) |
Nov
(54) |
Dec
(68) |
2006 |
Jan
(81) |
Feb
(43) |
Mar
(64) |
Apr
(141) |
May
(37) |
Jun
(101) |
Jul
(112) |
Aug
(32) |
Sep
(85) |
Oct
(63) |
Nov
(84) |
Dec
(81) |
2007 |
Jan
(25) |
Feb
(64) |
Mar
(46) |
Apr
(28) |
May
(14) |
Jun
(42) |
Jul
(19) |
Aug
(34) |
Sep
(29) |
Oct
(25) |
Nov
(12) |
Dec
(9) |
2008 |
Jan
(15) |
Feb
(34) |
Mar
(37) |
Apr
(23) |
May
(18) |
Jun
(47) |
Jul
(28) |
Aug
(61) |
Sep
(29) |
Oct
(48) |
Nov
(24) |
Dec
(79) |
2009 |
Jan
(48) |
Feb
(50) |
Mar
(28) |
Apr
(10) |
May
(51) |
Jun
(22) |
Jul
(125) |
Aug
(29) |
Sep
(38) |
Oct
(29) |
Nov
(58) |
Dec
(32) |
2010 |
Jan
(15) |
Feb
(10) |
Mar
(12) |
Apr
(64) |
May
(4) |
Jun
(81) |
Jul
(41) |
Aug
(82) |
Sep
(84) |
Oct
(35) |
Nov
(43) |
Dec
(26) |
2011 |
Jan
(59) |
Feb
(25) |
Mar
(23) |
Apr
(14) |
May
(22) |
Jun
(8) |
Jul
(5) |
Aug
(20) |
Sep
(10) |
Oct
(12) |
Nov
(29) |
Dec
(7) |
2012 |
Jan
(1) |
Feb
(22) |
Mar
(9) |
Apr
(5) |
May
(2) |
Jun
|
Jul
(6) |
Aug
(2) |
Sep
|
Oct
(5) |
Nov
(9) |
Dec
(10) |
2013 |
Jan
(9) |
Feb
(3) |
Mar
(2) |
Apr
(4) |
May
(2) |
Jun
(1) |
Jul
(2) |
Aug
(5) |
Sep
|
Oct
(3) |
Nov
(3) |
Dec
(2) |
2014 |
Jan
(1) |
Feb
(2) |
Mar
|
Apr
(10) |
May
(3) |
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(3) |
2015 |
Jan
(8) |
Feb
(3) |
Mar
(7) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
(3) |
Dec
|
2016 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(2) |
2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(8) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2020 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
(4) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2025 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
From: Itamar Syn-H. <it...@co...> - 2010-12-24 11:24:59
|
This is now merged to master. Thanks for reporting! Itamar. On 5/12/2010 7:43 PM, Matt Ronge wrote: > Shoot I wish I had noticed this earlier: http://clucene.git.sourceforge.net/git/gitweb.cgi?p=clucene/clucene;a=commit;h=de5695332badddc264c3e187350463d9d6ee4a8a > > Looks like someone else had already found and fixed the bug. I wish I had found that, would have saved me a lot of time, oh well. > > On Dec 4, 2010, at 1:40 PM, Matt Ronge wrote: > >> I've been working off the head of CLucene (which has great!) and I ran into a memory smasher. >> >> I had strange issues where after some queries the search results would start to get more and more incorrect until the app would crash. After much debugging I was able to confirm that this would only occur for queries which had lengths that were multiples of 8. >> >> After even more debugging I found that the KeywordTokenizer (which I was using for my queries) was allocating term buffers that where multiples of 8 (suspicious!). It turns out that if the token length is a multiple of 8 and the KeywordTokenizer attempts to null terminate the string, it writes off the end of the array, causing memory corruption. Normally you don't see this because it silently corrupts and the token must be the multiple of 8. >> >> To fix this I make sure to add room for the null terminator if the buffer is already full. Here is my patch: >> >> diff --git a/src/core/CLucene/analysis/Analyzers.cpp b/src/core/CLucene/analysis/Analyzers.cpp >> index 0c34a60..39fec43 100644 >> --- a/src/core/CLucene/analysis/Analyzers.cpp >> +++ b/src/core/CLucene/analysis/Analyzers.cpp >> @@ -556,6 +556,9 @@ Token* KeywordTokenizer::next(Token* token){ >> if ( termBuffer == NULL ){ >> termBuffer=token->resizeTermBuffer(token->bufferLength() + 8); >> } >> + if (upto == token->bufferLength()) { >> + termBuffer = token->resizeTermBuffer(token->bufferLength() + 1); >> + } >> termBuffer[upto]=0; >> token->setTermLength(upto); >> return token; >> >> >> Let me know if I should open a bug for this. >> >> Thanks again for clucene! >> -- >> Matt Ronge >> Central Atomics >> Makers of Rocketbox >> >> http://www.getrocketbox.com >> mr...@ce... >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> What happens now with your Lotus Notes apps - do you make another costly >> upgrade, or settle for being marooned without product support? Time to move >> off Lotus Notes and onto the cloud with Force.com, apps are easier to build, >> use, and manage than apps on traditional platforms. Sign up for the Lotus >> Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d >> _______________________________________________ >> CLucene-developers mailing list >> CLu...@li... >> https://lists.sourceforge.net/lists/listinfo/clucene-developers > -- > Matt Ronge > Central Atomics > Makers of Rocketbox > > http://www.getrocketbox.com > mr...@ce... > > > > > > > > ------------------------------------------------------------------------------ > What happens now with your Lotus Notes apps - do you make another costly > upgrade, or settle for being marooned without product support? Time to move > off Lotus Notes and onto the cloud with Force.com, apps are easier to build, > use, and manage than apps on traditional platforms. Sign up for the Lotus > Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > > |
From: Itamar Syn-H. <it...@co...> - 2010-12-24 11:23:14
|
Hi Bill, What kind of a sort object are you passing? if its your own brewed, perhaps it is buggy? Itamar. On 10/12/2010 10:53 PM, Miller, Bill (QuickWire) wrote: > Hi all, I've been implementing MultiSearcher and have a problem that > may be more of a 'Lucene Conceptual' thing than a bug. > > I'm running a fairly new 2_3_2 git under Windows (VS 2010). > > My problem is when I pass a sort to MultSearch::search() I receive > many duplicate hits (note that all doc id's are unique across both > indexes I'm testing with...). > > However, if I only ask for 100 hits max there are no duplicates. > > Also if I do not pass a sort then there are no duplicates. > > I find it rather hard to follow but I'm guessing that the 'good' 100 > docs comes from the initial search and the duplicates are caused from > the Hits::getMoreDocs() eventually calling MultSearch::search() again > and for some reason adding the same hits each time. > > Should I be expecting this behavior? > > Thanks, > > Bill > > > ------------------------------------------------------------------------------ > Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL, > new data types, scalar functions, improved concurrency, built-in packages, > OCI, SQL*Plus, data movement tools, best practices and more. > http://p.sf.net/sfu/oracle-sfdev2dev > > > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers |
From: Itamar Syn-H. <it...@co...> - 2010-12-24 11:08:12
|
Hi Pini, Thanks for the fix. I checked in a slightly different fix taken from our bug tracker. It also had a few tests. Let us know if there's anything else left to fix, and feel free to add your own tests. Itamar. On 23/12/2010 7:41 PM, pini shamgar wrote: > Hi, > > I found a bug in the StandardAnalyzer. Due to it I could not search > and find an IP address. > > The bug is in the reuse of StandardTokenizer object. > > Bellow a fixed presented: > > The fix is in 3 flies: > > 1)In core\CLucene\analysis\standard\StandardTokenizer.cpp > > void StandardTokenizer::reset(Reader* _input) { > > rdPos= -1; > > tokenStart = -1; > > this->input = _input; > > if (rd->input==NULL) { > > rd->reset(_input->__asBufferedReader()); > > } > > } > > 2)In core\CLucene\util\_FastCharStream.h > > Add the method > > void reset(BufferedReader* reader); > > 3)In core\CLucene\util\FastCharStream.cpp > > FastCharStream::FastCharStream(BufferedReader* reader) > > { > > reset(reader); > > input->setMinBufSize(maxRewindSize); > > } > > FastCharStream::~FastCharStream(){ > > } > > void FastCharStream::reset(BufferedReader* reader) > > { > > pos = 0; > > rewindPos = 0; > > resetPos = 0; > > col = 1; > > line = 1; > > input = reader; > > } > > Cheers > > Pini Shamgar > > > > > > ------------------------------------------------------------------------------ > Learn how Oracle Real Application Clusters (RAC) One Node allows customers > to consolidate database storage, standardize their database environment, and, > should the need arise, upgrade to a full multi-node Oracle RAC database > without downtime or disruption > http://p.sf.net/sfu/oracle-sfdevnl > > > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers |
From: pini s. <pi...@ya...> - 2010-12-23 17:41:16
|
Hi, I found a bug in the StandardAnalyzer. Due to it I could not search and find an IP address. The bug is in the reuse of StandardTokenizer object. Bellow a fixed presented: The fix is in 3 flies: 1) In core\CLucene\analysis\standard\StandardTokenizer.cpp void StandardTokenizer::reset(Reader* _input) { rdPos = -1; tokenStart = -1; this->input = _input; if (rd->input==NULL) { rd->reset(_input->__asBufferedReader()); } } 2) In core\CLucene\util\_FastCharStream.h Add the method void reset(BufferedReader* reader); 3) In core\CLucene\util\FastCharStream.cpp FastCharStream::FastCharStream(BufferedReader* reader) { reset(reader); input->setMinBufSize(maxRewindSize); } FastCharStream::~FastCharStream(){ } void FastCharStream::reset(BufferedReader* reader) { pos = 0; rewindPos = 0; resetPos = 0; col = 1; line = 1; input = reader; } Cheers Pini Shamgar |
From: Ben v. K. <bva...@gm...> - 2010-12-22 05:19:15
|
can you rebuild and recreate with cmake -DCMAKE_BUILD_TYPE="Debug". Then the stack will be more complete ben On Tue, Dec 21, 2010 at 4:09 PM, Pankaj Jangid <pan...@gm...>wrote: > Guys, > > Any clue on this. I am interested in fixing this issue. Can someone guide > me on this? > > -- > Regards > Pankaj > > > On Tue, Nov 16, 2010 at 4:00 PM, Pankaj Jangid <pan...@gm...>wrote: > >> Here is the stack trace. >> >> clucene built with these options >> ---------------------------------------------- >> cmake -G "Unix Makefiles" .. >> -DCMAKE_INSTALL_PREFIX="/lan/ops/cdnshelp/cops/tools/clucene/sun4v/32bit/release" >> -DCMAKE_CXX_FLAGS:STRING="-library=stlport4" >> -DCMAKE_EXE_LINKER_FLAGS:STRING="-library=stlport4" >> >> >> >> StackTrace of cl_demo crash >> ----------------------------------------- >> dftsol03s% ./bin/cl_demo >> Location of text files to be indexed: . >> Location to store the clucene index: ./idx >> adding file 1: ./cmake_uninstall.cmake >> Segmentation Fault (core dumped) >> dftsol03s% dbx ./bin/cl_demo >> For information about new features see `help changes' >> To remove this message, put `dbxenv suppress_startup_message 7.6' in your >> .dbxrc >> Reading cl_demo >> Reading ld.so.1 >> Reading libclucene-core.so.0.9.23.0 >> Reading libclucene-shared.so.0.9.23.0 >> Reading libthread.so.1 >> Reading libz.so.1 >> Reading libstlport.so.1 >> Reading librt.so.1 >> Reading libCrun.so.1 >> Reading libm.so.2 >> Reading libc.so.1 >> Reading libm.so.1 >> Reading libaio.so.1 >> Reading libmd.so.1 >> (dbx) run >> Running: cl_demo >> (process id 9352) >> Reading libc_psr.so.1 >> Location of text files to be indexed: . >> Location to store the clucene index: ./idx >> adding file 1: ./cmake_uninstall.cmake >> t@1 (l@1) signal SEGV (no mapping at the fault address) in >> lucene::index::DocumentsWriter::ThreadState::init at line 222 in file >> "DocumentsWriterThreadState.cpp" >> 222 while(fp != NULL && _tcscmp(fp->fieldInfo->name, fi->name) != >> 0 ) >> (dbx) where >> current thread: t@1 >> =>[1] lucene::index::DocumentsWriter::ThreadState::init(this = ???, doc = >> ???, docID = ???) (optimized), at 0x7fa24e74 (line ~222) in >> "DocumentsWriterThreadState.cpp" >> [2] lucene::index::DocumentsWriter::getThreadState(this = ???, doc = >> ???, delTerm = ???) (optimized), at 0x7fa189a4 (line ~892) in >> "DocumentsWriter.cpp" >> [3] lucene::index::DocumentsWriter::updateDocument(this = ???, doc = >> ???, analyzer = ???, delTerm = ???) (optimized), at 0x7fa18be4 (line ~940) >> in "DocumentsWriter.cpp" >> [4] lucene::index::DocumentsWriter::addDocument(this = ???, doc = ???, >> analyzer = ???) (optimized), at 0x7fa18ba0 (line ~930) in >> "DocumentsWriter.cpp" >> [5] lucene::index::IndexWriter::addDocument(this = ???, doc = ???, >> analyzer = ???) (optimized), at 0x7fa51bd4 (line ~672) in "IndexWriter.cpp" >> [6] indexDocs(writer = ???, directory = ???) (optimized), at 0x14940 >> (line ~84) in "IndexFiles.cpp" >> [7] IndexFiles(path = ???, target = ???, clearIndex = ???) (optimized), >> at 0x14a78 (line ~117) in "IndexFiles.cpp" >> [8] main(argc = ???, argv = ???) (optimized), at 0x15c9c (line ~58) in >> "Main.cpp" >> >> ----------------------------------------- >> >> >> When configured with -DCMAKE_BUILD_TYPE="Release" option, cl_demo doesn't >> build. >> >> cmake -G "Unix Makefiles" .. >> -DCMAKE_INSTALL_PREFIX="/lan/ops/cdnshelp/cops/tools/clucene/sun4v/32bit/release" >> -DCMAKE_CXX_FLAGS:STRING="-library=stlport4" >> -DCMAKE_EXE_LINKER_FLAGS:STRING="-library=stlport4" >> -DCMAKE_BUILD_TYPE="Release" >> >> make cl_demo gives this linking error. >> >> Linking CXX executable ../../bin/cl_demo >> Undefined first referenced >> symbol in file >> lucene::document::Field::Field(const >> wchar_t*,lucene::util::ValueArray<unsigned char>*,int,const bool) >> ../../bin/libclucene-core.so.0.9.23.0 >> std::string lucene::util::Misc::toString(const unsigned) >> ../../bin/libclucene-core.so.0.9.23.0 >> lucene::index::IndexWriter::IndexWriter(lucene::store::Directory*,lucene::analysis::Analyzer*,const >> bool,const bool) ../../bin/libclucene-core.so.0.9.23.0 >> lucene::store::FSDirectory*lucene::store::FSDirectory::getDirectory(const >> char*,const bool,lucene::store::LockFactory*) >> ../../bin/libclucene-core.so.0.9.23.0 >> lucene::index::IndexWriter::IndexWriter(const >> char*,lucene::analysis::Analyzer*,const bool) >> CMakeFiles/cl_demo.dir/IndexFiles.o >> ld: fatal: Symbol referencing errors. No output written to >> ../../bin/cl_demo >> make[3]: *** [bin/cl_demo] Error 1 >> make[2]: *** [src/demo/CMakeFiles/cl_demo.dir/all] Error 2 >> make[1]: *** [src/demo/CMakeFiles/cl_demo.dir/rule] Error 2 >> make: *** [cl_demo] Error 2 >> >> ----------------------------------------- >> >> Debug mode doesn't build. I have a fix for that I'll be submitting a patch >> for that. >> >> cmake -G "Unix Makefiles" .. >> -DCMAKE_INSTALL_PREFIX="/lan/ops/cdnshelp/cops/tools/clucene/sun4v/32bit/debug" >> -DCMAKE_CXX_FLAGS:STRING="-library=stlport4" >> -DCMAKE_EXE_LINKER_FLAGS:STRING="-library=stlport4" >> -DCMAKE_BUILD_TYPE="Debug" >> >> This is the error >> >> [ 61%] Building CXX object >> src/core/CMakeFiles/clucene-core.dir/CLucene/index/TermInfosReader.o >> "/lan/ops/cdnshelp/cops/users/pankajj/work/foss/clucene/clucene-HEAD-8484291/src/core/CLucene/store/IndexInput.h", >> line 194: Warning: lucene::store::BufferedIndexInput::getObjectName hides >> the virtual function lucene::store::IndexInput::getObjectName() const. >> "/lan/ops/cdnshelp/cops/users/pankajj/work/foss/clucene/clucene-HEAD-8484291/src/core/CLucene/index/TermInfosReader.cpp", >> line 114: Error: The operation "-- lucene::util::__LUCENE_ATOMIC_INT" is >> illegal. >> 1 Error(s) and 1 Warning(s) detected. >> make[2]: *** >> [src/core/CMakeFiles/clucene-core.dir/CLucene/index/TermInfosReader.o] Error >> 1 >> make[1]: *** [src/core/CMakeFiles/clucene-core.dir/all] Error 2 >> make: *** [all] Error 2 >> >> >> ----------------------------------------- >> >> >> On Tue, Nov 16, 2010 at 8:14 AM, Pankaj Jangid <pan...@gm...>wrote: >> >>> We are using .9.21 and it works but the HEAD is quite different from it. >>> I'll send a stack-trace today. >>> >>> -- >>> Regards >>> Pankaj >>> >>> On Tue, Nov 16, 2010 at 7:39 AM, Ben van Klinken <bva...@gm...>wrote: >>> >>>> I did have it going at some stage... was a while back so probably >>>> something new has broken it. >>>> >>>> ben >>>> >>>> >>>> On Sun, Nov 14, 2010 at 2:26 AM, Itamar Syn-Hershko <it...@co... >>>> > wrote: >>>> >>>>> Hi, >>>>> >>>>> >>>>> I assume the same segfault is given in both cases. Can you send a >>>>> stacktrace? Also, please sync your copy with the latest git master HEAD. >>>>> >>>>> >>>>> Itamar. >>>>> >>>>> >>>>> On 12/11/2010 3:57 PM, Pankaj Jangid wrote: >>>>> >>>>> I have built clucene with this configuration on Solaris Sparc, >>>>> >>>>> cmake -G "Unix Makefiles" .. -DCMAKE_INSTALL_PREFIX="<cl_install_path>" >>>>> -DCMAKE_CXX_FLAGS:STRING="-m32 -library=stlport4" >>>>> -DCMAKE_EXE_LINKER_FLAGS:STRING="-library=stlport4" >>>>> -DCMAKE_BUILD_TYPE="Debug|Release" >>>>> >>>>> I have tried various combinations - Debug, Release, -m32. With -m64 it >>>>> doesn't build at all. >>>>> >>>>> cl_demo gives segfault during indexing. >>>>> >>>>> cl_test fails. Gives segfault. >>>>> >>>>> Have anybody tried clucene HEAD on Solaris/Sparc? >>>>> >>>>> -- >>>>> Regards >>>>> Pankaj >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Centralized Desktop Delivery: Dell and VMware Reference Architecture >>>>> Simplifying enterprise desktop deployment and management using >>>>> Dell EqualLogic storage and VMware View: A highly scalable, end-to-end >>>>> client virtualization framework. Read more!http://p.sf.net/sfu/dell-eql-dev2dev >>>>> >>>>> >>>>> _______________________________________________ >>>>> CLucene-developers mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/clucene-developers >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Centralized Desktop Delivery: Dell and VMware Reference Architecture >>>>> Simplifying enterprise desktop deployment and management using >>>>> Dell EqualLogic storage and VMware View: A highly scalable, end-to-end >>>>> client virtualization framework. Read more! >>>>> http://p.sf.net/sfu/dell-eql-dev2dev >>>>> _______________________________________________ >>>>> CLucene-developers mailing list >>>>> CLu...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/clucene-developers >>>>> >>>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Beautiful is writing same markup. Internet Explorer 9 supports >>>> standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. >>>> Spend less time writing and rewriting code and more time creating great >>>> experiences on the web. Be a part of the beta today >>>> http://p.sf.net/sfu/msIE9-sfdev2dev >>>> >>>> _______________________________________________ >>>> CLucene-developers mailing list >>>> CLu...@li... >>>> https://lists.sourceforge.net/lists/listinfo/clucene-developers >>>> >>>> >>> >> > > > ------------------------------------------------------------------------------ > Lotusphere 2011 > Register now for Lotusphere 2011 and learn how > to connect the dots, take your collaborative environment > to the next level, and enter the era of Social Business. > http://p.sf.net/sfu/lotusphere-d2d > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > > |
From: Veit J. <nun...@go...> - 2010-12-21 09:51:17
|
2010/12/21 Veit Jahns <nun...@go...>: > [...] (Sun, I guess). Scratch that! It was clearly stated in the subject. Veit |
From: Veit J. <nun...@go...> - 2010-12-21 09:48:30
|
Just an idea regarding: >> make cl_demo gives this linking error. >> >> Linking CXX executable ../../bin/cl_demo >> Undefined first referenced >> symbol in file >> lucene::document::Field::Field(const >> wchar_t*,lucene::util::ValueArray<unsigned char>*,int,const bool) >> ../../bin/libclucene-core.so.0.9.23.0 There is a difference between the header file and the implementation file. The header says (line 142); Field(const TCHAR* name, CL_NS(util)::ValueArray<uint8_t>* data, int _config, const bool duplicateValue = true) And in the implementation the "const" before duplicateValue is missing (line 61): Field(const TCHAR* Name, ValueArray<uint8_t>* Value, int config, bool duplicateValue) Maybe this makes a difference on your plattform (Sun, I guess). >> std::string lucene::util::Misc::toString(const unsigned) >> ../../bin/libclucene-core.so.0.9.23.0 The same. >> lucene::index::IndexWriter::IndexWriter(lucene::store::Directory*,lucene::analysis::Analyzer*,const >> bool,const bool) ../../bin/libclucene-core.so.0.9.23.0 The same. >> lucene::store::FSDirectory*lucene::store::FSDirectory::getDirectory(const >> char*,const bool,lucene::store::LockFactory*) >> ../../bin/libclucene-core.so.0.9.23.0 The same. >> lucene::index::IndexWriter::IndexWriter(const >> char*,lucene::analysis::Analyzer*,const bool) The same. Veit |
From: Pankaj J. <pan...@gm...> - 2010-12-21 06:10:09
|
Guys, Any clue on this. I am interested in fixing this issue. Can someone guide me on this? -- Regards Pankaj On Tue, Nov 16, 2010 at 4:00 PM, Pankaj Jangid <pan...@gm...>wrote: > Here is the stack trace. > > clucene built with these options > ---------------------------------------------- > cmake -G "Unix Makefiles" .. > -DCMAKE_INSTALL_PREFIX="/lan/ops/cdnshelp/cops/tools/clucene/sun4v/32bit/release" > -DCMAKE_CXX_FLAGS:STRING="-library=stlport4" > -DCMAKE_EXE_LINKER_FLAGS:STRING="-library=stlport4" > > > > StackTrace of cl_demo crash > ----------------------------------------- > dftsol03s% ./bin/cl_demo > Location of text files to be indexed: . > Location to store the clucene index: ./idx > adding file 1: ./cmake_uninstall.cmake > Segmentation Fault (core dumped) > dftsol03s% dbx ./bin/cl_demo > For information about new features see `help changes' > To remove this message, put `dbxenv suppress_startup_message 7.6' in your > .dbxrc > Reading cl_demo > Reading ld.so.1 > Reading libclucene-core.so.0.9.23.0 > Reading libclucene-shared.so.0.9.23.0 > Reading libthread.so.1 > Reading libz.so.1 > Reading libstlport.so.1 > Reading librt.so.1 > Reading libCrun.so.1 > Reading libm.so.2 > Reading libc.so.1 > Reading libm.so.1 > Reading libaio.so.1 > Reading libmd.so.1 > (dbx) run > Running: cl_demo > (process id 9352) > Reading libc_psr.so.1 > Location of text files to be indexed: . > Location to store the clucene index: ./idx > adding file 1: ./cmake_uninstall.cmake > t@1 (l@1) signal SEGV (no mapping at the fault address) in > lucene::index::DocumentsWriter::ThreadState::init at line 222 in file > "DocumentsWriterThreadState.cpp" > 222 while(fp != NULL && _tcscmp(fp->fieldInfo->name, fi->name) != 0 > ) > (dbx) where > current thread: t@1 > =>[1] lucene::index::DocumentsWriter::ThreadState::init(this = ???, doc = > ???, docID = ???) (optimized), at 0x7fa24e74 (line ~222) in > "DocumentsWriterThreadState.cpp" > [2] lucene::index::DocumentsWriter::getThreadState(this = ???, doc = ???, > delTerm = ???) (optimized), at 0x7fa189a4 (line ~892) in > "DocumentsWriter.cpp" > [3] lucene::index::DocumentsWriter::updateDocument(this = ???, doc = ???, > analyzer = ???, delTerm = ???) (optimized), at 0x7fa18be4 (line ~940) in > "DocumentsWriter.cpp" > [4] lucene::index::DocumentsWriter::addDocument(this = ???, doc = ???, > analyzer = ???) (optimized), at 0x7fa18ba0 (line ~930) in > "DocumentsWriter.cpp" > [5] lucene::index::IndexWriter::addDocument(this = ???, doc = ???, > analyzer = ???) (optimized), at 0x7fa51bd4 (line ~672) in "IndexWriter.cpp" > [6] indexDocs(writer = ???, directory = ???) (optimized), at 0x14940 > (line ~84) in "IndexFiles.cpp" > [7] IndexFiles(path = ???, target = ???, clearIndex = ???) (optimized), > at 0x14a78 (line ~117) in "IndexFiles.cpp" > [8] main(argc = ???, argv = ???) (optimized), at 0x15c9c (line ~58) in > "Main.cpp" > > ----------------------------------------- > > > When configured with -DCMAKE_BUILD_TYPE="Release" option, cl_demo doesn't > build. > > cmake -G "Unix Makefiles" .. > -DCMAKE_INSTALL_PREFIX="/lan/ops/cdnshelp/cops/tools/clucene/sun4v/32bit/release" > -DCMAKE_CXX_FLAGS:STRING="-library=stlport4" > -DCMAKE_EXE_LINKER_FLAGS:STRING="-library=stlport4" > -DCMAKE_BUILD_TYPE="Release" > > make cl_demo gives this linking error. > > Linking CXX executable ../../bin/cl_demo > Undefined first referenced > symbol in file > lucene::document::Field::Field(const > wchar_t*,lucene::util::ValueArray<unsigned char>*,int,const bool) > ../../bin/libclucene-core.so.0.9.23.0 > std::string lucene::util::Misc::toString(const unsigned) > ../../bin/libclucene-core.so.0.9.23.0 > lucene::index::IndexWriter::IndexWriter(lucene::store::Directory*,lucene::analysis::Analyzer*,const > bool,const bool) ../../bin/libclucene-core.so.0.9.23.0 > lucene::store::FSDirectory*lucene::store::FSDirectory::getDirectory(const > char*,const bool,lucene::store::LockFactory*) > ../../bin/libclucene-core.so.0.9.23.0 > lucene::index::IndexWriter::IndexWriter(const > char*,lucene::analysis::Analyzer*,const bool) > CMakeFiles/cl_demo.dir/IndexFiles.o > ld: fatal: Symbol referencing errors. No output written to > ../../bin/cl_demo > make[3]: *** [bin/cl_demo] Error 1 > make[2]: *** [src/demo/CMakeFiles/cl_demo.dir/all] Error 2 > make[1]: *** [src/demo/CMakeFiles/cl_demo.dir/rule] Error 2 > make: *** [cl_demo] Error 2 > > ----------------------------------------- > > Debug mode doesn't build. I have a fix for that I'll be submitting a patch > for that. > > cmake -G "Unix Makefiles" .. > -DCMAKE_INSTALL_PREFIX="/lan/ops/cdnshelp/cops/tools/clucene/sun4v/32bit/debug" > -DCMAKE_CXX_FLAGS:STRING="-library=stlport4" > -DCMAKE_EXE_LINKER_FLAGS:STRING="-library=stlport4" > -DCMAKE_BUILD_TYPE="Debug" > > This is the error > > [ 61%] Building CXX object > src/core/CMakeFiles/clucene-core.dir/CLucene/index/TermInfosReader.o > "/lan/ops/cdnshelp/cops/users/pankajj/work/foss/clucene/clucene-HEAD-8484291/src/core/CLucene/store/IndexInput.h", > line 194: Warning: lucene::store::BufferedIndexInput::getObjectName hides > the virtual function lucene::store::IndexInput::getObjectName() const. > "/lan/ops/cdnshelp/cops/users/pankajj/work/foss/clucene/clucene-HEAD-8484291/src/core/CLucene/index/TermInfosReader.cpp", > line 114: Error: The operation "-- lucene::util::__LUCENE_ATOMIC_INT" is > illegal. > 1 Error(s) and 1 Warning(s) detected. > make[2]: *** > [src/core/CMakeFiles/clucene-core.dir/CLucene/index/TermInfosReader.o] Error > 1 > make[1]: *** [src/core/CMakeFiles/clucene-core.dir/all] Error 2 > make: *** [all] Error 2 > > > ----------------------------------------- > > > On Tue, Nov 16, 2010 at 8:14 AM, Pankaj Jangid <pan...@gm...>wrote: > >> We are using .9.21 and it works but the HEAD is quite different from it. >> I'll send a stack-trace today. >> >> -- >> Regards >> Pankaj >> >> On Tue, Nov 16, 2010 at 7:39 AM, Ben van Klinken <bva...@gm...>wrote: >> >>> I did have it going at some stage... was a while back so probably >>> something new has broken it. >>> >>> ben >>> >>> >>> On Sun, Nov 14, 2010 at 2:26 AM, Itamar Syn-Hershko <it...@co...>wrote: >>> >>>> Hi, >>>> >>>> >>>> I assume the same segfault is given in both cases. Can you send a >>>> stacktrace? Also, please sync your copy with the latest git master HEAD. >>>> >>>> >>>> Itamar. >>>> >>>> >>>> On 12/11/2010 3:57 PM, Pankaj Jangid wrote: >>>> >>>> I have built clucene with this configuration on Solaris Sparc, >>>> >>>> cmake -G "Unix Makefiles" .. -DCMAKE_INSTALL_PREFIX="<cl_install_path>" >>>> -DCMAKE_CXX_FLAGS:STRING="-m32 -library=stlport4" >>>> -DCMAKE_EXE_LINKER_FLAGS:STRING="-library=stlport4" >>>> -DCMAKE_BUILD_TYPE="Debug|Release" >>>> >>>> I have tried various combinations - Debug, Release, -m32. With -m64 it >>>> doesn't build at all. >>>> >>>> cl_demo gives segfault during indexing. >>>> >>>> cl_test fails. Gives segfault. >>>> >>>> Have anybody tried clucene HEAD on Solaris/Sparc? >>>> >>>> -- >>>> Regards >>>> Pankaj >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Centralized Desktop Delivery: Dell and VMware Reference Architecture >>>> Simplifying enterprise desktop deployment and management using >>>> Dell EqualLogic storage and VMware View: A highly scalable, end-to-end >>>> client virtualization framework. Read more!http://p.sf.net/sfu/dell-eql-dev2dev >>>> >>>> >>>> _______________________________________________ >>>> CLucene-developers mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/clucene-developers >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Centralized Desktop Delivery: Dell and VMware Reference Architecture >>>> Simplifying enterprise desktop deployment and management using >>>> Dell EqualLogic storage and VMware View: A highly scalable, end-to-end >>>> client virtualization framework. Read more! >>>> http://p.sf.net/sfu/dell-eql-dev2dev >>>> _______________________________________________ >>>> CLucene-developers mailing list >>>> CLu...@li... >>>> https://lists.sourceforge.net/lists/listinfo/clucene-developers >>>> >>>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Beautiful is writing same markup. Internet Explorer 9 supports >>> standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. >>> Spend less time writing and rewriting code and more time creating great >>> experiences on the web. Be a part of the beta today >>> http://p.sf.net/sfu/msIE9-sfdev2dev >>> >>> _______________________________________________ >>> CLucene-developers mailing list >>> CLu...@li... >>> https://lists.sourceforge.net/lists/listinfo/clucene-developers >>> >>> >> > |
From: Alexander E. <ego...@gm...> - 2010-12-21 06:02:28
|
I reproduced problem (on last git branch). Anyone can fix bug? 2010/11/12 <ps...@zo...>: > In this test adding 3 docs and then updating them. And then search > docs with sorting by doc_id (desc), after some updates search() not > sorting docs. > > search 1: //all ok > id = 3, score = 0.999667 > id = 2, score = 0.999667 > id = 1, score = 0.999667 > > search 2: //all ok > id = 3, score = 0.999833 > id = 2, score = 0.999833 > id = 1, score = 0.999833 > > search 3: //not sorted! > id = 1, score = 0.999889 > id = 2, score = 0.999889 > id = 3, score = 0.999889 > > search final: //not sorted! > id = 3, score = 0.712318 > id = 1, score = 0.712318 > id = 2, score = 0.712318 > |
From: Šplíchal J. <spl...@to...> - 2010-12-14 08:36:48
|
Hi Andrew, I also found a way to fix it, but I don't like it because it causes additional unnecessary seeks in the file. Could you please send me you version of the BufferedIndexInput class (just the files you have changed)? I would test it and publish it into the GIT repository together with some extended tests. Thanks, Jiri -----Original Message----- From: Andrew McCann [mailto:mc...@de...] Sent: Tuesday, December 14, 2010 12:07 AM To: clu...@li... Subject: Re: [CLucene-dev] read past EOF ERROR while searching Hi guys, I've never responded to this list.. I have encountered this bug myself.. and I made a fix.. (Took a while to track down) jiri is on the right track, that was the problem I had.. I refactored the bufferedindexinput class slightly to prevent it from happening, was a minimal change. I'm not sure what the procedure is for submitting fixes though. -Andrew 2010/12/13 Šplíchal Jiří <spl...@to...>: > Hi, > > some more details: > > the exception comes from the method: > > void FSDirectory::FSIndexInput::readInternal(uint8_t* b, const int32_t len) > > > > which is called from: > > void BufferedIndexInput::refill() > > > > the strange thing is: in the refill buffer there is a member start with the > value 1024 > > but when calling the readInternal method, both members > > handle->_fpos and _pos have values 1025 (one more) and so it tries to read > different data from the file. > > > > Hope, It helps. > > > > Jiri > > > > > > > > From: Šplíchal Jiří [mailto:spl...@to...] > Sent: Monday, December 13, 2010 9:28 PM > To: clu...@li... > Subject: [CLucene-dev] read past EOF ERROR while searching > > > > Hi, > > > > we found serious problem while searching. In some special situation > (probably depends on the index size) > > repeating search does not return correct results or event ends with > CLuceneError "read past EOF". > > > > To achieve the error, the following sequence must be called: > > 1) run a query > > 2) delete an instance of Analyzer (can be instantiated earlier but in the > same thread) - this causes that the ThreadLocals objects are freed > > 3) run the same query - THIS FAILS!! > > > > In all the cases where the seach failed we switched off the compound files. > > Could some one help us with this issue? It seems that reading the index > files is not working correctly. > > > > Jiri > > > > PS: The following code is a test that demostrated the problem: > > > > > > > > /** > > * Create index > > */ > > Directory* prepareDirectory1() > > { > > const TCHAR * tszDocText = _T( "a b c d e f g h i j k l m n o p q r s > t u v w x y z ab bb cb db eb fb gb hb ib jb kb lb mb nb ob pb qb rb sb tb ub > vb wb xb yb zb ac bc cc dc ec fc gc hc ic jc kc lc mc nc oc pc qc rc sc tc > uc vc wc xc yc zc ad bd cd dd ed fd gd hd id jd kd ld md nd od pd qd rd sd > td ud vd wd xd yd zd ae be ce de ee fe ge he ie je ke le me ne oe pe qe re > se te ue ve we xe ye ze af bf cf df ef ff gf hf if jf kf lf mf" ); > > > > char fsdir[CL_MAX_PATH]; > > _snprintf(fsdir,CL_MAX_PATH,"%s/%s",cl_tempDir, "test.search"); > > > > WhitespaceAnalyzer analyzer; > > Directory* pDirectory = > (Directory*)FSDirectory::getDirectory(fsdir); > > IndexWriter writer( pDirectory, &analyzer, true ); > > > > writer.setUseCompoundFile( false ); > > > > Document* d = _CLNEW Document(); > > d->add( *_CLNEW Field( _T("_content"), tszDocText, Field::STORE_NO | > Field::INDEX_TOKENIZED )); > > writer.addDocument(d); > > _CLDELETE( d ); > > writer.close(); > > > > return pDirectory; > > } > > > > /** > > * Run test > > */ > > void testReadPastEOF(CuTest *tc) > > { > > Directory* pDirectory = prepareDirectory1(); > > Analyzer * pAnalyzer = NULL; > > Hits * pHits = NULL; > > IndexReader* pReader = IndexReader::open( pDirectory ); > > IndexSearcher searcher( pReader ); > > > > CLUCENE_ASSERT( pReader->numDocs() == 1 ); > > > > Term * t1 = new Term( _T( "_content" ), _T( "ze" ) ); > > TermQuery * pQry1 = new TermQuery( t1 ); > > _CLDECDELETE( t1 ); > > > > pAnalyzer = new SimpleAnalyzer(); > > pHits = searcher.search( pQry1 ); > > _ASSERT( pHits->length() == 1 ); > > CLUCENE_ASSERT( pHits->length() == 1 ); > > _CLDELETE( pHits ); > > > > // Removing the analyzer causes removing of ThreadLocals - also cached > SegmentTermEnum > > _CLDELETE( pAnalyzer ); > > > > // THE NEXT CALL WILL FAIL > > pHits = searcher.search( pQry1 ); > > _ASSERT( pHits->length() == 1 ); > > CLUCENE_ASSERT( pHits->length() == 1 ); > > _CLDELETE( pHits ); > > > > _CLDELETE( pQry1 ); > > > > searcher.close(); > > _CLDELETE( pReader ); > > > > pDirectory->close(); > > _CLDECDELETE( pDirectory ); > > } > > > > > > ------------------------------------------------------------------------------ > Lotusphere 2011 > Register now for Lotusphere 2011 and learn how > to connect the dots, take your collaborative environment > to the next level, and enter the era of Social Business. > http://p.sf.net/sfu/lotusphere-d2d > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > > ------------------------------------------------------------------------------ Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d _______________________________________________ CLucene-developers mailing list CLu...@li... https://lists.sourceforge.net/lists/listinfo/clucene-developers |
From: Andrew M. <mc...@de...> - 2010-12-14 00:08:35
|
Hi guys, I've never responded to this list.. I have encountered this bug myself.. and I made a fix.. (Took a while to track down) jiri is on the right track, that was the problem I had.. I refactored the bufferedindexinput class slightly to prevent it from happening, was a minimal change. I'm not sure what the procedure is for submitting fixes though. -Andrew 2010/12/13 Šplíchal Jiří <spl...@to...>: > Hi, > > some more details: > > the exception comes from the method: > > void FSDirectory::FSIndexInput::readInternal(uint8_t* b, const int32_t len) > > > > which is called from: > > void BufferedIndexInput::refill() > > > > the strange thing is: in the refill buffer there is a member start with the > value 1024 > > but when calling the readInternal method, both members > > handle->_fpos and _pos have values 1025 (one more) and so it tries to read > different data from the file. > > > > Hope, It helps. > > > > Jiri > > > > > > > > From: Šplíchal Jiří [mailto:spl...@to...] > Sent: Monday, December 13, 2010 9:28 PM > To: clu...@li... > Subject: [CLucene-dev] read past EOF ERROR while searching > > > > Hi, > > > > we found serious problem while searching. In some special situation > (probably depends on the index size) > > repeating search does not return correct results or event ends with > CLuceneError "read past EOF". > > > > To achieve the error, the following sequence must be called: > > 1) run a query > > 2) delete an instance of Analyzer (can be instantiated earlier but in the > same thread) - this causes that the ThreadLocals objects are freed > > 3) run the same query - THIS FAILS!! > > > > In all the cases where the seach failed we switched off the compound files. > > Could some one help us with this issue? It seems that reading the index > files is not working correctly. > > > > Jiri > > > > PS: The following code is a test that demostrated the problem: > > > > > > > > /** > > * Create index > > */ > > Directory* prepareDirectory1() > > { > > const TCHAR * tszDocText = _T( "a b c d e f g h i j k l m n o p q r s > t u v w x y z ab bb cb db eb fb gb hb ib jb kb lb mb nb ob pb qb rb sb tb ub > vb wb xb yb zb ac bc cc dc ec fc gc hc ic jc kc lc mc nc oc pc qc rc sc tc > uc vc wc xc yc zc ad bd cd dd ed fd gd hd id jd kd ld md nd od pd qd rd sd > td ud vd wd xd yd zd ae be ce de ee fe ge he ie je ke le me ne oe pe qe re > se te ue ve we xe ye ze af bf cf df ef ff gf hf if jf kf lf mf" ); > > > > char fsdir[CL_MAX_PATH]; > > _snprintf(fsdir,CL_MAX_PATH,"%s/%s",cl_tempDir, "test.search"); > > > > WhitespaceAnalyzer analyzer; > > Directory* pDirectory = > (Directory*)FSDirectory::getDirectory(fsdir); > > IndexWriter writer( pDirectory, &analyzer, true ); > > > > writer.setUseCompoundFile( false ); > > > > Document* d = _CLNEW Document(); > > d->add( *_CLNEW Field( _T("_content"), tszDocText, Field::STORE_NO | > Field::INDEX_TOKENIZED )); > > writer.addDocument(d); > > _CLDELETE( d ); > > writer.close(); > > > > return pDirectory; > > } > > > > /** > > * Run test > > */ > > void testReadPastEOF(CuTest *tc) > > { > > Directory* pDirectory = prepareDirectory1(); > > Analyzer * pAnalyzer = NULL; > > Hits * pHits = NULL; > > IndexReader* pReader = IndexReader::open( pDirectory ); > > IndexSearcher searcher( pReader ); > > > > CLUCENE_ASSERT( pReader->numDocs() == 1 ); > > > > Term * t1 = new Term( _T( "_content" ), _T( "ze" ) ); > > TermQuery * pQry1 = new TermQuery( t1 ); > > _CLDECDELETE( t1 ); > > > > pAnalyzer = new SimpleAnalyzer(); > > pHits = searcher.search( pQry1 ); > > _ASSERT( pHits->length() == 1 ); > > CLUCENE_ASSERT( pHits->length() == 1 ); > > _CLDELETE( pHits ); > > > > // Removing the analyzer causes removing of ThreadLocals - also cached > SegmentTermEnum > > _CLDELETE( pAnalyzer ); > > > > // THE NEXT CALL WILL FAIL > > pHits = searcher.search( pQry1 ); > > _ASSERT( pHits->length() == 1 ); > > CLUCENE_ASSERT( pHits->length() == 1 ); > > _CLDELETE( pHits ); > > > > _CLDELETE( pQry1 ); > > > > searcher.close(); > > _CLDELETE( pReader ); > > > > pDirectory->close(); > > _CLDECDELETE( pDirectory ); > > } > > > > > > ------------------------------------------------------------------------------ > Lotusphere 2011 > Register now for Lotusphere 2011 and learn how > to connect the dots, take your collaborative environment > to the next level, and enter the era of Social Business. > http://p.sf.net/sfu/lotusphere-d2d > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > > |
From: Šplíchal J. <spl...@to...> - 2010-12-13 20:44:36
|
Hi, some more details: the exception comes from the method: void FSDirectory::FSIndexInput::readInternal(uint8_t* b, const int32_t len) which is called from: void BufferedIndexInput::refill() the strange thing is: in the refill buffer there is a member start with the value 1024 but when calling the readInternal method, both members handle->_fpos and _pos have values 1025 (one more) and so it tries to read different data from the file. Hope, It helps. Jiri From: Šplíchal Jiří [mailto:spl...@to...] Sent: Monday, December 13, 2010 9:28 PM To: clu...@li... Subject: [CLucene-dev] read past EOF ERROR while searching Hi, we found serious problem while searching. In some special situation (probably depends on the index size) repeating search does not return correct results or event ends with CLuceneError "read past EOF". To achieve the error, the following sequence must be called: 1) run a query 2) delete an instance of Analyzer (can be instantiated earlier but in the same thread) - this causes that the ThreadLocals objects are freed 3) run the same query - THIS FAILS!! In all the cases where the seach failed we switched off the compound files. Could some one help us with this issue? It seems that reading the index files is not working correctly. Jiri PS: The following code is a test that demostrated the problem: /** * Create index */ Directory* prepareDirectory1() { const TCHAR * tszDocText = _T( "a b c d e f g h i j k l m n o p q r s t u v w x y z ab bb cb db eb fb gb hb ib jb kb lb mb nb ob pb qb rb sb tb ub vb wb xb yb zb ac bc cc dc ec fc gc hc ic jc kc lc mc nc oc pc qc rc sc tc uc vc wc xc yc zc ad bd cd dd ed fd gd hd id jd kd ld md nd od pd qd rd sd td ud vd wd xd yd zd ae be ce de ee fe ge he ie je ke le me ne oe pe qe re se te ue ve we xe ye ze af bf cf df ef ff gf hf if jf kf lf mf" ); char fsdir[CL_MAX_PATH]; _snprintf(fsdir,CL_MAX_PATH,"%s/%s",cl_tempDir, "test.search"); WhitespaceAnalyzer analyzer; Directory* pDirectory = (Directory*)FSDirectory::getDirectory(fsdir); IndexWriter writer( pDirectory, &analyzer, true ); writer.setUseCompoundFile( false ); Document* d = _CLNEW Document(); d->add( *_CLNEW Field( _T("_content"), tszDocText, Field::STORE_NO | Field::INDEX_TOKENIZED )); writer.addDocument(d); _CLDELETE( d ); writer.close(); return pDirectory; } /** * Run test */ void testReadPastEOF(CuTest *tc) { Directory* pDirectory = prepareDirectory1(); Analyzer * pAnalyzer = NULL; Hits * pHits = NULL; IndexReader* pReader = IndexReader::open( pDirectory ); IndexSearcher searcher( pReader ); CLUCENE_ASSERT( pReader->numDocs() == 1 ); Term * t1 = new Term( _T( "_content" ), _T( "ze" ) ); TermQuery * pQry1 = new TermQuery( t1 ); _CLDECDELETE( t1 ); pAnalyzer = new SimpleAnalyzer(); pHits = searcher.search( pQry1 ); _ASSERT( pHits->length() == 1 ); CLUCENE_ASSERT( pHits->length() == 1 ); _CLDELETE( pHits ); // Removing the analyzer causes removing of ThreadLocals - also cached SegmentTermEnum _CLDELETE( pAnalyzer ); // THE NEXT CALL WILL FAIL pHits = searcher.search( pQry1 ); _ASSERT( pHits->length() == 1 ); CLUCENE_ASSERT( pHits->length() == 1 ); _CLDELETE( pHits ); _CLDELETE( pQry1 ); searcher.close(); _CLDELETE( pReader ); pDirectory->close(); _CLDECDELETE( pDirectory ); } |
From: Šplíchal J. <spl...@to...> - 2010-12-13 20:28:12
|
Hi, we found serious problem while searching. In some special situation (probably depends on the index size) repeating search does not return correct results or event ends with CLuceneError "read past EOF". To achieve the error, the following sequence must be called: 1) run a query 2) delete an instance of Analyzer (can be instantiated earlier but in the same thread) - this causes that the ThreadLocals objects are freed 3) run the same query - THIS FAILS!! In all the cases where the seach failed we switched off the compound files. Could some one help us with this issue? It seems that reading the index files is not working correctly. Jiri PS: The following code is a test that demostrated the problem: /** * Create index */ Directory* prepareDirectory1() { const TCHAR * tszDocText = _T( "a b c d e f g h i j k l m n o p q r s t u v w x y z ab bb cb db eb fb gb hb ib jb kb lb mb nb ob pb qb rb sb tb ub vb wb xb yb zb ac bc cc dc ec fc gc hc ic jc kc lc mc nc oc pc qc rc sc tc uc vc wc xc yc zc ad bd cd dd ed fd gd hd id jd kd ld md nd od pd qd rd sd td ud vd wd xd yd zd ae be ce de ee fe ge he ie je ke le me ne oe pe qe re se te ue ve we xe ye ze af bf cf df ef ff gf hf if jf kf lf mf" ); char fsdir[CL_MAX_PATH]; _snprintf(fsdir,CL_MAX_PATH,"%s/%s",cl_tempDir, "test.search"); WhitespaceAnalyzer analyzer; Directory* pDirectory = (Directory*)FSDirectory::getDirectory(fsdir); IndexWriter writer( pDirectory, &analyzer, true ); writer.setUseCompoundFile( false ); Document* d = _CLNEW Document(); d->add( *_CLNEW Field( _T("_content"), tszDocText, Field::STORE_NO | Field::INDEX_TOKENIZED )); writer.addDocument(d); _CLDELETE( d ); writer.close(); return pDirectory; } /** * Run test */ void testReadPastEOF(CuTest *tc) { Directory* pDirectory = prepareDirectory1(); Analyzer * pAnalyzer = NULL; Hits * pHits = NULL; IndexReader* pReader = IndexReader::open( pDirectory ); IndexSearcher searcher( pReader ); CLUCENE_ASSERT( pReader->numDocs() == 1 ); Term * t1 = new Term( _T( "_content" ), _T( "ze" ) ); TermQuery * pQry1 = new TermQuery( t1 ); _CLDECDELETE( t1 ); pAnalyzer = new SimpleAnalyzer(); pHits = searcher.search( pQry1 ); _ASSERT( pHits->length() == 1 ); CLUCENE_ASSERT( pHits->length() == 1 ); _CLDELETE( pHits ); // Removing the analyzer causes removing of ThreadLocals - also cached SegmentTermEnum _CLDELETE( pAnalyzer ); // THE NEXT CALL WILL FAIL pHits = searcher.search( pQry1 ); _ASSERT( pHits->length() == 1 ); CLUCENE_ASSERT( pHits->length() == 1 ); _CLDELETE( pHits ); _CLDELETE( pQry1 ); searcher.close(); _CLDELETE( pReader ); pDirectory->close(); _CLDECDELETE( pDirectory ); } |
From: Miller, B. (QuickWire) <bm...@qu...> - 2010-12-10 21:27:51
|
Hi all, I've been implementing MultiSearcher and have a problem that may be more of a 'Lucene Conceptual' thing than a bug. I'm running a fairly new 2_3_2 git under Windows (VS 2010). My problem is when I pass a sort to MultSearch::search() I receive many duplicate hits (note that all doc id's are unique across both indexes I'm testing with...). However, if I only ask for 100 hits max there are no duplicates. Also if I do not pass a sort then there are no duplicates. I find it rather hard to follow but I'm guessing that the 'good' 100 docs comes from the initial search and the duplicates are caused from the Hits::getMoreDocs() eventually calling MultSearch::search() again and for some reason adding the same hits each time. Should I be expecting this behavior? Thanks, Bill |
From: Matt R. <mr...@mr...> - 2010-12-05 17:43:46
|
Shoot I wish I had noticed this earlier: http://clucene.git.sourceforge.net/git/gitweb.cgi?p=clucene/clucene;a=commit;h=de5695332badddc264c3e187350463d9d6ee4a8a Looks like someone else had already found and fixed the bug. I wish I had found that, would have saved me a lot of time, oh well. On Dec 4, 2010, at 1:40 PM, Matt Ronge wrote: > I've been working off the head of CLucene (which has great!) and I ran into a memory smasher. > > I had strange issues where after some queries the search results would start to get more and more incorrect until the app would crash. After much debugging I was able to confirm that this would only occur for queries which had lengths that were multiples of 8. > > After even more debugging I found that the KeywordTokenizer (which I was using for my queries) was allocating term buffers that where multiples of 8 (suspicious!). It turns out that if the token length is a multiple of 8 and the KeywordTokenizer attempts to null terminate the string, it writes off the end of the array, causing memory corruption. Normally you don't see this because it silently corrupts and the token must be the multiple of 8. > > To fix this I make sure to add room for the null terminator if the buffer is already full. Here is my patch: > > diff --git a/src/core/CLucene/analysis/Analyzers.cpp b/src/core/CLucene/analysis/Analyzers.cpp > index 0c34a60..39fec43 100644 > --- a/src/core/CLucene/analysis/Analyzers.cpp > +++ b/src/core/CLucene/analysis/Analyzers.cpp > @@ -556,6 +556,9 @@ Token* KeywordTokenizer::next(Token* token){ > if ( termBuffer == NULL ){ > termBuffer=token->resizeTermBuffer(token->bufferLength() + 8); > } > + if (upto == token->bufferLength()) { > + termBuffer = token->resizeTermBuffer(token->bufferLength() + 1); > + } > termBuffer[upto]=0; > token->setTermLength(upto); > return token; > > > Let me know if I should open a bug for this. > > Thanks again for clucene! > -- > Matt Ronge > Central Atomics > Makers of Rocketbox > > http://www.getrocketbox.com > mr...@ce... > > > > > > > > ------------------------------------------------------------------------------ > What happens now with your Lotus Notes apps - do you make another costly > upgrade, or settle for being marooned without product support? Time to move > off Lotus Notes and onto the cloud with Force.com, apps are easier to build, > use, and manage than apps on traditional platforms. Sign up for the Lotus > Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers -- Matt Ronge Central Atomics Makers of Rocketbox http://www.getrocketbox.com mr...@ce... |
From: Matt R. <mr...@mr...> - 2010-12-04 19:56:09
|
I've been working off the head of CLucene (which has great!) and I ran into a memory smasher. I had strange issues where after some queries the search results would start to get more and more incorrect until the app would crash. After much debugging I was able to confirm that this would only occur for queries which had lengths that were multiples of 8. After even more debugging I found that the KeywordTokenizer (which I was using for my queries) was allocating term buffers that where multiples of 8 (suspicious!). It turns out that if the token length is a multiple of 8 and the KeywordTokenizer attempts to null terminate the string, it writes off the end of the array, causing memory corruption. Normally you don't see this because it silently corrupts and the token must be the multiple of 8. To fix this I make sure to add room for the null terminator if the buffer is already full. Here is my patch: diff --git a/src/core/CLucene/analysis/Analyzers.cpp b/src/core/CLucene/analysis/Analyzers.cpp index 0c34a60..39fec43 100644 --- a/src/core/CLucene/analysis/Analyzers.cpp +++ b/src/core/CLucene/analysis/Analyzers.cpp @@ -556,6 +556,9 @@ Token* KeywordTokenizer::next(Token* token){ if ( termBuffer == NULL ){ termBuffer=token->resizeTermBuffer(token->bufferLength() + 8); } + if (upto == token->bufferLength()) { + termBuffer = token->resizeTermBuffer(token->bufferLength() + 1); + } termBuffer[upto]=0; token->setTermLength(upto); return token; Let me know if I should open a bug for this. Thanks again for clucene! -- Matt Ronge Central Atomics Makers of Rocketbox http://www.getrocketbox.com mr...@ce... |
From: Kostka B. <ko...@to...> - 2010-11-28 12:38:41
|
I’ll look why it crashes. But anyway you should call Query::rewrite and pass the resulting query to highlighter instead of original query (see Query.h for description of rewrite). Otherwise highlighter won’t highlight words related to wildcard term. Borek From: muhammad ismael [mailto:m.i...@gm...] Sent: Saturday, November 27, 2010 3:19 PM To: CLu...@li... Subject: [CLucene-dev] crash when using highlighter with the wildcardQuery!!! Hello everybody, When i use WildcardQuery to search for a word the search is done successfully, but when i try to highlight using a WildcardQuery the application crashes. I think this because of the function Query::extractTerms. can any body help to solve this problem? -- Sincerely -------------------- Mohammad Ismael Software Developer Mobile:+20114753575 |
From: Kostka B. <ko...@to...> - 2010-11-28 12:29:20
|
Hi, This is known problem. Unfortunatelly this behavior is caused by way, how it is implemented. Doing exact highlighting is much more complex task (especially for span queries). I remember there was some discussion about this few months ago, try to search mailing list archive Borek From: muhammad ismael [mailto:m.i...@gm...] Sent: Saturday, November 27, 2010 9:30 PM To: CLu...@li... Subject: [CLucene-dev] how can i use the highlighter to highlight the exactterm? Hi, I am searching for the term "hello world" for example, and i use highlighter to highlight it.but the highlighter does not highlight it only, but also highlights the word "hello" even if it is not followed by the other word "world". How can i make the highlighter highlights the exact term only? -- Sincerely -------------------- Mohammad Ismael Software Developer Mobile:+20114753575 |
From: muhammad i. <m.i...@gm...> - 2010-11-27 20:30:13
|
Hi, I am searching for the term "hello world" for example, and i use highlighter to highlight it.but the highlighter does not highlight it only, but also highlights the word "hello" even if it is not followed by the other word "world". How can i make the highlighter highlights the exact term only? -- Sincerely -------------------- Mohammad Ismael Software Developer Mobile:+20114753575 |
From: muhammad i. <m.i...@gm...> - 2010-11-27 14:18:44
|
Hello everybody, When i use WildcardQuery to search for a word the search is done successfully, but when i try to highlight using a WildcardQuery the application crashes. I think this because of the function Query::extractTerms. can any body help to solve this problem? -- Sincerely -------------------- Mohammad Ismael Software Developer Mobile:+20114753575 |
From: Mark A. <ash...@ca...> - 2010-11-24 14:51:25
|
Hi Ben, Thank you for the pointers. I will have a look these options. -- Mark. Mark Ashworth IBM Informix Extensibility Architect Office phone: +1 (905) 413-5033 Alternate: +1 (905) 697-8094 Email: ash...@ca... Check out my blog From: Ben van Klinken <bva...@gm...> To: "clu...@li..." <clu...@li...> Date: 11/09/2010 01:44 PM Subject: Re: [CLucene-dev] wildcard search - unexpected results Hi mark this is the expected behaviour. A leading wildcard would require a full termlist scan therefore by default there is a required prefix length. I think it's set on the querparser. Let me know if you can't find it and I'll try and find it. If you are often doing prefixed wildcards, some people will create a reversed field and use that field instead when appropriate. For example, a search for filename:*.doc. Can be transformed into searching for reverse_filename:cod.*. this can be achieved by overriding the query parser. Nasty but effective trick. Ben On Wednesday, November 10, 2010, Mark Ashworth <ash...@ca...> wrote: > > > Sorry, I sent before changing the subject > line, I email was about wildcard searching and not "NearSpansUnordered > bug fix". I also go 4 copies back from the clucene-developers > email reflector and I am not sure why, again sorry for the extra bandwidth > if you got duplicate copies. > > >> > >> Hope someone with wildcard searching > behaviour: >> >> I have an index with email address with my email address (ash...@ca...): > >> >> I can search for: >> >> Searching for: a?hw...@ca... >> >> 1. ash...@ca... - 100.00 > >> >> Searching for: ashworth@ca.ibm* >> >> 1. ash...@ca... - 100.00 > >> >> Searching for: ?sh...@ca... >> >> 1. ash...@ca... - 100.00 > >> >> Searching for: a?hworth@ca.ibm* >> >> 1. ash...@ca... - 100.00 > >> >> Searching for: ?shworth@ca.ibm* >> >> Does not file any rows when. This is when there is a leading > single character wildcard and a multi-character wildcard at the end.. > >> >> Is this a bug or expected behaviour? >> >> thanks in advance... > > > > > -- Mark. > Mark Ashworth > IBM Informix Extensibility Architect > Office phone: +1 (905) 413-5033 > Alternate: +1 (905) 697-8094 > Email: ash...@ca... > > Check out my blog > > > > > > > From: > Mark Ashworth/Toronto/IBM@IBMCA > > To: > clu...@li... > > Date: > 11/09/2010 12:59 PM > > Subject: > Re: [CLucene-dev] NearSpansUnordered > bug fix > > > > > > > Hope someone with wildcard searching behaviour: > > I have an index with email address with my email address (ash...@ca...): > > > I can search for: > > Searching for: a?hw...@ca... > > 1. ash...@ca... - 100.00 > > Searching for: ashworth@ca.ibm* > > 1. ash...@ca... - 100.00 > > Searching for: ?sh...@ca... > > 1. ash...@ca... - 100.00 > > Searching for: a?hworth@ca.ibm* > > 1. ash...@ca... - 100.00 > > Searching for: ?shworth@ca.ibm* > > Does not file any rows when. This is when there is a leading single > character wildcard and a multi-character wildcard at the end.. > > > Is this a bug or expected behaviour? > > thanks in advance... > > . > > > > -- Mark. > Mark Ashworth > IBM Informix Extensibility Architect > Office phone: +1 (905) 413-5033 > Alternate: +1 (905) 697-8094 > Email: ash...@ca... > > Check out my blog > > > > > > From: > > Veit Jahns <nun...@go...> > > > To: > > clu...@li... > > > Date: > > 11/08/2010 04:02 PM > > > Subject: > > Re: [CLucene-dev] NearSpansUnordered > bug fix > > > > > > > 2010/11/6 Itamar Syn-Hershko <it...@co...>: >> Where are we with the smart_pointers branchs? > > You mean this activity. I thought you meant another one. Actually, I > think every day about this ;). But in the last months I had not so > much time to work on it. I hope I can changed this in the next weeks. > >> And which of all the fix branches is ready to be merged to master? > > german_analyzer and tracker_3094661_fix > > Veit > > ------------------------------------------------------------------------------ > The Next 800 Companies to Lead America's Growth: New Video Whitepaper > David G. Thomson, author of the best-selling book "Blueprint to a > > Billion" shares his insights and actions to help propel your > business during the next growth cycle. Listen Now! > http://p.sf.net/sfu/SAP-dev2dev > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > > ------------------------------------------------------------------------------ > The Next 800 Companies to Lead America's Growth: New Video Whitepaper > David G. Thomson, author of the best-selling book "Blueprint to a > > Billion" shares his insights and actions to help propel your > business during the next growth cycle. Listen Now! > http://p.sf.net/sfu/SAP-dev2dev_______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > > > > ------------------------------------------------------------------------------ The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book "Blueprint to a Billion" shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev _______________________________________________ CLucene-developers mailing list CLu...@li... https://lists.sourceforge.net/lists/listinfo/clucene-developers |
From: Freiholz M. <M.F...@ca...> - 2010-11-23 08:01:08
|
Hello, i just wanted to mention that i uploaded a few contrib lib files to enable spell-checking. I uploaded the 3 patch files to your bug tracking system. https://sourceforge.net/tracker/?func=detail&aid=3113462&group_id=80013&atid=558446 Sorry for 3 patch files, i'm not very familiar with git (Subversion ftw^^), but i work on it ;) Greetings, Manuel Freiholz. |
From: muhammad i. <m.i...@gm...> - 2010-11-20 10:54:04
|
Dear Veit, I want to thank you for your response. I want to tell you that now every thing works fine. The highlighter returns the correct string now, I should index the text after encoding it in UTF-8. On Sat, Nov 20, 2010 at 12:29 AM, Veit Jahns <nun...@go...>wrote: > 2010/11/17 muhammad ismael <m.i...@gm...>: > > Dear Veit, > > I want to thank you for the patch you sent me. > > I patched my Highlighter with your patch and it is compiled successfully. > > Thanks! Then I will commit the patch the CLucene repository. > > > but i have another question: > > I am using clucene to index and search Arabic text until now every thing > > woks fine, > > but when i used the highlighter to return the text fragment, it returns > the > > text in wchar_t it is supposed to be Unicode but when i display it > appears > > with strange characters not Arabic? > > can you help me to solve this problem? > > I will try. Encoding is sometimes very confusing to me, also. On which > plattform do you work? Because wchar_t has different sizes on > different plattforms as one of my colleagues explained it to me. 2 > byte on Windows and 4 byte on Linux plattforms. Maybe this is the > reason. > > Also, can you provide a small test case? Done I can play also a little > bit with these highlighter. Some months ago, I made also some > experiments with Arabic characters, but as I am not familiar with the > Arabic language it was just moving symbols around, not knowing if the > characters form a useful word. > > Regards, > Veit > > PS > Do you mind, if we continue the discussion on the mailing list? The > discussion may be useful for others too or someone other knowing more > about this encoding issues can provide further input. > -- Sincerely -------------------- Mohammad Ismael Software Developer Mobile:+20114753575 |
From: Veit J. <nun...@go...> - 2010-11-19 22:31:52
|
2010/11/16 Šplíchal Jiří <spl...@to...>: > We are using the merge of the two branches without problems = all tests pass in debug and also in release > on win7 64bit. But there are still some memory leaks left. This branch contains also the fix for the missing include of limits.h (see tracker 3083768 [1]). Veit [1] http://sourceforge.net/tracker/?func=detail&aid=3083768&group_id=80013&atid=558446 |
From: Šplíchal J. <spl...@to...> - 2010-11-18 08:38:15
|
Hello, there is a bug in the constructor of the wildcard query setting the termContainsWildcard member variable. The existing test checked if at least one of the chars *? was NOT contained in the string instead of checking that at least one IS contained. I pushed the fix to the wildcardquery_fix branch. Plus I added one more fix to the memory-leaks branch. Jiri -- Jiří Šplíchal TOVEK, spol. s r.o. spl...@to... <mailto:spl...@to...> +420 606671930 |