clucene-developers Mailing List for CLucene - a C++ search engine (Page 17)

Status: Beta

Brought to you by: synhershko, ustramooner

clucene-developers — CLucene development-related discussion.

You can subscribe to this list here.

2003	Jan	Feb	Mar	Apr	May	Jun (16)	Jul (56)	Aug (2)	Sep (62)	Oct (71)	Nov (45)	Dec (6)
2004	Jan (12)	Feb (22)	Mar	Apr (62)	May (15)	Jun (57)	Jul (4)	Aug (24)	Sep (7)	Oct (34)	Nov (81)	Dec (41)
2005	Jan (70)	Feb (51)	Mar (46)	Apr (16)	May (22)	Jun (34)	Jul (23)	Aug (13)	Sep (43)	Oct (42)	Nov (54)	Dec (68)
2006	Jan (81)	Feb (43)	Mar (64)	Apr (141)	May (37)	Jun (101)	Jul (112)	Aug (32)	Sep (85)	Oct (63)	Nov (84)	Dec (81)
2007	Jan (25)	Feb (64)	Mar (46)	Apr (28)	May (14)	Jun (42)	Jul (19)	Aug (34)	Sep (29)	Oct (25)	Nov (12)	Dec (9)
2008	Jan (15)	Feb (34)	Mar (37)	Apr (23)	May (18)	Jun (47)	Jul (28)	Aug (61)	Sep (29)	Oct (48)	Nov (24)	Dec (79)
2009	Jan (48)	Feb (50)	Mar (28)	Apr (10)	May (51)	Jun (22)	Jul (125)	Aug (29)	Sep (38)	Oct (29)	Nov (58)	Dec (32)
2010	Jan (15)	Feb (10)	Mar (12)	Apr (64)	May (4)	Jun (81)	Jul (41)	Aug (82)	Sep (84)	Oct (35)	Nov (43)	Dec (26)
2011	Jan (59)	Feb (25)	Mar (23)	Apr (14)	May (22)	Jun (8)	Jul (5)	Aug (20)	Sep (10)	Oct (12)	Nov (29)	Dec (7)
2012	Jan (1)	Feb (22)	Mar (9)	Apr (5)	May (2)	Jun	Jul (6)	Aug (2)	Sep	Oct (5)	Nov (9)	Dec (10)
2013	Jan (9)	Feb (3)	Mar (2)	Apr (4)	May (2)	Jun (1)	Jul (2)	Aug (5)	Sep	Oct (3)	Nov (3)	Dec (2)
2014	Jan (1)	Feb (2)	Mar	Apr (10)	May (3)	Jun	Jul	Aug	Sep (1)	Oct	Nov	Dec (3)
2015	Jan (8)	Feb (3)	Mar (7)	Apr	May	Jun	Jul	Aug	Sep	Oct (1)	Nov (3)	Dec
2016	Jan (1)	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec (2)
2018	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug (1)	Sep	Oct	Nov	Dec
2019	Jan	Feb	Mar	Apr	May	Jun	Jul (8)	Aug	Sep	Oct	Nov	Dec
2020	Jan	Feb	Mar	Apr (2)	May	Jun (1)	Jul	Aug	Sep	Oct	Nov (1)	Dec
2021	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug (1)	Sep	Oct	Nov	Dec
2023	Jan	Feb	Mar (1)	Apr	May	Jun	Jul (4)	Aug	Sep	Oct	Nov	Dec
2025	Jan	Feb	Mar	Apr	May	Jun (1)	Jul	Aug (1)	Sep	Oct	Nov	Dec

Flat | Threaded

<< < 1 .. 15 16 17 18 19 .. 168 > >> (Page 17 of 168)

Re: [CLucene-dev] Patch: Fixes memory smasher in KeywordTokenizer

From: Itamar Syn-H. <it...@co...> - 2010-12-24 11:24:59

This is now merged to master. Thanks for reporting!


Itamar.


On 5/12/2010 7:43 PM, Matt Ronge wrote:

> Shoot I wish I had noticed this earlier: http://clucene.git.sourceforge.net/git/gitweb.cgi?p=clucene/clucene;a=commit;h=de5695332badddc264c3e187350463d9d6ee4a8a
>
> Looks like someone else had already found and fixed the bug. I wish I had found that, would have saved me a lot of time, oh well.
>
> On Dec 4, 2010, at 1:40 PM, Matt Ronge wrote:
>
>> I've been working off the head of CLucene (which has great!) and I ran into a memory smasher.
>>
>> I had strange issues where after some queries the search results would start to get more and more incorrect until the app would crash. After much debugging I was able to confirm that this would only occur for queries which had lengths that were multiples of 8.
>>
>> After even more debugging I found that the KeywordTokenizer (which I was using for my queries) was allocating term buffers that where multiples of 8 (suspicious!). It turns out that  if the token length is a multiple of 8 and the KeywordTokenizer attempts to null terminate the string, it writes off the end of the array, causing memory corruption. Normally you don't see this because it silently corrupts and the token must be the multiple of 8.
>>
>> To fix this I make sure to add room for the null terminator if the buffer is already full. Here is my patch:
>>
>> diff --git a/src/core/CLucene/analysis/Analyzers.cpp b/src/core/CLucene/analysis/Analyzers.cpp
>> index 0c34a60..39fec43 100644
>> --- a/src/core/CLucene/analysis/Analyzers.cpp
>> +++ b/src/core/CLucene/analysis/Analyzers.cpp
>> @@ -556,6 +556,9 @@ Token* KeywordTokenizer::next(Token* token){
>>      if ( termBuffer == NULL ){
>>        termBuffer=token->resizeTermBuffer(token->bufferLength() + 8);
>>      }
>> +       if (upto == token->bufferLength()) {
>> +         termBuffer = token->resizeTermBuffer(token->bufferLength() + 1);
>> +       }
>>      termBuffer[upto]=0;
>>      token->setTermLength(upto);
>>      return token;
>>
>>
>> Let me know if I should open a bug for this.
>>
>> Thanks again for clucene!
>> --
>> Matt Ronge
>> Central Atomics
>> Makers of Rocketbox
>>
>> http://www.getrocketbox.com
>> mr...@ce...
>>
>>
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> What happens now with your Lotus Notes apps - do you make another costly
>> upgrade, or settle for being marooned without product support? Time to move
>> off Lotus Notes and onto the cloud with Force.com, apps are easier to build,
>> use, and manage than apps on traditional platforms. Sign up for the Lotus
>> Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d
>> _______________________________________________
>> CLucene-developers mailing list
>> CLu...@li...
>> https://lists.sourceforge.net/lists/listinfo/clucene-developers
> --
> Matt Ronge
> Central Atomics
> Makers of Rocketbox
>
> http://www.getrocketbox.com
> mr...@ce...
>
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> What happens now with your Lotus Notes apps - do you make another costly
> upgrade, or settle for being marooned without product support? Time to move
> off Lotus Notes and onto the cloud with Force.com, apps are easier to build,
> use, and manage than apps on traditional platforms. Sign up for the Lotus
> Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d
> _______________________________________________
> CLucene-developers mailing list
> CLu...@li...
> https://lists.sourceforge.net/lists/listinfo/clucene-developers
>
>

Re: [CLucene-dev] MultiSearcher problem

From: Itamar Syn-H. <it...@co...> - 2010-12-24 11:23:14

Hi Bill,


What kind of a sort object are you passing? if its your own brewed, 
perhaps it is buggy?


Itamar.



On 10/12/2010 10:53 PM, Miller, Bill (QuickWire) wrote:

> Hi all, I've been implementing MultiSearcher and have a problem that 
> may be more of a 'Lucene Conceptual' thing than a bug.
>
> I'm running a fairly new 2_3_2 git under Windows (VS 2010).
>
> My problem is when I pass a sort to MultSearch::search() I receive 
> many duplicate hits (note that all doc id's are unique across both 
> indexes I'm testing with...).
>
> However, if I only ask for 100 hits max there are no duplicates.
>
> Also if I do not pass a sort then there are no duplicates.
>
> I find it rather hard to follow but I'm guessing that the 'good' 100 
> docs comes from the initial search and the duplicates are caused from 
> the Hits::getMoreDocs() eventually calling MultSearch::search() again 
> and for some reason adding the same hits each time.
>
> Should I be expecting this behavior?
>
> Thanks,
>
> Bill
>
>
> ------------------------------------------------------------------------------
> Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL,
> new data types, scalar functions, improved concurrency, built-in packages,
> OCI, SQL*Plus, data movement tools, best practices and more.
> http://p.sf.net/sfu/oracle-sfdev2dev
>
>
> _______________________________________________
> CLucene-developers mailing list
> CLu...@li...
> https://lists.sourceforge.net/lists/listinfo/clucene-developers

Re: [CLucene-dev] Bug in StandardTokenizer

From: Itamar Syn-H. <it...@co...> - 2010-12-24 11:08:12

Hi Pini,


Thanks for the fix. I checked in a slightly different fix taken from our 
bug tracker. It also had a few tests. Let us know if there's anything 
else left to fix, and feel free to add your own tests.


Itamar.


On 23/12/2010 7:41 PM, pini shamgar wrote:

> Hi,
>
> I found a bug in the StandardAnalyzer. Due to it I could not search 
> and find an IP address.
>
> The bug is in the reuse of StandardTokenizer object.
>
> Bellow a fixed presented:
>
> The fix is in 3 flies:
>
> 1)In core\CLucene\analysis\standard\StandardTokenizer.cpp
>
> void StandardTokenizer::reset(Reader* _input) {
>
> rdPos= -1;
>
> tokenStart = -1;
>
> this->input = _input;
>
> if (rd->input==NULL) {
>
> rd->reset(_input->__asBufferedReader());
>
> }
>
> }
>
> 2)In core\CLucene\util\_FastCharStream.h
>
> Add the method
>
> void reset(BufferedReader* reader);
>
> 3)In core\CLucene\util\FastCharStream.cpp
>
> FastCharStream::FastCharStream(BufferedReader* reader)
>
> {
>
> reset(reader);
>
> input->setMinBufSize(maxRewindSize);
>
> }
>
> FastCharStream::~FastCharStream(){
>
> }
>
> void FastCharStream::reset(BufferedReader* reader)
>
> {
>
> pos = 0;
>
> rewindPos = 0;
>
> resetPos = 0;
>
> col = 1;
>
> line = 1;
>
> input = reader;
>
> }
>
> Cheers
>
> Pini Shamgar
>
>
>
>
>
> ------------------------------------------------------------------------------
> Learn how Oracle Real Application Clusters (RAC) One Node allows customers
> to consolidate database storage, standardize their database environment, and,
> should the need arise, upgrade to a full multi-node Oracle RAC database
> without downtime or disruption
> http://p.sf.net/sfu/oracle-sfdevnl
>
>
> _______________________________________________
> CLucene-developers mailing list
> CLu...@li...
> https://lists.sourceforge.net/lists/listinfo/clucene-developers

[CLucene-dev] Bug in StandardTokenizer

From: pini s. <pi...@ya...> - 2010-12-23 17:41:16

Hi,
 
I found a bug in the StandardAnalyzer. Due to it I could not search and find an 
 IP address.
The bug is in the reuse of StandardTokenizer object.
Bellow a fixed presented:
 
The fix is in 3 flies:
 
1)     In core\CLucene\analysis\standard\StandardTokenizer.cpp
 
void StandardTokenizer::reset(Reader* _input) {
    rdPos             = -1;
    tokenStart = -1;
            this->input = _input;
    if (rd->input==NULL) {
                        rd->reset(_input->__asBufferedReader());
            }
  }
 
 
 2)     In core\CLucene\util\_FastCharStream.h
Add the method
 
 void reset(BufferedReader* reader);
 
 
 3)     In core\CLucene\util\FastCharStream.cpp
 
 
FastCharStream::FastCharStream(BufferedReader* reader)
  {
    reset(reader);
    input->setMinBufSize(maxRewindSize);
  }
  FastCharStream::~FastCharStream(){
  }
 
  void FastCharStream::reset(BufferedReader* reader)
  {
             pos = 0;
             rewindPos = 0;
             resetPos = 0;
             col = 1;
             line = 1;
             input = reader;
  }
 
 
 
Cheers
Pini Shamgar

Re: [CLucene-dev] Have anybody run cl_test or cl_demo on solaris sparc?

From: Ben v. K. <bva...@gm...> - 2010-12-22 05:19:15

can you rebuild and recreate with cmake  -DCMAKE_BUILD_TYPE="Debug". Then
the stack will be more complete

ben

On Tue, Dec 21, 2010 at 4:09 PM, Pankaj Jangid <pan...@gm...>wrote:

> Guys,
>
> Any clue on this. I am interested in fixing this issue. Can someone guide
> me on this?
>
> --
> Regards
> Pankaj
>
>
> On Tue, Nov 16, 2010 at 4:00 PM, Pankaj Jangid <pan...@gm...>wrote:
>
>> Here is the stack trace.
>>
>> clucene built with these options
>> ----------------------------------------------
>> cmake -G "Unix Makefiles" ..
>> -DCMAKE_INSTALL_PREFIX="/lan/ops/cdnshelp/cops/tools/clucene/sun4v/32bit/release"
>> -DCMAKE_CXX_FLAGS:STRING="-library=stlport4"
>> -DCMAKE_EXE_LINKER_FLAGS:STRING="-library=stlport4"
>>
>>
>>
>> StackTrace of cl_demo crash
>> -----------------------------------------
>> dftsol03s% ./bin/cl_demo
>> Location of text files to be indexed: .
>> Location to store the clucene index: ./idx
>> adding file 1: ./cmake_uninstall.cmake
>> Segmentation Fault (core dumped)
>> dftsol03s% dbx ./bin/cl_demo
>> For information about new features see `help changes'
>> To remove this message, put `dbxenv suppress_startup_message 7.6' in your
>> .dbxrc
>> Reading cl_demo
>> Reading ld.so.1
>> Reading libclucene-core.so.0.9.23.0
>> Reading libclucene-shared.so.0.9.23.0
>> Reading libthread.so.1
>> Reading libz.so.1
>> Reading libstlport.so.1
>> Reading librt.so.1
>> Reading libCrun.so.1
>> Reading libm.so.2
>> Reading libc.so.1
>> Reading libm.so.1
>> Reading libaio.so.1
>> Reading libmd.so.1
>> (dbx) run
>> Running: cl_demo
>> (process id 9352)
>> Reading libc_psr.so.1
>> Location of text files to be indexed: .
>> Location to store the clucene index: ./idx
>> adding file 1: ./cmake_uninstall.cmake
>> t@1 (l@1) signal SEGV (no mapping at the fault address) in
>> lucene::index::DocumentsWriter::ThreadState::init at line 222 in file
>> "DocumentsWriterThreadState.cpp"
>>   222       while(fp != NULL && _tcscmp(fp->fieldInfo->name, fi->name) !=
>> 0 )
>> (dbx) where
>> current thread: t@1
>> =>[1] lucene::index::DocumentsWriter::ThreadState::init(this = ???, doc =
>> ???, docID = ???) (optimized), at 0x7fa24e74 (line ~222) in
>> "DocumentsWriterThreadState.cpp"
>>   [2] lucene::index::DocumentsWriter::getThreadState(this = ???, doc =
>> ???, delTerm = ???) (optimized), at 0x7fa189a4 (line ~892) in
>> "DocumentsWriter.cpp"
>>   [3] lucene::index::DocumentsWriter::updateDocument(this = ???, doc =
>> ???, analyzer = ???, delTerm = ???) (optimized), at 0x7fa18be4 (line ~940)
>> in "DocumentsWriter.cpp"
>>   [4] lucene::index::DocumentsWriter::addDocument(this = ???, doc = ???,
>> analyzer = ???) (optimized), at 0x7fa18ba0 (line ~930) in
>> "DocumentsWriter.cpp"
>>   [5] lucene::index::IndexWriter::addDocument(this = ???, doc = ???,
>> analyzer = ???) (optimized), at 0x7fa51bd4 (line ~672) in "IndexWriter.cpp"
>>   [6] indexDocs(writer = ???, directory = ???) (optimized), at 0x14940
>> (line ~84) in "IndexFiles.cpp"
>>   [7] IndexFiles(path = ???, target = ???, clearIndex = ???) (optimized),
>> at 0x14a78 (line ~117) in "IndexFiles.cpp"
>>   [8] main(argc = ???, argv = ???) (optimized), at 0x15c9c (line ~58) in
>> "Main.cpp"
>>
>> -----------------------------------------
>>
>>
>> When configured with -DCMAKE_BUILD_TYPE="Release" option, cl_demo doesn't
>> build.
>>
>> cmake -G "Unix Makefiles" ..
>> -DCMAKE_INSTALL_PREFIX="/lan/ops/cdnshelp/cops/tools/clucene/sun4v/32bit/release"
>> -DCMAKE_CXX_FLAGS:STRING="-library=stlport4"
>> -DCMAKE_EXE_LINKER_FLAGS:STRING="-library=stlport4"
>> -DCMAKE_BUILD_TYPE="Release"
>>
>> make cl_demo gives this linking error.
>>
>> Linking CXX executable ../../bin/cl_demo
>> Undefined                       first referenced
>>  symbol                             in file
>> lucene::document::Field::Field(const
>> wchar_t*,lucene::util::ValueArray<unsigned char>*,int,const bool)
>> ../../bin/libclucene-core.so.0.9.23.0
>> std::string lucene::util::Misc::toString(const unsigned)
>> ../../bin/libclucene-core.so.0.9.23.0
>> lucene::index::IndexWriter::IndexWriter(lucene::store::Directory*,lucene::analysis::Analyzer*,const
>> bool,const bool) ../../bin/libclucene-core.so.0.9.23.0
>> lucene::store::FSDirectory*lucene::store::FSDirectory::getDirectory(const
>> char*,const bool,lucene::store::LockFactory*)
>> ../../bin/libclucene-core.so.0.9.23.0
>> lucene::index::IndexWriter::IndexWriter(const
>> char*,lucene::analysis::Analyzer*,const bool)
>> CMakeFiles/cl_demo.dir/IndexFiles.o
>> ld: fatal: Symbol referencing errors. No output written to
>> ../../bin/cl_demo
>> make[3]: *** [bin/cl_demo] Error 1
>> make[2]: *** [src/demo/CMakeFiles/cl_demo.dir/all] Error 2
>> make[1]: *** [src/demo/CMakeFiles/cl_demo.dir/rule] Error 2
>> make: *** [cl_demo] Error 2
>>
>> -----------------------------------------
>>
>> Debug mode doesn't build. I have a fix for that I'll be submitting a patch
>> for that.
>>
>> cmake -G "Unix Makefiles" ..
>> -DCMAKE_INSTALL_PREFIX="/lan/ops/cdnshelp/cops/tools/clucene/sun4v/32bit/debug"
>> -DCMAKE_CXX_FLAGS:STRING="-library=stlport4"
>> -DCMAKE_EXE_LINKER_FLAGS:STRING="-library=stlport4"
>> -DCMAKE_BUILD_TYPE="Debug"
>>
>> This is the error
>>
>> [ 61%] Building CXX object
>> src/core/CMakeFiles/clucene-core.dir/CLucene/index/TermInfosReader.o
>> "/lan/ops/cdnshelp/cops/users/pankajj/work/foss/clucene/clucene-HEAD-8484291/src/core/CLucene/store/IndexInput.h",
>> line 194: Warning: lucene::store::BufferedIndexInput::getObjectName hides
>> the virtual function lucene::store::IndexInput::getObjectName() const.
>> "/lan/ops/cdnshelp/cops/users/pankajj/work/foss/clucene/clucene-HEAD-8484291/src/core/CLucene/index/TermInfosReader.cpp",
>> line 114: Error: The operation "-- lucene::util::__LUCENE_ATOMIC_INT" is
>> illegal.
>> 1 Error(s) and 1 Warning(s) detected.
>> make[2]: ***
>> [src/core/CMakeFiles/clucene-core.dir/CLucene/index/TermInfosReader.o] Error
>> 1
>> make[1]: *** [src/core/CMakeFiles/clucene-core.dir/all] Error 2
>> make: *** [all] Error 2
>>
>>
>> -----------------------------------------
>>
>>
>> On Tue, Nov 16, 2010 at 8:14 AM, Pankaj Jangid <pan...@gm...>wrote:
>>
>>> We are using .9.21 and it works but the HEAD is quite different from it.
>>> I'll send a stack-trace today.
>>>
>>> --
>>> Regards
>>> Pankaj
>>>
>>> On Tue, Nov 16, 2010 at 7:39 AM, Ben van Klinken <bva...@gm...>wrote:
>>>
>>>> I did have it going at some stage... was a while back so probably
>>>> something new has broken it.
>>>>
>>>> ben
>>>>
>>>>
>>>> On Sun, Nov 14, 2010 at 2:26 AM, Itamar Syn-Hershko <it...@co...
>>>> > wrote:
>>>>
>>>>>  Hi,
>>>>>
>>>>>
>>>>>  I assume the same segfault is given in both cases. Can you send a
>>>>> stacktrace? Also, please sync your copy with the latest git master HEAD.
>>>>>
>>>>>
>>>>>  Itamar.
>>>>>
>>>>>
>>>>>  On 12/11/2010 3:57 PM, Pankaj Jangid wrote:
>>>>>
>>>>> I have built clucene with this configuration on Solaris Sparc,
>>>>>
>>>>> cmake -G "Unix Makefiles" .. -DCMAKE_INSTALL_PREFIX="<cl_install_path>"
>>>>> -DCMAKE_CXX_FLAGS:STRING="-m32 -library=stlport4"
>>>>> -DCMAKE_EXE_LINKER_FLAGS:STRING="-library=stlport4"
>>>>> -DCMAKE_BUILD_TYPE="Debug|Release"
>>>>>
>>>>> I have tried various combinations - Debug, Release, -m32. With -m64 it
>>>>> doesn't build at all.
>>>>>
>>>>> cl_demo gives segfault during indexing.
>>>>>
>>>>> cl_test fails. Gives segfault.
>>>>>
>>>>> Have anybody tried clucene HEAD on Solaris/Sparc?
>>>>>
>>>>> --
>>>>> Regards
>>>>> Pankaj
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Centralized Desktop Delivery: Dell and VMware Reference Architecture
>>>>> Simplifying enterprise desktop deployment and management using
>>>>> Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
>>>>> client virtualization framework. Read more!http://p.sf.net/sfu/dell-eql-dev2dev
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> CLucene-developers mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/clucene-developers
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Centralized Desktop Delivery: Dell and VMware Reference Architecture
>>>>> Simplifying enterprise desktop deployment and management using
>>>>> Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
>>>>> client virtualization framework. Read more!
>>>>> http://p.sf.net/sfu/dell-eql-dev2dev
>>>>> _______________________________________________
>>>>> CLucene-developers mailing list
>>>>> CLu...@li...
>>>>> https://lists.sourceforge.net/lists/listinfo/clucene-developers
>>>>>
>>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Beautiful is writing same markup. Internet Explorer 9 supports
>>>> standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
>>>> Spend less time writing and  rewriting code and more time creating great
>>>> experiences on the web. Be a part of the beta today
>>>> http://p.sf.net/sfu/msIE9-sfdev2dev
>>>>
>>>> _______________________________________________
>>>> CLucene-developers mailing list
>>>> CLu...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/clucene-developers
>>>>
>>>>
>>>
>>
>
>
> ------------------------------------------------------------------------------
> Lotusphere 2011
> Register now for Lotusphere 2011 and learn how
> to connect the dots, take your collaborative environment
> to the next level, and enter the era of Social Business.
> http://p.sf.net/sfu/lotusphere-d2d
> _______________________________________________
> CLucene-developers mailing list
> CLu...@li...
> https://lists.sourceforge.net/lists/listinfo/clucene-developers
>
>

Re: [CLucene-dev] Have anybody run cl_test or cl_demo on solaris sparc?

From: Veit J. <nun...@go...> - 2010-12-21 09:51:17

2010/12/21 Veit Jahns <nun...@go...>:
> [...] (Sun, I guess).

Scratch that! It was clearly stated in the subject.

Veit

Re: [CLucene-dev] Have anybody run cl_test or cl_demo on solaris sparc?

From: Veit J. <nun...@go...> - 2010-12-21 09:48:30

Just an idea regarding:

>> make cl_demo gives this linking error.
>>
>> Linking CXX executable ../../bin/cl_demo
>> Undefined                       first referenced
>>  symbol                             in file
>> lucene::document::Field::Field(const
>> wchar_t*,lucene::util::ValueArray<unsigned char>*,int,const bool)
>> ../../bin/libclucene-core.so.0.9.23.0

There is a difference between the header file  and the implementation
file. The header says (line 142);

Field(const TCHAR* name, CL_NS(util)::ValueArray<uint8_t>* data, int
_config, const bool duplicateValue = true)

And in the implementation the "const" before duplicateValue is missing
(line 61):

Field(const TCHAR* Name, ValueArray<uint8_t>* Value, int config, bool
duplicateValue)

Maybe this makes a difference on your plattform (Sun, I guess).

>> std::string lucene::util::Misc::toString(const unsigned)
>> ../../bin/libclucene-core.so.0.9.23.0

The same.

>> lucene::index::IndexWriter::IndexWriter(lucene::store::Directory*,lucene::analysis::Analyzer*,const
>> bool,const bool) ../../bin/libclucene-core.so.0.9.23.0

The same.

>> lucene::store::FSDirectory*lucene::store::FSDirectory::getDirectory(const
>> char*,const bool,lucene::store::LockFactory*)
>> ../../bin/libclucene-core.so.0.9.23.0

The same.

>> lucene::index::IndexWriter::IndexWriter(const
>> char*,lucene::analysis::Analyzer*,const bool)

The same.

Veit

Re: [CLucene-dev] Have anybody run cl_test or cl_demo on solaris sparc?

From: Pankaj J. <pan...@gm...> - 2010-12-21 06:10:09

Guys,

Any clue on this. I am interested in fixing this issue. Can someone guide me
on this?

--
Regards
Pankaj

On Tue, Nov 16, 2010 at 4:00 PM, Pankaj Jangid <pan...@gm...>wrote:

> Here is the stack trace.
>
> clucene built with these options
> ----------------------------------------------
> cmake -G "Unix Makefiles" ..
> -DCMAKE_INSTALL_PREFIX="/lan/ops/cdnshelp/cops/tools/clucene/sun4v/32bit/release"
> -DCMAKE_CXX_FLAGS:STRING="-library=stlport4"
> -DCMAKE_EXE_LINKER_FLAGS:STRING="-library=stlport4"
>
>
>
> StackTrace of cl_demo crash
> -----------------------------------------
> dftsol03s% ./bin/cl_demo
> Location of text files to be indexed: .
> Location to store the clucene index: ./idx
> adding file 1: ./cmake_uninstall.cmake
> Segmentation Fault (core dumped)
> dftsol03s% dbx ./bin/cl_demo
> For information about new features see `help changes'
> To remove this message, put `dbxenv suppress_startup_message 7.6' in your
> .dbxrc
> Reading cl_demo
> Reading ld.so.1
> Reading libclucene-core.so.0.9.23.0
> Reading libclucene-shared.so.0.9.23.0
> Reading libthread.so.1
> Reading libz.so.1
> Reading libstlport.so.1
> Reading librt.so.1
> Reading libCrun.so.1
> Reading libm.so.2
> Reading libc.so.1
> Reading libm.so.1
> Reading libaio.so.1
> Reading libmd.so.1
> (dbx) run
> Running: cl_demo
> (process id 9352)
> Reading libc_psr.so.1
> Location of text files to be indexed: .
> Location to store the clucene index: ./idx
> adding file 1: ./cmake_uninstall.cmake
> t@1 (l@1) signal SEGV (no mapping at the fault address) in
> lucene::index::DocumentsWriter::ThreadState::init at line 222 in file
> "DocumentsWriterThreadState.cpp"
>   222       while(fp != NULL && _tcscmp(fp->fieldInfo->name, fi->name) != 0
> )
> (dbx) where
> current thread: t@1
> =>[1] lucene::index::DocumentsWriter::ThreadState::init(this = ???, doc =
> ???, docID = ???) (optimized), at 0x7fa24e74 (line ~222) in
> "DocumentsWriterThreadState.cpp"
>   [2] lucene::index::DocumentsWriter::getThreadState(this = ???, doc = ???,
> delTerm = ???) (optimized), at 0x7fa189a4 (line ~892) in
> "DocumentsWriter.cpp"
>   [3] lucene::index::DocumentsWriter::updateDocument(this = ???, doc = ???,
> analyzer = ???, delTerm = ???) (optimized), at 0x7fa18be4 (line ~940) in
> "DocumentsWriter.cpp"
>   [4] lucene::index::DocumentsWriter::addDocument(this = ???, doc = ???,
> analyzer = ???) (optimized), at 0x7fa18ba0 (line ~930) in
> "DocumentsWriter.cpp"
>   [5] lucene::index::IndexWriter::addDocument(this = ???, doc = ???,
> analyzer = ???) (optimized), at 0x7fa51bd4 (line ~672) in "IndexWriter.cpp"
>   [6] indexDocs(writer = ???, directory = ???) (optimized), at 0x14940
> (line ~84) in "IndexFiles.cpp"
>   [7] IndexFiles(path = ???, target = ???, clearIndex = ???) (optimized),
> at 0x14a78 (line ~117) in "IndexFiles.cpp"
>   [8] main(argc = ???, argv = ???) (optimized), at 0x15c9c (line ~58) in
> "Main.cpp"
>
> -----------------------------------------
>
>
> When configured with -DCMAKE_BUILD_TYPE="Release" option, cl_demo doesn't
> build.
>
> cmake -G "Unix Makefiles" ..
> -DCMAKE_INSTALL_PREFIX="/lan/ops/cdnshelp/cops/tools/clucene/sun4v/32bit/release"
> -DCMAKE_CXX_FLAGS:STRING="-library=stlport4"
> -DCMAKE_EXE_LINKER_FLAGS:STRING="-library=stlport4"
> -DCMAKE_BUILD_TYPE="Release"
>
> make cl_demo gives this linking error.
>
> Linking CXX executable ../../bin/cl_demo
> Undefined                       first referenced
>  symbol                             in file
> lucene::document::Field::Field(const
> wchar_t*,lucene::util::ValueArray<unsigned char>*,int,const bool)
> ../../bin/libclucene-core.so.0.9.23.0
> std::string lucene::util::Misc::toString(const unsigned)
> ../../bin/libclucene-core.so.0.9.23.0
> lucene::index::IndexWriter::IndexWriter(lucene::store::Directory*,lucene::analysis::Analyzer*,const
> bool,const bool) ../../bin/libclucene-core.so.0.9.23.0
> lucene::store::FSDirectory*lucene::store::FSDirectory::getDirectory(const
> char*,const bool,lucene::store::LockFactory*)
> ../../bin/libclucene-core.so.0.9.23.0
> lucene::index::IndexWriter::IndexWriter(const
> char*,lucene::analysis::Analyzer*,const bool)
> CMakeFiles/cl_demo.dir/IndexFiles.o
> ld: fatal: Symbol referencing errors. No output written to
> ../../bin/cl_demo
> make[3]: *** [bin/cl_demo] Error 1
> make[2]: *** [src/demo/CMakeFiles/cl_demo.dir/all] Error 2
> make[1]: *** [src/demo/CMakeFiles/cl_demo.dir/rule] Error 2
> make: *** [cl_demo] Error 2
>
> -----------------------------------------
>
> Debug mode doesn't build. I have a fix for that I'll be submitting a patch
> for that.
>
> cmake -G "Unix Makefiles" ..
> -DCMAKE_INSTALL_PREFIX="/lan/ops/cdnshelp/cops/tools/clucene/sun4v/32bit/debug"
> -DCMAKE_CXX_FLAGS:STRING="-library=stlport4"
> -DCMAKE_EXE_LINKER_FLAGS:STRING="-library=stlport4"
> -DCMAKE_BUILD_TYPE="Debug"
>
> This is the error
>
> [ 61%] Building CXX object
> src/core/CMakeFiles/clucene-core.dir/CLucene/index/TermInfosReader.o
> "/lan/ops/cdnshelp/cops/users/pankajj/work/foss/clucene/clucene-HEAD-8484291/src/core/CLucene/store/IndexInput.h",
> line 194: Warning: lucene::store::BufferedIndexInput::getObjectName hides
> the virtual function lucene::store::IndexInput::getObjectName() const.
> "/lan/ops/cdnshelp/cops/users/pankajj/work/foss/clucene/clucene-HEAD-8484291/src/core/CLucene/index/TermInfosReader.cpp",
> line 114: Error: The operation "-- lucene::util::__LUCENE_ATOMIC_INT" is
> illegal.
> 1 Error(s) and 1 Warning(s) detected.
> make[2]: ***
> [src/core/CMakeFiles/clucene-core.dir/CLucene/index/TermInfosReader.o] Error
> 1
> make[1]: *** [src/core/CMakeFiles/clucene-core.dir/all] Error 2
> make: *** [all] Error 2
>
>
> -----------------------------------------
>
>
> On Tue, Nov 16, 2010 at 8:14 AM, Pankaj Jangid <pan...@gm...>wrote:
>
>> We are using .9.21 and it works but the HEAD is quite different from it.
>> I'll send a stack-trace today.
>>
>> --
>> Regards
>> Pankaj
>>
>> On Tue, Nov 16, 2010 at 7:39 AM, Ben van Klinken <bva...@gm...>wrote:
>>
>>> I did have it going at some stage... was a while back so probably
>>> something new has broken it.
>>>
>>> ben
>>>
>>>
>>> On Sun, Nov 14, 2010 at 2:26 AM, Itamar Syn-Hershko <it...@co...>wrote:
>>>
>>>>  Hi,
>>>>
>>>>
>>>>  I assume the same segfault is given in both cases. Can you send a
>>>> stacktrace? Also, please sync your copy with the latest git master HEAD.
>>>>
>>>>
>>>>  Itamar.
>>>>
>>>>
>>>>  On 12/11/2010 3:57 PM, Pankaj Jangid wrote:
>>>>
>>>> I have built clucene with this configuration on Solaris Sparc,
>>>>
>>>> cmake -G "Unix Makefiles" .. -DCMAKE_INSTALL_PREFIX="<cl_install_path>"
>>>> -DCMAKE_CXX_FLAGS:STRING="-m32 -library=stlport4"
>>>> -DCMAKE_EXE_LINKER_FLAGS:STRING="-library=stlport4"
>>>> -DCMAKE_BUILD_TYPE="Debug|Release"
>>>>
>>>> I have tried various combinations - Debug, Release, -m32. With -m64 it
>>>> doesn't build at all.
>>>>
>>>> cl_demo gives segfault during indexing.
>>>>
>>>> cl_test fails. Gives segfault.
>>>>
>>>> Have anybody tried clucene HEAD on Solaris/Sparc?
>>>>
>>>> --
>>>> Regards
>>>> Pankaj
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Centralized Desktop Delivery: Dell and VMware Reference Architecture
>>>> Simplifying enterprise desktop deployment and management using
>>>> Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
>>>> client virtualization framework. Read more!http://p.sf.net/sfu/dell-eql-dev2dev
>>>>
>>>>
>>>> _______________________________________________
>>>> CLucene-developers mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/clucene-developers
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Centralized Desktop Delivery: Dell and VMware Reference Architecture
>>>> Simplifying enterprise desktop deployment and management using
>>>> Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
>>>> client virtualization framework. Read more!
>>>> http://p.sf.net/sfu/dell-eql-dev2dev
>>>> _______________________________________________
>>>> CLucene-developers mailing list
>>>> CLu...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/clucene-developers
>>>>
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Beautiful is writing same markup. Internet Explorer 9 supports
>>> standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
>>> Spend less time writing and  rewriting code and more time creating great
>>> experiences on the web. Be a part of the beta today
>>> http://p.sf.net/sfu/msIE9-sfdev2dev
>>>
>>> _______________________________________________
>>> CLucene-developers mailing list
>>> CLu...@li...
>>> https://lists.sourceforge.net/lists/listinfo/clucene-developers
>>>
>>>
>>
>

[CLucene-dev] Bug in sorting

From: Alexander E. <ego...@gm...> - 2010-12-21 06:02:28

I reproduced problem (on last git branch). Anyone can fix bug?

2010/11/12 <ps...@zo...>:
> In this test adding 3 docs and then updating them. And then search
> docs with sorting by doc_id (desc), after some updates search() not
> sorting docs.
>
> search 1: //all ok
> id = 3, score = 0.999667
> id = 2, score = 0.999667
> id = 1, score = 0.999667
>
> search 2: //all ok
> id = 3, score = 0.999833
> id = 2, score = 0.999833
> id = 1, score = 0.999833
>
> search 3: //not sorted!
> id = 1, score = 0.999889
> id = 2, score = 0.999889
> id = 3, score = 0.999889
>
> search final: //not sorted!
> id = 3, score = 0.712318
> id = 1, score = 0.712318
> id = 2, score = 0.712318
>

Re: [CLucene-dev] read past EOF ERROR while searching

From: Šplíchal J. <spl...@to...> - 2010-12-14 08:36:48

Hi Andrew,

I also found a way to fix it, but I don't like it because it causes 
additional unnecessary seeks in the file. 
Could you please send me you version of the BufferedIndexInput class (just the files you have changed)?
I would test it and publish it into the GIT repository together with some extended tests.

Thanks, 
Jiri



-----Original Message-----
From: Andrew McCann [mailto:mc...@de...] 
Sent: Tuesday, December 14, 2010 12:07 AM
To: clu...@li...
Subject: Re: [CLucene-dev] read past EOF ERROR while searching

Hi guys, I've never responded to this list..

I have encountered this bug myself..

and I made a fix.. (Took a while to track down)

jiri is on the right track, that was the problem I had.. I refactored
the bufferedindexinput class slightly to prevent it from happening,
was a minimal change.


I'm not sure what the procedure is for submitting fixes though.

-Andrew


2010/12/13 Šplíchal Jiří <spl...@to...>:
> Hi,
>
> some more details:
>
> the exception comes from the method:
>
> void FSDirectory::FSIndexInput::readInternal(uint8_t* b, const int32_t len)
>
>
>
> which is called from:
>
> void BufferedIndexInput::refill()
>
>
>
> the strange thing is: in the refill buffer there is a member start with the
> value 1024
>
> but when calling the readInternal method, both members
>
> handle->_fpos and _pos have values 1025 (one more) and so it tries to read
> different data from the file.
>
>
>
> Hope, It helps.
>
>
>
> Jiri
>
>
>
>
>
>
>
> From: Šplíchal Jiří [mailto:spl...@to...]
> Sent: Monday, December 13, 2010 9:28 PM
> To: clu...@li...
> Subject: [CLucene-dev] read past EOF ERROR while searching
>
>
>
> Hi,
>
>
>
> we found serious problem while searching. In some special situation
> (probably depends on the index size)
>
> repeating search does not return correct results or event ends with
> CLuceneError "read past EOF".
>
>
>
> To achieve the error, the following sequence must be called:
>
> 1)  run a query
>
> 2)  delete an instance of Analyzer (can be instantiated earlier but in the
> same thread) - this causes that the ThreadLocals objects are freed
>
> 3)  run the same query - THIS FAILS!!
>
>
>
> In all the cases where the seach failed we switched off the compound files.
>
> Could some one help us with this issue? It seems that reading the index
> files is not working correctly.
>
>
>
> Jiri
>
>
>
> PS: The following code is a test that demostrated the problem:
>
>
>
>
>
>
>
> /**
>
> * Create index
>
> */
>
> Directory* prepareDirectory1()
>
> {
>
>       const TCHAR * tszDocText = _T( "a b c d e f g h i j k l m n o p q r s
> t u v w x y z ab bb cb db eb fb gb hb ib jb kb lb mb nb ob pb qb rb sb tb ub
> vb wb xb yb zb ac bc cc dc ec fc gc hc ic jc kc lc mc nc oc pc qc rc sc tc
> uc vc wc xc yc zc ad bd cd dd ed fd gd hd id jd kd ld md nd od pd qd rd sd
> td ud vd wd xd yd zd ae be ce de ee fe ge he ie je ke le me ne oe pe qe re
> se te ue ve we xe ye ze af bf cf df ef ff gf hf if jf kf lf mf" );
>
>
>
>     char fsdir[CL_MAX_PATH];
>
>       _snprintf(fsdir,CL_MAX_PATH,"%s/%s",cl_tempDir, "test.search");
>
>
>
>     WhitespaceAnalyzer  analyzer;
>
>     Directory*          pDirectory =
> (Directory*)FSDirectory::getDirectory(fsdir);
>
>       IndexWriter         writer( pDirectory, &analyzer, true );
>
>
>
>     writer.setUseCompoundFile( false );
>
>
>
>     Document* d = _CLNEW Document();
>
>     d->add( *_CLNEW Field( _T("_content"), tszDocText, Field::STORE_NO |
> Field::INDEX_TOKENIZED ));
>
>     writer.addDocument(d);
>
>       _CLDELETE( d );
>
>     writer.close();
>
>
>
>     return pDirectory;
>
> }
>
>
>
> /**
>
> * Run test
>
> */
>
> void testReadPastEOF(CuTest *tc)
>
> {
>
>     Directory*      pDirectory  = prepareDirectory1();
>
>     Analyzer *      pAnalyzer   = NULL;
>
>     Hits *          pHits       = NULL;
>
>     IndexReader*    pReader     = IndexReader::open( pDirectory );
>
>     IndexSearcher   searcher( pReader );
>
>
>
>     CLUCENE_ASSERT( pReader->numDocs() == 1 );
>
>
>
>     Term * t1 = new Term( _T( "_content" ), _T( "ze" ) );
>
>     TermQuery * pQry1 = new TermQuery( t1 );
>
>     _CLDECDELETE( t1 );
>
>
>
>     pAnalyzer = new SimpleAnalyzer();
>
>     pHits = searcher.search( pQry1 );
>
>     _ASSERT( pHits->length() == 1 );
>
>     CLUCENE_ASSERT( pHits->length() == 1 );
>
>     _CLDELETE( pHits );
>
>
>
>     // Removing the analyzer causes removing of ThreadLocals - also cached
> SegmentTermEnum
>
>     _CLDELETE( pAnalyzer );
>
>
>
>     // THE NEXT CALL WILL FAIL
>
>     pHits = searcher.search( pQry1 );
>
>     _ASSERT( pHits->length() == 1 );
>
>     CLUCENE_ASSERT( pHits->length() == 1 );
>
>     _CLDELETE( pHits );
>
>
>
>     _CLDELETE( pQry1 );
>
>
>
>     searcher.close();
>
>     _CLDELETE( pReader );
>
>
>
>     pDirectory->close();
>
>     _CLDECDELETE( pDirectory );
>
> }
>
>
>
>
>
> ------------------------------------------------------------------------------
> Lotusphere 2011
> Register now for Lotusphere 2011 and learn how
> to connect the dots, take your collaborative environment
> to the next level, and enter the era of Social Business.
> http://p.sf.net/sfu/lotusphere-d2d
> _______________________________________________
> CLucene-developers mailing list
> CLu...@li...
> https://lists.sourceforge.net/lists/listinfo/clucene-developers
>
>

------------------------------------------------------------------------------
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
_______________________________________________
CLucene-developers mailing list
CLu...@li...
https://lists.sourceforge.net/lists/listinfo/clucene-developers

Re: [CLucene-dev] read past EOF ERROR while searching

From: Andrew M. <mc...@de...> - 2010-12-14 00:08:35

Hi guys, I've never responded to this list..

I have encountered this bug myself..

and I made a fix.. (Took a while to track down)

jiri is on the right track, that was the problem I had.. I refactored
the bufferedindexinput class slightly to prevent it from happening,
was a minimal change.


I'm not sure what the procedure is for submitting fixes though.

-Andrew


2010/12/13 Šplíchal Jiří <spl...@to...>:
> Hi,
>
> some more details:
>
> the exception comes from the method:
>
> void FSDirectory::FSIndexInput::readInternal(uint8_t* b, const int32_t len)
>
>
>
> which is called from:
>
> void BufferedIndexInput::refill()
>
>
>
> the strange thing is: in the refill buffer there is a member start with the
> value 1024
>
> but when calling the readInternal method, both members
>
> handle->_fpos and _pos have values 1025 (one more) and so it tries to read
> different data from the file.
>
>
>
> Hope, It helps.
>
>
>
> Jiri
>
>
>
>
>
>
>
> From: Šplíchal Jiří [mailto:spl...@to...]
> Sent: Monday, December 13, 2010 9:28 PM
> To: clu...@li...
> Subject: [CLucene-dev] read past EOF ERROR while searching
>
>
>
> Hi,
>
>
>
> we found serious problem while searching. In some special situation
> (probably depends on the index size)
>
> repeating search does not return correct results or event ends with
> CLuceneError "read past EOF".
>
>
>
> To achieve the error, the following sequence must be called:
>
> 1)  run a query
>
> 2)  delete an instance of Analyzer (can be instantiated earlier but in the
> same thread) - this causes that the ThreadLocals objects are freed
>
> 3)  run the same query - THIS FAILS!!
>
>
>
> In all the cases where the seach failed we switched off the compound files.
>
> Could some one help us with this issue? It seems that reading the index
> files is not working correctly.
>
>
>
> Jiri
>
>
>
> PS: The following code is a test that demostrated the problem:
>
>
>
>
>
>
>
> /**
>
> * Create index
>
> */
>
> Directory* prepareDirectory1()
>
> {
>
>       const TCHAR * tszDocText = _T( "a b c d e f g h i j k l m n o p q r s
> t u v w x y z ab bb cb db eb fb gb hb ib jb kb lb mb nb ob pb qb rb sb tb ub
> vb wb xb yb zb ac bc cc dc ec fc gc hc ic jc kc lc mc nc oc pc qc rc sc tc
> uc vc wc xc yc zc ad bd cd dd ed fd gd hd id jd kd ld md nd od pd qd rd sd
> td ud vd wd xd yd zd ae be ce de ee fe ge he ie je ke le me ne oe pe qe re
> se te ue ve we xe ye ze af bf cf df ef ff gf hf if jf kf lf mf" );
>
>
>
>     char fsdir[CL_MAX_PATH];
>
>       _snprintf(fsdir,CL_MAX_PATH,"%s/%s",cl_tempDir, "test.search");
>
>
>
>     WhitespaceAnalyzer  analyzer;
>
>     Directory*          pDirectory =
> (Directory*)FSDirectory::getDirectory(fsdir);
>
>       IndexWriter         writer( pDirectory, &analyzer, true );
>
>
>
>     writer.setUseCompoundFile( false );
>
>
>
>     Document* d = _CLNEW Document();
>
>     d->add( *_CLNEW Field( _T("_content"), tszDocText, Field::STORE_NO |
> Field::INDEX_TOKENIZED ));
>
>     writer.addDocument(d);
>
>       _CLDELETE( d );
>
>     writer.close();
>
>
>
>     return pDirectory;
>
> }
>
>
>
> /**
>
> * Run test
>
> */
>
> void testReadPastEOF(CuTest *tc)
>
> {
>
>     Directory*      pDirectory  = prepareDirectory1();
>
>     Analyzer *      pAnalyzer   = NULL;
>
>     Hits *          pHits       = NULL;
>
>     IndexReader*    pReader     = IndexReader::open( pDirectory );
>
>     IndexSearcher   searcher( pReader );
>
>
>
>     CLUCENE_ASSERT( pReader->numDocs() == 1 );
>
>
>
>     Term * t1 = new Term( _T( "_content" ), _T( "ze" ) );
>
>     TermQuery * pQry1 = new TermQuery( t1 );
>
>     _CLDECDELETE( t1 );
>
>
>
>     pAnalyzer = new SimpleAnalyzer();
>
>     pHits = searcher.search( pQry1 );
>
>     _ASSERT( pHits->length() == 1 );
>
>     CLUCENE_ASSERT( pHits->length() == 1 );
>
>     _CLDELETE( pHits );
>
>
>
>     // Removing the analyzer causes removing of ThreadLocals - also cached
> SegmentTermEnum
>
>     _CLDELETE( pAnalyzer );
>
>
>
>     // THE NEXT CALL WILL FAIL
>
>     pHits = searcher.search( pQry1 );
>
>     _ASSERT( pHits->length() == 1 );
>
>     CLUCENE_ASSERT( pHits->length() == 1 );
>
>     _CLDELETE( pHits );
>
>
>
>     _CLDELETE( pQry1 );
>
>
>
>     searcher.close();
>
>     _CLDELETE( pReader );
>
>
>
>     pDirectory->close();
>
>     _CLDECDELETE( pDirectory );
>
> }
>
>
>
>
>
> ------------------------------------------------------------------------------
> Lotusphere 2011
> Register now for Lotusphere 2011 and learn how
> to connect the dots, take your collaborative environment
> to the next level, and enter the era of Social Business.
> http://p.sf.net/sfu/lotusphere-d2d
> _______________________________________________
> CLucene-developers mailing list
> CLu...@li...
> https://lists.sourceforge.net/lists/listinfo/clucene-developers
>
>

Re: [CLucene-dev] read past EOF ERROR while searching

From: Šplíchal J. <spl...@to...> - 2010-12-13 20:44:36

Hi, 

some more details:

the exception comes from the method:

void FSDirectory::FSIndexInput::readInternal(uint8_t* b, const int32_t len)

 

which is called from:

void BufferedIndexInput::refill()

 

the strange thing is: in the refill buffer there is a member start with the value 1024

but when calling the readInternal method, both members 

handle->_fpos and _pos have values 1025 (one more) and so it tries to read different data from the file.

 

Hope, It helps.

 

Jiri

 

 

 

From: Šplíchal Jiří [mailto:spl...@to...] 
Sent: Monday, December 13, 2010 9:28 PM
To: clu...@li...
Subject: [CLucene-dev] read past EOF ERROR while searching

 

Hi,

 

we found serious problem while searching. In some special situation (probably depends on the index size) 

repeating search does not return correct results or event ends with CLuceneError "read past EOF".

 

To achieve the error, the following sequence must be called:

1)  run a query 

2)  delete an instance of Analyzer (can be instantiated earlier but in the same thread) - this causes that the ThreadLocals objects are freed

3)  run the same query - THIS FAILS!!

 

In all the cases where the seach failed we switched off the compound files. 

Could some one help us with this issue? It seems that reading the index files is not working correctly.

 

Jiri

 

PS: The following code is a test that demostrated the problem:

 

 

 

/**

* Create index

*/

Directory* prepareDirectory1()

{

      const TCHAR * tszDocText = _T( "a b c d e f g h i j k l m n o p q r s t u v w x y z ab bb cb db eb fb gb hb ib jb kb lb mb nb ob pb qb rb sb tb ub vb wb xb yb zb ac bc cc dc ec fc gc hc ic jc kc lc mc nc oc pc qc rc sc tc uc vc wc xc yc zc ad bd cd dd ed fd gd hd id jd kd ld md nd od pd qd rd sd td ud vd wd xd yd zd ae be ce de ee fe ge he ie je ke le me ne oe pe qe re se te ue ve we xe ye ze af bf cf df ef ff gf hf if jf kf lf mf" );

      

    char fsdir[CL_MAX_PATH];

      _snprintf(fsdir,CL_MAX_PATH,"%s/%s",cl_tempDir, "test.search");

 

    WhitespaceAnalyzer  analyzer;

    Directory*          pDirectory = (Directory*)FSDirectory::getDirectory(fsdir);

      IndexWriter         writer( pDirectory, &analyzer, true );

 

    writer.setUseCompoundFile( false );

      

    Document* d = _CLNEW Document();

    d->add( *_CLNEW Field( _T("_content"), tszDocText, Field::STORE_NO | Field::INDEX_TOKENIZED ));

    writer.addDocument(d);

      _CLDELETE( d );

    writer.close();

 

    return pDirectory;

}

 

/**

* Run test

*/

void testReadPastEOF(CuTest *tc)

{

    Directory*      pDirectory  = prepareDirectory1();

    Analyzer *      pAnalyzer   = NULL;

    Hits *          pHits       = NULL;

    IndexReader*    pReader     = IndexReader::open( pDirectory );

    IndexSearcher   searcher( pReader );

 

    CLUCENE_ASSERT( pReader->numDocs() == 1 );

 

    Term * t1 = new Term( _T( "_content" ), _T( "ze" ) );

    TermQuery * pQry1 = new TermQuery( t1 );

    _CLDECDELETE( t1 );

 

    pAnalyzer = new SimpleAnalyzer();

    pHits = searcher.search( pQry1 );

    _ASSERT( pHits->length() == 1 );

    CLUCENE_ASSERT( pHits->length() == 1 );

    _CLDELETE( pHits );

 

    // Removing the analyzer causes removing of ThreadLocals - also cached SegmentTermEnum

    _CLDELETE( pAnalyzer );

 

    // THE NEXT CALL WILL FAIL

    pHits = searcher.search( pQry1 );

    _ASSERT( pHits->length() == 1 );

    CLUCENE_ASSERT( pHits->length() == 1 );

    _CLDELETE( pHits );

 

    _CLDELETE( pQry1 );

 

    searcher.close();

    _CLDELETE( pReader );

 

    pDirectory->close();

    _CLDECDELETE( pDirectory );

}

[CLucene-dev] read past EOF ERROR while searching

From: Šplíchal J. <spl...@to...> - 2010-12-13 20:28:12

Hi,

 

we found serious problem while searching. In some special situation (probably depends on the index size) 

repeating search does not return correct results or event ends with CLuceneError "read past EOF".

 

To achieve the error, the following sequence must be called:

1)  run a query 

2)  delete an instance of Analyzer (can be instantiated earlier but in the same thread) - this causes that the ThreadLocals objects are freed

3)  run the same query - THIS FAILS!!

 

In all the cases where the seach failed we switched off the compound files. 

Could some one help us with this issue? It seems that reading the index files is not working correctly.

 

Jiri

 

PS: The following code is a test that demostrated the problem:

 

 

 

/**

* Create index

*/

Directory* prepareDirectory1()

{

      const TCHAR * tszDocText = _T( "a b c d e f g h i j k l m n o p q r s t u v w x y z ab bb cb db eb fb gb hb ib jb kb lb mb nb ob pb qb rb sb tb ub vb wb xb yb zb ac bc cc dc ec fc gc hc ic jc kc lc mc nc oc pc qc rc sc tc uc vc wc xc yc zc ad bd cd dd ed fd gd hd id jd kd ld md nd od pd qd rd sd td ud vd wd xd yd zd ae be ce de ee fe ge he ie je ke le me ne oe pe qe re se te ue ve we xe ye ze af bf cf df ef ff gf hf if jf kf lf mf" );

      

    char fsdir[CL_MAX_PATH];

      _snprintf(fsdir,CL_MAX_PATH,"%s/%s",cl_tempDir, "test.search");

 

    WhitespaceAnalyzer  analyzer;

    Directory*          pDirectory = (Directory*)FSDirectory::getDirectory(fsdir);

      IndexWriter         writer( pDirectory, &analyzer, true );

 

    writer.setUseCompoundFile( false );

      

    Document* d = _CLNEW Document();

    d->add( *_CLNEW Field( _T("_content"), tszDocText, Field::STORE_NO | Field::INDEX_TOKENIZED ));

    writer.addDocument(d);

      _CLDELETE( d );

    writer.close();

 

    return pDirectory;

}

 

/**

* Run test

*/

void testReadPastEOF(CuTest *tc)

{

    Directory*      pDirectory  = prepareDirectory1();

    Analyzer *      pAnalyzer   = NULL;

    Hits *          pHits       = NULL;

    IndexReader*    pReader     = IndexReader::open( pDirectory );

    IndexSearcher   searcher( pReader );

 

    CLUCENE_ASSERT( pReader->numDocs() == 1 );

 

    Term * t1 = new Term( _T( "_content" ), _T( "ze" ) );

    TermQuery * pQry1 = new TermQuery( t1 );

    _CLDECDELETE( t1 );

 

    pAnalyzer = new SimpleAnalyzer();

    pHits = searcher.search( pQry1 );

    _ASSERT( pHits->length() == 1 );

    CLUCENE_ASSERT( pHits->length() == 1 );

    _CLDELETE( pHits );

 

    // Removing the analyzer causes removing of ThreadLocals - also cached SegmentTermEnum

    _CLDELETE( pAnalyzer );

 

    // THE NEXT CALL WILL FAIL

    pHits = searcher.search( pQry1 );

    _ASSERT( pHits->length() == 1 );

    CLUCENE_ASSERT( pHits->length() == 1 );

    _CLDELETE( pHits );

 

    _CLDELETE( pQry1 );

 

    searcher.close();

    _CLDELETE( pReader );

 

    pDirectory->close();

    _CLDECDELETE( pDirectory );

}

[CLucene-dev] MultiSearcher problem

From: Miller, B. (QuickWire) <bm...@qu...> - 2010-12-10 21:27:51

Hi all, I've been implementing MultiSearcher and have a problem that may
be more of a 'Lucene Conceptual' thing than a bug.

I'm running a fairly new 2_3_2 git under Windows (VS 2010).

 

My problem is when I pass a sort to MultSearch::search() I receive many
duplicate hits (note that all doc id's are unique across both indexes
I'm testing with...).

However, if I only ask for 100 hits max there are no duplicates. 

Also if I do not pass a sort then there are no duplicates.

 

I find it rather hard to follow but I'm guessing that the 'good' 100
docs comes from the initial search and the duplicates are caused from
the Hits::getMoreDocs() eventually calling MultSearch::search() again
and for some reason adding the same hits each time.

 

Should I be expecting this behavior?

Thanks,

Bill

Re: [CLucene-dev] Patch: Fixes memory smasher in KeywordTokenizer

From: Matt R. <mr...@mr...> - 2010-12-05 17:43:46

Shoot I wish I had noticed this earlier: http://clucene.git.sourceforge.net/git/gitweb.cgi?p=clucene/clucene;a=commit;h=de5695332badddc264c3e187350463d9d6ee4a8a

Looks like someone else had already found and fixed the bug. I wish I had found that, would have saved me a lot of time, oh well.

On Dec 4, 2010, at 1:40 PM, Matt Ronge wrote:

> I've been working off the head of CLucene (which has great!) and I ran into a memory smasher.
> 
> I had strange issues where after some queries the search results would start to get more and more incorrect until the app would crash. After much debugging I was able to confirm that this would only occur for queries which had lengths that were multiples of 8.
> 
> After even more debugging I found that the KeywordTokenizer (which I was using for my queries) was allocating term buffers that where multiples of 8 (suspicious!). It turns out that  if the token length is a multiple of 8 and the KeywordTokenizer attempts to null terminate the string, it writes off the end of the array, causing memory corruption. Normally you don't see this because it silently corrupts and the token must be the multiple of 8. 
> 
> To fix this I make sure to add room for the null terminator if the buffer is already full. Here is my patch:
> 
> diff --git a/src/core/CLucene/analysis/Analyzers.cpp b/src/core/CLucene/analysis/Analyzers.cpp
> index 0c34a60..39fec43 100644
> --- a/src/core/CLucene/analysis/Analyzers.cpp
> +++ b/src/core/CLucene/analysis/Analyzers.cpp
> @@ -556,6 +556,9 @@ Token* KeywordTokenizer::next(Token* token){
>     if ( termBuffer == NULL ){
>       termBuffer=token->resizeTermBuffer(token->bufferLength() + 8);
>     }
> +       if (upto == token->bufferLength()) {
> +         termBuffer = token->resizeTermBuffer(token->bufferLength() + 1);
> +       }
>     termBuffer[upto]=0;
>     token->setTermLength(upto);
>     return token;
> 
> 
> Let me know if I should open a bug for this.
> 
> Thanks again for clucene!
> --
> Matt Ronge
> Central Atomics
> Makers of Rocketbox
> 
> http://www.getrocketbox.com
> mr...@ce...
> 
> 
> 
> 
> 
> 
> 
> ------------------------------------------------------------------------------
> What happens now with your Lotus Notes apps - do you make another costly 
> upgrade, or settle for being marooned without product support? Time to move
> off Lotus Notes and onto the cloud with Force.com, apps are easier to build,
> use, and manage than apps on traditional platforms. Sign up for the Lotus 
> Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d
> _______________________________________________
> CLucene-developers mailing list
> CLu...@li...
> https://lists.sourceforge.net/lists/listinfo/clucene-developers

--
Matt Ronge
Central Atomics
Makers of Rocketbox

http://www.getrocketbox.com
mr...@ce...

[CLucene-dev] Patch: Fixes memory smasher in KeywordTokenizer

From: Matt R. <mr...@mr...> - 2010-12-04 19:56:09

I've been working off the head of CLucene (which has great!) and I ran into a memory smasher.

I had strange issues where after some queries the search results would start to get more and more incorrect until the app would crash. After much debugging I was able to confirm that this would only occur for queries which had lengths that were multiples of 8.

After even more debugging I found that the KeywordTokenizer (which I was using for my queries) was allocating term buffers that where multiples of 8 (suspicious!). It turns out that  if the token length is a multiple of 8 and the KeywordTokenizer attempts to null terminate the string, it writes off the end of the array, causing memory corruption. Normally you don't see this because it silently corrupts and the token must be the multiple of 8. 

To fix this I make sure to add room for the null terminator if the buffer is already full. Here is my patch:

diff --git a/src/core/CLucene/analysis/Analyzers.cpp b/src/core/CLucene/analysis/Analyzers.cpp
index 0c34a60..39fec43 100644
--- a/src/core/CLucene/analysis/Analyzers.cpp
+++ b/src/core/CLucene/analysis/Analyzers.cpp
@@ -556,6 +556,9 @@ Token* KeywordTokenizer::next(Token* token){
     if ( termBuffer == NULL ){
       termBuffer=token->resizeTermBuffer(token->bufferLength() + 8);
     }
+       if (upto == token->bufferLength()) {
+         termBuffer = token->resizeTermBuffer(token->bufferLength() + 1);
+       }
     termBuffer[upto]=0;
     token->setTermLength(upto);
     return token;


Let me know if I should open a bug for this.

Thanks again for clucene!
--
Matt Ronge
Central Atomics
Makers of Rocketbox

http://www.getrocketbox.com
mr...@ce...

Re: [CLucene-dev] crash when using highlighter with the wildcardQuery!!!

From: Kostka B. <ko...@to...> - 2010-11-28 12:38:41

I’ll look why it crashes. But anyway you should call Query::rewrite and pass the resulting query to highlighter instead of original query

(see Query.h for description of rewrite). Otherwise highlighter won’t highlight words related to wildcard term.

 

Borek

 

From: muhammad ismael [mailto:m.i...@gm...] 
Sent: Saturday, November 27, 2010 3:19 PM
To: CLu...@li...
Subject: [CLucene-dev] crash when using highlighter with the wildcardQuery!!!

 

Hello everybody,
When i use WildcardQuery to search for a word the search is done successfully, but when i try to highlight using a WildcardQuery the application crashes. 
I think this because of the function Query::extractTerms.
can any body help to solve this problem?
-- 

Sincerely
--------------------
Mohammad Ismael
Software Developer
Mobile:+20114753575

Re: [CLucene-dev] how can i use the highlighter to highlight the exactterm?

From: Kostka B. <ko...@to...> - 2010-11-28 12:29:20

Hi,

 

This is known problem. Unfortunatelly  this behavior is caused by way, how it is implemented.

Doing exact highlighting is much more complex task (especially for span queries).

 

I remember there was some discussion about this few months ago, try to search mailing

list archive 

 

Borek

 

From: muhammad ismael [mailto:m.i...@gm...] 
Sent: Saturday, November 27, 2010 9:30 PM
To: CLu...@li...
Subject: [CLucene-dev] how can i use the highlighter to highlight the exactterm?

 

Hi,
I am searching for the term "hello world" for example, and i use highlighter  to highlight it.but the highlighter does not highlight it only, but also highlights the word "hello" even if it is not followed by the other word "world".
How can i make the highlighter highlights the exact term only? 

-- 

Sincerely
--------------------
Mohammad Ismael
Software Developer
Mobile:+20114753575

[CLucene-dev] how can i use the highlighter to highlight the exact term?

From: muhammad i. <m.i...@gm...> - 2010-11-27 20:30:13

Hi,
I am searching for the term "hello world" for example, and i use
highlighter  to highlight it.but the highlighter does not highlight it only,
but also highlights the word "hello" even if it is not followed by the other
word "world".
How can i make the highlighter highlights the exact term only?

-- 
Sincerely
--------------------
Mohammad Ismael
Software Developer
Mobile:+20114753575

[CLucene-dev] crash when using highlighter with the wildcardQuery!!!

From: muhammad i. <m.i...@gm...> - 2010-11-27 14:18:44

Hello everybody,
When i use WildcardQuery to search for a word the search is done
successfully, but when i try to highlight using a WildcardQuery the
application crashes.
I think this because of the function Query::extractTerms.
can any body help to solve this problem?
-- 
Sincerely
--------------------
Mohammad Ismael
Software Developer
Mobile:+20114753575

Re: [CLucene-dev] wildcard search - unexpected results

From: Mark A. <ash...@ca...> - 2010-11-24 14:51:25

Hi Ben,

Thank you for the pointers.  I will have a look these options.

-- Mark.
Mark Ashworth 
IBM Informix Extensibility Architect 
Office phone: +1 (905) 413-5033 
Alternate: +1 (905) 697-8094 
Email: ash...@ca... 
Check out my blog 



From:
Ben van Klinken <bva...@gm...>
To:
"clu...@li..." 
<clu...@li...>
Date:
11/09/2010 01:44 PM
Subject:
Re: [CLucene-dev] wildcard search - unexpected results



Hi mark

this is the expected behaviour. A leading wildcard would require a
full termlist scan therefore by default there is a required prefix
length. I think it's set on the querparser. Let me know if you can't
find it and I'll try and find it.

If you are often doing prefixed wildcards, some people will create a
reversed field and use that field instead when appropriate. For
example, a search for filename:*.doc. Can be transformed into
searching for reverse_filename:cod.*. this can be achieved by
overriding the query parser. Nasty but effective trick.

Ben

On Wednesday, November 10, 2010, Mark Ashworth <ash...@ca...> 
wrote:
>
>
> Sorry, I sent before changing the subject
> line, I email was about wildcard searching and not "NearSpansUnordered
> bug fix".   I also go 4 copies back from the clucene-developers
> email reflector and I am not sure why, again sorry for the extra 
bandwidth
> if you got duplicate copies.
>
>
>>
>
>> Hope someone with wildcard searching
> behaviour:
>>
>> I have an index with email address with my email address 
(ash...@ca...):
>
>>
>> I can search for:
>>
>> Searching for: a?hw...@ca...
>>
>>     1. ash...@ca... - 100.00
>
>>
>> Searching for: ashworth@ca.ibm*
>>
>>     1. ash...@ca... - 100.00
>
>>
>> Searching for: ?sh...@ca...
>>
>>     1. ash...@ca... - 100.00
>
>>
>> Searching for: a?hworth@ca.ibm*
>>
>>     1. ash...@ca... - 100.00
>
>>
>> Searching for: ?shworth@ca.ibm*
>>
>> Does not file any rows when.  This is when there is a leading
> single character wildcard and a multi-character wildcard at the end..
>
>>
>> Is this a bug or expected behaviour?
>>
>> thanks in advance...
>
>
>
>
> -- Mark.
> Mark Ashworth
> IBM Informix Extensibility Architect
> Office phone: +1 (905) 413-5033
> Alternate: +1 (905) 697-8094
> Email: ash...@ca...
>
> Check out my blog
>
>
>
>
>
>
> From:
> Mark Ashworth/Toronto/IBM@IBMCA
>
> To:
> clu...@li...
>
> Date:
> 11/09/2010 12:59 PM
>
> Subject:
> Re: [CLucene-dev] NearSpansUnordered
> bug fix
>
>
>
>
>
>
> Hope someone with wildcard searching behaviour:
>
> I have an index with email address with my email address 
(ash...@ca...):
>
>
> I can search for:
>
> Searching for: a?hw...@ca...
>
>     1. ash...@ca... - 100.00
>
> Searching for: ashworth@ca.ibm*
>
>     1. ash...@ca... - 100.00
>
> Searching for: ?sh...@ca...
>
>     1. ash...@ca... - 100.00
>
> Searching for: a?hworth@ca.ibm*
>
>     1. ash...@ca... - 100.00
>
> Searching for: ?shworth@ca.ibm*
>
> Does not file any rows when.  This is when there is a leading single
> character wildcard and a multi-character wildcard at the end..
>
>
> Is this a bug or expected behaviour?
>
> thanks in advance...
>
> .
>
>
>
> -- Mark.
> Mark Ashworth
> IBM Informix Extensibility Architect
> Office phone: +1 (905) 413-5033
> Alternate: +1 (905) 697-8094
> Email: ash...@ca...
>
> Check out my blog
>
>
>
>
>
> From:
>
> Veit Jahns <nun...@go...>
>
>
> To:
>
> clu...@li...
>
>
> Date:
>
> 11/08/2010 04:02 PM
>
>
> Subject:
>
> Re: [CLucene-dev] NearSpansUnordered
> bug fix
>
>
>
>
>
>
> 2010/11/6 Itamar Syn-Hershko <it...@co...>:
>> Where are we with the smart_pointers branchs?
>
> You mean this activity. I thought you meant another one. Actually, I
> think every day about this ;). But in the last months I had not so
> much time to work on it. I hope I can changed this in the next weeks.
>
>> And which of all the fix branches is ready to be merged to master?
>
> german_analyzer and tracker_3094661_fix
>
> Veit
>
> 
------------------------------------------------------------------------------
> The Next 800 Companies to Lead America's Growth: New Video Whitepaper
> David G. Thomson, author of the best-selling book "Blueprint to a
>
> Billion" shares his insights and actions to help propel your
> business during the next growth cycle. Listen Now!
> http://p.sf.net/sfu/SAP-dev2dev
> _______________________________________________
> CLucene-developers mailing list
> CLu...@li...
> https://lists.sourceforge.net/lists/listinfo/clucene-developers
>
> 
------------------------------------------------------------------------------
> The Next 800 Companies to Lead America's Growth: New Video Whitepaper
> David G. Thomson, author of the best-selling book "Blueprint to a
>
> Billion" shares his insights and actions to help propel your
> business during the next growth cycle. Listen Now!
> 
http://p.sf.net/sfu/SAP-dev2dev_______________________________________________

> CLucene-developers mailing list
> CLu...@li...
> https://lists.sourceforge.net/lists/listinfo/clucene-developers
>
>
>
>

------------------------------------------------------------------------------
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
_______________________________________________
CLucene-developers mailing list
CLu...@li...
https://lists.sourceforge.net/lists/listinfo/clucene-developers

[CLucene-dev] contrib: SpellChecker

From: Freiholz M. <M.F...@ca...> - 2010-11-23 08:01:08

Hello,

i just wanted to mention that i uploaded a few contrib lib files to enable spell-checking. I uploaded the 3 patch files to your bug tracking system.
https://sourceforge.net/tracker/?func=detail&aid=3113462&group_id=80013&atid=558446

Sorry for 3 patch files, i'm not very familiar with git (Subversion ftw^^), but i work on it ;)

Greetings,
Manuel Freiholz.

Re: [CLucene-dev] Highlighter compiled successfully

From: muhammad i. <m.i...@gm...> - 2010-11-20 10:54:04

Dear Veit,
I want to thank you for your response.
I want to tell you that now every thing works fine. The highlighter returns
the correct string now, I should index the text after encoding it in UTF-8.

On Sat, Nov 20, 2010 at 12:29 AM, Veit Jahns <nun...@go...>wrote:

> 2010/11/17 muhammad ismael <m.i...@gm...>:
> > Dear Veit,
> > I want to thank you for the patch you sent me.
> > I patched my Highlighter with your patch and it is compiled successfully.
>
> Thanks! Then I will commit the patch the CLucene repository.
>
> > but i have another question:
> > I am using clucene to index and search Arabic text until now every thing
> > woks fine,
> > but when i used the highlighter to return the text fragment, it returns
> the
> > text in wchar_t it is supposed to be Unicode but when i display it
> appears
> > with strange characters not Arabic?
> > can you help me to solve  this problem?
>
> I will try. Encoding is sometimes very confusing to me, also. On which
> plattform do you work? Because wchar_t has different sizes on
> different plattforms as one of my colleagues explained it to me. 2
> byte on Windows and 4 byte on Linux plattforms. Maybe this is the
> reason.
>
> Also, can you provide a small test case? Done I can play also a little
> bit with these highlighter. Some months ago, I made also some
> experiments with Arabic characters, but as I am not familiar with the
> Arabic language it was just moving symbols around, not knowing if the
> characters form a useful word.
>
> Regards,
> Veit
>
> PS
> Do you mind, if we continue the discussion on the mailing list? The
> discussion may be useful for others too or someone other knowing more
> about this encoding issues can provide further input.
>



-- 
Sincerely
--------------------
Mohammad Ismael
Software Developer
Mobile:+20114753575

Re: [CLucene-dev] NearSpansUnordered bug fix

From: Veit J. <nun...@go...> - 2010-11-19 22:31:52

2010/11/16 Šplíchal Jiří <spl...@to...>:
> We are using the merge of the two branches without problems = all tests pass in debug and also in release
> on win7 64bit. But there are still some memory leaks left.

This branch contains also the fix for the missing include of limits.h
(see tracker 3083768 [1]).

Veit

[1] http://sourceforge.net/tracker/?func=detail&aid=3083768&group_id=80013&atid=558446

[CLucene-dev] Wildcardquery bug in constructor

From: Šplíchal J. <spl...@to...> - 2010-11-18 08:38:15

Hello,

 

there is a bug in the constructor of the wildcard query setting the termContainsWildcard member

variable.  The existing test checked if at least one of the chars *? was NOT contained in the string instead of

checking that at least one IS contained.

 

I pushed the fix to the wildcardquery_fix branch.

 

Plus I added one more fix to the memory-leaks branch.

 

Jiri

 

--

Jiří Šplíchal

TOVEK, spol. s r.o.

spl...@to... <mailto:spl...@to...> 

+420 606671930

53 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 15 16 17 18 19 .. 168 > >> (Page 17 of 168)