From: Rzepa, Henry <h.rzepa@im...> - 2004-11-22 11:38:02
We are indexing/searching for long strings, containing separators. htdig breaks these up into tokens
and then does a boolean AND on them.
We were testing how many tokens are tolerated before htdig is unable to distinguish
between two strings. Tests show that if the string is broken into 14 tokens, it
gives a unique result, but if the 15th token differs, then it does not recognise this
Can anyone comment on the coded token length of htdig (this for V 3.1.6).
Does perchance 3.2 differ in this regard?
Apologies if this is a developer question!
+44 (020) 7594 5774 (Voice); +44 (0870) 132 3747 (eFax); rzepahs@... (iChat)
http://www.ch.ic.ac.uk/rzepa/ Dept. Chemistry, Imperial College London, SW7 2AZ, UK.
(Voracious anti-spam filter in operation for received email.
If expected reply not received, please phone/fax).
Get latest updates about Open Source Projects, Conferences and News.