i just tried synonyms.txt with a translated wordlist of english and
This could give better/more results, because users sometimes(often)
forget to use search terms in other important languages.
The dictionary i used came from http://www.linguee.com/ and could be
used after some replacements to get a comma separated list.
The entries and translations are not the best for our purpose, but it
was a good start to check how a synonyms.txt with 50.000 lines is handled.
The good message is: There was no significant delay for searches.
Now my question:
Is anybody using synonyms.txt with a thesaurus or dictionary?
How big could synonyms.txt grow?
Heiko Weier Technische Universität Hamburg-Harburg
Tel.: 040-42878-3449 Denickestr. 22
Fax.: 040-42878-2527 21073 Hamburg
WWW : http://www.tub.tu-harburg.de
D8DF 3FFD 3910 AB1C 12D2 940B 65B6 2A73 524A 2C1E
From: Till Kinstler <kinstler@gm...> - 2010-09-30 13:15:28
Am 30.09.2010 14:52, schrieb Heiko Weier:
> How big could synonyms.txt grow?
I played (we don't use it in production) with synonyms.txt files having
about 1 million lines and two or three terms on each line. We considered
using synonym files to look up terms in authority files (eg. SWD), but
stopped doing that, because the complex structure of eg. SWD is not
easily mappable to a flat Solr synonyms file.
The synonyms file is loaded into RAM when Solr starts, that's why the
synonym lookups are pretty fast. But startup time and RAM usage grow
considerably, when you use synonym files of that size, of course.