Hi there,
as you know LogicalDOC has an interesting feature that other DMS doesn't
have: search by similarity.
Well, we want to preserve this feature but by now all the 'search by
similarity' machinery is based on the Term business entity and it's
table LD_TERM.
This approach *kills* performances on very huge documents archives
sinche for each document 20 records are also created in LD_TERM.
Imagine a population of 1M documents, int this case we will get 20M
records into LD_TERM!
So our intention is to drop Term object and all it's associated logic
and reimplement the feature following one of the two approaches:
1. Use the lucene Index: similar documents are those documents that
have a similar value in the 'content' field of the lucene index.
2. Use keywords(table LD_KEYWORD): similar documents are those
documents that share at least one keyword
I think the first approach is the best, while the second ignores
documents without keywords.
So in the next days we will proceed to this refactoring, if you don't
have objections. The final purpose is always to achieve higher performances.
PS
The new tool TestBench is giving us a lot of satisfaction, ASAP we will
publish our benchmarks.
Best Regards
---------------------------------------------------------------
ing. Marco Meschieri
e-mail: m.m...@lo... <mailto:m.m...@lo...>
---------------------------------------------------------------
Logical Objects snc
Via Bonasi, 2/A 41012 Carpi (MO) Italy
Tel./Fax. 059 688969
web: http://www.logicalobjects.it
--
AVVERTENZE AI SENSI DEL DLGS 196/2003:
Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i, sono da considerarsi strettamente riservate. Il
loro utilizzo e' consentito esclusivamente al destinatario del
messaggio, per le finalita' inidcate nel messaggio stesso. Qualora
riceveste questo messaggio senza esserne il destinatario Vi preghiamo
cortesemente di darcene notizia via e-mail e di procedere alla
distruzione del messaggio stesso, cancellando dal vostro sistema;
costituisce comportamento contrario ai principi dettati dal Dlgs196/2003
il trattenere il messaggio stesso, divulgandolo anche in parte,
distribuirlo ad altri soggetti,copiarlo, od utilizzarlo per
finalita'diverse. Titolare del trattamento e' Logical Objects SNC Via
Bonasi 2/A 41012 CARPI (MO) Tel. 059/688969 Fax 059/688969
This e-mail and any file transmitted with it is intended only for the
person or entity to which is addressed and may contain information that
is privileged, confidential or otherwise protected from disclosure.
Copying, dissemination or use of this e-mail or the information herein
by anyone other than the intended recipient is prohibited. If you have
received this e-mail by mistake, please notify us immediately by
telephone or fax. Proprietor of treatment is Logical Objects SNC Via
Bonasi 2/A 41012 CARPI (MO) Tel. 059/688969 Fax 059/688969
|