From: Marco M. - L. O. s. <m.m...@lo...> - 2008-10-31 18:02:35
|
Hi there, as you know LogicalDOC has an interesting feature that other DMS doesn't have: search by similarity. Well, we want to preserve this feature but by now all the 'search by similarity' machinery is based on the Term business entity and it's table LD_TERM. This approach *kills* performances on very huge documents archives sinche for each document 20 records are also created in LD_TERM. Imagine a population of 1M documents, int this case we will get 20M records into LD_TERM! So our intention is to drop Term object and all it's associated logic and reimplement the feature following one of the two approaches: 1. Use the lucene Index: similar documents are those documents that have a similar value in the 'content' field of the lucene index. 2. Use keywords(table LD_KEYWORD): similar documents are those documents that share at least one keyword I think the first approach is the best, while the second ignores documents without keywords. So in the next days we will proceed to this refactoring, if you don't have objections. The final purpose is always to achieve higher performances. PS The new tool TestBench is giving us a lot of satisfaction, ASAP we will publish our benchmarks. Best Regards --------------------------------------------------------------- ing. Marco Meschieri e-mail: m.m...@lo... <mailto:m.m...@lo...> --------------------------------------------------------------- Logical Objects snc Via Bonasi, 2/A 41012 Carpi (MO) Italy Tel./Fax. 059 688969 web: http://www.logicalobjects.it -- AVVERTENZE AI SENSI DEL DLGS 196/2003: Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i, sono da considerarsi strettamente riservate. Il loro utilizzo e' consentito esclusivamente al destinatario del messaggio, per le finalita' inidcate nel messaggio stesso. Qualora riceveste questo messaggio senza esserne il destinatario Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellando dal vostro sistema; costituisce comportamento contrario ai principi dettati dal Dlgs196/2003 il trattenere il messaggio stesso, divulgandolo anche in parte, distribuirlo ad altri soggetti,copiarlo, od utilizzarlo per finalita'diverse. Titolare del trattamento e' Logical Objects SNC Via Bonasi 2/A 41012 CARPI (MO) Tel. 059/688969 Fax 059/688969 This e-mail and any file transmitted with it is intended only for the person or entity to which is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. Copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this e-mail by mistake, please notify us immediately by telephone or fax. Proprietor of treatment is Logical Objects SNC Via Bonasi 2/A 41012 CARPI (MO) Tel. 059/688969 Fax 059/688969 |