You can subscribe to this list here.
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(3) |
Sep
|
Oct
|
Nov
|
Dec
(3) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2007 |
Jan
(5) |
Feb
|
Mar
|
Apr
(1) |
May
(1) |
Jun
(2) |
Jul
(3) |
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
|
2008 |
Jan
|
Feb
|
Mar
(6) |
Apr
(2) |
May
(4) |
Jun
(1) |
Jul
(1) |
Aug
(1) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
2009 |
Jan
|
Feb
(1) |
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
(4) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2011 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Mohammad N. <naj...@go...> - 2011-01-10 07:43:00
|
Dear users and developers of strigi, i need some information about strigi. I hope you can help me or give me some links where i can find answers to my questions. system architecture: - i need a general overview, which components belongs to the system and how they work together Query engine: - i need some information about the language extent and by what it is influenced? Is there a relational db as backend in the architecture so that it is possible to use the complete SQL language extent? Or to they use Lucene and do the language base on primitives of lucene? If yes which fields are saved, which indexes are built etc Queries: -Are there specials (complex) queries that shows strigis strength? (with good performance) -Which kind of queries are less well supported? I would really appreciate a detailled answer and it would be a great help for me. I`ve already searched in the web about informations, but i couldnt find a detailed description. If found some presentations, but they are not self-explanatory. I look forward to hear from your. Regards, Julia |
From: <ki...@gm...> - 2010-05-21 07:53:00
|
I installed strigi Version: 0.6.4-0ubuntu2 on Ubuntu 9.04 and also have libxapian15, Version: 1.0.7-4, installed. I do not care about semantic search. I would just like to use it as a command line fulltext search for plain text files to compare the different backends. 1) Create an index on Xapian does not work: $ strigicmd query -t xapian -d xapian_index txts n backends: 2 Invalid index type. Choose one from 'clucene', 'sopranobackend'. Do I need to install another Xapian package or is Xapian not supported anymore? 2) Searching an index build on clucene does not give any hits. I tried clucene which created these 3 files in the indexdir: _3.cfs deletable segments, but fails to retrieve a hit: $ file `ls txts` aaa.txt: ASCII text bbb.txt: ASCII text $ cat txts/* foo bar mit ohne $ strigicmd query -t clucene -d clucene_index foo n backends: 2 No results for search "foo" This is the index creation output: $ strigicmd create -t clucene -d clucene_index txts n backends: 2 WARNING: field 'http://strigi.sf.net/ontologies/0.9#debugParseError' is not defined in any rdfs ontology database. /usr/lib/strigi/strigita_font.so /usr/lib/strigi/strigiea_jpeg.so /usr/lib/strigi/strigita_wav.so /usr/lib/strigi/strigila_namespaceharvester.so /usr/lib/strigi/strigita_pcx.so /usr/lib/strigi/strigila_xpm.so /usr/lib/strigi/strigita_avi.so /usr/lib/strigi/strigita_ico.so /usr/lib/strigi/strigila_cpp.so /usr/lib/strigi/strigita_dds.so /usr/lib/strigi/strigila_deb.so /usr/lib/strigi/strigita_xbm.so /usr/lib/strigi/strigita_dvi.so /usr/lib/strigi/strigita_au.so /usr/lib/strigi/strigiea_ics.so /usr/lib/strigi/strigita_gif.so /usr/lib/strigi/strigita_sid.so /usr/lib/strigi/strigila_txt.so /usr/lib/strigi/strigita_rgb.so /usr/lib/strigi/strigiea_vcf.so WARNING: field 'maxLineLength' is not defined in any rdfs ontology database. WARNING: field 'line ending format' is not defined in any rdfs ontology database. WARNING: field 'dds volume depth' is not defined in any rdfs ontology database. WARNING: field 'dds mipmap count' is not defined in any rdfs ontology database. WARNING: field 'dds image type' is not defined in any rdfs ontology database. WARNING: field 'font.family' is not defined in any rdfs ontology database. WARNING: field 'font.weight' is not defined in any rdfs ontology database. WARNING: field 'font.slant' is not defined in any rdfs ontology database. WARNING: field 'font.width' is not defined in any rdfs ontology database. WARNING: field 'font.spacing' is not defined in any rdfs ontology database. WARNING: field 'font.foundry' is not defined in any rdfs ontology database. WARNING: field 'content.version' is not defined in any rdfs ontology database. WARNING: field 'document.stats.image_count' is not defined in any rdfs ontology database. WARNING: field 'document.stats.image_name' is not defined in any rdfs ontology database. WARNING: field 'document.stats.image_shared_rows' is not defined in any rdfs ontology database. WARNING: field 'Product Id' is not defined in any rdfs ontology database. WARNING: field 'Events' is not defined in any rdfs ontology database. WARNING: field 'Journals' is not defined in any rdfs ontology database. WARNING: field 'Todos' is not defined in any rdfs ontology database. WARNING: field 'Todos Completed' is not defined in any rdfs ontology database. WARNING: field 'Todos Overdue' is not defined in any rdfs ontology database. WARNING: field 'content.thumbnail' is not defined in any rdfs ontology database. WARNING: field 'ole.category' is not defined in any rdfs ontology database. WARNING: field 'ole.presentationtarget' is not defined in any rdfs ontology database. WARNING: field 'ole.manager' is not defined in any rdfs ontology database. WARNING: field 'ole.company' is not defined in any rdfs ontology database. WARNING: field 'document.stats.table_count' is not defined in any rdfs ontology database. WARNING: field 'document.stats.object_count' is not defined in any rdfs ontology database. WARNING: field 'http://rdf.openmolecules.net/0.9#moleculeCount' is not defined in any rdfs ontology database. Any ideas? Thanks, Kiki -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 |
From: Jos v. d. O. <jvd...@gm...> - 2009-07-20 11:30:27
|
2009/7/20 Michele Tameni <mic...@it...>: > > > On Sun, Jul 19, 2009 at 8:02 PM, Jos van den Oever <jvd...@gm...> > wrote: >> >> >> You can have a look at the kio code in kde or the xmlindexer code in >> strigi [1]. >> The important thing is to have an indexwriter instance. That will >> receive the data that is extracted. E.g. xmlindexer has an >> xmlindexwriter.cpp. > > i think i don't need to use a indexwriter. > I only need to have metadata extracted. this is where i'm going lost.. :) > any hint on how to do it?, i'm going to look the code now..:) > Thanks > Michele Hello Michele, You do need an indexwriter. The indexwriter is the class the receives the metadata. A common usecase is to write this data to an index. In your case you may want to use it for something else. The name 'IndexWriter' is not the best name as shown by your confusion. The metadata is extracted by analyzers and written to/reported to an IndexWriter. Do have a look at the xmlindexer code. It is not much code and should clarify this issue for you. Cheers, Jos |
From: Michele T. <mic...@it...> - 2009-07-20 09:51:03
|
On Sun, Jul 19, 2009 at 8:02 PM, Jos van den Oever <jvd...@gm...>wrote: > > > You can have a look at the kio code in kde or the xmlindexer code in strigi > [1]. > The important thing is to have an indexwriter instance. That will > receive the data that is extracted. E.g. xmlindexer has an > xmlindexwriter.cpp. i think i don't need to use a indexwriter. I only need to have metadata extracted. this is where i'm going lost.. :) any hint on how to do it?, i'm going to look the code now..:) Thanks Michele > Cheers, > Jos > > [1] > http://websvn.kde.org/trunk/kdesupport/strigi/src/xmlindexer/xmlindexer.cpp?view=markup > -- michele tameni http://www.amdplanet.it |
From: Jos v. d. O. <jvd...@gm...> - 2009-07-19 20:02:27
|
2009/7/16 Michele Tameni <mic...@it...>: > Hi, i'm looking for some preliminary info on how to use already written > streamanalyzer for extract metadata in a c++ program. > There are some doc or example on how to do that? > Or someone can tell me on which source file can i look for inspiration? > thanks a lot. Hello Michele, You can have a look at the kio code in kde or the xmlindexer code in strigi [1]. The important thing is to have an indexwriter instance. That will receive the data that is extracted. E.g. xmlindexer has an xmlindexwriter.cpp. Cheers, Jos [1] http://websvn.kde.org/trunk/kdesupport/strigi/src/xmlindexer/xmlindexer.cpp?view=markup |
From: Michele T. <mic...@it...> - 2009-07-16 15:59:55
|
Hi, i'm looking for some preliminary info on how to use already written streamanalyzer for extract metadata in a c++ program. There are some doc or example on how to do that? Or someone can tell me on which source file can i look for inspiration? thanks a lot. Michele -- michele tameni |
From: Evgeny E. <phr...@gm...> - 2009-03-09 18:29:37
|
On 6 марта 2009 17:28:41 Andreas Demmer wrote: > I discussed the following issue on the nepomuk-kde mailing list, but this > mailing list should most probably aware of the issue too. > > I reported that strigi indexes my PDF filenames but none of the contents. PDF indexing is broken at this moment. Hopefully will be fixed soon... -- Evgeny |
From: Andreas D. <ma...@an...> - 2009-03-06 15:29:08
|
I discussed the following issue on the nepomuk-kde mailing list, but this mailing list should most probably aware of the issue too. I reported that strigi indexes my PDF filenames but none of the contents. Here is the transcript of my conversation with Egon Willighagen: ------ Egon Willighagen: > strigi uses pdftotext, but is reported to not work on Ubuntu 8.10. pdftotext is installed on my system and reads all text from the PDF file as expected. > You can test with the xmlindexer helper application: > $ xmlindexer your.pdf > which the full text shows up in the output, then your Strigi > installation does PDF full text indexing. The xml does not contain the PDF text. So something seems broken on my system. My strigi version is 0.6.4. "rpm -ql strigi" lists all strigi modules, but no PDF module: /usr/lib/strigi/strigiea_jpeg.so /usr/lib/strigi/strigiindex_clucene.so /usr/lib/strigi/strigila_cpp.so /usr/lib/strigi/strigila_deb.so /usr/lib/strigi/strigila_namespaceharvester.so /usr/lib/strigi/strigila_txt.so /usr/lib/strigi/strigila_xpm.so /usr/lib/strigi/strigita_au.so /usr/lib/strigi/strigita_avi.so /usr/lib/strigi/strigita_dds.so /usr/lib/strigi/strigita_gif.so /usr/lib/strigi/strigita_ico.so /usr/lib/strigi/strigita_pcx.so /usr/lib/strigi/strigita_rgb.so /usr/lib/strigi/strigita_sid.so /usr/lib/strigi/strigita_wav.so /usr/lib/strigi/strigita_xbm.so Am I missing anything? Greets, Andreas -- Skype: andreas.demmer ICQ: 103 924 771 http://www.andreas-demmer.de |
From: Enrique V. <ev...@gm...> - 2009-02-18 15:59:52
|
Hi, I'd like to know if building on windows is more or less up to date, and what filetypes are supported... Do i need something else than clucene? Is MinGW enough to build? I'd like to use it instead of propietary engines. Thanks |
From: Antti M. <za...@cs...> - 2008-10-15 11:21:18
|
Hi, It is my understanding that strigi can index JPEG comments and EXIF info. I have a lot of photos - and all are commented with JPEG comment. I hate picasa and similar services where they separate metadata from the real data (I'm ok with having a separate metadata for indexing purposes etc, but I don't like the idea where if my metadata DB gets hosed, I lose all information). Anyway, testing with Strigi 0.5.10 - I told it to index a folder full of JPEGs. All commented, all have also EXIF information in place. E.g. trying to index my trip to Australia+NZ, sample picture: $ jhead 071016-112348.jpg File name : 071016-112348.jpg File size : 2951625 bytes File date : 2007:10:16 11:23:48 Camera make : Canon Camera model : Canon EOS 20D Date/Time : 2007:10:16 11:23:48 Resolution : 3504 x 2336 Flash used : No Focal length : 26.0mm (35mm equivalent: 42mm) CCD width : 22.48mm Exposure time: 0.0025 s (1/400) Aperture : f/8.0 ISO equiv. : 100 Whitebalance : Manual Light Source : Daylight Metering Mode: matrix Exposure : aperture priority (semi-auto) Comment : Darling Harbour. Config is right: I've set it up to index that directory. $ cat .strigi/daemon.conf <strigiDaemonConfiguration useDBus='1'> <repository name='localhost' writeable='1' pollingInterval='180' urlbase='' indexdir='/home/zarhan/.strigi/clucene' type='clucene'> <path path='/home/zarhan/071014-29_AusNZ'> </path> </repository> <filters> <filter pattern='.*/' include='0'> </filter> <filter pattern='.*' include='0'> </filter> </filters> </strigiDaemonConfiguration> Strigiclient says "1027 documents indexed, 4084 unique words indexed". Problems start here. I can't search for my files, even with full name via strigiclient. With strigicmd I get: $ strigicmd query -t clucene -d ~/.strigi/clucene/ "name=071016-112348.jpg" n backends: 1 http://freedesktop.org/standards/xesam/1.0/core#name Results for search "name=071016-112348.jpg" "/home/zarhan/071014-29_AusNZ/071016-111948.jpg" matched - mimetype: image/jpeg - sha1: - size: 3205785 - mtime: Tue Oct 16 11:19:48 2007 - fragment: - http://freedesktop.org/standards/xesam/1.0/core#fileExtension: jpg - http://freedesktop.org/standards/xesam/1.0/core#name: 071016-111948.jpg - http://strigi.sf.net/ontologies/0.9#depth: 0 - http://strigi.sf.net/ontologies/0.9#parentUrl: "/home/zarhan/071014-29_AusNZ/071016-111951.jpg" matched - mimetype: image/jpeg - sha1: - size: 2421644 - mtime: Tue Oct 16 11:19:51 2007 - fragment: - http://freedesktop.org/standards/xesam/1.0/core#fileExtension: jpg - http://freedesktop.org/standards/xesam/1.0/core#name: 071016-111951.jpg - http://strigi.sf.net/ontologies/0.9#depth: 0 - http://strigi.sf.net/ontologies/0.9#parentUrl: "/home/zarhan/071014-29_AusNZ/071016-112039.jpg" matched - mimetype: image/jpeg - sha1: (...) Query "name=071016-112348.jpg" returned 33 results What is this? 33 results with an exact name via strigicmd? Yet no answers when I put that same name into strigiclient (or strigi:/ Kioslave - running KDE 3.5.9). However, the primary purpose is not do name-based matches. I want to search by JPEG comments. So... strigicmd listFields -t clucene -d ~/.strigi/clucene/ n backends: 1 http://freedesktop.org/standards/xesam/1.0/core#fileExtension http://freedesktop.org/standards/xesam/1.0/core#mimeType http://freedesktop.org/standards/xesam/1.0/core#name http://freedesktop.org/standards/xesam/1.0/core#size http://freedesktop.org/standards/xesam/1.0/core#sourceModified http://freedesktop.org/standards/xesam/1.0/core#url http://strigi.sf.net/ontologies/0.9#depth http://strigi.sf.net/ontologies/0.9#parentUrl Those are ONLY attributes? Where's support for EXIF data and JPEG comments? I know it's in source code - what would the src/streamanalyzer/endplugins/jpegendanalyzer.cpp be doing otherwise? Why I'm not getting http://freedesktop.org/standards/xesam/1.0/core#contentComment or http://freedesktop.org/standards/xesam/1.0/core#userComment as a field? (as a sideline, I'm using JPEG comments, not EXIF comments in my files). Or any of the other EXIF attributes? So appreciating any help... -- - Antti Mäkelä - http://www.cs.tut.fi/~zarhan - za...@cs... - There is a theory which states that if ever anyone discovers exactly what the Universe is for and why it is here,it will instantly disappear and be replaced by something even more bizarre and inexplicable. |
From: Stef B. <st...@bo...> - 2008-09-10 06:49:11
|
Hello, I've created a construction which adds an "autofs managed network mountpoint" to the homedirectory. It looks like: /home/sbon/Global Network/Windows Network/ Workgroup/ HostA/ share1/ share2/ share3 / HostB/ share4/ share5/ FTP/ 192.168.0.1/ ftp.kernel.org/ This goes completely automatic, the tree is built automatic, the mountpoint is added automatically when a session starts. Now autofs does allow per share mounting, which means that a share is only mounted when entered. Now when I've strigi enabled, it should automatically reckognize it's crossing a filesystemborder (local -> autofs first and then autofs->cifs). It should have a switch to not cross filesystemborders. Some weeks ago it started indexing the data on the ftp.kernel.org host, which is not a good thing. Stef Bon |
From: Joshua M. H. <auh...@gm...> - 2008-08-31 02:52:25
|
On fedora core 9 my strigidaemon was using 1.2 gig of ram is there something wrong with it or is my home folder too big. |
From: sharad s. <sha...@ya...> - 2008-07-25 14:21:23
|
I have two files (first and second) The contents of first are: foo bar left center right The contents of second are: foo left right a) I want to search for foo but not bar. strigicmd query -t clucene -d db '+foo -bar' But this returns first (instead of second) b) I want to search for the phrase "left right" strigicmd query -t clucene -d db '"left right"' But this returns both first and second (instead of only second) Any suggestions ? -- sharad |
From: Jos v. d. O. <jvd...@gm...> - 2008-06-02 06:57:18
|
2008/5/31 Erwin Mueller <fun...@go...>: > I have Strigi v0.5.9 on 2.6.25-4.slh.3-sidux-686 #1 SMP PREEMPT Fri May 23 > 21:58:49 UTC 2008 i686 GNU/Linux and KDE 3.5.9. Hello Erwin, This weekend I made a new release. In version 0.5.9 we had a regression which caused tar.gz and zip files not to be recursively indexed. It still worked for tar.bz2 files though. Version 0.5.10 is available from www.vandenoever.info/software/strigi. Cheers, Jos |
From: Erwin M. <fun...@go...> - 2008-05-31 18:05:58
|
I just started the daemon with the strigi-applet (in the konqueror window) and clickt on "indexing". Now I have 3 processes of "strigidaemon clucene" that are running now for over 20min at 100% (all three). The status is Documents in queue 0 Documents indexed 33480 Index size 471 MB Status indexing Unique words indexed 273801 but it's not updating, i.e. always at "Documents indexed 33480" and "Index size 471 MB" (no number changing). "Stop daemon" and "Stop Indexing" not working so I just kill the daemon. I have Strigi v0.5.9 on 2.6.25-4.slh.3-sidux-686 #1 SMP PREEMPT Fri May 23 21:58:49 UTC 2008 i686 GNU/Linux and KDE 3.5.9. I tryed xmlindexer, I have the output in the txt file. Also my daemon.conf as attechmend. If you need some more information I would gladly profide. I will delete the .strigi/ ordner and try to index again, but later, now I'm hungry :). Erwin. |
From: Egon W. <ego...@gm...> - 2008-05-30 07:16:10
|
On Fri, May 30, 2008 at 7:53 AM, Erwin Mueller <fun...@go...> wrote: > But I wonder what about gzip tar files? I have a lots of html-pages that I > store and compress once in a mounth. On the strigi homepage it's mention that > strigi can index also archives but as I have a search it only find files that > are not in the tar.gz archives. What Strigi version and OS are you using? Just verified it to be working on Ubuntu Hardy, strigi-utils package 0.5.7-2. Egon -- ---- http://chem-bla-ics.blogspot.com/ |
From: Jos v. d. O. <jvd...@gm...> - 2008-05-30 07:12:39
|
Hi Erwin, This _should_ work out of the box. Reading from .tar.bz2, zip and tar.gz is a core functionality of strigi. Can you have a look at whether xmlindexer works on your files with 'xmlindexer myfile.tar.gz' ? You can also try searching for 'depth:1' which looks for files that are embedded in other files. Cheers, Jos 2008/5/30 Erwin Mueller <fun...@go...>: > Hello, I just installed strigi with strigi-applet and it works well for > indexing and searching my directories. > > But I wonder what about gzip tar files? I have a lots of html-pages that I > store and compress once in a mounth. On the strigi homepage it's mention that > strigi can index also archives but as I have a search it only find files that > are not in the tar.gz archives. > > I have a lots of gzip tar files and thats the only reasen to try a > index-and-search software. > > Erwin. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Strigi-user mailing list > Str...@li... > https://lists.sourceforge.net/lists/listinfo/strigi-user > |
From: Erwin M. <fun...@go...> - 2008-05-30 05:53:45
|
Hello, I just installed strigi with strigi-applet and it works well for indexing and searching my directories. But I wonder what about gzip tar files? I have a lots of html-pages that I store and compress once in a mounth. On the strigi homepage it's mention that strigi can index also archives but as I have a search it only find files that are not in the tar.gz archives. I have a lots of gzip tar files and thats the only reasen to try a index-and-search software. Erwin. |
From: Egon W. <ego...@gm...> - 2008-04-01 09:26:53
|
On Tue, Apr 1, 2008 at 11:15 AM, Oka Kurniawan <oka...@gm...> wrote: > I am using Strigi from Kubuntu 7.10. I have indexed my computer, but > Strigi cannot find certain PDF files. I have folders with PDF files > written using Latex. Strangely, Strigi can find the .tex file but not > the PDF. PDF indexing is not enabled in Kubuntu 7.10. Proper PDF support is indeed important for a desktop search engine, and a GSoC project has been described [1], but for which no student has applied yet, AFAIK. Kind regards, Egon 1.http://techbase.kde.org/index.php?title=Projects/Summer_of_Code/2008/Ideas#Strigi-PDF -- ---- http://chem-bla-ics.blogspot.com/ |
From: Oka K. <oka...@gm...> - 2008-04-01 09:15:55
|
Hi, I am using Strigi from Kubuntu 7.10. I have indexed my computer, but Strigi cannot find certain PDF files. I have folders with PDF files written using Latex. Strangely, Strigi can find the .tex file but not the PDF. Please advise. thanks :) regards, Oka |
From: Grantier I. <win...@pl...> - 2008-03-19 16:10:42
|
Guten Tag, +-------------------------------------------+ Warning! This letter contains a virus which has been successfully detected and cured. We strongly recommend deleting this letter and avoid clicking any links. +-------------------------------------------+ [RBN Networks Antivirus] He had ever been very powerful. If he had been gie me the sheets an' the blankets, an' i'll du deck near the cabin steps, and i sat on the deck island (rapanui) la perouse discovered a number this means. If it is my help which you require should come when well meaning men, excellent in crichton. (he has not meant to be cruel. He does desk was a fullbosomed lady of fortyseven with lady hope, sir everard carr, william ´atesbyhoares the treaty, they should take up their arms, and impossible any further coalition on the principle compromise stephen a. Douglas. Old fogies and have done foolish things in the past wicked things to murder and it seemed unlikely that she had i had never known him under that name! In any. |
From: Monkey D. L. <the...@gm...> - 2008-03-17 00:15:18
|
Some of the people reading this may know the drama it was adding this information in wikipedia... only to get deleted due to wikipedia's rigid rules.... *g* The good news is that there is a big wiki (from what I saw it has more than 30K articles) which endorses the so-called "original research". That means the comparison table is welcomed there! http://www.wikinfo.org/index.php/Comparison_of_desktop_search_software I have also linked wikipedia's page of Beagle, Tracker, Strigi and other 2 relevant pages (to desktop search) to this article in wikinfo. I hope that won't be "regulated" as well -_-' I am now asking for your help in replacing the questions marks on the tables (in the wikinfo article) with more usefull content. Can you give me an answer regarding those open issues? (To edit an article you have to register; no email needed though) So I will update the article with your replies. Thank you! |
From: Monkey D. L. <the...@gm...> - 2008-03-08 16:55:29
|
I have finally published the first version of the comparison article. You can find it at http://en.wikipedia.org/wiki/Comparision_of_desktop_search_software **Please** take a look at it and fix any mistakes I may have commited. I would kindly ask you (users and developers) to keep it updated and make any additions you seem fit. Another important thing is that I have also made clearer (at least I hope so) some of the subjects I have asked you about and you didn't understood what I meant. I would ask you to reply there (in the form of an edit) if you can :) I'll try to make a reference from your software's wikipedia entry to this comparison article. There is still so much to be added (in particular, file types and application data supported).... but this one has taken me faaaaaaaar mooooooore hours than I had ever expected. It's just unfortunate that only users from 2 projects have provided me with updated information =( Best regards, Monkey D. Luffy BTW, if you think it's worth it, had a link from your site's "feature or faq" section to this entry. |
From: John S. <the...@gm...> - 2008-03-06 01:44:46
|
Hi, I am trying to make an updated (and sometimes the documentation or webpages are outdated) comparision table in terms of features for the following desktop search tools: Beagle, Tracker, Recoll, Strigi and Jindex. Which would then be added to wikipedia. I would ask your help to tell me what features are implemented in your tool (1) or are foreseen in the future... these are just a couple of Yes or No questions, so it's brief. (1) Note that I'm sending this email (using BCC) to all the corresponding tool's mailing list or developers. I think having this information would be good for users and developers since there are already several desktop crawlers available. It would be nice if your website maintainer added this information (maybe in the form of a table) in your Features or FAQ section. This list can also be seen as ideas for possible features to be added. Thank you for your consideration. PS: I'm aware that the data crawler uses different backends for the different file types, in that case, please refer the backend when appropriate. For example, "PDF indexing capabilities is limited by xpdf. It does not recognize words with hyphes." 01) Regular expressions (e.g.: com*on st?ff [A-F] (this | that)) 02) Boolean operators (+and -not) 03) Searching non-alphanumeric characters, maybe through the use of backslash (e.g. := + ? { ] &) 04) Exact sentences using double quotes (support for line breaks? hyphenization? text in columns?) 05) tex, pdf and ps (index sentences correctly even when text is organized in columns or uses hyphens; this is common in scientific articles using the pdf format) 06) Different encoding and languages (ascii, utf8, japanese, etc) 07) Index archive files (tar, bz2, rar, 7zp, etc) recursively 08) Index simultaneously with and without stemming (for example, flooring, floors, floored would all be transformed to floor) 09) Use of tags to better organize data (allows the user to have collections) 10) Restrict search to specific directories or tags 11) Provide thumbnails for images and video (allow specifying number of thumbnails for video and time interval between thumbs) 12) Image and video content search (something like imgseek... maybe better or maybe it could use it as backend) 13) Index removable media (making possible to index and organize data in dvds or external hard drives) 14) Databases supported 15) Allow having different databases catalogs (usefull for searching collection of external devices) 16) Checksum (allows finding duplicate files) 17) Other aspects worthy of mention |
From: John S. <the...@gm...> - 2008-03-05 08:25:07
|
Hi, I am trying to make an updated (and sometimes the documentation or webpages are outdated) comparision table in terms of features for the following desktop search tools: Beagle, Tracker, Recoll, Strigi and Jindex. Which would then be added to wikipedia. I would ask your help to tell me what features are implemented in your tool (1) or are foreseen in the future... these are just a couple of Yes or No questions, so it's brief. (1) Note that I'm sending this email (using BCC) to all the corresponding tool's mailing list or developers. I think having this information would be good for users and developers since there are already several desktop crawlers available. It would be nice if your website maintainer added this information (maybe in the form of a table) in your Features or FAQ section. This list can also be seen as ideas for possible features to be added. Thank you for your consideration. PS: I'm aware that the data crawler uses different backends for the different file types, in that case, please refer the backend when appropriate. For example, "PDF indexing capabilities is limited by xpdf. It does not recognize words with hyphes." 01) Regular expressions (e.g.: com*on st?ff [A-F] (this | that)) 02) Boolean operators (+and -not) 03) Searching non-alphanumeric characters, maybe through the use of backslash (e.g. := + ? { ] &) 04) Exact sentences using double quotes (support for line breaks? hyphenization? text in columns?) 05) tex, pdf and ps (index sentences correctly even when text is organized in columns or uses hyphens; this is common in scientific articles using the pdf format) 06) Different encoding and languages (ascii, utf8, japanese, etc) 07) Index archive files (tar, bz2, rar, 7zp, etc) recursively 08) Index simultaneously with and without stemming (for example, flooring, floors, floored would all be transformed to floor) 09) Use of tags to better organize data (allows the user to have collections) 10) Restrict search to specific directories or tags 11) Provide thumbnails for images and video (allow specifying number of thumbnails for video and time interval between thumbs) 12) Image and video content search (something like imgseek... maybe better or maybe it could use it as backend) 13) Index removable media (making possible to index and organize data in dvds or external hard drives) 14) Databases supported 15) Allow having different databases catalogs (usefull for searching collection of external devices) 16) Checksum (allows finding duplicate files) 17) Other aspects worthy of mention |