User Activity

  • Posted a comment on discussion General Discussion on DocFetcher

    Quang, are you the only person working on the project? Has anyone working on the project looked into www.javaxt.com, or github.com/drewnoakes/metadata-extractor (or any other JPEG metadata extractor--there are at least a half dozen packages in openSUSE)? I suspect that the metadata specifications could be presented to the user as a form with either a drop down of reasonable values, or some way to indicate a range (e.g. for GPS data you might want to keep it simple and give either the option of specifying...

  • Posted a comment on discussion General Discussion on DocFetcher

    Is there a location I can find the JavaDoc, and/or some description of the general design (UML2 models would be helpful to me)?

  • Posted a comment on discussion General Discussion on DocFetcher

    Has anyone done a comparison (features, performance, etc.) between DocFetcher and Xapian/Recoll (or other Xapian based index/retrieval)?

  • Posted a comment on discussion General Discussion on DocFetcher

    Have you given any thought to using Drools, NLP, and/or some neural network library to provide more sophisticated sorting and indexing? Drools would give you the ability to write rules that are pretty much arbitrarily complex, separate from source code. There is a reasonably robust infrastructure for editing rules, checking for conflicts, etc. I don't know what you use inernally for your indexing, but there are some Apache projects (e.g. UIMA, and the medical document NLP work done based upon it),...

  • Posted a comment on discussion General Discussion on DocFetcher

    Any update to this issue? There are some well known REGEX for digging into mbox and maildir email messages (see the O'Reily Mastering Regular Expressions use these as examples, or at least did in the 2nd edition). This would let you pull out the header and metadata and email text, and you should be able to link the attachment to the message. Linking conversations, sifting through listserv, etc. might take a bit more.

View All

Personal Data

Username:
emresponse
Joined:
2005-08-16 19:13:57

Projects

This is a list of open source software projects that Kevin Coonan, MD is associated with:

Personal Tools