From: <ant...@gm...> - 2007-10-23 23:09:57
|
Keith R. Bennett pisze: > > Hello. > > I'm Keith Bennett, a member of the Tika team > (http://incubator.apache.org/projects/tika.html), and am interested in > learning more about Aperture. Aperture's scope is much broader than > Tika's, including far more than document parsing. > > I'd like to see how Aperture would do the same task as Tika, even > though that would be using just part of its functionality. > > To do this, I looked at the sample class at > http://aperture.sourceforge.net/tutorial/extractors.html, but could > not get it to compile successfully. Could the example be updated (for > example, to include the correct import statements) so that it is in a > compilable state? (Or am I doing something wrong?) ? I am getting > two errors: > > * DATA cannot be found. > * new RepositoryModel(false); fails because there is apparently no > (boolean) constructor. I've updated the wikipage example and added it to the SVN. It does compile. See [1] > Or is there a better way to test Aperture's document parsing > functionality? I want to be able to pass it a URL or InputStream and > get the parse results. > You may also try out the fileinspector example application. The easiest way to do so is to 1. checkout the latest trunk 2. type 'ant testbuild' 3. go to the bin folder 4. type 'fileinspector' Then you'll be able to see the output aperture produces for various files. The full text extracted from a file is stored in the RDFContainer as a value of the nie:plainTextContent property. Antoni Myłka ant...@gm... [1] <http://aperture.svn.sourceforge.net/viewvc/aperture/trunk/aperture/src/examples/org/semanticdesktop/aperture/examples/tutorials/ExtractorExample.java?view=markup> |