[Ikvm-developers] IKVM / Tika - DLL vs Executable
Brought to you by:
jfrijters
From: Trevor W. <tw...@da...> - 2011-06-28 19:01:49
|
Howdy folks, I've recently found a java library called "Tika" which extracts metadata and content from various file types. A great little library. I used ikvmc.exe to convert it from the jar file to a dll usable in C# and after some struggles with which libraries and which types to use, got a test executable up and running. It worked great! Extracted metadata just like I needed it to. I then copied the functions from the executable and put it into a DLL that will be used as a plug-in library for one of the projects I'm currently working on. However (there's always a however, isn't there?), Tika instantly started reporting only 3 items (default properties of file length, file name and mime type) instead of the full metadata it currently does. I'm not sure if this is a IKVM thing or a Tika thing, so am posting it here as well as on the Tika mailing lists. Any assistance would be greatly appreciated. I'm using C# in Visual Studio 2005 w/ IKVM 0.46.0.1 and Tika v0.9 The code for the extraction is: private void button1_Click(object sender, EventArgs e) { AutoDetectParser parser = new AutoDetectParser(); Metadata metadata = new Metadata(); ParseContext parserContext = new ParseContext(); java.lang.Class parserClass = typeof(AutoDetectParser); parserContext.set(parserClass, parser); java.io.File file = new java.io.File(@"E:\temp\Docs for Demo SM\E-documents\011NewCaseDONE.doc"); java.net.URL url = file.toURI().toURL(); using (java.io.InputStream inputStream = MetadataHelper.getInputStream(url, metadata)) { parser.parse(inputStream, getTransformerHandler(), metadata, parserContext); inputStream.close(); } foreach (string name in metadata.names()) { } } private TransformerHandler getTransformerHandler() { SAXTransformerFactory factory = TransformerFactory.newInstance() as SAXTransformerFactory; TransformerHandler handler = factory.newTransformerHandler(); handler.getTransformer().setOutputProperty(OutputKeys.METHOD, "text"); handler.getTransformer().setOutputProperty(OutputKeys.INDENT, "yes"); _outputWriter = new StringWriter(); handler.setResult(new StreamResult(_outputWriter)); return handler; } and here are the using lines to make it work if you create a project using System; using System.Windows.Forms; using java.io; using java.lang; using javax.xml.transform; using javax.xml.transform.sax; using javax.xml.transform.stream; using org.apache.tika.io; using org.apache.tika.metadata; using org.apache.tika.parser; ------------------------------------------------------------ The DLL takes the code from the Button1_Click and moves it into a simple function that currently just tries to read it (no returns or anything at this time). When run from the executable, i get the following information (vs an MP3 file) xmpDM:releaseDate=2009 Content-Length=4136960 xmpDM:audioChannelType=Stereo xmpDM:album= Author=The B52's xmpDM:artist=The B52's channels=2 xmpDM:audioSampleRate=44100 xmpDM:logComment= xmpDM:trackNumber=1 version=MPEG 3 Layer III Version 1 xmpDM:composer=null xmpDM:audioCompressor=MP3 title=Rock Lobster samplerate=44100 xmpDM:genre=Blues Content-Type=audio/mpeg resourceName=The B52's - Rock Lobster.mp3 When run from the DLL i get the following information Content-Length 4136960 | Content-Type audio/mpeg | resourceName The B52's - Rock Lobster.mp3 | Thanks in advance, Trevor Watson |