|
From: Joe W. <jo...@gm...> - 2011-09-08 12:47:23
|
Hi all, One thing worth highlighting about Peter's work is how he has used eXist-db as a hub of his research data: He is importing data from SQL databases (MediaWiki), RDF data from Zotero, and MS Access databases. By importing this data into eXist-db, he is able to enrich what he can do with his TEI data. And he is able to "export" his combined data into useful forms: HTML for web views, PDF for dissertation publishing, and NodeXL for data visualization. eXist's ability to import and house data from a variety of sources is really remarkable. You can ignore this ability and use it for querying just your TEI, of course, but there are many hooks for integrating data from lots of other sources. A lot of this flexibility is due to eXist's openness to incorporating libraries from other open source projects that facilitate import and export: - eXist can generate charts - using the built-in JFreeChart library. See http://atomic.exist-db.org/blogs/dizzzz/JFreeChart - eXist can generate PDFs - using the built-in Apache FOP framework (commercial XSL-FO libraries can be supplied instead of FOP) - eXist can scale images and get metadata about them - see http://demo.exist-db.org/exist/functions/image - eXist can make SQL queries and can connect to FTP and SVN servers - eXist is adding the Apache Tika framework for importing PDF, Microsoft Office files (and the formats listed at http://tika.apache.org/0.9/formats.html) -- making all of these formats full text searchable and queryable with XQuery Of course with the growth in RESTful API services such as Google Charts, Captcha, and OpenCalais, not all features need to come out of the box in eXist -- you can just use eXist's HTTP client to send data off to the API and get your results back. A word on Jens' mention of eXist version 1.5: The current release version of eXist is 1.4.1, but eXist is adding new features to its "bleeding edge" development version, 1.5. Such features include the ability to install "app"-like packages such as Jens's Tamboti project, the Apache Tika module, more sophisticated Lucene indexers, etc. This new version isn't available as a convenient installer yet, but it can be downloaded and used -- if you're willing to tinker with Subversion. If you haven't heard of Subversion, then it's probably worth waiting for an official release version, but you're welcome to dive in. See http://exist.sourceforge.net/building.html#svn. I, for one, am eager for the release of the current development version, because I have already adapted the Punch demo from the Oxford Summer School eXist workshop into an easily installable "app" package. It'll make the installation step a one-click process, instead of the manual process it is now. (For more info on eXist's package and repository format, see http://demo.exist-db.org/repo/repo.xml.) So there's exciting stuff coming down the pike! Again, you can just stick with querying your TEI -- but it's nice to know that when you need to reach out to integrate other data sources or generate new forms of output, lots of solutions are there for you to make use of. Cheers, Joe |