eXtensible Text Framework (XTF)
The eXtensible Text Framework (XTF) is a flexible indexing, display and query tool that supports searching across collections of heterogeneous data and presents results in a highly configurable manner. XTF is an open-source project of the eScholarship Publishing Group of the California Digital Library, and is deployed in academic settings worldwide. Highlights of the XTF system are described in an online brochure PDF?] .
Downloads and Documentation
- Download XTF: XTF can be downloaded from SourceForge.net site.
- Documentation:
- Change log: List of the bug fixes and new features for each release.
- Deployment guide: Installation and basic configuration.
- Programming guide: In-depth technical overview of XTF and how to program it.
- Tag reference: Complete reference for all XTF elements and attributes.
- Tips & Tricks: Hints to help you get results quickly with XTF.
- Under the hood: Details about the actual operation of XTF and how it performs tasks.
- Experimental features: Information on features being developed for XTF.
- Resources: Presentations, papers, and tutorials on XTF.
Technical Overview
The system is divided into four components:
- crossQuery: The front-end to the collection search system.
- dynaXML: Interface to individual documents.
- Text Engine: Used by crossQuery and dynaXML to perform text searches.
- Indexer: Full-text indexer based on Lucene.
The following diagrams give a general overview of how documents are indexed, stored, queried, retrieved, and displayed using XTF (somewhat outdated).
- System architecture diagram: A general illustration showing the roles the XTF components play in the user experience. GIF?]
- Collection searching diagram: A more detailed view of the collection searching process, covering query parsing and results formatting. GIF?]
- Individual object display diagram: A more detailed view of the object display and internal search mechanisms, covering request parsing, authentication, and document formatting. GIF?]
- Text indexing diagram: An illustration of the workflow for the creation of collection indexes. GIF?]
Who uses XTF?#WhoUsesXTF?
The CDL uses XTF as a building block for new services and has used it to replace a number of systems previously used for text searching (i.e., DLXS, Greenstone, DynaWeb). As of 2008, CDL has deployed XTF in the following ways:
- OAC texts and eScholarship Editions search and display (January 2005).
- OAC finding aids search and display (January 2005).
- OAC images search and display (September 2005).
- Calisphere search and display (2006).
- Mark Twain Project Online search and display (2007).
XTF is also extensively used outside the CDL:
- The Encyclopedia of Chicago, a collaboration between the Chicago Historical Society, Northwestern University, and Newberry Library, was the first non-CDL project to deploy XTF in production.
- Indiana University Board of Trustees Minutes
- Visual Arkiv (Swedish). Primary developer: Jakob Saternus
- Biblioteca Italiana (Italian). Primary developer: Fabio Ciotti.
- Coleção de Direito Regulatório das Telecomunicações (Brazilian). Primary developer: Joao Lima
- LexML Portal (Brazilian). Try searching for "idoso". Primary developer: Joao Lima.
- The Chymistry of Isaac Newton. Indiana University. Primary developer Tamara Lopez. Here's an article about it.
- The Swinburne Project. Indiana University. Primary developer John Walsh.
- EECS Tech Reports. University of California at Berkeley. Primary developer: Giulia Hill.
- De Humani Corporis Fabrica. Northwestern University.
- Richard B. Russell Library for Political Research and Studies. University of Georgia Libraries.
- A Companion to the Digital Humanities (Digital Book). Ed. Susan Schreibman, Ray Siemens, John Unsworth. Oxford: Blackwell, 2004; and A Companion to the Digital Literary Studies (Digital Book). Ed. Ray Siemens and Susan Schreibman. Oxford: Blackwell, 2008.
- Frontiers of Science (University of Sydney Library) The full digitised collection of comic strips from 1961-1982, presented using XTF and other technologies such as DSpace, Thickbox, Zoomify and JQuery. Primary developer: Gary Browne.
Other institutions exploring XTF include: University of Sydney; OhioLink; University of Texas at Austin; University of Virginia; University of Denver; and University of Kansas Digital Initiatives.
Support
Implementers
While CDL does not directly support XTF implementers, we do make a good-faith effort to address the needs of the XTF community through the following resources on SourceForge:
- The xtf-user email list for those trying to set-up and use XTF. It is monitored by the principal developers of the application.
- The Bug Tracker is the place to submit bug reports.
- The Support Request Tracker and Feature Request Tracker are alternative ways to bring to our attention documentation errors and potential new features, respectively. However, the xtf-user list is probably a better way to start.
Developers
SourceForge resources for XTF developers and others who are interested in contributing to the architecture
- The xtf-devel email list is where developers share their ideas. It also logs all CVS commits.
- The Patch Tracker allows developers to submit XTF patches for our approval.
- Access to the CVS repository is also available