California Digital Library Announces Self-Guided Tutorial for the eXtensible Text Framework (XTF)

March 17, 2009 Author: Lisa SchiffCategories:

By Lisa Schiff, eScholarship Publishing Program Technical Lead

The California Digital Library (CDL) is pleased to announce the availability of an extensive self-guided tutorial for its eXtensible Text Framework (XTF) application. XTF is an open source, highly customizable piece of software supporting the search, browse, and display of heterogeneous digital content and offering efficient and practical methods for creating customized end-user interfaces for distinct digital collections. The tutorial provides guidance for implementing and customizing XTF, from core functionality to overall look and feel.Downloads for the Mac and Windows operating systems are available from the XTF Project page on SourceForge, along with the complete distribution and documentation.

The tutorial comes with a complete XTF package that is ready to run when uncompressed; no other installation is required. It contains nine modules spanning the most powerful and popular features, including how to:

Add new content
Change metadata
Change logo and colors
Increase significance of titles in ranking hits
Customize and enable default status of advanced search
Change fields displayed in search results
Enable structural searching
Create a hierarchical facet
Change footnote behavior

XTF Background and Overview
Since first developing and deploying this indexing and display technology in 2005, the CDL has worked to build and maintain XTF as a highly customizable application built upon tested components already in use by the digital library and search communities – in particular the Lucene text search engine, Java, XML, and XSLT. By coordinating these pieces in a single platform that can be used to create multiple unique applications, the CDL has succeeded in dramatically reducing the investment in infrastructure, staff training, and development for new digital content projects.

XTF offers the following core features out of the box:

Easy to deploy: Drops directly in to a Java application server such as Tomcat or Resin; has been tested on Solaris, Mac, Linux, and Windows operating systems
Easy to configure: Can create indexes on any XML element or attribute; entire presentation layer is customizable via XSLT
Robust: Optimized to perform well on large documents (e.g., text that exceeds 10MB of encoded text); scales to perform well on collections of millions of documents; provides full Unicode support
Extensible:
- Works well with a variety of authentication systems (e.g., IP address lists, LDAP, Shibboleth)
- Provides an interface for external data lookups to support thesaurus-based term expansion, recommender systems, etc.
- Can power other digital library services (e.g., XTF contains an OAI-PMH data provider that allows others to harvest metadata, and an SRU interface that exposes searches to federated search engines)
- Can be deployed as separate, modular pieces of a third-party system
Powerful for the end user:
- Spell checking of queries
- Faceted displays for browsing
- Dynamically updated browse lists
- Session-based bookbags

A sampling of XTF-based applications include:

Mark Twain Project Online (http://www.marktwainproject.org), developed by the Mark Twain Papers Project, the CDL and the University of California Press.
Calisphere (http://www.calisphere.org/), developed by the CDL.
The Encyclopedia of Chicago (http://www.encyclopedia.chicagohistory.org/)
The Chymistry of Isaac Newton (http://webapp1.dlib.indiana.edu/newton/) and The Swinburne Project (http://webapp1.dlib.indiana.edu/swinburne/www/swinburne/)
Finding Aids at the New York Public Library (http://labs.nypl.org/2007/10/30/extensible-text-framework-xtf/)
EECS Technical Reports (http://sunsite2.berkeley.edu:8088/xtf/servlet/org.cdlib.xtf.crossQuery.CrossQuery?rmode=btr)

For more information, visit http://xtf.cdlib.org/ .