Name Modified Size InfoDownloads / Week
Parent folder
README_v3.1.1.txt 2010-11-30 22.1 kB
CHANGES_v3.1.1.txt 2010-09-09 22.1 kB
joai_v3.1.1.zip 2010-09-09 13.8 MB
Totals: 3 Items   13.8 MB 0
Release notes and version documentation for the 
jOAI software version 3.1.1

jOAI is a Java-based Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)
data provider and harvester tool that runs in Apache Tomcat. 

jOAI is developed by Digital Learning Sciences (DLS) and
National Science Digital Library - Technical Network Services (NSDL-TNS),
University Corporation for Atmospheric Research (UCAR)
with support by the National Science Foundation (NSF).
http://dlsciences.org/  http://www.nsdl.org/  http://www.ucar.edu/  http://nsf.gov/
-----------------------------------------------------------------------------------

v3.1.1 (Released to SourceForge September 9, 2010)

-Fixed bug in sort order for the 'Harvest History and Progress' display in the UI 
(bug was introduced in v3.1.0). Display is now properly sorted showing the newest
harvest first.

-Removed the maxlength limit in the form fields for entering the 
setSpec and setName of an OAI set in the administrative UI 
(there is no reason to limit these).


v3.1.0 (Released to SourceForge August 30, 2010)

-Upgraded to Lucene 3.0.2. No functional changes were made to jOAI as a result,
but the underlying repository code was modified significantly to use the 
new Lucene classes.

-Entity references in the <identifier> element are now propely escaped in the XML responses for 
GetRecord, ListRecords, ListIdentifiers (matters only if entities such as '&' exist in your
identifiers).

-Added documentation in the 'Data Provider Documentation' page under 'Preparing files for serving' 
about escaping reserved characters in file names for the OAI records on disk


v3.0.19

-Added more info to the install instructions


v3.0.18 (Released to SourceForge May 25, 2010)

-badVerb error code is now returned instead of badArgument if no verb argument exists in 
the request (conforms to the protocol).

-Added additional protocol validation checks if arguments are repeated (badArgument) or 
if identifier does not exist for ListMetadataFormats (idDoesNotExist)

-Verified that jOAI passes all validation checks at the OAI Data Provider registration site 
(http://www.openarchives.org/Register/ValidateSite)

-Changed 'Provider' to 'Data Provider' throughout the UI and documentation

-Updated documentation


v3.0.17 (Released to SourceForge May 8, 2010)

-Fixed incompatibility in ODL search when slashes (/) appear in the search query. Updated
ODL specification to require that all slashes (/) in [search query] and [set] be escaped 
with %2F. This enables XPath search fields to be used in ODL search queries.

-Created a jOAI User Forum at SourceForge. Updated links and documentation to
point users to the forum for support:
http://sourceforge.net/projects/dlsciences/forums/forum/1138932

-Provided additional details about available search fields and query syntax in the
documentation... points to DDS Search documentation:
http://www.dlese.org/dds/services/ddsws1-1/service_specification.jsp#availableSearchFields

-Added ncs_collect to dlese_collect format converter

-Clarified build instructions


v3.0.16 (Released to SourceForge February 11, 2010)

-Fixed OAI protocol compliance for badArgument error condition. OAI response no longer displays
any attributes on the <request> element, as required for badArgument or badVerb error conditions.

-Fixed issue where files that have a byte-order mark (BOM) character (\uFEFF) at the beginning could not be indexed. 
Fixes problem for users of the Windows Notepad editor, which writes a BOM in files that are saved as UTF-8 encoding. 

-Added a config option to import and insert a page at the bottom of each HTML page in the interface.
Used for inserting tracking code such as Google Analytics or other snippets at the bottom of each page.

-Improved css styles for buttons in the interface.
			

v3.0.15 (Released to SourceForge December 2, 2009)

-Converted all the action links (Add, Edit, Delete, etc.) in the software settings pages from hrefs to form buttons.
This protects them from inadvertently being activated by a web crawler. Changes were made
in the Metadata Files Configuration, Sets Configuration, and Harvester Setup and Status 
pages.

-Fixed issue with empty collection ID value that caused indexing to fail for null value

-Added alternate nsdl_dc to oai_dc transform that is compatible with nsdl_dc v1.00
(default configuration is to use nsdl_dc to oai_dc transform for v1.02)

-Added more robust badArgument checking of OAI requests. Returns badArgument if a requested
argument is not valid in OAI-PMH.

-Updated install and system requirements, FAQ and other documentation for the software

-Default Java XSL transform engine is now used, to be compatible when running in Tomcat with Fedora 
and other XSL processors running in the same JVM that require the default engine.
Removed the following explicit System.setProperty call: 
System.setProperty("javax.xml.transform.TransformerFactory", "com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl");

-Updated dlese_anno to nsdl_dc transform to be compatible with the above change (fixed typo)

-Placed the contentType="text/xml; charset=UTF-8" page directive at the top of each OAI 
request JSP page to ensure proper contentType is suppled in all, not just ListRecords.

-Tested and added build support for Tomcat 6, Java 6

-To do: Add ability to configre which days of the week, not just time of day, to run the indexing cron for 
the RepositoryManager data sources - **need to add to jOAI config and test!**


v3.0.14 (Released to SourceForge February 2, 2009)

-Fixed issue in ListRecords response where UTF-8 encoding was occasionally not applied
properly in the response.

-Fixed issue in the public search page related to the above where UTF-8 characters
would occasionally not display properly.

-Fixed issue in the admin search page to ensure UTF-8 characters are properly displayed

-Added documentation on how to configure Tomcat to use UTF-8 encoding when reading URIs 
(e.g. forms in Web pages, http parameters, etc.).

-Added the harvest, validate, and transform scripts to build and deploy with the project,
and provided instructions on how to install them in the jOAI documentation.


v3.0.13 (Released to SourceForge December 17, 2008)

-Fixed an issue found on Windows systems whereby the file processor was 
improperly updating the OAI datestamps for all records each time the indexer 
ran regardless of whether the file had changed, which made it impossible 
to effectively perform incremental harvests.

Changes inherited from repository code base:

-Added default search fields for all xml formats derived from the xPath to each 
element and attribute in the XML instance document. These are encoded as text, 
stemmed text and keywords for each field. Can be used in admin search (not suppored in ODL search yet).
	Example search fields for oai_dc records:
		/text//dc/title:(my search) - Search for terms 'my search' in the title field
		/stems//dc/title:(my search) - Search for terms 'my search' in the title field using word stemming
		/key//dc/title:"my search" - Search for terms 'my search' in the title field as an exact keyword (whitespace and other non-word characters included in search)

-Configured indexer to use SnowballAnalyzer for stems fields, which provides improvements over PorterStemmerAnalyzer

-Updated to Lucene v2.4.0


v3.0.12 (Released to SourceForge June 20, 2008) 

-Added configurations for dif and ncs_collect XML formats 

-Added xsl styelsheets for XML format conversion
--dif to oai_dc
--dif to fgdc
--ncs_collect to nsdl_dc

-Updated the indexer for Lucene v2.2.0. After updating to this 
version of jOAI, existing indexes will need to be deleted and rebuilt to accomodate 
the new Lucene format for Date fields. Do the following after re-starting Tomcat:
From the 'Metadata Files Configuration' page, choose 'Show advanced index operations'
then click the 'Reset the index' button. This upgrade will reset the OAI date stamps
and remove the history of deleted records from the data provider. The history
of harvests will also be lost.

-Added nsdl_dc transforms for dlese_anno, dlese_collect and news_opps formats

-Added link to WHOIS lookup via NetworkSolutions in the reports area

-Encoding for resumption tokens now uses just the reserved
characters listed in the OAI 2.0 spec instead of using java.net.URLEncoder

-Fixed display of number of deleted and error records in a given 
set/collection on Win file systems. Was problem in query parser with colon in file path


v3.0.12-rc1

-Added documentation in the FAQ for configuring Tomcat for use with
the mod_proxy Apache module

-Added the ability to configure a fixed server URL (scheme, hostname and port,
e.g. http://www.example.org) used to display the baseUrl in OAI 
responses and elsewhere in the application

-Created better error reporting in the harvest when harvests fail due to write permissions
when creating file or zip archive directories


v3.0.11 (Released June 21, 2007)

-Added more documentation in the FAQ

-Updated copyright notice for Digital Learning Sciences (DLS)

-OAI project copied from the dlese SourceForge area to its new home in the
dlsciences SourceForge area and the CVS module has been renamed from 
oai-project to joai-project.



v3.0.10 (Released March 20, 2007)

-Files are now explicitly read into the index using UTF-8 encoding instead of the 
native system encoding type (rt ticket 6937). LC_CTYPE no longer needs to be set to UTF-8 in unix
environments and encoding should now work properly on Windows. Index Writers expect
that the files reside in UTF-8 encoding.

-Select lists in the UI are now disabled only for IE 6 and lower. IE 7 
properly supports z-index layers with select lists.

-Added more documentation in the FAQ


v3.0.9 (Released Feb 23, 2007)

-Added nsdl_dc to oai_dc xsl crosswalk

-Fixed issue where corrupt XML data was being seen in the output from XSL 
format converters, which can propagate as data corruption in the OAI 
ListRecords and GetRecord responses. Fixed by applying synchronization 
in the XMLConversionService to address this concurrency-related issue.


v3.0.8 (Released Jan 19, 2007)

-Fixed bug in the data provider where sometimes empty content was returned in 
ListRecords or GetRecord responses. Crosswalk transformer no longer writes the cached 
file if the transformed content is empty. This ensures that the transform will be 
attemped again the next time the given record/format is requested by a harvester.

-Made the new adn to nsdl_dc crosswalk with ASN identifiers the default configuration


v3.0.7

-Improved performace of the Harvester. Fixed bug that caused the harvester
to slow increasingly as the number of harvested records grew

-Added more documentation in the FAQ, ODL search client (odl_search_client.jsp),
and elsewhere

-Included a version of the adn to nsdl_dc XSL crosswalk that inserts ASN
identifiers for content standards

-Added ability to configure in web.xml how long the harvester waits for a response
before timing out

-Distribution now built using Java 1.5 (specifically Sun jdk1.5.0_09), which has seen 
improved stability over previous versions of Java


v3.0.6 (Released Oct 3, 2006)

-Links in the UI to external pages now open in the same window


v3.0.5 Changes

-Summary: This release provides updates primarily only to the 
documentation pages. No other significant changes have been made
since version 3.0.4

-The ODL 'Search' page now displays parse errors from malformed ODL
service responses as HTML comments rather than to the user

-Added additional FAQ and configuration documentation

-Fixed the link in the 'Indexing Errors' page that brings up the file
for an ID used in another file


v3.0.4 Changes (Released Sept 12, 2006)

-Added display of the harvest from date and duration of the harvest in 
the harvest history

-File directories are now sorted alphabetically in the edit sets 
definitions page

-Added display-name to web.xml (used for display in the Tomcat manager)

-Fixed 'No write permissions' error that occured when harvesting all 
records for the first time

-Fixed problem that made it impossible to delete a configured harvest 


v3.0.3 Changes

-Implemented deletions in the data provider, used to support
incremental or selective harvesting

-Automatic harvests now check if the data provider supports deletions.
If deletions are supported, an incremental harvest is performed
by requesting and synchronizing only those records that have been added,
modified or deleted since the previous harvest. If deletions are not 
supported, a full harvest is performed by deleting all previously 
harvested records and harvesting all records from scratch

-Automatic harvests that occur daily now run at a given
time of day, which is entered in the UI when setting up the harvest

-Added additional user instructions and software documentation

-Sets are now sorted by set name in the sets configuration page

-Zipped harvests are now saved to a single directory for each
provider/set, and the directory is now in the admin area so access 
can be controlled

-Added ability to configure when to log the requests 
that are made to the data provider and ability to disable 
logging altogether. This eliminates or reduces writes that are 
made to the log index to save disc space.

-The OAI datestamps for items in the data provider are now 
set to three minutes ahead of the time the change was recorded 
in the index to provide for overlap in indexing time

-Added more harvest status messages during harvests

-Addressed some inconsistencies with the index mirroring the 
metadata files

-Fixed issue that was causing the server to shut down 
when a harvest attemped to save to a directory 
that did not have write permissions

-Fixed zip exception thrown when zero files were harvested


v3.0.2 Changes (Released June 7, 2006)

-Added statement about how to provide attribution when
using, modifying or redistributing the software

-Fixed problem where namespace and schema information
was not being saved when a new metadata format was entered
in the data provider files configuration page


v3.0.1 Changes

-Automatic harvests now perform a clean harvest of all records
from the data provider instead of only those that have changed
since the previous harvest

-Moved the data provider Explorer page to the top level menu
in the UI

-Fixed problem on Windows where the paths to download files from the
harvester were broken


v3.0.0 Changes (Released April 2006)

-The software has been renamed from 'DLESE OAI Software' to 'jOAI'

-A completely redesigned User Interface

-Ability to define arbitraty OAI sets in the data provider. 
Sets are now configured separately from the file directories via the UI

-Ability to define the data provider namespaces and schemas that are 
exposed in the ListMetadataFormats response via the UI

-The harvester now packages each set of harvested records into
zip files, which are made available for download from the
harvester UI

-Improved/added XSL transforms in the data provider to oai_dc 
from these formats: adn, dlese_anno, dlese_collect, news_opps

-Documentation and help pages updated for better clarity

-Deletions no longer supported in the data provider

-If the same ID is found in more than one metadata file, this is now detected
and an error is displayed in the data provider's administration UI

-Numerous updates and bug fixes for the underlying repository, indexing and other
core libraries and routines

-Updated to use Struts v1.2.7 and Lucene v1.4.3

-Compatible in Tomcat 5.x and Java 1.4 or higher


v2.0.12 Changes

-Added a searh page as the main index.jsp. This search page uses the ODL interface
to search the repository and display results to users.

-Harvester now ignores it if a given repository returns a record that has an empty
metadata element. Previously the presence of an empty metadata element within the
ListRecords response would result in a harvest error and halting of the harvest.

-Harvester now properly reports HTTP error (type 500, 403, etc.). Previously
these were being reported as connection time-outs (since the time-out functionality
was added).

-The Provider search and discovery page can now display dlese_ims records. Previously 
an exception was thrown.

-The dlese_ims indexer can now index files with the % character in their path. 

-Harvester now indents the XML it outputs with nice formatting. Previously the XML was 
written to file as a single line. Output is now also explicitly set to UTF-8.

-Provider JSPs now explicitly set the charset response header to UTF-8.

-Gzip filter now outputs using the given charset, which is UTF-8.


v2.0.11 Changes

-Harvester now times-out if a connection to the remote provider can not be established within 
180 seconds. Previously, the harvester would hang indefinately waiting for the harvest to 
complete.

v2.0.10 Changes

-Provider search and discovery now defaults to search on everything
when nothing is typed in the search box. Previously the user had to
type * to search on everything.

-Rearranged the reindexing button in the provider admin UI to place it
in a more logical spot.

-Changed parameter settings to address Mac OSX limitation on the number of 
files that can be open at any given time. For OSX compatibility, the Max 
number of files to index per block is set to 25 and the max number of 
files to harvest per resumption token is set to 50.


v2.0.9.3 Changes (Released Oct 2005)

-Made compatible for running in Tomcat 5.5.x


v2.0.9 Changes (Released Sept 2003)

-Added user feedback about the indexing process in the provider administration page.

-Updated the ADN to NSDL-DC and ADN to OAI-DC format converters.

-Improved display of ADN records in the provider Search and Discovery interface.

-Users can now only configure a directory of metadata files once.
This fixes some rare bugs that occur if a given directory is configured
for more than one set.

-Harvester now encodes resumptionTokens properly. Previously, resumptionTokens
were being encoded twice, which resulted in inproper requests to data providers.

-A bit more documentation and user instructions added.

-Binary and source distributions now automaticlly built and zipped by Ant. 

-New button in the provider admin interface allows administrators to rebuild the
metadata index from scratch, resetting all OAI datestamps and deleted records

Known issues:

- Files that have been created or harvested on Windows platforms sometimes
cannot be indexed properly in the data provider.

- Some UTF character encoding problems when writing metadata files.

- Problems on Windows reading some XML files that have had their file names encoded.

v2.0.8 Changes

-Added more documentation.

v2.0.7 Changes

- Fixed bug in displaying the Provider explorer ODL search pages in Netscape 4.

- Fixed IndexOutOfBounds error when selecting multiple formats in Provider search box.

v2.0.6 Changes

- Additions to documentation.

- XSL format converters in the data provider should now work properly on Windows systems.

v2.0.5 Changes

- Encoded colons and other chars in file names are now properly unencoded
to form a proper item ID.

- Now able to view XML files in the provider data search and reports interfaces that contain file 
name encoding, such as escaped colons. 


v2.0.4 Changes

- Added a bit more documentation.

- Performace enhancements made in the ODL search and the admin keyword search 
engines. 

- Fixed a bug in the creation of file directory path names to escape
characters that are illegal in Windows and other platforms file names.

- Harvests that were terminated by a server shut-down or crash are now
logged appropriately. Previously the harvest log would show these as 
"in progress."

- Removed the 'b' from the version number. 

v2.0.3b Changes (Released June 2003)

- Added a harvester. Features include:
  - Supports protocol versions 2.0 and 1.1. 
  - Automatic updating and sychronizing of the local data files with the remote data provider. 
  - Supports gzip compression. 
  - Ability to monitor harvest activity through the web interface. 
  - Saves files to disk in folders by baseURL path, set and format. 
  
- Added new metadata converters to convert from ADN to oai_dc, ADN to nsdl_dc, ADN to the briefmata 
format, DLESE IMS to ADN and DLESE IMS to ADN with cataloger information suppressed.



v2.0.2b Changes

- Added parsers for adn and dlese_anno metadata types, allowing search and display by field.

- Updated the dlese_ims to adn converters.

- Added a demo Java based XML converter.

- Improved the navigational menus in the UI. 

- Added a bit more documentation.


v2.0.1b features

-Keyword search over items in the repository. 

-UI to explore the repository using OAI-PMH requests. 

-Implements an ODL search extension of the OAI-PMH protocol that supports content-based queries. 

-Serves metadata directly from XML files cached on disc and requires no database for operation. 

-Allows users to validate and view each OAI response that is being returned by the provider. 

-Allows users to validate and view individual XML metadata records within the repository. 

-Supports configuration of arbitrary metadata sets. 

-Supports indexing and serving of any schema-based XML metadata format, plus DLESE-IMS (contains a DTD). 

-Provides extensible, plug-and-play configuration of XML metadata format converters, with support for XSL or Java. 

-Includes an XSL format converter to convert from DLESE IMS to ADN. 

-Allows a number of administrative options to control the behavior of the provider and the data that is returned. 

-Provides searchable reports that detail validation or critical errors found in metadata files. 

-Provides searchable reports of harvesters that have accessed the data provider. 

-Supports data transmission in compressed gzip format. 

-Supports deletions. 

-Supports modification date granularity down to seconds. 


v1.01 (Released to SourceForge March 2002)
Source: README_v3.1.1.txt, updated 2010-11-30