#10 Validation error when namespace for in-line XML...

closed
nobody
None
5
2008-09-15
2007-02-19
Sam
No

...datastreams is in root element.

Discussion

  • Sam
    Sam
    2007-03-12

    Logged In: YES
    user_id=1709161
    Originator: YES

    I get the following error when I try batch ingest with the Fedora 2.1 client
    (to a Fedora 2.1 repository), but I am able to ingest the same file without
    any problems using a Fedora 2.0 client and repository.

    Ingesting Batch . . .
    ingest failed for: object specific.xml
    org.apache.axis.AxisFault
    fedora.server.errors.ObjectIntegrityException: Parse error parsing DC XML
    Metadata: The prefix "dc" for element "dc:creator" is not bound.
    ingest format specified was: "metslikefedora1"
    ===BATCH HAS FAILED===
    consider manually backing out any objects which were already successfully
    ingested in this batch

    The mets file (with the datastreams section omitted) is listed below:

    <?xml version="1.0" encoding="UTF-8"?>
    <!-- METS format: four images plus Dublin Core for a bulk TIFF ingest -->
    <METS:mets xmlns:METS="http://www.loc.gov/METS/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:fedoraAudit="http://fedora.comm.nsdlib.org/audit"
    xmlns:xlink="http://www.w3.org/TR/xlink"
    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xsi:schemaLocation="http://www.fedora.info/definitions/1/0/mets-fedora-
    ext.xsd" OBJID="fedorawillreplace:0814" TYPE="FedoraObject" LABEL="template:
    multi-image plus metadata" PROFILE="image/tiff">
    <!-- This is the Dublin core metadata -->
    <METS:dmdSecFedora ID="DC" STATUS="A">
    <METS:descMD ID="DC.0" CREATED="2006-03-01T10:28:34">
    <METS:mdWrap MIMETYPE="text/xml" MDTYPE="OTHER" LABEL="Dublin Core
    descriptive metadata">
    <METS:xmlData>
    <oai_dc:dc>
    <dc:title>Engelmann spruce 15 years after planting.</dc:title>
    <dc:creator>Ohio Agricultural Experiment Station. Dept. of
    Forestry.</dc:creator>
    <dc:subject>Forestry engineering </dc:subject>
    <dc:description>Engelmann spruce 15 years after planting. Item
    #814</dc:description>
    <dc:publisher>Ohio Agricultural Research and Development
    Center.</dc:publisher>
    <dc:contributor>Ohio Agricultural Research and Development
    Center.</dc:contributor>
    <dc:date>1918</dc:date>
    <dc:type>Image.photographic</dc:type>
    <dc:format>image/jpeg</dc:format>
    <dc:identifier>0814_gs_a.jpg</dc:identifier>
    <dc:source>Taken from 2 x 3 glass slide #814.</dc:source>
    <dc:relation>Ohio Agricultural Experiment Station Forestry Image
    Collection</dc:relation>
    <dc:coverage>Ohio</dc:coverage>
    <dc:rights>http://library.osu.edu/sites/dlib/terms.html</dc:rights>
    </oai_dc:dc>
    </METS:xmlData>
    </METS:mdWrap>
    </METS:descMD>
    </METS:dmdSecFedora>

    ------- Additional Comment #1 From Ross Wayland 2006-03-03 13:09 [reply] -------
    This bug is not specific to Batch Ingest utility but applies to Ingest method
    in general when in-line XML datastreams have their namespace information
    declared in the root element.

    A temporary workaround is to declare the namespace information locally in the
    in-lin XML. e.g. Fro the DC datastream:

    <oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/">

    ------- Additional Comment #2 From Chris Wilper 2006-03-07 15:50 [reply] -------
    Just a note: I checked the revision history of METSDODeserializer.java and
    nothing significant changed since I "fixed" this last time. Curious.

     
  • Sam
    Sam
    2007-03-12

    Logged In: YES
    user_id=1709161
    Originator: YES

    I get the following error when I try batch ingest with the Fedora 2.1 client
    (to a Fedora 2.1 repository), but I am able to ingest the same file without
    any problems using a Fedora 2.0 client and repository.

    Ingesting Batch . . .
    ingest failed for: object specific.xml
    org.apache.axis.AxisFault
    fedora.server.errors.ObjectIntegrityException: Parse error parsing DC XML
    Metadata: The prefix "dc" for element "dc:creator" is not bound.
    ingest format specified was: "metslikefedora1"
    ===BATCH HAS FAILED===
    consider manually backing out any objects which were already successfully
    ingested in this batch

    The mets file (with the datastreams section omitted) is listed below:

    <?xml version="1.0" encoding="UTF-8"?>
    <!-- METS format: four images plus Dublin Core for a bulk TIFF ingest -->
    <METS:mets xmlns:METS="http://www.loc.gov/METS/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:fedoraAudit="http://fedora.comm.nsdlib.org/audit"
    xmlns:xlink="http://www.w3.org/TR/xlink"
    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xsi:schemaLocation="http://www.fedora.info/definitions/1/0/mets-fedora-
    ext.xsd" OBJID="fedorawillreplace:0814" TYPE="FedoraObject" LABEL="template:
    multi-image plus metadata" PROFILE="image/tiff">
    <!-- This is the Dublin core metadata -->
    <METS:dmdSecFedora ID="DC" STATUS="A">
    <METS:descMD ID="DC.0" CREATED="2006-03-01T10:28:34">
    <METS:mdWrap MIMETYPE="text/xml" MDTYPE="OTHER" LABEL="Dublin Core
    descriptive metadata">
    <METS:xmlData>
    <oai_dc:dc>
    <dc:title>Engelmann spruce 15 years after planting.</dc:title>
    <dc:creator>Ohio Agricultural Experiment Station. Dept. of
    Forestry.</dc:creator>
    <dc:subject>Forestry engineering </dc:subject>
    <dc:description>Engelmann spruce 15 years after planting. Item
    #814</dc:description>
    <dc:publisher>Ohio Agricultural Research and Development
    Center.</dc:publisher>
    <dc:contributor>Ohio Agricultural Research and Development
    Center.</dc:contributor>
    <dc:date>1918</dc:date>
    <dc:type>Image.photographic</dc:type>
    <dc:format>image/jpeg</dc:format>
    <dc:identifier>0814_gs_a.jpg</dc:identifier>
    <dc:source>Taken from 2 x 3 glass slide #814.</dc:source>
    <dc:relation>Ohio Agricultural Experiment Station Forestry Image
    Collection</dc:relation>
    <dc:coverage>Ohio</dc:coverage>
    <dc:rights>http://library.osu.edu/sites/dlib/terms.html</dc:rights>
    </oai_dc:dc>
    </METS:xmlData>
    </METS:mdWrap>
    </METS:descMD>
    </METS:dmdSecFedora>

    ------- Additional Comment #1 From Ross Wayland 2006-03-03 13:09 [reply] -------
    This bug is not specific to Batch Ingest utility but applies to Ingest method
    in general when in-line XML datastreams have their namespace information
    declared in the root element.

    A temporary workaround is to declare the namespace information locally in the
    in-lin XML. e.g. Fro the DC datastream:

    <oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/">

    ------- Additional Comment #2 From Chris Wilper 2006-03-07 15:50 [reply] -------
    Just a note: I checked the revision history of METSDODeserializer.java and
    nothing significant changed since I "fixed" this last time. Curious.

     
  • Chris Wilper
    Chris Wilper
    2007-10-05

    Logged In: YES
    user_id=189298
    Originator: NO

    I can think of three approaches to handling this:

    If foxml or mets is submitted with "X" datastream
    namespaces declared at the root of the document
    (and not inside the datastream itself):

    [Disallow]
    Throw an error indicating this (in a more clear
    manner than currently given) at ingest time.
    - I think this is the easiest to implement and
    rationalize.

    [Normalize]
    Automatically move namespace declarations to
    the inline xml sections at ingest time.
    - This would require that we essentially
    change the datastream content as submitted.
    It's more effort than disallowing, and
    harder to rationalize. Consider the questions:
    A) Should the namespace declaration be considered
    part of the datastream content? I think the answer
    is yes. This is certainly the case for managed content
    that happens to be XML.
    B) What do we do if a checksum is submitted with the
    inline XML content, and the automatic change would make
    it invalid?

    [Allow]
    Allow namespaces to be declared at the root,
    so that inline XML doesn't have to be valid
    in itself, but is only valid in the context
    of the foxml document that declares the
    namespace(s) at the root.
    - I think this approach would be unwise.
    One obvious reason is that it would make it
    impossible to validate inline XML outside
    the context of the FOXML or METS in which
    it resides (consider the admin client, or
    other utilities that work with datastreams
    without awareness of their "wrapping format")

     
  • Chris Wilper
    Chris Wilper
    2007-10-05

    • priority: 7 --> 5
     
  • Chris Wilper
    Chris Wilper
    2007-10-05

    Logged In: YES
    user_id=189298
    Originator: NO

    Changed priority to 5 -- I don't think this is causing too many headaches out there.

     
  • Chris Wilper
    Chris Wilper
    2007-10-11

    Logged In: YES
    user_id=189298
    Originator: NO

    Ross and I talked about this and generally agreed that [Disallow] makes sense here.

    So the scope of this should be:

    - Make the validation give more meaningful error messages (instructing how to fix) in this case
    - Make sure all the demo objects and examples don't declare namespaces at the root that are a) unused, or b) only used in the xml content

     
  • Daniel Davis
    Daniel Davis
    2008-09-15

    • status: open --> closed