Menu

#5 Feature to strip XML namespaces from tags

open
nobody
None
5
2016-10-11
2010-04-05
Dan Posluns
No

The attached file patches RapidXML v1.13 with a new optional parsing flag to strip namespaces from element and attribute tags.

To use the feature, pass the flag parse_strip_xml_namespaces when parsing an XML document, eg.

xml_document<> doc;

doc.parse<parse_strip_xml_namespaces>(my_xml_string);

This causes all XML namespaces to be stripped from the resulting DOM, eg. the following XML:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:ns1="http://www.example.org">
<xs:element name="example" type="xs:string"/>
<ns1:example ns1:attr="hello world"/>
</xs:schema>

Would have a stripped representation of:

<schema xs="http://www.w3.org/2001/XMLSchema" ns1="http://www.example.org">
<element name="example" type="xs:string"/>
<example attr="hello world"/>
</schema>

(Note how the string of the type node in the second line still has the "xs:" prefix applied to it, as it was in actual node content and not part of a tag or attribute name.)

The cost of stripping XML namespaces is fairly small but non-negligible: an additional iteration over every element and attribute tag name, looking for the namespace symbol (a colon, or ":"), plus the increase in code size to do this. The code should be optimized away entirely by the compiler when not using the flag.

The flag is non-destructive; ie. it does not modify the original string, only the XML nodes that are generated from the string.

Short of actual XML namespace support, eliminating the namespaces entirely was the most convenient way I found to drill into XML that had non-trivial or unpredictable namespace usage. A more robust solution may be required (of course) if those namespaces are actually needed to provide context to conflicting tag names.

Discussion

  • Dan Posluns

    Dan Posluns - 2010-04-05

    Patch file to add XML namespace stripping to RapidXML v1.13

     
  • pedro quide

    pedro quide - 2016-10-11

    What patch format is this? I can't manage to apply it!

    Here is another go on the problem: https://github.com/dwd/rapidxml/blob/master/rapidxml.hpp

     

Log in to post a comment.