The attached file patches RapidXML v1.13 with a new optional parsing flag to strip namespaces from element and attribute tags.
To use the feature, pass the flag parse_strip_xml_namespaces when parsing an XML document, eg.
This causes all XML namespaces to be stripped from the resulting DOM, eg. the following XML:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:ns1="http://www.example.org">
<xs:element name="example" type="xs:string"/>
<ns1:example ns1:attr="hello world"/>
Would have a stripped representation of:
<schema xs="http://www.w3.org/2001/XMLSchema" ns1="http://www.example.org">
<element name="example" type="xs:string"/>
<example attr="hello world"/>
(Note how the string of the type node in the second line still has the "xs:" prefix applied to it, as it was in actual node content and not part of a tag or attribute name.)
The cost of stripping XML namespaces is fairly small but non-negligible: an additional iteration over every element and attribute tag name, looking for the namespace symbol (a colon, or ":"), plus the increase in code size to do this. The code should be optimized away entirely by the compiler when not using the flag.
The flag is non-destructive; ie. it does not modify the original string, only the XML nodes that are generated from the string.
Short of actual XML namespace support, eliminating the namespaces entirely was the most convenient way I found to drill into XML that had non-trivial or unpredictable namespace usage. A more robust solution may be required (of course) if those namespaces are actually needed to provide context to conflicting tag names.
Log in to post a comment.