I think the best way to do this is to put the document through a transformation that changes the namespace:

<xsl:template match="*">
  <xsl:element name="{local-name()}" namespace="{$correct-namespace}">
    <xsl:copy-of select="@*"/>
    <xsl:apply-templates/>
  </xsl:element>
</xsl:template>

and then do the XPath query.

The alternative is to rely on XPath wildcard matching (paths of the form *:foo/*:bar/*baz.

I'm afraid when people misuse XML in this way, the tools like XPath and XSLT become much less effective.

I have to say I haven't seen this particular problem of people choosing random namespaces for the same vocabulary: what particular community is doing this?

Michael Kay
Saxonica


On 4 Dec 2013, at 21:14, Ryan McKinley <ryantxu@gmail.com> wrote:

I am trying to use saxon s9api to run XPath queries against XML found from a bunch of locations (focused web crawl)

I understand the namespace URI is *supposed* to be unique and consistent, but this does not appear to be the case in practice.

I am finding lots of examples where people reference:
 prefix="http://somethign/abc/"
 prefix="http://somethign/ABC/v2"
 prefix="url:a:b:c"
 prefix="url:a:b:c:v2"

It would be great to be able to normalize this so that the same xpath query could be used across multiple documents.

Any suggestions for how to approach this?  via java code would be easier to deal with than an XSLT... but I am open to anything :)

Thanks for any pointers!

ryan



------------------------------------------------------------------------------
Sponsored by Intel(R) XDK
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
saxon-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/saxon-help