xmlstarlet - sel with namespaces?

Help
barni
2012-01-12
2013-03-13
  • barni
    barni
    2012-01-12

    Hello community!

    I'm new in xml and xmlstarlet and I'd like to use xmlstarlet to get some data out of a xml file.

    file.xml (input)

    <?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="/2.6.2/style/numbers.xsl"?>
    <ops:world-patent-data xmlns:ops="http://ops.epo.org" xmlns="http://www.epo.org/exchange" xmlns:ccd="http://www.epo.org/ccd" xmlns:xlink="http://www.w3.org/1999/xlink">
        <ops:meta name="status" value="BRW002 BRW015 BRW023"/>
        <ops:meta name="info" value=""/>
        <ops:meta name="version" value="11.03.63"/>
        <ops:meta name="elapsed-time" value="122"/>
        <ops:standardization inputFormat="original" outputFormat="docdb">
            <ops:input>
                <ops:publication-reference>
                    <document-id document-id-type="original">
                        <country>WO</country>
                        <doc-number>9933165</doc-number>
                        <kind>A1</kind>
                    </document-id>
                </ops:publication-reference>
            </ops:input>
            <ops:output>
                <ops:publication-reference>
                    <document-id document-id-type="docdb">
                        <country>WO</country>
                        <doc-number>9933165</doc-number>
                        <kind>A1</kind>
                        <date>19990701</date>
                    </document-id>
                </ops:publication-reference>
            </ops:output>
        </ops:standardization>
    </ops:world-patent-data>

    I need the values of <country>, <doc-number> and <kind> from <ops:output>.
    In this example the result should be "WO9933165A1".

    For understanding how xmlstarlet works I first try to get all values of nodes <country> with the following command line string

    xml.exe sel -t -m "//document-id" -v "country" file.xml

    But I didn't get a result. In an easier xml file with no namespaces the command line string works. So I guess the namespaces are the problem.

    Next try was

    xml.exe sel -N ops="http://ops.epo.org" -t -m "//document-id" -v "country" file.xml

    But it doesn't worked too.

    I would be pleased if you can give me a short advise how I have to modify the command line string to get the country code respectively how the command line string looks like for getting the complete result "WO9933165A1" (<country>, <doc-number> and <kind>).

    Many thanks and greetings from germany,

    barni

     
  • Noam Postavsky
    Noam Postavsky
    2012-01-13

    xmlns="[url]http://www.epo.org/exchange[/url]"

    This is the default namespace, any tag without a prefix in the document has this namespace. Unfortunately XPath doesn't support a default namespace so you have declare it explicity to xmlstarlet:

    xml sel -N ops=[url]http://ops.epo.org[/url] -N ex=[url]http://www.epo.org/exchange[/url] -T -t -m //ops:output//ex:document-id -v ex:country -v ex:doc-number -v ex:kind -n
    

    Note that in recent version (1.2.1 and above)  -N ops=http://ops.epo.org is not needed: xmlstarlet will automatically use the namespace declarations from the document (but not the default one).

     
  • barni
    barni
    2012-01-13

    @npostavs

    That's great, it worked - thank you very much!

    barni