Menu

Reading from file works, but not stdin?

Help
David
2011-05-25
2013-03-13
  • David

    David - 2011-05-25

    Just trying to get my feet wet playing with epubs; trying to use XMLStarlet to update the metadata file inside an epub.  Somehow, when I extract the metadata file to disk, then use that, it works.  But if I extract to stdout and pipe to XMLStarlet, it doesn't.  Clueless here …

    xml sel  -N opf="http://www.idpf.org/2007/opf"    \
             -N dc="http://purl.org/dc/elements/1.1/" \
             -t -c '/opf:package/opf:metadata' content.opf
    

    works and returns

    <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf" xmlns:calibre="http://calibre.kovidgoyal.net/2009/metadata" xmlns="http://www.idpf.org/2007/opf">
            <dc:title>March to the Sea</dc:title>
            <dc:creator opf:role="aut" opf:file-as="Weber, David">David Weber</dc:creator><dc:creator opf:role="aut">John Ringo</dc:creator>
            <dc:contributor opf:role="bkp" opf:file-as="calibre">calibre (0.5.4) [http://calibre.kovidgoyal.net]</dc:contributor>
            <dc:identifier opf:scheme="calibre" id="calibre_id">f7ad94e9-85d2-4528-a8e8-5087bc03f3ca</dc:identifier>
            <dc:date>2001-08-01T00:00:00</dc:date>
            <dc:language>en-us</dc:language>
            <dc:publisher>Baen Books</dc:publisher>
            <dc:identifier opf:scheme="ISBN">0-671-31826-8</dc:identifier>
            <meta name="calibre:series_index" content="1"/>
            <dc:subject>Science Fiction</dc:subject>
        </metadata>
    

    but

    unzip -c -q test.epub |                              \
       xml sel  -N opf="http://www.idpf.org/2007/opf"    \
                -N dc="http://purl.org/dc/elements/1.1/" \
                -t -c '/opf:package/opf:metadata'
    

    does not.  Instead, I just get the usage text.

    - It's not a problem with the unzip output; if I replace the unzip command with 'cat content.opf' I get the same results. 
    - If I have only 1 namespace declaration, it does work.
    - Changing the order of the namespace declarations doesn't affect the result.

     
  • David

    David - 2011-05-25

    Correction: the second listing should read

    unzip -c -q test.epub content.opf |                  \
       xml sel  -N opf="http://www.idpf.org/2007/opf"    \
                -N dc="http://purl.org/dc/elements/1.1/" \
                -t -c '/opf:package/opf:metadata'
    

    to extract only the metadata file from the epub.

     
  • Noam Postavsky

    Noam Postavsky - 2011-05-25

    I can't reproduce your problem, can you post the ouput from -version and the exact command that fails. Does it work if you use "-" (indicating stdin) as the file name?

     
  • David

    David - 2011-05-30

    OK, I don't know where those semicolons crept in.  The version is 1.0.1, as packaged for openSUSE 11.3.

    The following command works:

    xml sel -N opf="http://www.idpf.org/2007/opf"      \
            -N dc="http://purl.org/dc/elements/1.1/"   \
            -t -c '/opf:package/opf:metadata/dc:title' metadata.opf
    

    But

    cat metadata.opf | xml sel -N opf="http://www.idpf.org/2007/opf"      \
            -N dc="http://purl.org/dc/elements/1.1/"   \
            -t -c '/opf:package/opf:metadata/dc:title'
    

    does not.  If I only define a single namespace, it works.  But then I can't select individual elements within /package/metadata:

    cat metadata.opf | xml sel -N opf="http://www.idpf.org/2007/opf"      \
            -t -c '/opf:package/opf:metadata'
    

    However, it turns out the multiple namespaces work when I explicitly set stdin as the file using "-" as suggested, thanks!  My amateur skills showing …

     
  • David

    David - 2011-05-30

    Why does it keep putting those semicolons in there?

     
  • Noam Postavsky

    Noam Postavsky - 2011-05-31

    1.0.1 is pretty old, you might be hitting bug #1722425: many -N options on command line bug.

    It seems like the forum adds semicolons every time it finds a url inside quotes, you can block it by putting the url inside a url bbcode tag.

     

Log in to post a comment.

MongoDB Logo MongoDB