Menu

Avoid output of 'temporary' variables

Help
Anonymous
2022-08-28
2022-08-29
  • Anonymous

    Anonymous - 2022-08-28

    I'm working on

    xidel --silent --output-format=cmd ^
            --extract "x86:=//a[contains(text(), 'x86')]" ^
            --extract "x64:=//a[contains(text(), 'x64')]" ^
            --extract "x86.id:=$x86/extract(@onclick, '.\"(.*)\"', 1)" ^
            --extract "x86.txt:=$x86/normalize-space(text())" ^
            --extract "x64.id:=$x64/extract(@onclick, '.\"(.*)\"', 1)" ^
            --extract "x64.txt:=$x64/normalize-space(text())" ^
        "%fil%"'
    
    which works so far ....
    
    How can I avoid output of 'temporary' variables x86 + x64 ?
    
    thx for your attention & thx for your time
    
     
  • Reino

    Reino - 2022-08-28

    You could use --extract-exclude:

    xidel -s "%fil%" --extract-exclude=x86,x64 ^
          -e "x86:=//a[contains(text(), 'x86')]" ^
          -e "x64:=//a[contains(text(), 'x64')]" ^
          -e "x86.id:=$x86/extract(@onclick, '.\"(.*)\"', 1)" ^
          -e "x86.txt:=$x86/normalize-space(text())" ^
          -e "x64.id:=$x64/extract(@onclick, '.\"(.*)\"', 1)" ^
          -e "x64.txt:=$x64/normalize-space(text())" ^
          --output-format=cmd
    

    There is however no need for so many extraction-queries. One is enough:

    xidel -s "%fil%" -e "//a[contains(text(), 'x86')]/(x86.id:=extract(@onclick, '.\"(.*)\"', 1),x86.txt:=normalize-space(text())),//a[contains(text(), 'x64')]/(x64.id:=extract(@onclick, '.\"(.*)\"', 1),x64.txt:=normalize-space(text()))" --output-format=cmd
    
    xidel -s "%fil%" -e ^"^
      //a[contains(text(), 'x86')]/(^
        x86.id:=extract(@onclick, '.\"(.*)\"', 1),^
        x86.txt:=normalize-space(text())^
      ),^
      //a[contains(text(), 'x64')]/(^
        x64.id:=extract(@onclick, '.\"(.*)\"', 1),^
        x64.txt:=normalize-space(text())^
      )^
    " --output-format=cmd
    

    Or alternatively with let-variables:

    xidel -s "%fil%" -e "let $x86:=//a[contains(text(), 'x86')],$x64:=//a[contains(text(), 'x64')] return (x86.id:=$x86/extract(@onclick, '.\"(.*)\"', 1),x86.txt:=$x86/normalize-space(text()),x64.id:=$x64/extract(@onclick, '.\"(.*)\"', 1),x64.txt:=$x64/normalize-space(text()))" --output-format=cmd
    
    xidel -s "%fil%" -e ^"^
      let $x86:=//a[contains(text(), 'x86')],^
          $x64:=//a[contains(text(), 'x64')]^
      return (^
        x86.id:=$x86/extract(@onclick, '.\"(.*)\"', 1),^
        x86.txt:=$x86/normalize-space(text()),^
        x64.id:=$x64/extract(@onclick, '.\"(.*)\"', 1),^
        x64.txt:=$x64/normalize-space(text())^
      )^
    " --output-format=cmd
    

    As you didn't provide the input you're using, I couldn't test this.

     
  • Anonymous

    Anonymous - 2022-08-28

    thx Reino for your instant response on a sunday afternoon.
    May I have another question ... ???
    I try to use this in a FOR-loop - and I can't find a way to get
    ... extract(@onclick, '.\"(.)\"', 1), ... e.g. extract(@onclick, '.\^"(.)\^"', 1),
    working because of (DOS) syntax errors.
    Is there a way to let XIDEL read it's commands from a file ?

    thx again

     
  • Reino

    Reino - 2022-08-28

    Could you share the input/source you're processing as well as the exact cmd command and the error-message?

     
  • Anonymous

    Anonymous - 2022-08-28

    The input is:
    https://www.catalog.update.microsoft.com/Search.aspx?q=KB5016676%20windows%207%20-embedded

    The command is:

    for /f "tokens=*" %%l in (
            'xidel --output-format=cmd ^
            --extract "x86:=//a[contains(text(), 'x86')]" ^
            --extract "x64:=//a[contains(text(), 'x64')]" ^
            --extract "x86.id:=$x86/extract(@onclick, '\"(.*)\"', 1)" ^
            --extract "x86.txt:=$x86/normalize-space(text())" ^
            --extract "x64.id:=$x64/extract(@onclick, '\((.*)\)', 1)" ^
            --extract "x64.txt:=$x64/normalize-space(text())" ^
        "%fil%"') do %%l
    
     
  • Reino

    Reino - 2022-08-28

    Multiple things are going wrong here.

    • The only way to correctly export xidel variables with --output-format=cmd in a for-loop is by using "delims=" instead of "tokens=*". See https://stackoverflow.com/a/61972422. https://sourceforge.net/p/xidel/discussion/help/thread/3797db1dc8/ could also be informative.
    • Within a for-loop you shouldn't end a line with a ^.
    • Escape-character-hell! It's not about escaping the \", it's about (.*) between the second pair of double-quotes.
      In a for-loop the following characters need to be escaped: % & < > ^ | , ; = " ( ). Except when they're between a single pair of double-quotes.
    • And please put the input before the extraction-query.
    SET "fil=https://www.catalog.update.microsoft.com/Search.aspx?q=KB5016676%20windows%207%20-embedded"
    
    FOR /F "delims=" %%A IN ('
      xidel -s "%fil%" --extract-exclude^=x86^,x64
      -e "x86:=//a[contains(text(),'x86')]"
      -e "x64:=//a[contains(text(),'x64')]"
      -e "x86.id:=$x86/extract(@onclick,'\"^(.*^)\"',1)"
      -e "x86.txt:=$x86/normalize-space(text())"
      -e "x64.id:=$x64/extract(@onclick,'\"^(.*^)\"',1)"
      -e "x64.txt:=$x64/normalize-space(text())"
      --output-format^=cmd
    ') DO %%A
    

    So --extract-exclude^=x86^,x64, --output-format^=cmd and ^(.*^) (between a second pair of double-quotes; -e "...\"(.*)\"...") needed to be escaped.

    You can also use &quot;, the XQuery notation for a double-quote. Then there's no second pair of double-quotes and thus no need to escape (.*):

    FOR /F "delims=" %%A IN ('
      xidel -s "%fil%" --extract-exclude^=x86^,x64
      -e "x86:=//a[contains(text(),'x86')]"
      -e "x64:=//a[contains(text(),'x64')]"
      -e "x86.id:=$x86/extract(@onclick,'&quot;(.*)&quot;',1)"
      -e "x86.txt:=$x86/normalize-space(text())"
      -e "x64.id:=$x64/extract(@onclick,'&quot;(.*)&quot;',1)"
      -e "x64.txt:=$x64/normalize-space(text())"
      --output-format^=cmd
    ') DO %%A
    

    For the other command I posted, where the extraction-query is 'prettified' and thus covers multiple lines, all the above mentioned characters need to be escaped:

    FOR /F "delims=" %%A IN ('
      xidel -s "%fil%" -e ^"
        //a[contains^(text^(^)^,'x86'^)]/^(
          x86.id:^=extract^(@onclick^,'\^"^(.*^)\^"'^,1^)^,
          x86.txt:^=normalize-space^(text^(^)^)
        ^)^,
        //a[contains^(text^(^)^,'x64'^)]/^(
          x64.id:^=extract^(@onclick^,'\^"^(.*^)\^"'^,1^)^,
          x64.txt:^=normalize-space^(text^(^)^)
        ^)
      ^" --output-format^=cmd
    ') DO %%A
    

    A minified command is a lot easier though:

    FOR /F "delims=" %%A IN ('xidel -s "%fil%" -e "//a[contains(text(),'x86')]/(x86.id:=extract(@onclick,'\"^(.*^)\"',1),x86.txt:=normalize-space(text())),//a[contains(text(),'x64')]/(x64.id:=extract(@onclick,'\"^(.*^)\"',1),x64.txt:=normalize-space(text()))" --output-format^=cmd') DO %%A
    

    Is there a way to let XIDEL read it's commands from a file ?

    Yes. 'query.xq':

    //a[contains(text(),'x86')]/(
      x86.id:=extract(@onclick,'"(.*)"',1),
      x86.txt:=normalize-space(text())
    ),
    //a[contains(text(),'x64')]/(
      x64.id:=extract(@onclick,'"(.*)"',1),
      x64.txt:=normalize-space(text())
    )
    

    or

    //a[contains(text(),"x86")]/(
      x86.id:=extract(@onclick,"""(.*)""",1),
      x86.txt:=normalize-space(text())
    ),
    //a[contains(text(),"x64")]/(
      x64.id:=extract(@onclick,"""(.*)""",1),
      x64.txt:=normalize-space(text())
    )
    

    or

    //a[contains(text(),"x86")]/(
      x86.id:=extract(@onclick,"&quot;(.*)&quot;",1),
      x86.txt:=normalize-space(text())
    ),
    //a[contains(text(),"x64")]/(
      x64.id:=extract(@onclick,"&quot;(.*)&quot;",1),
      x64.txt:=normalize-space(text())
    )
    

    And then:

    FOR /F "delims=" %%A IN ('xidel -s "%fil%" --extract-file^='query.xq' --output-format^=cmd') DO %%A
    

    or

    FOR /F "delims=" %%A IN ('xidel -s "%fil%" -e @query.xq --output-format^=cmd') DO %%A
    

    And as you can see, in this case it doesn't matter if you use Windows- or Linux quotation for the extraction-file.

     
  • Anonymous

    Anonymous - 2022-08-29

    What an enlightenment! Every line! ... and so many things I should have known.
    Reino, I thank you this late night lesson. - I won't forget this one.

     
  • Reino

    Reino - 2022-08-29

    No problem.
    One last thing I forgot to mention: substring-before(@id,'_link') is another option instead of extract(@onclick,'\"(.*)\"',1). Probably easier because there are no double-quotes to deal with.

     
  • Anonymous

    Anonymous - 2022-08-29

    I used @onclick because I thought, @onclick references the 'true' UUID.
    When I realized, that UUID is consistantly used all over the site, I switched to @id in the way you mentioned.
    ... sInce " double-quotes are no problem anymore :-)

    Thank you very much, Reino

     
  • Anonymous

    Anonymous - 2022-08-29

    ... since &quot; double-quotes are no problem anymore :-)

     

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.