Menu

Avoid output of 'temporary' variables

Help
Anonymous
2022-08-28
2022-08-29
  • Anonymous

    Anonymous - 2022-08-28

    I'm working on

    xidel --silent --output-format=cmd ^
            --extract "x86:=//a[contains(text(), 'x86')]" ^
            --extract "x64:=//a[contains(text(), 'x64')]" ^
            --extract "x86.id:=$x86/extract(@onclick, '.\"(.*)\"', 1)" ^
            --extract "x86.txt:=$x86/normalize-space(text())" ^
            --extract "x64.id:=$x64/extract(@onclick, '.\"(.*)\"', 1)" ^
            --extract "x64.txt:=$x64/normalize-space(text())" ^
        "%fil%"'
    
    which works so far ....
    
    How can I avoid output of 'temporary' variables x86 + x64 ?
    
    thx for your attention & thx for your time
    
     
  • Reino

    Reino - 2022-08-28

    You could use --extract-exclude:

    xidel -s "%fil%" --extract-exclude=x86,x64 ^
          -e "x86:=//a[contains(text(), 'x86')]" ^
          -e "x64:=//a[contains(text(), 'x64')]" ^
          -e "x86.id:=$x86/extract(@onclick, '.\"(.*)\"', 1)" ^
          -e "x86.txt:=$x86/normalize-space(text())" ^
          -e "x64.id:=$x64/extract(@onclick, '.\"(.*)\"', 1)" ^
          -e "x64.txt:=$x64/normalize-space(text())" ^
          --output-format=cmd
    

    There is however no need for so many extraction-queries. One is enough:

    xidel -s "%fil%" -e "//a[contains(text(), 'x86')]/(x86.id:=extract(@onclick, '.\"(.*)\"', 1),x86.txt:=normalize-space(text())),//a[contains(text(), 'x64')]/(x64.id:=extract(@onclick, '.\"(.*)\"', 1),x64.txt:=normalize-space(text()))" --output-format=cmd
    
    xidel -s "%fil%" -e ^"^
      //a[contains(text(), 'x86')]/(^
        x86.id:=extract(@onclick, '.\"(.*)\"', 1),^
        x86.txt:=normalize-space(text())^
      ),^
      //a[contains(text(), 'x64')]/(^
        x64.id:=extract(@onclick, '.\"(.*)\"', 1),^
        x64.txt:=normalize-space(text())^
      )^
    " --output-format=cmd
    

    Or alternatively with let-variables:

    xidel -s "%fil%" -e "let $x86:=//a[contains(text(), 'x86')],$x64:=//a[contains(text(), 'x64')] return (x86.id:=$x86/extract(@onclick, '.\"(.*)\"', 1),x86.txt:=$x86/normalize-space(text()),x64.id:=$x64/extract(@onclick, '.\"(.*)\"', 1),x64.txt:=$x64/normalize-space(text()))" --output-format=cmd
    
    xidel -s "%fil%" -e ^"^
      let $x86:=//a[contains(text(), 'x86')],^
          $x64:=//a[contains(text(), 'x64')]^
      return (^
        x86.id:=$x86/extract(@onclick, '.\"(.*)\"', 1),^
        x86.txt:=$x86/normalize-space(text()),^
        x64.id:=$x64/extract(@onclick, '.\"(.*)\"', 1),^
        x64.txt:=$x64/normalize-space(text())^
      )^
    " --output-format=cmd
    

    As you didn't provide the input you're using, I couldn't test this.

     
  • Anonymous

    Anonymous - 2022-08-28

    thx Reino for your instant response on a sunday afternoon.
    May I have another question ... ???
    I try to use this in a FOR-loop - and I can't find a way to get
    ... extract(@onclick, '.\"(.)\"', 1), ... e.g. extract(@onclick, '.\^"(.)\^"', 1),
    working because of (DOS) syntax errors.
    Is there a way to let XIDEL read it's commands from a file ?

    thx again

     
  • Reino

    Reino - 2022-08-28

    Could you share the input/source you're processing as well as the exact cmd command and the error-message?

     
  • Anonymous

    Anonymous - 2022-08-28

    The input is:
    https://www.catalog.update.microsoft.com/Search.aspx?q=KB5016676%20windows%207%20-embedded

    The command is:

    for /f "tokens=*" %%l in (
            'xidel --output-format=cmd ^
            --extract "x86:=//a[contains(text(), 'x86')]" ^
            --extract "x64:=//a[contains(text(), 'x64')]" ^
            --extract "x86.id:=$x86/extract(@onclick, '\"(.*)\"', 1)" ^
            --extract "x86.txt:=$x86/normalize-space(text())" ^
            --extract "x64.id:=$x64/extract(@onclick, '\((.*)\)', 1)" ^
            --extract "x64.txt:=$x64/normalize-space(text())" ^
        "%fil%"') do %%l
    
     
  • Reino

    Reino - 2022-08-28

    Multiple things are going wrong here.

    • The only way to correctly export xidel variables with --output-format=cmd in a for-loop is by using "delims=" instead of "tokens=*". See https://stackoverflow.com/a/61972422. https://sourceforge.net/p/xidel/discussion/help/thread/3797db1dc8/ could also be informative.
    • Within a for-loop you shouldn't end a line with a ^.
    • Escape-character-hell! It's not about escaping the \", it's about (.*) between the second pair of double-quotes.
      In a for-loop the following characters need to be escaped: % & < > ^ | , ; = " ( ). Except when they're between a single pair of double-quotes.
    • And please put the input before the extraction-query.
    SET "fil=https://www.catalog.update.microsoft.com/Search.aspx?q=KB5016676%20windows%207%20-embedded"
    
    FOR /F "delims=" %%A IN ('
      xidel -s "%fil%" --extract-exclude^=x86^,x64
      -e "x86:=//a[contains(text(),'x86')]"
      -e "x64:=//a[contains(text(),'x64')]"
      -e "x86.id:=$x86/extract(@onclick,'\"^(.*^)\"',1)"
      -e "x86.txt:=$x86/normalize-space(text())"
      -e "x64.id:=$x64/extract(@onclick,'\"^(.*^)\"',1)"
      -e "x64.txt:=$x64/normalize-space(text())"
      --output-format^=cmd
    ') DO %%A
    

    So --extract-exclude^=x86^,x64, --output-format^=cmd and ^(.*^) (between a second pair of double-quotes; -e "...\"(.*)\"...") needed to be escaped.

    You can also use &quot;, the XQuery notation for a double-quote. Then there's no second pair of double-quotes and thus no need to escape (.*):

    FOR /F "delims=" %%A IN ('
      xidel -s "%fil%" --extract-exclude^=x86^,x64
      -e "x86:=//a[contains(text(),'x86')]"
      -e "x64:=//a[contains(text(),'x64')]"
      -e "x86.id:=$x86/extract(@onclick,'&quot;(.*)&quot;',1)"
      -e "x86.txt:=$x86/normalize-space(text())"
      -e "x64.id:=$x64/extract(@onclick,'&quot;(.*)&quot;',1)"
      -e "x64.txt:=$x64/normalize-space(text())"
      --output-format^=cmd
    ') DO %%A
    

    For the other command I posted, where the extraction-query is 'prettified' and thus covers multiple lines, all the above mentioned characters need to be escaped:

    FOR /F "delims=" %%A IN ('
      xidel -s "%fil%" -e ^"
        //a[contains^(text^(^)^,'x86'^)]/^(
          x86.id:^=extract^(@onclick^,'\^"^(.*^)\^"'^,1^)^,
          x86.txt:^=normalize-space^(text^(^)^)
        ^)^,
        //a[contains^(text^(^)^,'x64'^)]/^(
          x64.id:^=extract^(@onclick^,'\^"^(.*^)\^"'^,1^)^,
          x64.txt:^=normalize-space^(text^(^)^)
        ^)
      ^" --output-format^=cmd
    ') DO %%A
    

    A minified command is a lot easier though:

    FOR /F "delims=" %%A IN ('xidel -s "%fil%" -e "//a[contains(text(),'x86')]/(x86.id:=extract(@onclick,'\"^(.*^)\"',1),x86.txt:=normalize-space(text())),//a[contains(text(),'x64')]/(x64.id:=extract(@onclick,'\"^(.*^)\"',1),x64.txt:=normalize-space(text()))" --output-format^=cmd') DO %%A
    

    Is there a way to let XIDEL read it's commands from a file ?

    Yes. 'query.xq':

    //a[contains(text(),'x86')]/(
      x86.id:=extract(@onclick,'"(.*)"',1),
      x86.txt:=normalize-space(text())
    ),
    //a[contains(text(),'x64')]/(
      x64.id:=extract(@onclick,'"(.*)"',1),
      x64.txt:=normalize-space(text())
    )
    

    or

    //a[contains(text(),"x86")]/(
      x86.id:=extract(@onclick,"""(.*)""",1),
      x86.txt:=normalize-space(text())
    ),
    //a[contains(text(),"x64")]/(
      x64.id:=extract(@onclick,"""(.*)""",1),
      x64.txt:=normalize-space(text())
    )
    

    or

    //a[contains(text(),"x86")]/(
      x86.id:=extract(@onclick,"&quot;(.*)&quot;",1),
      x86.txt:=normalize-space(text())
    ),
    //a[contains(text(),"x64")]/(
      x64.id:=extract(@onclick,"&quot;(.*)&quot;",1),
      x64.txt:=normalize-space(text())
    )
    

    And then:

    FOR /F "delims=" %%A IN ('xidel -s "%fil%" --extract-file^='query.xq' --output-format^=cmd') DO %%A
    

    or

    FOR /F "delims=" %%A IN ('xidel -s "%fil%" -e @query.xq --output-format^=cmd') DO %%A
    

    And as you can see, in this case it doesn't matter if you use Windows- or Linux quotation for the extraction-file.

     
  • Anonymous

    Anonymous - 2022-08-29

    What an enlightenment! Every line! ... and so many things I should have known.
    Reino, I thank you this late night lesson. - I won't forget this one.

     
  • Reino

    Reino - 2022-08-29

    No problem.
    One last thing I forgot to mention: substring-before(@id,'_link') is another option instead of extract(@onclick,'\"(.*)\"',1). Probably easier because there are no double-quotes to deal with.

     
  • Anonymous

    Anonymous - 2022-08-29

    I used @onclick because I thought, @onclick references the 'true' UUID.
    When I realized, that UUID is consistantly used all over the site, I switched to @id in the way you mentioned.
    ... sInce " double-quotes are no problem anymore :-)

    Thank you very much, Reino

     
  • Anonymous

    Anonymous - 2022-08-29

    ... since &quot; double-quotes are no problem anymore :-)

     

Anonymous
Anonymous

Add attachments
Cancel





MongoDB Logo MongoDB