Hello, I'm new to Xidel and I'm trying to learn how to use it with lots of trial and errors.
I managed up to now to script download updates from sourceforge, github and "normal" websites but now I'm facing a new challenge and I might need advise from expert users.
I managed to get the download link but then I'm trying to use wget (as I'm doing usually) and the answer is "ERROR 403: Forbidden". I know that there's a way to download a file from within xidel but I've had always an hard time to use it.
Could anybody please point me to the correct way of doing all this ? Besides the download issue are there shorter/cleaner ways to get to the result ? I'd like also to know if there are tutorials for unexperienced users like I am, I couldn't find any, except for some examples here and there...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
Anonymous
-
2023-07-26
I finally found a way: wget was just missing the --user-agent parameter and further steps were necessary to get both the final url and a meaningful file name.
I didn't make it in avoiding the first wget that is used to get the file download "proxy" page. I wish I could do it via Xidel avoiding a creation of a temporary file. But I failed with xidel getting the content of "https://www.apkmirror.com/apk/teamviewer/teamviewer-host/teamviewer-host-15-43-203-release/teamviewer-host-15-43-203-android-apk-download/download/?key=1d371836842c115a39fe79d1413e29a5ea3619bb"
What would be the correct syntax for xidel to parse this page ?
Here the script corrected and completed, please advise if it could be simplified in any way.
Thanks.
...of them. And because of the FOR-loop only the last one will be assigned to %DLHREF%.
If you only want the first one, use -e "(//a[@class='downloadLink'])[1]/@href".
The same goes for the "apk-dl-url".
Could anybody please point me to the correct way of doing all this ? Besides the download issue are there shorter/cleaner ways to get to the result ?
Yes, there is. In fact, all this can be done with just 1 xidel call:
With xidel <url> -f ... -e ... you're opening/downloading/"following" an url in memory and extracting something to stdout. With xidel <url> -f ... --download . you're actually downloading/writing it to disk (to the current dir in this case). So there's no need for wget.
And without -s, as you can see above, you can see some interesting information about what's happening (status information).
A final advice. Prettify an HTML-source first before examining it to come up with a suitable XPath-query:
Thank you so much for your detailed replies !
I was just passing by and I'm looking forward to read all your info attentively asap.
Thanks for having taken the time to reply to me and talk to you soon, V.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello, I'm new to Xidel and I'm trying to learn how to use it with lots of trial and errors.
I managed up to now to script download updates from sourceforge, github and "normal" websites but now I'm facing a new challenge and I might need advise from expert users.
I took as an example teamviewer latest apk.
Package base url is: https://www.apkmirror.com/apk/teamviewer/teamviewer-host/
To get to the download link I needed to call xidel three times:
To get latest release url: https://www.apkmirror.com/apk/teamviewer/teamviewer-host/teamviewer-host-15-43-203-release/
To get download url: https://www.apkmirror.com/apk/teamviewer/teamviewer-host/teamviewer-host-15-43-203-release/teamviewer-host-15-43-203-android-apk-download/
To get download link: https://www.apkmirror.com/apk/teamviewer/teamviewer-host/teamviewer-host-15-43-203-release/teamviewer-host-15-43-203-android-apk-download/download/?key=1d371836842c115a39fe79d1413e29a5ea3619bb
I managed to get the download link but then I'm trying to use wget (as I'm doing usually) and the answer is "ERROR 403: Forbidden". I know that there's a way to download a file from within xidel but I've had always an hard time to use it.
Could anybody please point me to the correct way of doing all this ? Besides the download issue are there shorter/cleaner ways to get to the result ? I'd like also to know if there are tutorials for unexperienced users like I am, I couldn't find any, except for some examples here and there...
Thanks and have a nice day,
V.
Here the script I'm currently using:
Last edit: Virgus 2023-07-25
I finally found a way: wget was just missing the --user-agent parameter and further steps were necessary to get both the final url and a meaningful file name.
I didn't make it in avoiding the first wget that is used to get the file download "proxy" page. I wish I could do it via Xidel avoiding a creation of a temporary file. But I failed with xidel getting the content of "https://www.apkmirror.com/apk/teamviewer/teamviewer-host/teamviewer-host-15-43-203-release/teamviewer-host-15-43-203-android-apk-download/download/?key=1d371836842c115a39fe79d1413e29a5ea3619bb"
What would be the correct syntax for xidel to parse this page ?
Here the script corrected and completed, please advise if it could be simplified in any way.
Thanks.
Hello Virgus,
Have you actually removed (or commented)
@ECHO OFF
to debug your script? The first extraction......doesn't only extract the first "apk-release-url". It extracts...
...of them. And because of the FOR-loop only the last one will be assigned to
%DLHREF%
.If you only want the first one, use
-e "(//a[@class='downloadLink'])[1]/@href"
.The same goes for the "apk-dl-url".
Yes, there is. In fact, all this can be done with just 1
xidel
call:There's no need for multiple
xidel
calls, because with-f
/--follow
you can open, download, or "follow", other urls.%DLHOST%
isn't necessary either, because "following" relative urls (with-f
/--follow
)......
xidel
automatically puts$host
in front.With
xidel <url> -f ... -e ...
you're opening/downloading/"following" an url in memory and extracting something tostdout
. Withxidel <url> -f ... --download .
you're actually downloading/writing it to disk (to the current dir in this case). So there's no need forwget
.And without
-s
, as you can see above, you can see some interesting information about what's happening (status information).A final advice. Prettify an HTML-source first before examining it to come up with a suitable XPath-query:
For Xidel stuff, see https://github.com/benibela/xidel/issues/67#issuecomment-770084663.
For XPath/XQuery stuff, see https://github.com/benibela/xidel/issues/106#issuecomment-1627386429.
An extensive wiki is still on my to-do-list.
Last edit: Reino 2023-07-27
Thank you so much for your detailed replies !
I was just passing by and I'm looking forward to read all your info attentively asap.
Thanks for having taken the time to reply to me and talk to you soon, V.