Menu

#37 wget doesn't support incapsula protected feeds

v1.0_(example)
open
None
5
2015-09-27
2015-09-19
No

Background:
I've been using podget for a couple of years, it's the best application I can find to match my requirements :)

I subscribe to a premium paid-for podcast feed, which has recently had incapsula on the host server, and has broken podget. The feed is from a http:// server which requires username/password authentication.
Incapsula unfortunately breaks podget, because the data downloaded by wget is corrupted. I believe that this is due to wget not supporting the incapsula javascript.

My podget-0.7.6 installation is running on an arm7l system running debian wheezy.

Trial solution:
Package "aria2c" is a similar tool to wget, which appears to support downloads protected by incapsula.
I've written a simple wrapper script "wget2ariac2" which converts the calls to wget and its parameters, to the equivalent aria2c parameters.
When wget option "-O -" output to stdout is used, the wrapper script writes to a temporary file, and then uses "cat" to send the temporary file to stdout.
The following wget parameters are supported and converted to aria2c parameters:
-O destination
--user=username
--password=password
http://feed.url

I've then replaced all calls to wget in the podget script, with calls to my wrapper script.
Modified "podget" and wrapper "wget2ariac2" are attached for your reference.

There's probably a much better method to achieve this, and I'm sure my coding isn't perfect. ;) I've tried this solution just to get my downloads working again.

Cheers,
Chris.

2 Attachments

Discussion

  • Chris Cartwright

    attached is an example of the incapsula protected feed data returned by "wget"

     
  • Chris Cartwright

    modified podget script

     
  • Dave Vehrs

    Dave Vehrs - 2015-09-27

    First, let me start by saying nothing flatters me more than seeing something I started being used by someone else to achieve a goal I never imagined. Thank you and kudos to you for finding a solution to a new problem using an old tool.

    Now while your wrapper script and podget_modified prove that the modifcations can work, I hesitant to integrate them into podget for a few coouple reasons:
    1. This was literally the first time someone suggested Aria. Honestly, I'd never heard of it before. On the other hand, Wget is well established and commonly installed. Using Wget just makes Podget a little simplier. In the future, we will probably have to change but I prefer 'well tested' over 'new' for most core tools.
    2. I'm concerned how Aria's multiple threads could hammer some Podcast provider's servers. While a faster download could be advantageous to one user, what impact will it have on popular feeds? And given that Podget was designed to be run as a scheduled job rather than directly by the user, do faster download speeds matter much?

    On the other side of the case, we have the fact that you've pointed out that Aria can successfully download somethings that Wget can't.

    You've given me a lot to think about. Perhaps it is approaching the time we should consider a "Podget Next Generation". That would allow us to keep the current version as a stable alternative for some users and create a newer version that enabled others to see the benefits of using Aria. I'm a little hesitant to do so for the simple reason that when I started podget, I never imagined I would still have people using a little script that I created. It can be a little imtimidating to consider that something you release today, you will still be supporting it in a decade.

     

Log in to post a comment.