Fooby - 2014-07-07

I originally posted this script in my review but markdown formatting is apparently not supported in a review so I am posting it here.

As I mentioned in my review, I have been looking for a CLI alternative to DownThemAll for Firefox because DTA cripples Firefox and slows the entire system on my pretty fast 8-core Mac Pro. It just struggles to maintain a difficult connection, and I have plenty of those. aria2 handles these with ease and does not in any way slow my system down. It just sits there in the background downloading away at max speed. I tried wget and curl but they are single-threaded and very slow on the files I am downloading. I tried puf but it didn't work at all with HTTPS protocol. I tried axel, which worked with HTTPS, BUT tcpdump showed that it had somehow switched over to HTTP when I wasn't looking, which really pissed me off (can I say that?). Most people would probably not have even checked and been lulled into a false sense of security--like what happened with Heartbleed. Bad, bad axel, misleading us like that… aria2 not only worked first time, right out of the box, really handling HTTPS traffic (believe me, I checked…) flawlessly, but it downloaded files at record speeds--my full allotted bandwidth. Needless to say, "Goodbye DTA…", "Sayonara axel…".

I used the following syntax to get these speeds:

aria2c --file-allocation=none -c -x 10 -s 10 -d "mydir" URL


Note that aria2c is the binary HomeBrew has installed--don't ask me why, I don't know…

--file-allocation=none speeds up the initialization of the download which can take quite a long time for a multi-GB file otherwise.

-c allows continuation of a download if it was incomplete the first time. This came in really handy when, for some reason, the speed started flagging and I ctrl-c-ed out of the download and restarted it. It resumed right where it left off at max speed. Nice.

-x 10 and -s 10 give 10 connections per server to speed things along. I suspect the -s 10 is unnecessary but I prefer to err on the side of overkill.

- d downloads files to a directory.


I wrote a script that reads directory names and URLs from a text file and automatically creates the directories and downloads the files to the directories--similar to the way DTA works only much faster/better ;).

aria2files.sh:

#!/bin/bash

filename="$1" # get filename from command line argument

while read -r line
do
    if [ "$line" ] # skip blank lines
    then
        if [[ "$line" =~ (https?|ftp)\:\/\/ ]] # line contains a URL, download file
        then
            echo "URL: '$line'"
            aria2c --file-allocation=none -c -x 10 -s 10 -d "$currdir" "$line"
        else # line contains a directory name, create directory if not already present
            echo "Directory: '$line'"
            currdir="$line"
            if [ ! -d "$currdir" ]
            then
                mkdir -p "$currdir" # '-p' enables creation of nested directories in one command
            fi
        fi
    fi
done < "$filename"


The regex will detect HTTP(S) and FTP URLs. Note that if [ "$line" ] works in OS X bash but you may have to use if [ -z "$line" ] on *NIX, which BTW doesn't work in OS X bash--again, don't ask me why…

The text file has the format:

files.txt:

dierctory 1
url1
url2
…
directory 2/subdirectory/sub-subdirectory/…
url3
url4
…
…
…


The script reads the filename from the command line:

aria2files.sh files.txt


files.txt is in the PWD and listed directories are created as subdirectories of the PWD. Notice that you can list nested directories on one line and the entire hierarchy will be created. There is no checking done so if, for example, the first non-empty line of files.txt is not a directory name but a URL, the file will be saved to the PWD and subsequent URLs will do the same until a directory name is encountered. If the script hasn't yet finished, you can keep adding directories/URLs to the bottom of the text file and saving it.

I put the script in

/usr/local/bin


so it is in my PATH.

 

Last edit: Fooby 2014-07-08