Menu

#87 wget option --ignore-length

open
nobody
None
5
2012-09-28
2008-03-12
Anonymous
No

I came across a web-server which returns wrong "Content-Length" when I connect in multi-thread mode. It seems that an option like "--ignore-length" for wget would solve the problem. I am not sure if it is the best way to do it, but anyway -- here is my patch http://rapidshare.com/files/99038922/aria2c-0.13.0_1.ign_len.diff

Discussion

  • tujikawa

    tujikawa - 2008-03-13

    Logged In: YES
    user_id=1450148
    Originator: NO

    Great thanks for the patch! I'll review it.
    Many servers that don't support large file(>2GB) seem to send wrong content-length.
    I think this option is good for single source download.
    The problem is multi-source download where I think strict range and content-length checking are needed.
    BTW, did the web server also return "bogus" Content-Range header?

     
  • Nobody/Anonymous

    Logged In: NO

    this patch actually works quite well in multi-thread mode (with one source), I am not sure about multi-source, but probably it should not any difference? That server returns zeros for all threads except the first one (which starts at byte 0). And the size of the file in question was well below 2Gb.

    I actually have another question: is it true that the only thing that restricts the number of download threads to 5 is "handlers.push_back(new NumberOptionHandler(PREF_SPLIT, 1, 5))" in OptionHandlerFactory.cc? I changed it to 20, and it seemed to work.

     
  • tujikawa

    tujikawa - 2008-03-14

    Logged In: YES
    user_id=1450148
    Originator: NO

    If multi-thread works well, multi-source should works. There is no difference other than multiple connection to one host vs single connection to multiple host.
    Could you give me sample of http response the bogus server returned? I want to know the actual response header.

    As for the number of thread, yes.

     
  • Nobody/Anonymous

    Logged In: NO

    hm... now I sniffed the actual traffic with tcpdump, and the remote server response seem to look OK


    HTTP/1.1 206 Partial Content

    Date: Fri, 14 Mar 2008 19:13:33 GMT

    Server: Apache/1.3.39 (Unix) PHP/4.4.8

    X-Powered-By: PHP/4.4.8

    Content-Range: 29360128-116762500/116762501

    Content-Length: 87402373

    Content-Disposition: attachment; filename=

    Content-Transfer-Encoding: binary

    Connection: close

    Content-Type: application/force-download


    it must be something wrong with the way aria2 handles it. The above download thread was dropped by aria2 (when used w/o --ignore-length), but this


    HTTP/1.1 206 Partial Content

    Date: Fri, 14 Mar 2008 19:27:59 GMT

    Content-Type: application/octet-stream

    ETag: "92e1979857e7c31:8037"

    Last-Modified: Fri, 30 Jan 2004 17:36:30 GMT

    Accept-Ranges: bytes

    Server: Microsoft-IIS/6.0

    X-Powered-By: ASP.NET

    Content-Length: 44609568

    Content-Range: bytes 15728640-60338207/60338208

    Connection: close


    worked fine. I am not sure if it could be relevant: the only dufference between the two is that in the first case I had to feed some cookies to the remote server (the content is password-protected).

     
  • tujikawa

    tujikawa - 2008-03-15
     
  • tujikawa

    tujikawa - 2008-03-15

    Logged In: YES
    user_id=1450148
    Originator: NO

    bytes-unit specifier is omitted in the former case:

    Content-Range: 29360128-116762500/116762501

    latter case has correct bytes-unit specifier 'bytes':

    Content-Range: bytes 15728640-60338207/60338208
    ~~~~~

    According to RFC2616, content-specifier 'bytes' or so is required there. I don't know this is the bug of apache.

    I created a patch to handle this situation.

    Thanks for heads up.

    File Added: aria2-0.13.0+1-range.patch

     
  • Nobody/Anonymous

    Logged In: NO

    thanks, now it works without --ignore-length

    But may be you would still want to include something like it, since wget and curl both have this option.

     
  • tujikawa

    tujikawa - 2008-03-15

    Logged In: YES
    user_id=1450148
    Originator: NO

    OK, I'll check these apps.

     

Log in to post a comment.