Menu

#26 infinite loop on unexpected responsed

0.9.35
open
Crawler (17)
5
2007-12-18
2007-12-18
Anonymous
No

When downloading from servers not supporting partial download (using -force_reget), pavuk is caught in an infinite loop like the following:

> Trying to resume from position 18651401
> Unexpected response "403 Forbidden" when trying to reget!

And one more "unexpected response":

> Unexpected response "301 Moved Permanently" when trying to reget!

Probably, a better logic is stop further attempts on such responses.

Discussion

  • Ger Hobbelt

    Ger Hobbelt - 2008-02-04

    Logged In: YES
    user_id=1799833
    Originator: NO

    Added a provisional fix for this issue in tonight's CVS source tree. The code now will try to grab the complete file without bothering with 'ranges/partial downloads' when such a 'partial download' attempt failed on the first try.

    This means pavuk will need to retry each such URL once at least, so -nregets' should never be set below 1 or this fix won't get a chance to work.

    As added to the documentation, this of course will not work for singlereget mode as that one will run forever or until done succesfully (or when a fatal occurs instead).

    The change must still be tested, but I lack an example to reproduce this issue. :-(

    Ger Hobbelt

     
  • Ger Hobbelt

    Ger Hobbelt - 2008-10-22

    Dirk,

    do you have an URL for this bug, so I can test the solution?

    (If not, we'll have to wait until we run into one... :-S )

     

Log in to post a comment.