Menu

#318 percent-encoded '?' in URLs problem

djview
closed
nobody
None
5
2020-06-01
2020-06-01
Janusz
No

This URL with %3F gives
Received http status 404 while retrieving https://djvu.szukajwslownikach.uw.edu.pl/IMPACT_GT_1/00426979.djvu%3Fdjvuopts&page=1&zoom=width&showposition=0.460,0.523&highlight=357,1207,1457,442.
Unexpected End Of File.
This URL opens correctly.
Tested with Debian buster and the git version.

Discussion

  • Dr Bill C Riemers

    Why would you expect the URL to work? The purpose of the % encoding characters is to prevent the characters from being interpreted with special meaning.

    If you create a file on your server named "IMPACT_GT_1/00426979.djvu?djvuopts" how would you specify you wanted that file, we ignored the escape intended by using %3F instead of ? in the URL.

    Some code tries to do what you mean, not what you say. Meaning you may well have some URL parsers that see the & and decide the question mark must be literal. But if you review the RFC's you'll find & has special meaning on the server, not in the client. So the argument should be passed to open a filename of: "IMPACT_GT_1/00426979.djvu%3Fdjvuopts&page=1&zoom=width&showposition=0..."

    e.g. The full string.

    The other component of a URL you see frequently absused with do as I mean not what it litterally say is anchors. You'll have websites that use anchors as cgi-bin arguments, even though an anchor is intended just to tag a location in the page for the web browser.

     
  • Janusz

    Janusz - 2020-06-01

    Thank you very much for your quick reaction.
    To be on the safe side:
    The first link in my posting is
    https://djvu.szukajwslownikach.uw.edu.pl/IMPACT_GT_1/00426979.djvu%3Fdjvuopts&page=1&zoom=width&showposition=0.460,0.523&highlight=357,1207,1457,442.
    The second one is
    https://djvu.szukajwslownikach.uw.edu.pl/IMPACT_GT_1/00426979.djvu?djvuopts&page=1&zoom=width&showposition=0.460,0.523&highlight=357,1207,1457,442
    I still don't understand why djview reports an unexpected end of file.

     
  • Janusz

    Janusz - 2020-06-01

    In a sense a nonexisting file has an unexpected end of file, but I would prefer a more explicit error message :-)
    Anyway please close the ticket.

     
  • Leon Bottou

    Leon Bottou - 2020-06-01

    Bill is enterely right. Escaping the question mark means that you're looking for a document named '00426979.djvu?djvuopts'. Then djview reports a sequence of errors. First a 404 http error because the url goes nowhere. Then an end-of-file error because the decoder tries to decode a zero length document. Finally the abort message summarizes the findings:

    djview: Received http status 404 while retrieving https://djvu.szukajwslownikach.uw.edu.pl/IMPACT_GT_1/00426979.djvu0.000000djvuopts&page=1&zoom=width&showposition=0.460,0.523&highlight=357,1207,1457,442.
    djview: Unexpected End Of File.
    djview: Cannot open URL 'https://djvu.szukajwslownikach.uw.edu.pl/IMPACT_GT_1/00426979.djvu0.000000djvuopts&page=1&zoom=width&showposition=0.460,0.523&highlight=357,1207,1457,442'.
    
     
  • Leon Bottou

    Leon Bottou - 2020-06-01
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,4 +1,3 @@
    -
     [This URL with %3F](https://djvu.szukajwslownikach.uw.edu.pl/IMPACT_GT_1/00426979.djvu%3Fdjvuopts&page=1&zoom=width&showposition=0.460,0.523&highlight=357,1207,1457,442)  gives
     `Received http status 404 while retrieving https://djvu.szukajwslownikach.uw.edu.pl/IMPACT_GT_1/00426979.djvu%3Fdjvuopts&page=1&zoom=width&showposition=0.460,0.523&highlight=357,1207,1457,442.`
     `Unexpected End Of File.`
    
    • status: open --> closed
     
  • Leon Bottou

    Leon Bottou - 2020-06-01

    Closing. This is not a bug but the normal behavior. URLs are confusing things.

     

Log in to post a comment.