Menu

#302 Behavior change for canonicalize from 1.0 to 1.1.1

open
uri (21)
5
2003-04-21
2003-04-21
No

For uri 1.0, we had the following behavior:

[rngadam@taipei rngadam]$ tclsh
% package require uri
1.0
% ::uri::canonicalize [::uri::resolve
"/toon/obe/fr/3/18/10044/buy/" ../../../]
/toon/obe/fr/3/

For uri 1.1.1, we now have:

[rngadam@montreal rngadam]$ tclsh
% package require uri
1.1.1
% ::uri::canonicalize [::uri::resolve
"/toon/obe/fr/3/18/10044/buy/" ../../../]
http:///toon/obe/fr/3/

Althought from what I've read in RFC2396 a scheme is
always needed for a URI, the behavior in 1.0 was more
satisfactory for us since browsers will prepend the
scheme and authority. Ideally, if no authority is
specified canonicalize sould leave the URL as-is.

In any case, uri should not just append an arbitrary
scheme in front and should alternatively fail with an
error if it as insufficient information to construct a
canonical URI. Especially when the documentation only
mention "The canonical form of a URI is one where
relative path specifications, ie. . and .., have been
resolved."

Discussion

  • Andreas Kupries

    Andreas Kupries - 2003-04-21

    Logged In: YES
    user_id=75003

    Research: There are two entries in the changeLog which may
    apply, i.e. may have introduced the change.

    2002-02-25 Andreas Kupries
    <andreas_kupries@users.sourceforge.net>

    * uri.tcl: Fixed "::uri::canonicalize" to pass the
    extended
    testsuite. The change to testsuite and command
    implementation
    here was triggered through work on a spider and
    real life urls,
    some of which where handled incorrectly.

    * uri.test: Extended the testsuite
    for "::uri::canonicalize" a
    lot. Handling of uris with a path, without a path,
    unknown uri
    schemes, path components which contain a ".",
    but are neither
    "." nor "..".

    2002-11-15 David N. Welton <davidw@dedasys.com>

    * uri.tcl (uri::canonicalize): Take care of trailing ..,
    as in
    "http://foobar.com/foo/bar/..".

     
  • Pat Thoyts

    Pat Thoyts - 2003-04-29

    Logged In: YES
    user_id=202636

    There are also two other bugs concerning uri - 664392 and
    581781. The canonicalize proc is very broken - especially
    for windows and IMO needs a rewrite to conform to RFC2396.
    Then perhaps special cases can be dealt with.

    These other bugs have more discussion about this.