From: Gustaf N. <ne...@wu...> - 2017-04-01 15:49:52
|
Dear all, While working on the encodings, i found the following issue with NaviServer url decoding. RFC 3986 (as well as earlier RFCs) define a path as a sequence of segments, separated by slashes "/": path-abempty = *( "/" segment ) path-absolute = "/" [ segment-nz *( "/" segment ) ] path-noscheme = segment-nz-nc *( "/" segment ) path-rootless = segment-nz *( "/" segment ) NaviServer decodes in request.c the whole URL with a single Ns_UrlPathDecode(), which is effectively the decode operation of a segment (!). This means, that the following two entries are treated identically: /foo/bar1%2fbaz.tcl /foo/bar/baz.tcl whereas this should refer to the two following [ns_conn urlv] values {foo bar/baz.tcl} {foo bar baz.tcl} See as well in [1], which states explicitly, that the URIs http://www.w3.org/albert/bertram/marie-claude and http://www.w3.org/albert/bertram%2Fmarie-claude are NOT identical, as in the second case the encoded slash does not have hierarchical significance. It is not good that a user of NaviServer has currently no means to detect the difference between this two cases, since it treats these as identical. Interestingly, Apache rejects per default requests with paths containing %2f (see discussion in [2]). I am currently considering keeping [ns_conn url] as it is, but to return in [ns_conn urlv] the correct hierarchical structure. Comments? -g [1] https://www.w3.org/Addressing/URL/4_URI_Recommentations.html [2] http://stackoverflow.com/questions/3235219/urlencoded-forward-slash-is-breaking-url |