#5046 Improper support for IPv6 raw address in http library

closed-fixed
Reinhard Max
5
2012-07-08
2012-05-31
Emmanuel Frécon
No

The current implementation of the http library does not support URLs that contain raw IPv6 specifications, i.e. http://\[2001:5c0:1400:b::d623]/ to take mine as an example. The problem is that the atom of the regexp that should match the host part of the authority is written as [^/:\#?]+ which will not work since IPv6 addresses are written using : as separators (well, not *entirely* true, but almost). Therefore, the parsing stops at improper places within the URL specification and ends up not being able to open the socket against the IP address, even though Tcl 8.6 now has support for IPv6.

The following code (extract out of the "parsing" part of the geturl procedure actually performs the work properly:

set URLmatcher {(?x) # this is _expanded_ syntax
^
(?: (\w+) : ) ? # <protocol scheme>
(?: //
( [^/\#?]+ ) # <location=userinfo + host + port>
)?
( / [^\#]*)? # <path> (including query)
(?: \# (.*) )? # <fragment>
$
}

# Phase one: parse, the URL match
if {![regexp -- $URLmatcher $url -> proto location srvurl]} {
unset $token
return -code error "Unsupported URL: $url"
}
if { [string first "@" $location] >= 0 } {
foreach {user hostport} [split $location "@"] break
} else {
set user ""
set hostport $location
}
set idx [string last ":" $hostport]
if { $idx >= 0 } {
set port [string trimleft [string range $hostport $idx end] ":"]
if { [string is integer $port] } {
set host [string trimright [string range $hostport 0 $idx] ":"]
} else {
set port ""
set host $hostport
}
} else {
set port ""
set host $hostport
}
set host [string trim $host "\[\]"]

# Phase two: validate

This version minimises the regexp parsing and adds a number of string operations. It's not perfect, but at least solve the issue. A regexp guru would be needed to fix this at the regexp level.

Discussion

  • I have a copy of the fix above, as part of a copy of the http package, available for more readable code. See: http://code.google.com/p/efr-tools/source/browse/trunk/apps/cxManager/lib/http/http.tcl. I have used the code review facilities available at google code to further highlight the changes that I have made and what needs to be. Once again, this isn't a proper fix from my point of view, needs a regexp expert to fix the URL parsing instead.

     
  • Jan Nijtmans
    Jan Nijtmans
    2012-07-08

    • assigned_to: patthoyts --> rmax
     
  • Reinhard Max
    Reinhard Max
    2012-07-08

    Fixed in Check-in [abc8fa71fe].

     
  • Reinhard Max
    Reinhard Max
    2012-07-08

    • status: open --> closed-fixed