Tcl / Read-Only Bugs / #4487 http::geturl abuses vwait on async call

Alaoui Youness - 2009-12-09

test code to reproduce bug

tst-http.tcl

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alaoui Youness - 2010-01-11

Hi,
I fixed this bug in the latest http 2.7.5 (from core-8-5 branch in CVS).
You can get the code here : http://amsn.svn.sf.net/viewvc/amsn/trunk/amsn/utils/http/http.tcl?revision=11885
You can see my changes if you diff it against CVS 1.67.2.9 : http://tcl.cvs.sourceforge.net/viewvc/\*checkout*/tcl/tcl/library/http/http.tcl?revision=1.67.2.9&pathrev=core-8-5-branch

In theory, no more vwait is used anymore when you do the call to geturl with
a -command option.
However, the test code previously attached to this bug is still having the same issue, but I have no idea why unfortunately... :(
I hope this gets merged upstream.
Thanks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alaoui Youness - 2010-01-11

Alright, I found the bug and fixed it in SVN revision 11889 : http://amsn.svn.sf.net/viewvc/amsn/trunk/amsn/utils/http/http.tcl?revision=11885
The reason why my fix didn't work is because I had a typo :
if {[info exists $state(-command)] }
instead of
if {[info exists state(-command)] }

p.s: I just realized I have some whitespace issues in my code, maybe it's because I did a 'untabify' on the tab, so you may want to do a 'ignore whitespace changes' diff. Sorry.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alaoui Youness - 2010-01-11

patch to fix http 2.7.5

http.patch

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alaoui Youness - 2010-01-11

typo in my previous comment, the url should have had revision=11889... anyways, I did a diff (ignoring whitespace changes) and I added it to this bug.
Thanks for checking it out!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alexandre Ferrieux - 2011-09-05

priority: 5 --> 9
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alexandre Ferrieux - 2011-09-05

Hum, looks like this got completely drowned, never committed to trunk, right ?
As of today, in both 8.[56], there is no way to do a fully async http request (including an async connect): the code insists on doing [socket -async] *only* in presence of -timeout, and -timeout forces a vwait.
Why that nonsense ? (Youness's version gets it right)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2012-05-26

We should consider what impact coroutines would have on this in 8.6

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2012-05-26

milestone: --> 2361739
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

kjnash - 2012-10-30

I had problem with the same symptoms, so I'll add my comments to this tracker, though I am not sure if my fix will work for the original problem.

My problem was with EOF detection when a server closes a keep-alive connection.

If http::geturl is called with -keepalive 1 and a -command that enters the event loop (e.g for an authorization dialog), and the HTTP response includes a "Content-Length" header, then a (slow) race condition ensues between the server's closing of the connection, and the user's response to the dialog. If the connection is closed first, the unpatched code fails at line 1126 with repeated eof fileevents, because the command http::Eof tries repeatedly to run the user's -command, and never returns. This can be reported as "too many nested evaluations".

While it might not be good style to allow the -command script to enter the event loop, http(n) does not exclude this possibility.

There's more than one way to fix the bug. Two possible patches are attached. EIther will fix the bug; both fixes can be applied.

Patches against core-8-5-branch checkin 69687a01db, dated 2012-10-24 11:28:33.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

kjnash - 2012-10-30

Sorry folks, I couldn't add the files without creating a new bug report, which is 3581754. The two patches are there.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jan Nijtmans - 2013-01-17

assigned_to: patthoyts --> dkf
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jan Nijtmans - 2013-01-17

kakaroto's patch committed to branch "bug-2911139" (based on core-8-5-branch). Looks good to me.

Donal, do you agree?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2013-01-19

Causes one test failure (http-4.14) but I don't know if that's significant yet. Or maybe it is even a good thing! I'll have to analyze further. Failure is as below (and is the only one):

==== http-4.14 http::Event FAILED
==== Contents of test case:

set token [http::geturl $badurl/?timeout=10 -timeout 10000 -command \#]
if {$token eq ""} {
error "bogus return from http::geturl"
}
http::wait $token
http::status $token
# error code varies among platforms.

---- Test completed normally; Return code was: 0
---- Return code should have been one of: 1
==== http-4.14 FAILED

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alaoui Youness - 2013-01-22

Hey, dkf! Glad to see this bug taken care of! :)
It looks like that test case expects an error to happen with the badurl, but since there's a -command, the socket gets created asynchronously, which means the error will happen in the command callback, not during the socket creation.
So I think that this is a good thing indeed and working as expected. You'd just have to confirm that the error happens asynchronously.
Thanks!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jan Nijtmans - 2013-01-22

I think that kakaroto is right, shouldn't the testcase just check whether the timeout occurred?

Suggested test-case fix committed to the branch. Agreed? I'm OK with this
change being merged to core-8-5-branch and up, but I'm not the expert here....

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2013-01-22

I'm still getting a failure, but now it's getting 'error' rather than 'timeout' as the result string.

Generally, I think that fixing the test is (probably) the correct approach, as the test is looking for something that is intended to be changed. Once we've got the test working, merging through to the 8.5 branch, trunk and novem (though there's no hurry on novem) seems reasonable.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Alaoui Youness - 2013-01-22

Unless I misunderstood the test, isn't error what you're expecting?
If the bad url is a badly formatted URI, then you should expect an error right away.
If the bad url is a valid URI but with an invalid/unreachable server, then you expect it to error out in the callback, and not for it to timeout. It would only timeout if the server was real and dropping packets, so you don't get a "host unreachable" ICMP or "connection refused" error. Otherwise, it will give you an error directly.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2013-01-22

At the moment, I'm just reporting a test failure when I do 'make test' :-)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jan Nijtmans - 2013-01-22

>If the bad url is a valid URI but with an invalid/unreachable server, then
>you expect it to error out in the callback, and not for it to timeout
Well, "timeout" is what I see. If that's wrong, maybe the error
handling is still not 100% correct. I'm not sure yet.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jan Nijtmans - 2013-01-22

It might also be that the test outcome is timing-dependant: What
if the [http::wait $token] awakens twice, once when the connection
is created (with invalid socket), once when the connection times
out. Then it just depends on timing whether the last [http::status]
returns "error" or "timeout". Shouldn't the error-situation cancel
the time-out then?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2013-01-22

I'm not sure; I'm out on a business trip this week so I don't know how clean my network configuration is. OTOH, if I do:

make test TESTFLAGS="-file h*.test -match http-4.14"

This runs just the relevant test, and it fails very quickly; it doesn't look like a timeout to me. If I add:

puts [set ${token}(error)]

To the test immediately before the [http::status $token], I get this output as well:

{connect failed connection refused} {can't read "state(sock)": no such element in array
while executing
"fileevent $state(sock) writable {}"} NONE

I have no idea at all how we're getting into that state!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jan Nijtmans - 2013-01-22

>I have no idea at all how we're getting into that state!
I think I have now. The
can't read "state(sock)":...
comes from line 271, just a left-over from the $errorInfo
at the previous http::reset.

But the "connect failed connection refused" is a
better string to test for, so we know exactly what
happend and not some random other error. Done
now in the "bug-2911139" branch.

I don't get the "timeout" any more, which I got at the
office. At home this test-case passes now, both on
win32 as on Linux. I'm satisfied, everything looks OK
to me now. Agreed?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2013-01-23

Works for me now. :-)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Donal K. Fellows - 2013-01-23

Committed to 8.5 branch; not yet looked into porting to trunk.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

http::geturl abuses vwait on async call

The Tool Command Language implementation

Group

Searches

Help

#4487 http::geturl abuses vwait on async call

Discussion