Your patch makes the connect phase blocking. I don't see why that is necessary. libcurl should simply proceed as usual until connected, why is using a (socks) proxy any different that any other proxy or connection?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Better description of problem: I was seeing 75% of my Socks4a connections failing on setup with socket not connected. Now I see 0%.
The flow I was seeing (before the patch) is the following:
ConnectPlease calls Curl_connecthere which returns immediately in case of curl_multi, not waiting for connet to finish
ConnectPlease calls Curl_SOCKS4
which (in the case of SOCKS4a) calls Curl_write_plain
which calls send() which gets back socket is not connected
The eventual solution is to make Curl_SOCKS support multi directly and not require blocking, but that will take a lot more time. In the meantime, there is bug on the known bugs list that says that curl blocks during proxy connect and multi, but it actually doesn't on Windows.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
But instead of taking the route of making the behavior even more blocking, I would instead suggest that we make the socks proxy approach more similar to the http proxy one.
It uses the asynch connection approch, so for the case when we use the multi interface and Curl_connecthost() doesn't complete the connect, we could instead make the socks negotiation magic get called when we reach the CURLM_STATE_PROTOCONNECT state in lib/multi.c. Wouldn't that work (better) you think?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Agree that not blocking would be the best solution but until that gets written, I suggest this patch as a stopgap. Since many SOCKS proxies are in the LAN i.e. low delay, blocking may not be that bad for the app. The current behavior is bad.
Here are the states I see for SOCKS:
- waiting for SOCKS connect
- Send DNS request for final hostname (not 4a or 5_HOSTNAME)
- waiting for DNS response
- sending SOCKS request
- waiting for SOCKS reply
I suppose those states can get assigned states at the top level state machine or SOCKS could
maintain its own substate under CURLM_STATE_WAITCONNECT.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Right, but my suggested approach is only a stop-gap take anyway, just less crude than yours. See my uploaded patch attempt for an explanation on what I mean.
This approach waits for the TCP connect to be verified before it calls the socks-functions, and thus I hope it'll survive better.
What do you think about this?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I committed the fix for the primary connect, as shown already. And yes, we need something in the same fashion for the FTP data connection. It would happen in the CURLM_STATE_DO_MORE state when Curl_is_connected() for the secondary socket returns connected. Could you try making a patch for this and see if you can do FTP fine over a socks proxy then?
Thanks a lot for your work this far!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Ok, with the obvious (HTTP) case fixed I'll close this report. Thanks.
I've also created KNOWN_BUGS #65 which is about this problem, but for FTP and the data connection. I don't have the energy to work on this now, but hopefully I or someone else can get it working for next release.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Patch to block on connect for SOCKS proxies
See attached patch. Not all connections fail - some succeed.
Your patch makes the connect phase blocking. I don't see why that is necessary. libcurl should simply proceed as usual until connected, why is using a (socks) proxy any different that any other proxy or connection?
Better description of problem: I was seeing 75% of my Socks4a connections failing on setup with socket not connected. Now I see 0%.
The flow I was seeing (before the patch) is the following:
ConnectPlease calls Curl_connecthere which returns immediately in case of curl_multi, not waiting for connet to finish
ConnectPlease calls Curl_SOCKS4
which (in the case of SOCKS4a) calls Curl_write_plain
which calls send() which gets back socket is not connected
The eventual solution is to make Curl_SOCKS support multi directly and not require blocking, but that will take a lot more time. In the meantime, there is bug on the known bugs list that says that curl blocks during proxy connect and multi, but it actually doesn't on Windows.
Ah, yes thanks for clarifying!
But instead of taking the route of making the behavior even more blocking, I would instead suggest that we make the socks proxy approach more similar to the http proxy one.
It uses the asynch connection approch, so for the case when we use the multi interface and Curl_connecthost() doesn't complete the connect, we could instead make the socks negotiation magic get called when we reach the CURLM_STATE_PROTOCONNECT state in lib/multi.c. Wouldn't that work (better) you think?
sorry, it should be at the CURLM_STATE_WAITCONNECT position, which happens at TCP connect
Agree that not blocking would be the best solution but until that gets written, I suggest this patch as a stopgap. Since many SOCKS proxies are in the LAN i.e. low delay, blocking may not be that bad for the app. The current behavior is bad.
Here are the states I see for SOCKS:
- waiting for SOCKS connect
- Send DNS request for final hostname (not 4a or 5_HOSTNAME)
- waiting for DNS response
- sending SOCKS request
- waiting for SOCKS reply
I suppose those states can get assigned states at the top level state machine or SOCKS could
maintain its own substate under CURLM_STATE_WAITCONNECT.
my take at doing the socks connect magic after the tcp connect
Right, but my suggested approach is only a stop-gap take anyway, just less crude than yours. See my uploaded patch attempt for an explanation on what I mean.
This approach waits for the TCP connect to be verified before it calls the socks-functions, and thus I hope it'll survive better.
What do you think about this?
I like your approach better. I will test it.
You had a chance to see if it works yet?
Works great for me. Thanks!
One thing - there is similar code in ftp.c -- worth using Curl_connected there too?
Ack.
I committed the fix for the primary connect, as shown already. And yes, we need something in the same fashion for the FTP data connection. It would happen in the CURLM_STATE_DO_MORE state when Curl_is_connected() for the secondary socket returns connected. Could you try making a patch for this and see if you can do FTP fine over a socks proxy then?
Thanks a lot for your work this far!
Ok, with the obvious (HTTP) case fixed I'll close this report. Thanks.
I've also created KNOWN_BUGS #65 which is about this problem, but for FTP and the data connection. I don't have the energy to work on this now, but hopefully I or someone else can get it working for next release.