On October 26, 2004 04:45 am, Jake Appelbaum wrote:
> So in the system I am working on, it's using a few different apache
> servers, each server has two IPs. Each IP has a different cert. So that
> means a total of two certs between any number of webhosts, but each
> webhost has two certs, one for each ip.
> Dist-cache is running as a client, tunneled over stunnel and dist-cache
> is running as a server on a non-webserver host.
> One dc_client per server, and one dc_server for all the clients.
This all sounds fine.
> However, SSL "breaks" sometimes. It just stops responding.
> I wonder if this means it has the session keys for another host using
> the system that's using a different cert. Is that possible?
> It seems to be tied to the same time length as the amount of time it
> takes to expire session keys from the cache, then the error goes away.
Hmm... a couple of suggestions.
(1) can you reproduce this problem with Apache's logging cranked right up
(eg. set to "debug" level) and then check the log for "dc" notes at the
time the problem is observed? The logging should include timestamps and
so you should be able to observe pauses, etc. In fact, if you look at the
ssl module source code, you'll see that the "dc" cache has its own C file
and you should be able to add/instrument any additional logging you like.
If you think a particular call might contain the hang/pause/bug, try
putting some logging either side of the call and grep the resulting logs
after running. It's a bit mundane, but it can be a useful way to dig for
this kind of bug.
(2) try setting the "-idle" flag for dc_client and see if that changes the
behaviour? That would help identify what is "hanging", or at least
eliminate certain links in the chain.
BTW, what version of Apache are you running with?
> Oh and the error is less than verbose, it's just "cannot read data from
> server." Mozilla firefox, debian, etc
Yeah the browser certainly won't be able to say anything intelligent about
this (not that browsers are given to saying intelligent things at other
times, but that's another bleat for another time :-). The issue is
between apache's SSL module code and backwards from there via the caching
module that's active. Is this on unix with the standard fork() model, or
are you using Apache's threaded abomination on win32? (Which is no more
abominable than IIS mind you.)
> Strange stuff, but I think that's the issue?
Strange indeed ... :-( If you get no joy from the above and want something
else to try, perhaps build the latest release of distcache-1.5 and
rebuild+link your apache stuff against that - a lot of the networking
underbelly was overhauled since 1.4.*, and though I doubt it, it's still
possible the "hang" is an I/O condition getting deadlocked. More
importantly, I haven't looked at 1.4.* code for ages and it would be
easier to debug issues in 1.5 and back-port from there if applicable.