Thread: [Memcacheddotnet-devel] Re: memcache exception
Status: Beta
Brought to you by:
timiscool999
From: Tim G. <ti...@ge...> - 2006-04-01 01:56:31
|
I'm pretty confident the problem should be solved, and also that your out of memory errors may have been related to the client library. I'm not sure how much you've dug into the code, but basically the problem was twofold: 1) In the Java version, the socket object treats the input and output streams as two separate objects. However, in .NET the socket object uses the same stream for both input and output. So the TrueClose method of SockIO would sometimes have a problem with the code code that looked like this: outstream.close(); instream.close(); socket.close(); After you close the outstream, you close the instream (which is already closed) and .NET didn't like that. But for some reason it didn't complain every time, only selectively. Probably has something to do with exception catching on threads that aren't the main thread, or perhaps part of problem #2 in that we didn't really reuse the sockets (see below) that much so it didn't occur that much. But this was bubbling up to the top and that was your unhandled exception noted below. In any case I changed everything to use a single stream in the SockIO class and that should solve that problem. 2) In the AddSocketToPool (not quite sure of the name, I'm at home not at work) there's a check to make sure you're not adding it to a blank "host". So take the typical case of adding a socket to the available pool. SockIOPool would pass the SockIO object and the host string "memcached1:11211" to the function. First it checks to make sure the host string is not null. But that's where the bug was. I'm not sure if this bug's always been in there or if it was introduced during the big refactor, but basically it would make sure that the host WAS null. if it wasn't null, it would create a brand new pool to add the socket to. I think you can see where this goes... :( Basically for every single function you end up creating a new available pool and as a result you lose all the sockets that were in that pool. So for every add you would create a new Hashtable object and every time the maintenance thread ran (I have mine set for 30 ms) it would create a whole new set of sockets for you. Now the kicker was that this would also play into the first problem noted above because you could have a socket object that you had a dangling reference to (because it's available pool might disappear), but then it would be disposed (because it didn't belong to an active available pool), and then the memcachedclient library would call TrueClose on it, which would dispose and already disposed object! That's why your unhandled exceptions were scaling with your load because it was creating a bunch of new connections and the likelyhood of you running through this scenario would go up. #2 is probably where your out of memory errors were coming from. Ugh. It really helps to have someone with such an extreme environment using this library 'cause it really highlights and magnifies when there's something wrong :). But I'm very confident that it's in good shape now and ready for some real load. As a result of today's work the client library is friggin' fast now. The stress tests that I was doing (with MemcachedBench) were going about 3-10x faster after I made the bug fixes. I'd say the performance is finally on par or better than some of the other stable client libraries listed on the memcached homepage. ***As for your question about setup: if you're doing an ASP.NET application it probably doesn't make sense to make more than 25 connections, and even that might be a lot if you don't have a multi processor or multi core machine. The thread pool built into .NET by default uses 25 threads to process work requests (which ASP.NET uses to queue up requests to process) so I don't think it would make much sense to have more than that unless you've changed that setting in the machine.config file.*** Anyway, thanks again for all your testing and telling me about the trouble's you've been having. I'm very confident now that the library should run very well in your environment now. -Tim Gebhardt ti...@ge... p.s. If you're logging is "crappy" you might want to consider using log4net. I've used it on several projects of mine (including the memcached library) and I'm in love with it. I would have never figured out these bugs today if I didn't have great logging, and great logging is a pain to do yourself. Maxim Mass wrote: > The object is disposed exception?? That'd be awesome if you can nail > it! We also stress tested the app under all sorts of extreme > conditions and really found no issues -- it's solid. > > We are running live on an asp.net <http://asp.net> app distributed > over 25+ web servers. The frequency of the disposed object exception > correlates to the traffic we're getting.. The stack trace > unfortunately is being cut off by our crappy logging facility. > > We're also getting sporadic out of memory exceptions on our web server > for unrelated to mc reasons. I'm wondering if that out of memory > exception is causing unexpected disposal of objects in the heap (such > as sockets) and thus soon causing a disposed object exception shortly > after. > > If you google the exception you'll see that the c# mysql client has a > similar if not the same problem. According to mysql's bug tracker a > workaround/fix is to catch all exceptions on the socket.close method. > We put in this 'hack' into prod and although it masks the true problem > it may be a decent temporary solution. > > I'll gladly work on integrating the new version with any patches once > you release them and let you know how it works. Please describe the > bugs as verbosely as possible in the release notes ;) > > Thanks Tim!! > > Max > > > > On 3/31/06, *Tim Gebhardt* <ti...@ge... > <mailto:ti...@ge...>> wrote: > > Update: > > > > I think I managed to reproduce the exception that you've found. > I'm trying to figure out why it occurs and hopefully I'll have a > fix by the end of the day. > > > > > > -Tim > > > > ------------------------------------------------------------------------ > > *From:* **Maxim Mass > *Sent:* Thursday, March 30, 2006 7:16 PM** <mailto:le...@gm...> > *Subject:* memcache exception > > > > We've been getting this exception pretty often in our prod > environment. Seems to happen more as more load gets put on it. > There's def. a bug with the client -- i dont think this is server > related at all. > > Here's a partial stack.. Unfortunately, i dont have the full trace. > at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 > offset, Int32 size) at System.IO.BufferedStream.FlushWrite() at > System.IO.BufferedStream.Flush() at > System.IO.BufferedStream.Close() at > MemCached.clientlib.SockIO.TrueClose() at MemCache > > > The exception.message is > Cannot access a disposed object named > "System.Net.Sockets.NetworkStream". Object name: > "System.Net.Sockets.NetworkStream". > > > Have you been experiencing anything like this? can you tell me the > settings you've been using for the client? > We're still using previous version right before your fxcop > changes. Do you know if that version fixes anything? > > Max > > > |
From: Tim G. <ti...@ge...> - 2006-04-05 16:24:29
Attachments:
smime.p7s
|
I think I figured out what was wrong with the performance in the 1.1.* line and I'm redeploying a new version. Should be up in 10 minutes or so (Sourceforge file release process is pretty cumbersome). You can check the changelog or the news when it's up for a detailed description of what was wrong, but my benchmarks go something like this now: 1.1.1: 10,000 gets/sets - 9.3s 1.1.2: 10,000 gets/sets - 8.2s The other notable thing is that the CPU usage from 1.1.1 to 1.1.2 is down from 30% to ~10%. Lei and Max, I BCC'ed you and sent this email to the developer's mailing list. -Tim Gebhardt ti...@ge... -----Original Message----- From: Lei Sun [mailto:le...@gm...] Sent: Tuesday, April 04, 2006 12:47 PM To: ti...@ge... Cc: Maxim Mass Subject: Re: memcache exception Hi Tim, We tried to do a stress test on all 3 versions of the client. Conditions: 1) 2 memcached server, both dual cpu, 4 GB memory 2) Client machine is p4 dual core 3.0G hyper threading 3) Basically a unit of our test is consisted of 100 set, 100, get, 100 getmulti, 1 flushall 4) We run a unit of test 300 times and 1000 times across all three versions of the client. Here is what we found: 1.0.3 1.1.0 1.1.1 # of actions sec to finish/exceptions 30,000 26 30 30,000 25 30 30,000 25 29 100,000 81/0 97/30 101/0 100,000 94/0 109/47 102/0 100,000 82/0 111/24 97/0 Result: 1) 1.0.3 is actually the best in performance, network stats are steady and high, and no exceptions 2) 1.1.0 is the worst in performance, network stats are zigzaggy and lower, and lots of exceptions 3) 1.1.1's performance isn't that good, network stats are a little better than 1.1.0, but still very zigzaggy comparing to 1.0.3. but no exceptions Good thing that you have reduced the exceptions from 1.1.1 comparing to 1.1.0! Would you please look into this slower performance issue? Thanks Lei On 3/31/06, Tim Gebhardt <ti...@ge...> wrote: > > > > The new release is up. Let me know if you guys have any more problems. > > > > -Tim > > > > ________________________________ > > > From: max...@gm... [mailto:max...@gm...] On Behalf Of Maxim > Mass > Sent: Thursday, March 30, 2006 7:16 PM > To: ti...@ge... > Cc: le...@gm... > Subject: memcache exception > > > > > We've been getting this exception pretty often in our prod environment. > Seems to happen more as more load gets put on it. There's def. a bug with > the client -- i dont think this is server related at all. > > Here's a partial stack.. Unfortunately, i dont have the full trace. > at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, > Int32 offset, Int32 size) at > System.IO.BufferedStream.FlushWrite() at > System.IO.BufferedStream.Flush() at System.IO.BufferedStream.Close() at > MemCached.clientlib.SockIO.TrueClose() at MemCache > > > The exception.message is > Cannot access a disposed object named > "System.Net.Sockets.NetworkStream". Object name: > "System.Net.Sockets.NetworkStream". > > > Have you been experiencing anything like this? can you tell me the settings > you've been using for the client? > We're still using previous version right before your fxcop changes. Do you > know if that version fixes anything? > > Max > > |