memcacheddotnet-devel Mailing List for .NET memcached client library
Status: Beta
Brought to you by:
timiscool999
You can subscribe to this list here.
2006 |
Jan
(9) |
Feb
(1) |
Mar
(1) |
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
|
2013 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
From: Tim Lovell-S. <til...@mi...> - 2013-12-06 18:25:53
|
I know there are a lot of nuget packages for memcached already, I just thought one more wouldn't hurt. And it's the popular way of consuming libraries these days. :) If you're interested in principle, but not really interested in the work involved, I could publish one for you using binaries downloaded from sourceforge. Just let me know. Tim |
From: John S. <joh...@gm...> - 2007-11-13 03:00:12
|
Tim, Thanks for the clarification. I should have known something like that wouldn't be neglected by the server. Didn't they teach in school that it would be impractical to lock down resources on the client side. Heh, my college Networks prof would have a fit if he heard me ask this. John On Nov 12, 2007 8:50 PM, Tim Gebhardt <ti...@ge...> wrote: > Hi John, > > Locking is handled by the memcached server. I believe it does optimistic > locking concurrency, but a quick check on the memcached homepage will > confirm that. > > -Tim Gebhardt > ti...@ge... |
From: John S. <joh...@gm...> - 2007-11-12 22:08:29
|
Hi, Looking at the client code there doesn't seem to be any object locking/concurrency checking. Wondering if this is on purpose and left as an exercise for specific implementations? For example, in the middle of a thread calling set() on a large object, what prevents the half updated object from being fetched by another reader thread? Are there locks on memcached? (win32 port) Thanks, John |
From: Tim G. <ti...@ge...> - 2007-07-25 14:17:44
|
Hi John, According to the memcached protocol (which you can find here: http://cvs.danga.com/browse.cgi/wcmtools/memcached/doc/protocol.txt?rev=HEAD) every set command to the server goes like this: set <key> <flags> <exptime> <bytes>\r\n <data> The byte[0] you're referring to is part of the <flags> portion of the command. We use this portion to mark the type of data (if it's a simple data type such as bool, int, string, etc. or a serialized object graph) and whether or not it's compressed. If it's a value type such as Int, Bool, etc. it's much faster and much more space efficient to store the bytes, rather than serialize the data with the .NET serialization stuff, and indicate it's just a simple type. Most other memcached client APIs use the <flags> field in this way. Does this help? Do you have any specific concerns about this or are you just curious? -Tim On 7/24/07, John. H <xia...@gm...> wrote: > > Hello,Tim: > I am using your C# client for memcached. > And I have a quetion to ask you. > I found that at beging of every item value,you will add a such as 'byte[0] > = BOOLMAKER' > Can you tell me what is this used for? > > > Best Regards > Thanks a lot > |
From: Josef F. <car...@gm...> - 2007-07-06 12:21:32
|
I've been using the client for a couple of months now and have encountered some issues that lead me to wonder if I'm implementing the client wrong. One issue is in the SockPoolIO.Start method. I was getting fails that I coded around using: public void Start() { _stopThread = false; //_thread.Start(); try { _thread.Start(); } catch { _thread = new Thread(new ThreadStart(Maintain)); _thread.Start(); } } Has anyone else seen this behavior? Josef |
From: Tim G. <ti...@ge...> - 2007-06-07 15:49:33
|
(from the changelog) 7 June 2007 (TG) -Bugfix. Applied a patch from a contributer (thanks Dave Peckham!) that fixes a bug where we don't get the proper size of the buffer. In many cases the code would still work fine, but you may have experienced periodic errors when working with very large objects, or seeing wasted space on your memcached servers. -Tim Gebhardt ti...@ge... |
From: Ayende R. <ay...@ay...> - 2006-11-20 10:12:29
|
Hi, I am one of the developers of NHibernate, and one of our cache implementations makes use of the memcached client. We have recently move to log4net 1.2.10, but there is not version of memcached client available that works with log4net 1.2.10 Is it possible to release a version that does? Specifically, the issue is with the strong key that I would like to keep the same (to avoid the log4net 1.2.9 -> 1.2.10 PK change fiasko). |
From: Tim G. <ti...@ge...> - 2006-04-05 16:24:29
|
I think I figured out what was wrong with the performance in the 1.1.* line and I'm redeploying a new version. Should be up in 10 minutes or so (Sourceforge file release process is pretty cumbersome). You can check the changelog or the news when it's up for a detailed description of what was wrong, but my benchmarks go something like this now: 1.1.1: 10,000 gets/sets - 9.3s 1.1.2: 10,000 gets/sets - 8.2s The other notable thing is that the CPU usage from 1.1.1 to 1.1.2 is down from 30% to ~10%. Lei and Max, I BCC'ed you and sent this email to the developer's mailing list. -Tim Gebhardt ti...@ge... -----Original Message----- From: Lei Sun [mailto:le...@gm...] Sent: Tuesday, April 04, 2006 12:47 PM To: ti...@ge... Cc: Maxim Mass Subject: Re: memcache exception Hi Tim, We tried to do a stress test on all 3 versions of the client. Conditions: 1) 2 memcached server, both dual cpu, 4 GB memory 2) Client machine is p4 dual core 3.0G hyper threading 3) Basically a unit of our test is consisted of 100 set, 100, get, 100 getmulti, 1 flushall 4) We run a unit of test 300 times and 1000 times across all three versions of the client. Here is what we found: 1.0.3 1.1.0 1.1.1 # of actions sec to finish/exceptions 30,000 26 30 30,000 25 30 30,000 25 29 100,000 81/0 97/30 101/0 100,000 94/0 109/47 102/0 100,000 82/0 111/24 97/0 Result: 1) 1.0.3 is actually the best in performance, network stats are steady and high, and no exceptions 2) 1.1.0 is the worst in performance, network stats are zigzaggy and lower, and lots of exceptions 3) 1.1.1's performance isn't that good, network stats are a little better than 1.1.0, but still very zigzaggy comparing to 1.0.3. but no exceptions Good thing that you have reduced the exceptions from 1.1.1 comparing to 1.1.0! Would you please look into this slower performance issue? Thanks Lei On 3/31/06, Tim Gebhardt <ti...@ge...> wrote: > > > > The new release is up. Let me know if you guys have any more problems. > > > > -Tim > > > > ________________________________ > > > From: max...@gm... [mailto:max...@gm...] On Behalf Of Maxim > Mass > Sent: Thursday, March 30, 2006 7:16 PM > To: ti...@ge... > Cc: le...@gm... > Subject: memcache exception > > > > > We've been getting this exception pretty often in our prod environment. > Seems to happen more as more load gets put on it. There's def. a bug with > the client -- i dont think this is server related at all. > > Here's a partial stack.. Unfortunately, i dont have the full trace. > at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, > Int32 offset, Int32 size) at > System.IO.BufferedStream.FlushWrite() at > System.IO.BufferedStream.Flush() at System.IO.BufferedStream.Close() at > MemCached.clientlib.SockIO.TrueClose() at MemCache > > > The exception.message is > Cannot access a disposed object named > "System.Net.Sockets.NetworkStream". Object name: > "System.Net.Sockets.NetworkStream". > > > Have you been experiencing anything like this? can you tell me the settings > you've been using for the client? > We're still using previous version right before your fxcop changes. Do you > know if that version fixes anything? > > Max > > |
From: Tim G. <ti...@ge...> - 2006-04-01 01:56:31
|
I'm pretty confident the problem should be solved, and also that your out of memory errors may have been related to the client library. I'm not sure how much you've dug into the code, but basically the problem was twofold: 1) In the Java version, the socket object treats the input and output streams as two separate objects. However, in .NET the socket object uses the same stream for both input and output. So the TrueClose method of SockIO would sometimes have a problem with the code code that looked like this: outstream.close(); instream.close(); socket.close(); After you close the outstream, you close the instream (which is already closed) and .NET didn't like that. But for some reason it didn't complain every time, only selectively. Probably has something to do with exception catching on threads that aren't the main thread, or perhaps part of problem #2 in that we didn't really reuse the sockets (see below) that much so it didn't occur that much. But this was bubbling up to the top and that was your unhandled exception noted below. In any case I changed everything to use a single stream in the SockIO class and that should solve that problem. 2) In the AddSocketToPool (not quite sure of the name, I'm at home not at work) there's a check to make sure you're not adding it to a blank "host". So take the typical case of adding a socket to the available pool. SockIOPool would pass the SockIO object and the host string "memcached1:11211" to the function. First it checks to make sure the host string is not null. But that's where the bug was. I'm not sure if this bug's always been in there or if it was introduced during the big refactor, but basically it would make sure that the host WAS null. if it wasn't null, it would create a brand new pool to add the socket to. I think you can see where this goes... :( Basically for every single function you end up creating a new available pool and as a result you lose all the sockets that were in that pool. So for every add you would create a new Hashtable object and every time the maintenance thread ran (I have mine set for 30 ms) it would create a whole new set of sockets for you. Now the kicker was that this would also play into the first problem noted above because you could have a socket object that you had a dangling reference to (because it's available pool might disappear), but then it would be disposed (because it didn't belong to an active available pool), and then the memcachedclient library would call TrueClose on it, which would dispose and already disposed object! That's why your unhandled exceptions were scaling with your load because it was creating a bunch of new connections and the likelyhood of you running through this scenario would go up. #2 is probably where your out of memory errors were coming from. Ugh. It really helps to have someone with such an extreme environment using this library 'cause it really highlights and magnifies when there's something wrong :). But I'm very confident that it's in good shape now and ready for some real load. As a result of today's work the client library is friggin' fast now. The stress tests that I was doing (with MemcachedBench) were going about 3-10x faster after I made the bug fixes. I'd say the performance is finally on par or better than some of the other stable client libraries listed on the memcached homepage. ***As for your question about setup: if you're doing an ASP.NET application it probably doesn't make sense to make more than 25 connections, and even that might be a lot if you don't have a multi processor or multi core machine. The thread pool built into .NET by default uses 25 threads to process work requests (which ASP.NET uses to queue up requests to process) so I don't think it would make much sense to have more than that unless you've changed that setting in the machine.config file.*** Anyway, thanks again for all your testing and telling me about the trouble's you've been having. I'm very confident now that the library should run very well in your environment now. -Tim Gebhardt ti...@ge... p.s. If you're logging is "crappy" you might want to consider using log4net. I've used it on several projects of mine (including the memcached library) and I'm in love with it. I would have never figured out these bugs today if I didn't have great logging, and great logging is a pain to do yourself. Maxim Mass wrote: > The object is disposed exception?? That'd be awesome if you can nail > it! We also stress tested the app under all sorts of extreme > conditions and really found no issues -- it's solid. > > We are running live on an asp.net <http://asp.net> app distributed > over 25+ web servers. The frequency of the disposed object exception > correlates to the traffic we're getting.. The stack trace > unfortunately is being cut off by our crappy logging facility. > > We're also getting sporadic out of memory exceptions on our web server > for unrelated to mc reasons. I'm wondering if that out of memory > exception is causing unexpected disposal of objects in the heap (such > as sockets) and thus soon causing a disposed object exception shortly > after. > > If you google the exception you'll see that the c# mysql client has a > similar if not the same problem. According to mysql's bug tracker a > workaround/fix is to catch all exceptions on the socket.close method. > We put in this 'hack' into prod and although it masks the true problem > it may be a decent temporary solution. > > I'll gladly work on integrating the new version with any patches once > you release them and let you know how it works. Please describe the > bugs as verbosely as possible in the release notes ;) > > Thanks Tim!! > > Max > > > > On 3/31/06, *Tim Gebhardt* <ti...@ge... > <mailto:ti...@ge...>> wrote: > > Update: > > > > I think I managed to reproduce the exception that you've found. > I'm trying to figure out why it occurs and hopefully I'll have a > fix by the end of the day. > > > > > > -Tim > > > > ------------------------------------------------------------------------ > > *From:* **Maxim Mass > *Sent:* Thursday, March 30, 2006 7:16 PM** <mailto:le...@gm...> > *Subject:* memcache exception > > > > We've been getting this exception pretty often in our prod > environment. Seems to happen more as more load gets put on it. > There's def. a bug with the client -- i dont think this is server > related at all. > > Here's a partial stack.. Unfortunately, i dont have the full trace. > at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 > offset, Int32 size) at System.IO.BufferedStream.FlushWrite() at > System.IO.BufferedStream.Flush() at > System.IO.BufferedStream.Close() at > MemCached.clientlib.SockIO.TrueClose() at MemCache > > > The exception.message is > Cannot access a disposed object named > "System.Net.Sockets.NetworkStream". Object name: > "System.Net.Sockets.NetworkStream". > > > Have you been experiencing anything like this? can you tell me the > settings you've been using for the client? > We're still using previous version right before your fxcop > changes. Do you know if that version fixes anything? > > Max > > > |
From: Tim G. <ti...@ge...> - 2006-03-31 22:27:51
|
31 March 2006 (TG) -Bugfix. There were problems where the busy and available pools were continually being created over and over again. This has been fixed and not only does this solve a very subtle outstanding bug, but it also results in a massive performance increase! -Bugfix. Fixed the code where we treat the input and output stream of a SockIO object as two different objects when .NET just treats them as a single object. This was a hold-over from java. The code isn't in CVS yet (but it is in the file release) because it looks like the sourceforge CVS server is down, or doesn't want me to connect. |
From: Tim G. <ti...@ge...> - 2006-02-01 06:07:34
|
Max, This is actually a known issues with VS.NET 2003. VS.NET 2003 can't compile if there are binaries and debugger symbols hanging around from VS.NET 2005. Delete all of the 'bin' directories from all the projects and rebuild. I can't find the MSDN blog entry about it, but that should do it. My email has been down for a couple days because I am switching my host and the domain got screwed up. For some reason in their system they put a <space> character before my domain name: 'tim@ gebhardtcomputing.com' which screwed up everything. Hopefully it should be good now. But that's why it took a few days to reply. I just remembered to check the sourceforge archive to make sure nothing has blown up yet with the new release :). -Tim |
From: Maxim M. <mm...@um...> - 2006-01-31 01:43:44
|
Hey Tim, I'm having problems compiling the new library with the 1.1framewor= k.. Weird error--never seen anything like it. C:\Projects\web\Shared\memcached\memcacheddotnet_clientlib- 1.1.0\memcacheddotnet\src\clientlib\CRCTool.cs(3): Identifier 'Memcached' differing only in case is not CLS-compliant All sorts of other weird compile errors as well.. I tried using the old csproj file but still no luck.. Not sure whats going on. Can you see if thi= s works for you in VS2003? Btw, something is up with your email Tim. Getting relay access denied error from gmail. Max |
From: Tim G. <ti...@ge...> - 2006-01-24 18:51:48
|
After a huge refactoring of the client library to match the coding = standards recommended by Microsoft and some excellent bug hunting by Maxim Mass, I = am confident that the library is not only stable to use, but stable to code against (as in the API shouldn't be changing now very much). I am = changing this project's status to "Beta" so developers that stumble across the project feel a little bit more confident to use it. On a related note, we may get more developers stumbling across it ever = since it has been listed on the Client API's section of the memcached website: http://www.danga.com/memcached/apis.bml w00t! Thanks Max for all your help! -Tim Gebhardt ti...@ge... =A0=A0=A0=A0=A0 Quidquid latine dictum sit, altum sonatur.=A0 =A0=A0=A0=A0=A0=A0Whatever is said in Latin sounds profound.=20 |
From: Tim G. <ti...@ge...> - 2006-01-24 18:45:20
|
23 Jan 2006 (TG) -Huge overhaul of code. Basically ran the client library through FxCop (http://www.gotdotnet.com/team/fxcop/) and tried to fix every error. There are breaking changes, but this should be the one and only time this ever happens. -Internationalized the exception messages and log messages. -Added 2 FxCop projects (one for .NET 1.1 and one for .NET 2.0) and a custom dictionary file to add our custom words for FxCop. -Removed the NestedIOException class because the .NET IOException class can do inner exceptions. -Split off SockIO class into its own source file. -Tim Gebhardt ti...@ge... |
From: Tim G. <ti...@ge...> - 2006-01-18 16:31:42
|
Overriding the GetHashCode method was a holdover from when I ported the library from the Java version. I think you may be right and we don't need it so I removed it. I will redeploy the download today with three new items: -Enabled log4net to work in the MemCachedBench program (or any other program interested in logging memcacheddotnet). -Removed the GetHashCode method from the SockIO class. -Fixed a bug closing down the connection pool that would occur because I was removing items from a hashtable being iterated over. -Tim _____ From: mem...@li... [mailto:mem...@li...] On Behalf Of Maxim Mass Sent: Wednesday, January 18, 2006 1:05 AM To: ti...@ge... Cc: mem...@li... Subject: [Memcacheddotnet-devel] Re: memcache patch After a bit more digging this seems to be happening because the socket is not removed from the availPool. The reason it's not removed is because the objects hash code changes as a result of the TrueClose method. The GetHashCode method is overriden in SockIO and changes based on the private Socket variable. Once the socket is closed, gethashcode returns 0. Was there a reason for overriding the default gethashcode method? It appears to be working fine without it. Max On 1/17/06, Maxim Mass < mm...@um...> wrote: There appears to be another bug in that maintenance thread. It's still not closing connections but this time due to a different exception. This line: DateTime expire = (DateTime) sockets[ socket ]; throws a null ref exception since that socket doesn't exist in the hash. socket comes from a foreach loop that iterates over all the keys... So looking at this for about 10 mins now seems like there are a couple issues.. For some reason, it's iterating over keys (sockets) that have been already removed. Or it's not removing and closing correctly. I'm going to look at this further but let me know if you come up with a fix for it Max On 1/17/06, Tim Gebhardt <ti...@ge... <mailto:ti...@ge...> > wrote: The patch looks good. The timing stuff is a lot easier to read now. I redeployed the SourceForge.net download. -Tim _____ From: max...@gm... [mailto: <mailto:max...@gm...> max...@gm...] On Behalf Of Maxim Mass Sent: Monday, January 16, 2006 9:04 PM To: ti...@ge... Subject: memcache patch I fixed a few things in the SockIOPool.. Take a look and sanity check this please.. |
From: Maxim M. <mm...@um...> - 2006-01-18 07:05:29
|
After a bit more digging this seems to be happening because the socket is not removed from the availPool. The reason it's not removed is because the objects hash code changes as a result of the TrueClose method. The GetHashCode method is overriden in SockIO and changes based on the private Socket variable. Once the socket is closed, gethashcode returns 0. Was there a reason for overriding the default gethashcode method? It appear= s to be working fine without it. Max On 1/17/06, Maxim Mass <mm...@um...> wrote: > > There appears to be another bug in that maintenance thread. It's still no= t > closing connections but this time due to a different exception. This lin= e: > > DateTime expire =3D (DateTime) sockets[ socket ]; > > throws a null ref exception since that socket doesn't exist in the hash. > socket comes from a foreach loop that iterates over all the keys... So > looking at this for about 10 mins now seems like there are a couple issue= s.. > For some reason, it's iterating over keys (sockets) that have been alread= y > removed. Or it's not removing and closing correctly. > > I'm going to look at this further but let me know if you come up with a > fix for it > > Max > > On 1/17/06, Tim Gebhardt <ti...@ge...> wrote: > > > > The patch looks good. The timing stuff is a lot easier to read now. = I > > redeployed the SourceForge.net download. > > > > > > > > > > > > -Tim > > > > > > ------------------------------ > > > > *From:* max...@gm... [mailto: max...@gm...] *On Behalf Of > > *Maxim Mass > > *Sent:* Monday, January 16, 2006 9:04 PM > > *To:* ti...@ge... > > *Subject:* memcache patch > > > > > > > > I fixed a few things in the SockIOPool.. Take a look and sanity check > > this please.. > > > > > |
From: Maxim M. <mm...@um...> - 2006-01-17 21:52:35
|
There appears to be another bug in that maintenance thread. It's still not closing connections but this time due to a different exception. This line: DateTime expire =3D (DateTime) sockets[ socket ]; throws a null ref exception since that socket doesn't exist in the hash. socket comes from a foreach loop that iterates over all the keys... So looking at this for about 10 mins now seems like there are a couple issues.= . For some reason, it's iterating over keys (sockets) that have been already removed. Or it's not removing and closing correctly. I'm going to look at this further but let me know if you come up with a fix for it Max On 1/17/06, Tim Gebhardt <ti...@ge...> wrote: > > The patch looks good. The timing stuff is a lot easier to read now. I > redeployed the SourceForge.net download. > > > > > > -Tim > > > ------------------------------ > > *From:* max...@gm... [mailto:max...@gm...] *On Behalf Of *M= axim > Mass > *Sent:* Monday, January 16, 2006 9:04 PM > *To:* ti...@ge... > *Subject:* memcache patch > > > > I fixed a few things in the SockIOPool.. Take a look and sanity check thi= s > please.. > > |
From: Tim G. <ti...@ge...> - 2006-01-17 17:49:44
|
Hi Max, I can't really install AIM at work, otherwise I don't think I would get much done :-). However, I am on Skype, you can look me up by my email address. Log4net needs a line in AssemblyInfo.cs to enable logging on the assembly. I added it to the bench program and it will be included in the next release. -Tim _____ From: mem...@li... [mailto:mem...@li...] On Behalf Of Maxim Mass Sent: Monday, January 16, 2006 5:54 PM To: ti...@ge... Cc: mem...@li... Subject: [Memcacheddotnet-devel] Re: Memcached 1.0 Hi Tim, I got your message (on my phone though). You're probably at work now.. Do you have aim access there? If so, sign on! :) While trying to get logging to work, I think I found another bug.. We're getting type cast exceptions in the selfMaint() method as we look up the expiration time. Timespan.Totalmilliseconds returns a double rather than a long. One way to correct this is to simply use the DateTime.Now object as the value of the socket timeout vs converting it to a long/double. This will make the code a little bit more readable and shouldn't be a performance hit (it's a value type anyway that is just holding # of ticks). I'm not certain what the extent of this bug is and what the symptoms are but semingly, the result is that sockets aren't closing when they should be since the exception breaks it out of the loop immediately. Also, does logging for your simple benchmark project work for you? I'm not sure why but I cant get any logging whatsoever. I've mostly used nlog and never log4net but i've tried a few different app.config examples and still no luck. Any ideas? Max On 1/13/06, Tim Gebhardt <ti...@ge...> wrote: Yes, I noticed the exceptions as well. I've broadly narrowed (.?) it down to the TCP connections timing out. I would really like to squash that if I could. As time frees up here are my three development tasks that I would like to tackle: -Clean up the API (the longer it goes on like this the more of a pain it will be to change later) -Fix those exceptions that occur when the sockets time out -Make an App.config section handler so we can configure all of the options for the client API in the App or Web.config file (I just thought of this one today). And wow that's quite a setup you have there. Ours isn't quite as big as that right now, but there is a good possibility that we'll need to scale out that big so we're planning ahead :-) . As for an explanation on how the client handles failover: The socket pool keeps persistent connections to each of the memcached servers. When the client loses a connection to one it tries to reconnect after the reconnect timeout. If it doesn't connect, it doubles the timeout and tries again. It keeps doubling the timeout everytime it fails to connect. If it ever reconnects the timeout length is reset. This works really well because if there is a slight network "hiccup" you'll reconnect very quickly, but if the node goes down for a very long time the client quickly ignores it. The failover code could use a little more work and it's all related to those exceptions. There isn't really any "redundancy", only failover. But the failover process works pretty well and it's how the other clients I looked at handle it. Are you familiar with how memcached works? It's really just a hashtable of hashtables. The first level of hashing decides "what server does this go on?" then the next level of hashing happens on the server and it says "where does this item go in my hashtable?" Since all the clients use the same hashing algorithm they all end up coming up with the same values. That's one thing I had to explain to some of the devs on my team. It's only a cache, it's not a persistent store. If one of the nodes goes down, there will be a small hiccup in our web application because most of the stuff will have to be re-cached. Unfortunately I haven't had time to tune and performance test our cache yet, so we're just using the default values right now for our limited beta test of our system. We're a pretty small development shop and I'm pretty pressed for time just trying to add features and squash bugs in the application. Unfortunately the users can't see how cool memcached is, they only see how cool the interface for our application looks and that gets priority. I'm copying this email to the developer mailing list. I'll IM you my AIM screenname. Have a great weekend and drink up! -Tim This sounds great. It's 'alpha' but it's working great for us so far. We've been stable with our patched client for a few days now running on 25+ web servers with 10 memcache boxes with just a small handful of unhandled exceptions. I've recently reimaged out machines from 2.4 kernel to an smp 2.6 kernel. Load on each box went from ~85% to less than 5% ! This was extremely encouraging and paves the way to more extensive use of memcache throughout our site. Go ahead and post the bugs and emails--Anything that gets more people to download the library and get more eyes on the code would be great. I'd like to gain a bit more insight into some aspects of the client.. One question I get frequently is about redundancy. From looking at the code, it seems like if a node goes down then it's marked as down and the next server on the list is used instead. Since every client performs the same check and goes on to the same next server, there's barely a performance hit. Then the server is checked periodically (at increasing time intervals?) until it comes back up. Is this about right? Can you describe this process any further? Do you have some ideas of improving it? What do you think is a good value to use for max idle connections? I currently have roughly 2000 simultaneous connections to each memcache node and i'm pretty sure that most of these are idle sockets in client pools--though i'm not certain how many are actually used. It'd be cool to be able to see various client stats to get more transparency into things. I think the perl client does something like it already. I could take a look at this closer next week and get some more concrete ideas together. This email is getting long... are you on aim or any im? i'm **** on aim or msn: **** gonna go out drinking.. ttyl! :) Max Max, I made the changes to the library and re-uploaded the binaries. I also have two sets of project/solutions, one for VS.NET 2003 and .NET 1.1 and one for VS.NET 2005 and .NET 2.0. They both run off the exact same source files (although I think I may make the 2.0 version work with the native GZIP stuff new in the 2.0 framework at some point). Keep in mind though that technically the project is still in alpha. Mainly because the API isn't very clean. I would like to clean it up to adhere more to the .NET coding standards. It would be pretty easy to change any code (mostly just going from lowercase stuff to uppercase and small stuff like that), but I'll make sure to put any changes in the changelog. -Tim p.s. Would you mind if I copied this message to the development mailing list for the project so that it looks like there is some activity? It might help us out if some other people have an indication that something is going on with this project. I'll see if i can send diff's later on when i have a chance but the first bug I changed the sleeps to this: Thread.Sleep((int)interval); and Thread.Sleep((int)interval * 10 ) in the catch and the while loop changed to this while ((count = gzi.Read(tmp, 0, 2048)) != 0) please confirm that -1 really is never returned though (or why it would be) but we've had no probs so far. Regarding 2.0, we are using it for some projects but not for those using this client so it'd be great if you can maintain two branches if you start using 2.0-only stuff. We're using this pretty heavily at this point and I'll be sure to let you know as soon as we find new issues. Max : Hi Max, Hey thanks for the extra set of eyes. Yes, the nanoseconds stuff caused a lot of small little errors when I was porting it from Java (which uses milliseconds). If you made any changes, would it be possible to send me a diff of your project? I could incorporate them (and give you credit) and repost the project. On another note, are you using .NET 1.1 or 2.0? One thing I would like to do is move the project to .NET 2.0 because the library performs so much faster (I think serialization in .NET 2.0 has been much improved). If not, I'll make sure to keep both project files around and build it for both frameworks. -Tim Hi Tim, great work with the memcached c# client -- been using it over here in production with a lot of good results. I've had to modify the client so far in a couple ways to make it work and wanted to let you know so you could consider fixing it. In SockIOPool.Maintain(): when you pass a timespan in into thread.sleep you should instead be just passing in that number of milliseconds (5000). A timespan constructor takes ticks in nanoseconds. This causes 100% cpu time since it's polling way faster than it should. In Memcacheclient.LoadItems you have a while loop that reads until gzi.Read returns -1. I'm not sure if it ever returns -1 but looking at the zip code, it does return 0. This again was causing 100% cpu time as it never left this tight loop. Please keep up the awesome work you're doing and I will let you know if I find other issues ( I suspected the maintenance thread has a bug somewhere causing exceptions but haven't nailed it yet). Max |
From: Maxim M. <mm...@um...> - 2006-01-16 23:54:23
|
SGkgVGltLCBJIGdvdCB5b3VyIG1lc3NhZ2UgKG9uIG15IHBob25lIHRob3VnaCkuIFlvdSdyZSBw cm9iYWJseSBhdCB3b3JrCm5vdy4uIERvIHlvdSBoYXZlIGFpbSBhY2Nlc3MgdGhlcmU/IElmIHNv LCBzaWduIG9uISA6KQoKV2hpbGUgdHJ5aW5nIHRvIGdldCBsb2dnaW5nIHRvIHdvcmssIEkgdGhp bmsgSSBmb3VuZCBhbm90aGVyIGJ1Zy4uIFdlJ3JlCmdldHRpbmcgdHlwZSBjYXN0IGV4Y2VwdGlv bnMgaW4gdGhlIHNlbGZNYWludCgpIG1ldGhvZCBhcyB3ZSBsb29rIHVwIHRoZQpleHBpcmF0aW9u IHRpbWUuIFRpbWVzcGFuLlRvdGFsbWlsbGlzZWNvbmRzIHJldHVybnMgYSBkb3VibGUgcmF0aGVy IHRoYW4gYQpsb25nLiBPbmUgd2F5IHRvIGNvcnJlY3QgdGhpcyBpcyB0byBzaW1wbHkgdXNlIHRo ZSBEYXRlVGltZS5Ob3cgb2JqZWN0IGFzCnRoZSB2YWx1ZSBvZiB0aGUgc29ja2V0IHRpbWVvdXQg dnMgY29udmVydGluZyBpdCB0byBhIGxvbmcvZG91YmxlLiBUaGlzIHdpbGwKbWFrZSB0aGUgY29k ZSBhIGxpdHRsZSBiaXQgbW9yZSByZWFkYWJsZSBhbmQgc2hvdWxkbid0IGJlIGEgcGVyZm9ybWFu Y2UgaGl0CihpdCdzIGEgdmFsdWUgdHlwZSBhbnl3YXkgdGhhdCBpcyBqdXN0IGhvbGRpbmcgIyBv ZiB0aWNrcykuCgpJJ20gbm90IGNlcnRhaW4gd2hhdCB0aGUgZXh0ZW50IG9mIHRoaXMgYnVnIGlz IGFuZCB3aGF0IHRoZSBzeW1wdG9tcyBhcmUgYnV0CnNlbWluZ2x5LCB0aGUgcmVzdWx0IGlzIHRo YXQgc29ja2V0cyBhcmVuJ3QgY2xvc2luZyB3aGVuIHRoZXkgc2hvdWxkIGJlCnNpbmNlIHRoZSBl eGNlcHRpb24gYnJlYWtzIGl0IG91dCBvZiB0aGUgbG9vcCBpbW1lZGlhdGVseS4KCkFsc28sIGRv ZXMgbG9nZ2luZyBmb3IgeW91ciBzaW1wbGUgYmVuY2htYXJrIHByb2plY3Qgd29yayBmb3IgeW91 PyBJJ20gbm90CnN1cmUgd2h5IGJ1dCBJIGNhbnQgZ2V0IGFueSBsb2dnaW5nIHdoYXRzb2V2ZXIu IEkndmUgbW9zdGx5IHVzZWQgbmxvZyBhbmQKbmV2ZXIgbG9nNG5ldCBidXQgaSd2ZSB0cmllZCBh IGZldyBkaWZmZXJlbnQgYXBwLmNvbmZpZyBleGFtcGxlcyBhbmQgc3RpbGwKbm8gbHVjay4gQW55 IGlkZWFzPwoKTWF4CgpPbiAxLzEzLzA2LCBUaW0gR2ViaGFyZHQgPHRpbUBnZWJoYXJkdGNvbXB1 dGluZy5jb20+IHdyb3RlOgo+Cj4gIFllcywgSSBub3RpY2VkIHRoZSBleGNlcHRpb25zIGFzIHdl bGwuICBJJ3ZlIGJyb2FkbHkgbmFycm93ZWQgKIU/KSBpdAo+IGRvd24gdG8gdGhlIFRDUCBjb25u ZWN0aW9ucyB0aW1pbmcgb3V0LiAgSSB3b3VsZCByZWFsbHkgbGlrZSB0byBzcXVhc2ggdGhhdAo+ IGlmIEkgY291bGQuCj4KPgo+Cj4gQXMgdGltZSBmcmVlcyB1cCBoZXJlIGFyZSBteSB0aHJlZSBk ZXZlbG9wbWVudCB0YXNrcyB0aGF0IEkgd291bGQgbGlrZSB0bwo+IHRhY2tsZToKPgo+Cj4KPiAt Q2xlYW4gdXAgdGhlIEFQSSAodGhlIGxvbmdlciBpdCBnb2VzIG9uIGxpa2UgdGhpcyB0aGUgbW9y ZSBvZiBhIHBhaW4gaXQKPiB3aWxsIGJlIHRvIGNoYW5nZSBsYXRlcikKPgo+IC1GaXggdGhvc2Ug ZXhjZXB0aW9ucyB0aGF0IG9jY3VyIHdoZW4gdGhlIHNvY2tldHMgdGltZSBvdXQKPgo+IC1NYWtl IGFuIEFwcC5jb25maWcgc2VjdGlvbiBoYW5kbGVyIHNvIHdlIGNhbiBjb25maWd1cmUgYWxsIG9m IHRoZSBvcHRpb25zCj4gZm9yIHRoZSBjbGllbnQgQVBJIGluIHRoZSBBcHAgb3IgV2ViLmNvbmZp ZyBmaWxlIChJIGp1c3QgdGhvdWdodCBvZiB0aGlzCj4gb25lIHRvZGF5KS4KPgo+Cj4KPgo+Cj4g QW5kIHdvdyB0aGF0J3MgcXVpdGUgYSBzZXR1cCB5b3UgaGF2ZSB0aGVyZS4gIE91cnMgaXNuJ3Qg cXVpdGUgYXMgYmlnIGFzCj4gdGhhdCByaWdodCBub3csIGJ1dCB0aGVyZSBpcyBhIGdvb2QgcG9z c2liaWxpdHkgdGhhdCB3ZSdsbCBuZWVkIHRvIHNjYWxlIG91dAo+IHRoYXQgYmlnIHNvIHdlJ3Jl IHBsYW5uaW5nIGFoZWFkIEouCj4KPgo+Cj4gQXMgZm9yIGFuIGV4cGxhbmF0aW9uIG9uIGhvdyB0 aGUgY2xpZW50IGhhbmRsZXMgZmFpbG92ZXI6Cj4KPgo+Cj4gVGhlIHNvY2tldCBwb29sIGtlZXBz IHBlcnNpc3RlbnQgY29ubmVjdGlvbnMgdG8gZWFjaCBvZiB0aGUgbWVtY2FjaGVkCj4gc2VydmVy cy4gIFdoZW4gdGhlIGNsaWVudCBsb3NlcyBhIGNvbm5lY3Rpb24gdG8gb25lIGl0IHRyaWVzIHRv IHJlY29ubmVjdAo+IGFmdGVyIHRoZSByZWNvbm5lY3QgdGltZW91dC4gIElmIGl0IGRvZXNuJ3Qg Y29ubmVjdCwgaXQgZG91YmxlcyB0aGUgdGltZW91dAo+IGFuZCB0cmllcyBhZ2Fpbi4gIEl0IGtl ZXBzIGRvdWJsaW5nIHRoZSB0aW1lb3V0IGV2ZXJ5dGltZSBpdCBmYWlscyB0bwo+IGNvbm5lY3Qu ICBJZiBpdCBldmVyIHJlY29ubmVjdHMgdGhlIHRpbWVvdXQgbGVuZ3RoIGlzIHJlc2V0LiAgVGhp cyB3b3Jrcwo+IHJlYWxseSB3ZWxsIGJlY2F1c2UgaWYgdGhlcmUgaXMgYSBzbGlnaHQgbmV0d29y ayAiaGljY3VwIiB5b3UnbGwgcmVjb25uZWN0Cj4gdmVyeSBxdWlja2x5LCBidXQgaWYgdGhlIG5v ZGUgZ29lcyBkb3duIGZvciBhIHZlcnkgbG9uZyB0aW1lIHRoZSBjbGllbnQKPiBxdWlja2x5IGln bm9yZXMgaXQuICBUaGUgZmFpbG92ZXIgY29kZSBjb3VsZCB1c2UgYSBsaXR0bGUgbW9yZSB3b3Jr IGFuZCBpdCdzCj4gYWxsIHJlbGF0ZWQgdG8gdGhvc2UgZXhjZXB0aW9ucy4gIFRoZXJlIGlzbid0 IHJlYWxseSBhbnkgInJlZHVuZGFuY3kiLCBvbmx5Cj4gZmFpbG92ZXIuICBCdXQgdGhlIGZhaWxv dmVyIHByb2Nlc3Mgd29ya3MgcHJldHR5IHdlbGwgYW5kIGl0J3MgaG93IHRoZSBvdGhlcgo+IGNs aWVudHMgSSBsb29rZWQgYXQgaGFuZGxlIGl0Lgo+Cj4KPgo+IEFyZSB5b3UgZmFtaWxpYXIgd2l0 aCBob3cgbWVtY2FjaGVkIHdvcmtzPyAgSXQncyByZWFsbHkganVzdCBhIGhhc2h0YWJsZQo+IG9m IGhhc2h0YWJsZXMuICBUaGUgZmlyc3QgbGV2ZWwgb2YgaGFzaGluZyBkZWNpZGVzICJ3aGF0IHNl cnZlciBkb2VzIHRoaXMgZ28KPiBvbj8iIHRoZW4gdGhlIG5leHQgbGV2ZWwgb2YgaGFzaGluZyBo YXBwZW5zIG9uIHRoZSBzZXJ2ZXIgYW5kIGl0IHNheXMgIndoZXJlCj4gZG9lcyB0aGlzIGl0ZW0g Z28gaW4gbXkgaGFzaHRhYmxlPyIgIFNpbmNlIGFsbCB0aGUgY2xpZW50cyB1c2UgdGhlIHNhbWUK PiBoYXNoaW5nIGFsZ29yaXRobSB0aGV5IGFsbCBlbmQgdXAgY29taW5nIHVwIHdpdGggdGhlIHNh bWUgdmFsdWVzLiAgVGhhdCdzCj4gb25lIHRoaW5nIEkgaGFkIHRvIGV4cGxhaW4gdG8gc29tZSBv ZiB0aGUgZGV2cyBvbiBteSB0ZWFtLiAgSXQncyBvbmx5IGEKPiBjYWNoZSwgaXQncyBub3QgYSBw ZXJzaXN0ZW50IHN0b3JlLiAgSWYgb25lIG9mIHRoZSBub2RlcyBnb2VzIGRvd24sIHRoZXJlCj4g d2lsbCBiZSBhIHNtYWxsIGhpY2N1cCBpbiBvdXIgd2ViIGFwcGxpY2F0aW9uIGJlY2F1c2UgbW9z dCBvZiB0aGUgc3R1ZmYgd2lsbAo+IGhhdmUgdG8gYmUgcmUtY2FjaGVkLgo+Cj4KPgo+IFVuZm9y dHVuYXRlbHkgSSBoYXZlbid0IGhhZCB0aW1lIHRvIHR1bmUgYW5kIHBlcmZvcm1hbmNlIHRlc3Qg b3VyIGNhY2hlCj4geWV0LCBzbyB3ZSdyZSBqdXN0IHVzaW5nIHRoZSBkZWZhdWx0IHZhbHVlcyBy aWdodCBub3cgZm9yIG91ciBsaW1pdGVkIGJldGEKPiB0ZXN0IG9mIG91ciBzeXN0ZW0uICBXZSdy ZSBhIHByZXR0eSBzbWFsbCBkZXZlbG9wbWVudCBzaG9wIGFuZCBJJ20gcHJldHR5Cj4gcHJlc3Nl ZCBmb3IgdGltZSBqdXN0IHRyeWluZyB0byBhZGQgZmVhdHVyZXMgYW5kIHNxdWFzaCBidWdzIGlu IHRoZQo+IGFwcGxpY2F0aW9uLiAgVW5mb3J0dW5hdGVseSB0aGUgdXNlcnMgY2FuJ3Qgc2VlIGhv dyBjb29sIG1lbWNhY2hlZCBpcywgdGhleQo+IG9ubHkgc2VlIGhvdyBjb29sIHRoZSBpbnRlcmZh Y2UgZm9yIG91ciBhcHBsaWNhdGlvbiBsb29rcyBhbmQgdGhhdCBnZXRzCj4gcHJpb3JpdHkuCj4K Pgo+Cj4gSSdtIGNvcHlpbmcgdGhpcyBlbWFpbCB0byB0aGUgZGV2ZWxvcGVyIG1haWxpbmcgbGlz dC4gIEknbGwgSU0geW91IG15IEFJTQo+IHNjcmVlbm5hbWUuCj4KPgo+Cj4KPgo+IEhhdmUgYSBn cmVhdCB3ZWVrZW5kIGFuZCBkcmluayB1cCEKPgo+Cj4KPiAtVGltCj4KPgo+Cj4KPgo+ICogKgo+ Cj4gKlRoaXMgc291bmRzIGdyZWF0LiBJdCdzICdhbHBoYScgYnV0IGl0J3Mgd29ya2luZyBncmVh dCBmb3IgdXMgc28gZmFyLgo+IFdlJ3ZlIGJlZW4gc3RhYmxlIHdpdGggb3VyIHBhdGNoZWQgY2xp ZW50IGZvciBhIGZldyBkYXlzIG5vdyBydW5uaW5nIG9uIDI1Kwo+IHdlYiBzZXJ2ZXJzIHdpdGgg MTAgbWVtY2FjaGUgYm94ZXMgd2l0aCBqdXN0IGEgc21hbGwgaGFuZGZ1bCBvZiB1bmhhbmRsZWQK PiBleGNlcHRpb25zLgo+Cj4gSSd2ZSByZWNlbnRseSByZWltYWdlZCBvdXQgbWFjaGluZXMgZnJv bSAyLjQga2VybmVsIHRvIGFuIHNtcCAyLjYga2VybmVsLgo+IExvYWQgb24gZWFjaCBib3ggd2Vu dCBmcm9tIH44NSUgdG8gbGVzcyB0aGFuIDUlICEgIFRoaXMgd2FzIGV4dHJlbWVseQo+IGVuY291 cmFnaW5nIGFuZCBwYXZlcyB0aGUgd2F5IHRvIG1vcmUgZXh0ZW5zaXZlIHVzZSBvZiBtZW1jYWNo ZSB0aHJvdWdob3V0Cj4gb3VyIHNpdGUuCj4KPiBHbyBhaGVhZCBhbmQgcG9zdCB0aGUgYnVncyBh bmQgZW1haWxzLS1Bbnl0aGluZyB0aGF0IGdldHMgbW9yZSBwZW9wbGUgdG8KPiBkb3dubG9hZCB0 aGUgbGlicmFyeSBhbmQgZ2V0IG1vcmUgZXllcyBvbiB0aGUgY29kZSB3b3VsZCBiZSBncmVhdC4K Pgo+IEknZCBsaWtlIHRvIGdhaW4gYSBiaXQgbW9yZSBpbnNpZ2h0IGludG8gc29tZSBhc3BlY3Rz IG9mIHRoZSBjbGllbnQuLiBPbmUKPiBxdWVzdGlvbiBJIGdldCBmcmVxdWVudGx5IGlzIGFib3V0 IHJlZHVuZGFuY3kuIEZyb20gbG9va2luZyBhdCB0aGUgY29kZSwgaXQKPiBzZWVtcyBsaWtlIGlm IGEgbm9kZSBnb2VzIGRvd24gdGhlbiBpdCdzIG1hcmtlZCBhcyBkb3duIGFuZCB0aGUgbmV4dCBz ZXJ2ZXIKPiBvbiB0aGUgbGlzdCBpcyB1c2VkIGluc3RlYWQuIFNpbmNlIGV2ZXJ5IGNsaWVudCBw ZXJmb3JtcyB0aGUgc2FtZSBjaGVjayBhbmQKPiBnb2VzIG9uIHRvIHRoZSBzYW1lIG5leHQgc2Vy dmVyLCB0aGVyZSdzIGJhcmVseSBhIHBlcmZvcm1hbmNlIGhpdC4gVGhlbiB0aGUKPiBzZXJ2ZXIg aXMgY2hlY2tlZCBwZXJpb2RpY2FsbHkgKGF0IGluY3JlYXNpbmcgdGltZSBpbnRlcnZhbHM/KSB1 bnRpbCBpdAo+IGNvbWVzIGJhY2sgdXAuICBJcyB0aGlzIGFib3V0IHJpZ2h0PyBDYW4geW91IGRl c2NyaWJlIHRoaXMgcHJvY2VzcyBhbnkKPiBmdXJ0aGVyPyBEbyB5b3UgaGF2ZSBzb21lIGlkZWFz IG9mIGltcHJvdmluZyBpdD8KPgo+IFdoYXQgZG8geW91IHRoaW5rIGlzIGEgZ29vZCB2YWx1ZSB0 byB1c2UgZm9yIG1heCBpZGxlIGNvbm5lY3Rpb25zPyBJCj4gY3VycmVudGx5IGhhdmUgcm91Z2hs eSAyMDAwIHNpbXVsdGFuZW91cyBjb25uZWN0aW9ucyB0byBlYWNoIG1lbWNhY2hlIG5vZGUKPiBh bmQgaSdtIHByZXR0eSBzdXJlIHRoYXQgbW9zdCBvZiB0aGVzZSBhcmUgaWRsZSBzb2NrZXRzIGlu IGNsaWVudAo+IHBvb2xzLS10aG91Z2ggaSdtIG5vdCBjZXJ0YWluIGhvdyBtYW55IGFyZSBhY3R1 YWxseSB1c2VkLiBJdCdkIGJlIGNvb2wgdG8gYmUKPiBhYmxlIHRvIHNlZSB2YXJpb3VzIGNsaWVu dCBzdGF0cyB0byBnZXQgbW9yZSB0cmFuc3BhcmVuY3kgaW50byB0aGluZ3MuIEkKPiB0aGluayB0 aGUgcGVybCBjbGllbnQgZG9lcyBzb21ldGhpbmcgbGlrZSBpdCBhbHJlYWR5LiBJIGNvdWxkIHRh a2UgYSBsb29rIGF0Cj4gdGhpcyBjbG9zZXIgbmV4dCB3ZWVrIGFuZCBnZXQgc29tZSBtb3JlIGNv bmNyZXRlIGlkZWFzIHRvZ2V0aGVyLgo+Cj4gVGhpcyBlbWFpbCBpcyBnZXR0aW5nIGxvbmcuLi4g YXJlIHlvdSBvbiBhaW0gb3IgYW55IGltPyBpJ20gKioqKiBvbiBhaW0gb3IKPiBtc246ICoqKioK Pgo+IGdvbm5hIGdvIG91dCBkcmlua2luZy4uIHR0eWwhIDopCj4KPiBNYXgqCj4KPiAqICoKPgo+ ICogKgo+Cj4gKiAqCj4KPiAqTWF4LCoKPgo+ICogKgo+Cj4gKkkgbWFkZSB0aGUgY2hhbmdlcyB0 byB0aGUgbGlicmFyeSBhbmQgcmUtdXBsb2FkZWQgdGhlIGJpbmFyaWVzLiAgSSBhbHNvCj4gaGF2 ZSB0d28gc2V0cyBvZiBwcm9qZWN0L3NvbHV0aW9ucywgb25lIGZvciBWUy5ORVQgMjAwMyBhbmQg Lk5FVCAxLjEgYW5kCj4gb25lIGZvciBWUy5ORVQgMjAwNSBhbmQgLk5FVCAyLjAuICBUaGV5IGJv dGggcnVuIG9mZiB0aGUgZXhhY3Qgc2FtZSBzb3VyY2UKPiBmaWxlcyAoYWx0aG91Z2ggSSB0aGlu ayBJIG1heSBtYWtlIHRoZSAyLjAgdmVyc2lvbiB3b3JrIHdpdGggdGhlIG5hdGl2ZQo+IEdaSVAg c3R1ZmYgbmV3IGluIHRoZSAyLjAgZnJhbWV3b3JrIGF0IHNvbWUgcG9pbnQpLioKPgo+ICogKgo+ Cj4gKktlZXAgaW4gbWluZCB0aG91Z2ggdGhhdCB0ZWNobmljYWxseSB0aGUgcHJvamVjdCBpcyBz dGlsbCBpbiBhbHBoYS4KPiBNYWlubHkgYmVjYXVzZSB0aGUgQVBJIGlzbid0IHZlcnkgY2xlYW4u ICBJIHdvdWxkIGxpa2UgdG8gY2xlYW4gaXQgdXAgdG8KPiBhZGhlcmUgbW9yZSB0byB0aGUgLk5F VCBjb2Rpbmcgc3RhbmRhcmRzLiAgSXQgd291bGQgYmUgcHJldHR5IGVhc3kgdG8gY2hhbmdlCj4g YW55IGNvZGUgKG1vc3RseSBqdXN0IGdvaW5nIGZyb20gbG93ZXJjYXNlIHN0dWZmIHRvIHVwcGVy Y2FzZSBhbmQgc21hbGwKPiBzdHVmZiBsaWtlIHRoYXQpLCBidXQgSSdsbCBtYWtlIHN1cmUgdG8g cHV0IGFueSBjaGFuZ2VzIGluIHRoZSBjaGFuZ2Vsb2cuKgo+Cj4gKiAqCj4KPiAqICoKPgo+ICot VGltKgo+Cj4gKiAqCj4KPiAqcC5zLiAgV291bGQgeW91IG1pbmQgaWYgSSBjb3BpZWQgdGhpcyBt ZXNzYWdlIHRvIHRoZSBkZXZlbG9wbWVudCBtYWlsaW5nCj4gbGlzdCBmb3IgdGhlIHByb2plY3Qg c28gdGhhdCBpdCBsb29rcyBsaWtlIHRoZXJlIGlzIHNvbWUgYWN0aXZpdHk/ICBJdCBtaWdodAo+ IGhlbHAgdXMgb3V0IGlmIHNvbWUgb3RoZXIgcGVvcGxlIGhhdmUgYW4gaW5kaWNhdGlvbiB0aGF0 IHNvbWV0aGluZyBpcyBnb2luZwo+IG9uIHdpdGggdGhpcyBwcm9qZWN0LioKPgo+ICogKgo+Cj4g KiAqCj4KPiAqSSdsbCBzZWUgaWYgaSBjYW4gc2VuZCBkaWZmJ3MgbGF0ZXIgb24gd2hlbiBpIGhh dmUgYSBjaGFuY2UgYnV0IHRoZSBmaXJzdAo+IGJ1ZyBJIGNoYW5nZWQgdGhlIHNsZWVwcyB0byB0 aGlzOgo+IFRocmVhZC5TbGVlcCgoaW50KWludGVydmFsKTsKPiBhbmQKPiBUaHJlYWQuU2xlZXAo KGludClpbnRlcnZhbCAqIDEwICkgaW4gdGhlIGNhdGNoCj4KPiBhbmQgdGhlIHdoaWxlIGxvb3Ag Y2hhbmdlZCB0byB0aGlzCj4KPiB3aGlsZSAoKGNvdW50ID0gZ3ppLlJlYWQodG1wLCAwLCAyMDQ4 KSkgIT0gMCkKPiBwbGVhc2UgY29uZmlybSB0aGF0IC0xIHJlYWxseSBpcyBuZXZlciByZXR1cm5l ZCB0aG91Z2ggKG9yIHdoeSBpdCB3b3VsZAo+IGJlKSBidXQgd2UndmUgaGFkIG5vIHByb2JzIHNv IGZhci4KPgo+IFJlZ2FyZGluZyAyLjAsIHdlIGFyZSB1c2luZyBpdCBmb3Igc29tZSBwcm9qZWN0 cyBidXQgbm90IGZvciB0aG9zZSB1c2luZwo+IHRoaXMgY2xpZW50IHNvIGl0J2QgYmUgZ3JlYXQg aWYgeW91IGNhbiBtYWludGFpbiB0d28gYnJhbmNoZXMgaWYgeW91IHN0YXJ0Cj4gdXNpbmcgMi4w LW9ubHkgc3R1ZmYuCj4KPiBXZSdyZSB1c2luZyB0aGlzIHByZXR0eSBoZWF2aWx5IGF0IHRoaXMg cG9pbnQgYW5kIEknbGwgYmUgc3VyZSB0byBsZXQgeW91Cj4ga25vdyBhcyBzb29uIGFzIHdlIGZp bmQgbmV3IGlzc3Vlcy4KPgo+IE1heCoKPgo+ICogKgo+Cj4gKjoqCj4KPiAqSGkgTWF4LCoKPgo+ ICogKgo+Cj4gKkhleSB0aGFua3MgZm9yIHRoZSBleHRyYSBzZXQgb2YgZXllcy4gIFllcywgdGhl IG5hbm9zZWNvbmRzIHN0dWZmIGNhdXNlZAo+IGEgbG90IG9mIHNtYWxsIGxpdHRsZSBlcnJvcnMg d2hlbiBJIHdhcyBwb3J0aW5nIGl0IGZyb20gSmF2YSAod2hpY2ggdXNlcwo+IG1pbGxpc2Vjb25k cykuICAqCj4KPiAqICoKPgo+ICpJZiB5b3UgbWFkZSBhbnkgY2hhbmdlcywgd291bGQgaXQgYmUg cG9zc2libGUgdG8gc2VuZCBtZSBhIGRpZmYgb2YgeW91cgo+IHByb2plY3Q/ICBJIGNvdWxkIGlu Y29ycG9yYXRlIHRoZW0gKGFuZCBnaXZlIHlvdSBjcmVkaXQpIGFuZCByZXBvc3QgdGhlCj4gcHJv amVjdC4qCj4KPiAqICoKPgo+ICpPbiBhbm90aGVyIG5vdGUsIGFyZSB5b3UgdXNpbmcgLk5FVCAx LjEgb3IgMi4wPyAgT25lIHRoaW5nIEkgd291bGQgbGlrZQo+IHRvIGRvIGlzIG1vdmUgdGhlIHBy b2plY3QgdG8gLk5FVCAyLjAgYmVjYXVzZSB0aGUgbGlicmFyeSBwZXJmb3JtcyBzbyBtdWNoCj4g ZmFzdGVyIChJIHRoaW5rIHNlcmlhbGl6YXRpb24gaW4gLk5FVCAyLjAgaGFzIGJlZW4gbXVjaCBp bXByb3ZlZCkuICBJZgo+IG5vdCwgSSdsbCBtYWtlIHN1cmUgdG8ga2VlcCBib3RoIHByb2plY3Qg ZmlsZXMgYXJvdW5kIGFuZCBidWlsZCBpdCBmb3IgYm90aAo+IGZyYW1ld29ya3MuKgo+Cj4gKiAq Cj4KPiAqICoKPgo+ICotVGltKgo+Cj4gKiAqCj4KPiAqICoKPgo+ICpIaSBUaW0sIGdyZWF0IHdv cmsgd2l0aCB0aGUgbWVtY2FjaGVkIGMjIGNsaWVudCAtLSBiZWVuIHVzaW5nIGl0IG92ZXIKPiBo ZXJlIGluIHByb2R1Y3Rpb24gd2l0aCBhIGxvdCBvZiBnb29kIHJlc3VsdHMuIEkndmUgaGFkIHRv IG1vZGlmeSB0aGUgY2xpZW50Cj4gc28gZmFyIGluIGEgY291cGxlIHdheXMgdG8gbWFrZSBpdCB3 b3JrIGFuZCB3YW50ZWQgdG8gbGV0IHlvdSBrbm93IHNvIHlvdQo+IGNvdWxkIGNvbnNpZGVyIGZp eGluZyBpdC4KPgo+IEluIFNvY2tJT1Bvb2wuTWFpbnRhaW4oKTogd2hlbiB5b3UgcGFzcyBhIHRp bWVzcGFuIGluIGludG8gdGhyZWFkLnNsZWVweW91IHNob3VsZCBpbnN0ZWFkIGJlIGp1c3QgcGFz c2luZyBpbiB0aGF0IG51bWJlciBvZiBtaWxsaXNlY29uZHMgKDUwMDApLiAgQQo+IHRpbWVzcGFu IGNvbnN0cnVjdG9yIHRha2VzIHRpY2tzIGluIG5hbm9zZWNvbmRzLiBUaGlzIGNhdXNlcyAxMDAl IGNwdSB0aW1lCj4gc2luY2UgaXQncyBwb2xsaW5nIHdheSBmYXN0ZXIgdGhhbiBpdCBzaG91bGQu Cj4KPiBJbiBNZW1jYWNoZWNsaWVudC5Mb2FkSXRlbXMgeW91IGhhdmUgYSB3aGlsZSBsb29wIHRo YXQgcmVhZHMgdW50aWwKPiBnemkuUmVhZCByZXR1cm5zIC0xLiBJJ20gbm90IHN1cmUgaWYgaXQg ZXZlciByZXR1cm5zIC0xIGJ1dCBsb29raW5nIGF0IHRoZQo+IHppcCBjb2RlLCBpdCBkb2VzIHJl dHVybiAwLiBUaGlzIGFnYWluIHdhcyBjYXVzaW5nIDEwMCUgY3B1IHRpbWUgYXMgaXQgbmV2ZXIK PiBsZWZ0IHRoaXMgdGlnaHQgbG9vcC4KPgo+IFBsZWFzZSBrZWVwIHVwIHRoZSBhd2Vzb21lIHdv cmsgeW91J3JlIGRvaW5nIGFuZCBJIHdpbGwgbGV0IHlvdSBrbm93IGlmIEkKPiBmaW5kIG90aGVy IGlzc3VlcyAoIEkgc3VzcGVjdGVkIHRoZSBtYWludGVuYW5jZSB0aHJlYWQgaGFzIGEgYnVnIHNv bWV3aGVyZQo+IGNhdXNpbmcgZXhjZXB0aW9ucyBidXQgaGF2ZW4ndCBuYWlsZWQgaXQgeWV0KS4K Pgo+IE1heCoKPgo+ICoKPiAqCj4KPiAqICoKPgo+ICoKPiAqCj4KPiAqICoKPgo+Cg== |
From: Tim G. <ti...@ge...> - 2006-01-14 07:34:31
|
Yes, I noticed the exceptions as well. I've broadly narrowed (.?) it down to the TCP connections timing out. I would really like to squash that if I could. As time frees up here are my three development tasks that I would like to tackle: -Clean up the API (the longer it goes on like this the more of a pain it will be to change later) -Fix those exceptions that occur when the sockets time out -Make an App.config section handler so we can configure all of the options for the client API in the App or Web.config file (I just thought of this one today). And wow that's quite a setup you have there. Ours isn't quite as big as that right now, but there is a good possibility that we'll need to scale out that big so we're planning ahead :-). As for an explanation on how the client handles failover: The socket pool keeps persistent connections to each of the memcached servers. When the client loses a connection to one it tries to reconnect after the reconnect timeout. If it doesn't connect, it doubles the timeout and tries again. It keeps doubling the timeout everytime it fails to connect. If it ever reconnects the timeout length is reset. This works really well because if there is a slight network "hiccup" you'll reconnect very quickly, but if the node goes down for a very long time the client quickly ignores it. The failover code could use a little more work and it's all related to those exceptions. There isn't really any "redundancy", only failover. But the failover process works pretty well and it's how the other clients I looked at handle it. Are you familiar with how memcached works? It's really just a hashtable of hashtables. The first level of hashing decides "what server does this go on?" then the next level of hashing happens on the server and it says "where does this item go in my hashtable?" Since all the clients use the same hashing algorithm they all end up coming up with the same values. That's one thing I had to explain to some of the devs on my team. It's only a cache, it's not a persistent store. If one of the nodes goes down, there will be a small hiccup in our web application because most of the stuff will have to be re-cached. Unfortunately I haven't had time to tune and performance test our cache yet, so we're just using the default values right now for our limited beta test of our system. We're a pretty small development shop and I'm pretty pressed for time just trying to add features and squash bugs in the application. Unfortunately the users can't see how cool memcached is, they only see how cool the interface for our application looks and that gets priority. I'm copying this email to the developer mailing list. I'll IM you my AIM screenname. Have a great weekend and drink up! -Tim This sounds great. It's 'alpha' but it's working great for us so far. We've been stable with our patched client for a few days now running on 25+ web servers with 10 memcache boxes with just a small handful of unhandled exceptions. I've recently reimaged out machines from 2.4 kernel to an smp 2.6 kernel. Load on each box went from ~85% to less than 5% ! This was extremely encouraging and paves the way to more extensive use of memcache throughout our site. Go ahead and post the bugs and emails--Anything that gets more people to download the library and get more eyes on the code would be great. I'd like to gain a bit more insight into some aspects of the client.. One question I get frequently is about redundancy. From looking at the code, it seems like if a node goes down then it's marked as down and the next server on the list is used instead. Since every client performs the same check and goes on to the same next server, there's barely a performance hit. Then the server is checked periodically (at increasing time intervals?) until it comes back up. Is this about right? Can you describe this process any further? Do you have some ideas of improving it? What do you think is a good value to use for max idle connections? I currently have roughly 2000 simultaneous connections to each memcache node and i'm pretty sure that most of these are idle sockets in client pools--though i'm not certain how many are actually used. It'd be cool to be able to see various client stats to get more transparency into things. I think the perl client does something like it already. I could take a look at this closer next week and get some more concrete ideas together. This email is getting long... are you on aim or any im? i'm **** on aim or msn: **** gonna go out drinking.. ttyl! :) Max Max, I made the changes to the library and re-uploaded the binaries. I also have two sets of project/solutions, one for VS.NET 2003 and .NET 1.1 and one for VS.NET 2005 and .NET 2.0. They both run off the exact same source files (although I think I may make the 2.0 version work with the native GZIP stuff new in the 2.0 framework at some point). Keep in mind though that technically the project is still in alpha. Mainly because the API isn't very clean. I would like to clean it up to adhere more to the .NET coding standards. It would be pretty easy to change any code (mostly just going from lowercase stuff to uppercase and small stuff like that), but I'll make sure to put any changes in the changelog. -Tim p.s. Would you mind if I copied this message to the development mailing list for the project so that it looks like there is some activity? It might help us out if some other people have an indication that something is going on with this project. I'll see if i can send diff's later on when i have a chance but the first bug I changed the sleeps to this: Thread.Sleep((int)interval); and Thread.Sleep((int)interval * 10 ) in the catch and the while loop changed to this while ((count = gzi.Read(tmp, 0, 2048)) != 0) please confirm that -1 really is never returned though (or why it would be) but we've had no probs so far. Regarding 2.0, we are using it for some projects but not for those using this client so it'd be great if you can maintain two branches if you start using 2.0-only stuff. We're using this pretty heavily at this point and I'll be sure to let you know as soon as we find new issues. Max : Hi Max, Hey thanks for the extra set of eyes. Yes, the nanoseconds stuff caused a lot of small little errors when I was porting it from Java (which uses milliseconds). If you made any changes, would it be possible to send me a diff of your project? I could incorporate them (and give you credit) and repost the project. On another note, are you using .NET 1.1 or 2.0? One thing I would like to do is move the project to .NET 2.0 because the library performs so much faster (I think serialization in .NET 2.0 has been much improved). If not, I'll make sure to keep both project files around and build it for both frameworks. -Tim Hi Tim, great work with the memcached c# client -- been using it over here in production with a lot of good results. I've had to modify the client so far in a couple ways to make it work and wanted to let you know so you could consider fixing it. In SockIOPool.Maintain(): when you pass a timespan in into thread.sleep you should instead be just passing in that number of milliseconds (5000). A timespan constructor takes ticks in nanoseconds. This causes 100% cpu time since it's polling way faster than it should. In Memcacheclient.LoadItems you have a while loop that reads until gzi.Read returns -1. I'm not sure if it ever returns -1 but looking at the zip code, it does return 0. This again was causing 100% cpu time as it never left this tight loop. Please keep up the awesome work you're doing and I will let you know if I find other issues ( I suspected the maintenance thread has a bug somewhere causing exceptions but haven't nailed it yet). Max |