|
From: Randy M. <mac...@no...> - 2003-07-11 20:29:42
|
Hi, Thanks for the congestion details. Blocking on congestion is usually the right thing to do. It looks like you haven't implemented NON_BLOCKing send()s. Is that right? Is this a planned feature that I could implement? I suppose one could use the poll interface to avoid blocking. How much of the socket interface makes sense to implement for Tipc? Has anyone run Tipc on PPC linux? I only noticed one x86 asm instruction to get a random port so I don't expect that it's a problem. I'm not considering a mixed clusters. When you think of a device processor - what are the minimum resources it would have? Say > 100 MHz CPU, 16 MB Ram running a stripped down linux os? Or something even smaller? In the excellent Tipc documentation, you might consider putting simpler kernel module insert commands like: Most of the time: /sbin/insmod tipc.o will be sufficient to work on a single desktop. Then add the ethernet binding: /sbin/insmod tipc.o eth0=1 processor=X (BTW, someone told me that zone=0 crashes the system ;-) ) Jon, I may have some more questions... when is your vacation? Thanks, // Randy > > Message: 1 > From: "Jon Maloy (QB/LMC)" <jon...@er...> > To: Paul Jardetzky <pw...@fa...> > Cc: "Jon Maloy (QB/LMC)" <jon...@er...>, > TIPC mailing list > <osd...@li...> > Date: Thu, 10 Jul 2003 12:47:25 -0400 > Subject: [Osdlcluster-tipc] Re: More debugging news > > Hi, > I will try to update my document with this information after my vacatation, > which starts soon. > > About link congestion the story is very simple: I never throw away any > messages. If the link queue limit is hit (it is configurable, but by default > set to 48 packets for messages of importance "low"), the sending > process/thread > is put transparently to sleep until the window opens up again, which > normally > happens after a few microseconds. This gives the impression that a > sending "never" fails. > > All messsges are given an importance priority in the range 0-3, set in > the sending > port/socket. Default value is 0 (low). > Messages of priority "low" are subject to the "base queue limit", which > is the same as > the window limit, as configured, or 48. > Messages of priority "medium" (1) are subject to queue limit = (base > queue limit*4/3) . > If the send queue is beyond window limit, but below the calculated > queue limit, > the message is queued, but not sent until the window opens up again. The > process > is not blocked in this case. If queue size is beyond calculated queue > limit the process > is temporarily put to sleep, just as with importance zero messages. > Messages of priority "high" (2) and "non_rejectable" (3) are treated the > same way, but are subject > to the queue limits (base queue limit*5/3) and (base queue limit*6/3) > respectively. > > For TIPC internal (e.g. name table update) messages, or routed messages, > these limits > are far more generous, in order to not let TIPC have to compete with its > users. > > When it comes to processor overload, incoming messages are also handled > according > to importance priority. Each message is matched against two values before > they are put into the read-queue of a socket. > > The basic value is the "global queue limit", which keeps track of the > number of > queued incoming, but not yet read messages on the whole processor. The > limits here > are 1000 for low importance, 2000 for medium importance, and 10000 for > high and non-rejectable importance. > If an incoming message is connection oriented these thresholds are > multiplied > by four, under the assumption that it has more consequences to tear down a > transaction in progress than to reject it in the setup-phase, which is where > non-connection oriented messages are typically used. > If the upper limit is hit for a message, the message is not thrown > away, but > sent back (i.e. the first 1024 bytes) to the sender along with an error > code. > The importance for the rejected message is raised one step, to reduce the > risk that the rejected message will hit the limit at the return (which > is often at > the same processor). If, despite all this, the rejected message hits the > global limit > for its importance level, the the message is thrown away. > **This is the only situation where a message is thrown away silently, and as > you see it takes a really bad overload situation to en up there.*** > The consequence is also that a "connection abortion" message, which is > both > connection oriented, contains an error code, and has a raised importance > level, > is virtually never thrown away. > > To protect the processor from locally misbehaving applications there is > also a > "local queue limit", keeping track of the number of un-read messages > queued in > each socket. Based on empirical experience the values are here set to > 1/2 of the > global threshold, that is 500, 1000 and 5000 messages respectively. > Otherwise > the algorithm for rejecting and throwing away messages is the same as > described above. > > I hope this gives you the information you need. > > Regards /Jon > > > > Paul Jardetzky wrote: > > >>Ok. I've read and understood your mail. When I get the time, >>I'll make the appropriate changes. They are not hard and don't >>require modifying anything outside the adaptation layer. That >>said, if you decide to make the change for a future release, >>just let me know. >> >>Mostly, I needed to fix this quickly even though we are not >>going to use multiple zones. It is more about the perception >>of TIPC's stability with the other engineers here. They see >>machines crashing (their own desktops unfortunately) and they >>immediately become concerned about building our product on top >>this code. Fast bug fixes and some reassurance is needed to >>develop the required confidence. You know the story ... :). >> >>We have a few solid networking types that need to know TIPCs >>behavior under transient congestion ... e.g. when messages are >>dropped and under what circumstances, window sizes, etc... If >>you have information outside what is already in your document >>(or the code), it will help with convincing folk that it has >>what we need |