From: Neil H. <nh...@tu...> - 2008-07-25 16:57:15
|
On Sat, Jul 26, 2008 at 01:39:23AM +1000, leon zadorin wrote: > On Fri, Jul 25, 2008 at 10:03 PM, Neil Horman <nh...@tu...> wrote: > > On Fri, Jul 25, 2008 at 03:30:58PM +1000, leon zadorin wrote: > >> >> ok... is there a way to increase/control the "amount of system-wide > >> >> memory available to SCTP" with something like sysctl (at the system > >> >> admin level that is)? > >> >> > >> > Yes, see /proc/sys/net/sctp/sctp_[r|w]mem] > >> > >> sweet :-) thanks for that :-) > >> > > No problem :) > > > >> >> > >> > This seems rather antithetical to what you note above, that there can be > >> > multiple senders. Perhaps it won't be threads, but you will have independent > >> > contexts of execution to deal with here. This probably isn't germaine to the > >> > conversation though, sorry. > >> > >> that's cool. the choice of not having threads *or* forks/ipc/contexts > >> of execution is personal - I want to do relegate (as much as possible) > >> all io-bound staff to single-thread 'multiplexed'-like style. I don't > >> think that there is anything unelegant or impossible there :-) > >> > > Sorry to drg on this, but I'm trying to understand how many contexts you are > > going to have here. From you're description, it sounds to me like you will have > > two or more processes, acting as data sources, calling your library api. Will > > you have an additional execution context (thread or process), which is > > responsible for data the data from your previous two sources, and writing them > > to your destination streams (by calling write, sendmsg, etc)? > > Mmm.. not really... for simplicity's sake let's say there is only 1 > process - a server. Of course there may be various clients that may > connect to this server, but consider each of those clients as *single* > stream data generator or consumer - each connects to a server via TCP > or SCTP and either sends data to it, or reads data from it etc.. > > It is the *server* (which consists of only 1, one, context/thread of > execution) which will provide for many-2-many streams 'mappings' > (sockets A and B may be read from and both sent to socket C, whilst > also reading from socket D and sending to socket E). And it is the > server which is the subject of our discussion. The client apps are > irrelevant - they may well be 3rd party applications which use > whatever they want to communicate the data via TCP/SCTP/whatever to > server, etc > Ok, that makes sense. > >> > > >> > If I can try to summarize this, what I read from the above is that: > >> > > >> > 1) You have two data sources, call them A and B > >> correct - but there could be other 'associations' (not sctp, but > >> rather 'abstracted' streams relationships/flow-of-data) - all on the > >> same thread. > >> > > ok > > > >> > 2) A and B share an output channel > >> correct. > >> > >> > 3) the output channel (a socket or file) is bandwidth limited, and may block a > >> > caller if the bandwidth is exceeded > >> > >> physically it is capable of blocking, but my api will imply that > >> blocking won't happen (at least in normal conditions) - because the > >> heavy, endless data sources/streams shall be throttled (in their > >> reading) before this happens > >> > > ok > > > >> > 4) A is a high rate source, and may starve the output channel from B > >> > >> correct for "A is a high rate source, and may starve the output > >> channel" (at a physical level)... and I don't understand the 'from B' > >> qualification at the end of the sentense... > >> > > I mean to say that (in the absence of this code you are trying to write), that > > if A and B were multiplexing this output channel, Since A was capable of > > consuming all the bandwidth of that channel, it was possible (and likely) that > > when B had the opportunity to write to the output stream, that its execution > > would be blocked by the kernel (in fact A execution would be as well). And > > blocking is catastrophic to the behavior of your application. The purpose of > > the code you are writing is to avoid blocking in either the context of A or B, > > with some sort of indicator as to weather or not the next packet of data would > > block the execution context. > > > > correct (for A it is important, B is expected not to block due to > sufficiently sized buffers). > ok > >> > 5) B is a low rate or bursty source, and cannot block > >> > >> ideally, none of them are allowed to block in 'api'-level, but the "B" > >> one is expected to be solvable by the sufficient buffer sizes, whilst > >> "A" (an endless, heavy data stream) is in need of 'throttling when > >> destination is running low on free space' mechanisms - because no > >> matter how large the buffers are - if the stream is 'endless' it will > >> end up depleting the destination's buffer space. > >> > >> The whole thing (inclusive of single thread predicate) does not seem > >> to me overly complex when one further predicates the design on being > >> able to 'query' the free space for sending... > >> > > > > > > Well, I'll sa it again, go ahead and write the patch, I don't think anyone is > > going to complain if you submit it (myself included). Although, based on our > > conversation here, i wonder if you couldn't solve this more generically for any > > output descriptor using timestamps and counters: > > > > 1) Set the output descriptor to be non-blocking (I know, bear with me) > > that's cool - I already do and 'EWOULDBLOCK' is used to set 'error = > true' for the stream's state (or at least shoot a scary warning). > > > 2) Keep a running counter of the total number of bytes sucesssfully written to > > the output stream > > I already do - to aggregate the check for 'free space' not to be done > at every system write call. > Ok. > > 3) Record a timestamp after every sucessfull send > > wouldn't this be more costly as compared to checking for 'free space' > once say every 1Meg of data has been transmitted? Plus, one would have > to be careful wrt clock jumps - clock-monotonic is perhaps the only > choice (cpu counters are not good on SMP's and NTP daemon can step the > gettimeofday...) ... i'm not sure... > That kind of depends on how exactly you want to record the time stamp. gettimeofday on many Linux arches is implemented as a vsyscall these days, which means no kernel trap, so you're savings over the use of an ioctl may be significant, if a 1:1 replacement of calls is possible. If ntp adjustments are a problem, that limits usefullness here, but IIRC those can be capped and are very small (the ntpd man page indicates clock drift nominally doesn't stray more than 128ms). YMMV of course. > > 4) You're available output rate after every send can be computed as (total byte > > counter)/(last timestamp-time at descriptor open) > > 5) If a write returns EWOULDBLOCK, simply try again until such time as the write > > suceeds. > > Oops - this is no good - keeping in mind that there may not only be a > single mapping of "A and B sources directed to a C stream"... - there > may also be a completely independent D source directed to E > destination (i.e. a completely independent mapping of streams) - and > "D->E" may well be capable of writing and I *don't* want to starve the > latter because the former is blocking whilst trying to finish its > writes to C. All of the stream mappings are serviced on a single > thread by a single process (i.e. a server as per above sentence)... > and returning from loop without waiting for write to complete implies > caching/queuing ... > Hmm, ok. That kind of nixes this plan then. Neil -- /**************************************************** * Neil Horman <nh...@tu...> * Software Engineer, Red Hat ****************************************************/ |