>>>2) How well does it the tcp stack work in a congested environment.
>>> - Does it play badly with other simultaneous tcp connections?
>>> - Does the implemented window scaling work well.
>>
>>I don't have much in the form of test environments. Hopefully, somebody else can
>>
>>take a stab at testing the code in more real-life situations.
>
>
> Some of that can be done by simple inspection and theoretical calculation.
> Testing helps two of course. But I know I put in all of the exponential
> back off code into etherboot based on pure inspection.
Apart from the fact that I forgot all my documentation at work and had
to base the code on what I remembered and what I could find online, I
think the algorithm used mostly makes sense.
The window scaling algorithms that you usually find in TCP stacks assume
that the kernel receives packets in interrupt service handlers and
delivers them once the user space application is ready to consume data.
If the application cannot consume fast enough, then flow control must
kick in.
None of this is really true for Etherboot. Receiving data is the bottle
neck, as we can only poll the driver and not all cards support
sufficiently large buffers; but once the data is received, it can be
consumed immediately. Also, we don't really know, if we just dropped
packets on the floor, because we did not poll fast enough.
That's why my code slowly grows the window size, but as soon as it
discovers problems, shrinks it back to half. It could try to do smarter
things based on timeouts and RTT estimates, but I am not sure this would
really help. On the other hand, in my small test environment, I have
never seen it try to shrink the window; but in more congested networks
or on slower clients, I wouldn't be surprised if it happened occasionally.
In fact, I do try to correctly compute smoothed estimates of the
round-trip-time, but realistically that is just overkill. The client
never sends more than the initial SYN packet, one data packet and
occasional ACKs. This is definitly not enough to estimate RTT very
precisely.
This would be different, if somebody really went to the trouble of
implementing more complicated TCP based protocols (such as the iSCSI
stuff that you mentioned). If there is a lot of data being sent in both
directions, then the TCP implementation probably needs to be more
sophisticated than what I do right now.
As for the more advanced tuning algorithms that modern TCP stacks use
(e.g. slow start, restart after a stopped connection, ECN, ...), I'll
check with my documentation and see if any of those are applicable, but
I don't expect that much of it is useable.
>>>3) Are alignment considerations that need to be taken care of.
>>
>>??? Can you elaborate what you were thinking of?
>
>
> Compilers on non-x86 assume there certain kinds of data are aligned
> to certain boundaries. 4 bytes for a 32bit int for example. Network
> packets have a tendency to misalign structures.
>
> For running the code on the Itanium for example it is very important
> that we don't have that kind of issue.
I see. I don't believe any of my code violates any alignment
requirements (other than the HTTP handler calling getdec() in a slightly
broken way; we really should have something like getndec()), but I am
not sure about the OS loaders. The TCP code just hands the received data
off as it gets it. Can all of our loaders deal with arbitrarily
fragmented blocks or do I have to reassemble blocks before I can hand
them off? If I have to do this, what block size do I have to use? Is any
power of two OK, or does it have to be 512 bytes?
Markus
|