[Etherboot-developers] proposed patch to speedup image loading

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,
I'd like to suggest a patch for nfs (and maybe tftp as well) downloads
for etherboot.

As it is now, both protocols only allow one packet in flight, and
in case of a loss there is a long timeout (exponential backoff with
a base interval of 10 seconds). On paths where there is even some
small percentage of random loss (as it often happens in presence
of mixed 10/100 networks with underbuffered switches, and/or coax/utp
converters), booting is deadly slow.

I have been looking at a way to implement a window scheme similar
to TCP, but it would request too much restructuring of the code,
especially because the *download() routines assume in-order reception
of packets.

So i came up with a very simple change (attached) which should be
network-safe, yet quite effective and simple to implement.

The idea is to interpret previous successes in receiving a packet
as a 'right' to reduce the timeout for the next attempt to a small
value, whereas the expiration of a timeout drastically reduces our
right to do so. So, an occasional loss after a bunch of successes
will only stall the connection for a few ticks, but repeated losses
will quickly revert the behaviour to the standard exponential
backoff.

This way you partially compensate the fact that having only one
packet in flight the protocol is quite subject to timeouts.

Implementationwise, the whole code is basically four lines of C code: 
a static variable (tokens) in nfs.c::nfs_read() is incremented
on each success (up to some upper bound, say 256), and halved
on timeouts. The actual timeout value passed to await_reply()
is then shortened to 1/2s (or something short) whenever tokens >= 2.
By playing with the constants you can make the scheme more or
less aggressive, you get the idea...

Ideally, one could also try to add an RTT estimator in nfs_read()
so both the "short" timeout and the base timeout for the exponential
backoff are adaptive. That might take another 10-20 lines of
code i guess, probably with reasonably good benefits.

I am planning to commit this as a patch to the FreeBSD port
of etherboot, but maybe it could be interesting to integrate
it into the standard etherboot sources.

	cheers
	luigi

-----------------------------------+-------------------------------------
  Luigi RIZZO, lu...@ie...  . Dip. di Ing. dell'Informazione
  http://www.iet.unipi.it/~luigi/  . Universita` di Pisa
  TEL/FAX: +39-050-568.533/522     . via Diotisalvi 2, 56126 PISA (Italy)
  Mobile   +39-347-0373137
-----------------------------------+-------------------------------------

--- nfs.c.orig	Tue Mar 12 21:44:19 2002
+++ nfs.c	Tue Mar 12 23:07:32 2002
@@ -321,6 +321,14 @@
 	int retries;
 	long *p;
 
+	static int tokens=0;
+	/*
+	 * Try to implement some kind of window protocol in terms of
+	 * response to losses. On success receive, increment number of
+	 * tokens by 1 (top at 256). On failure, halve it.
+	 * When the number of tokens is > 2, use a very short timeout
+	 */
+
 	id = rpc_id++;
 	buf.u.call.id = htonl(id);
 	buf.u.call.type = htonl(MSG_CALL);
@@ -336,9 +344,14 @@
 	*p++ = 0;		/* unused parameter */
 	for (retries = 0; retries < MAX_RPC_RETRIES; retries++) {
 		long timeout = rfc2131_sleep_interval(TIMEOUT, retries);
+		if (tokens >= 2)
+			timeout = TICKS_PER_SEC/2;
+
 		udp_transmit(arptable[server].ipaddr.s_addr, sport, port,
 			(char *)p - (char *)&buf, &buf);
 		if (await_reply(AWAIT_RPC, sport, &id, timeout)) {
+			if (tokens < 256)
+				tokens++;
 			rpc = (struct rpc_t *)&nic.packet[ETH_HLEN];
 			if (rpc->u.reply.rstatus || rpc->u.reply.verifier ||
 			    rpc->u.reply.astatus || rpc->u.reply.data[0]) {
@@ -355,7 +368,8 @@
 			} else {
 				return 0;
 			}
-		}
+		} else
+			tokens >>= 1;
 	}
 	return -1;
 }