On Thursday 02 October 2003 03:23, Jeff Garzik wrote:
> Larry McVoy wrote:
> > On Wed, Oct 01, 2003 at 04:29:16PM -0700, Andrew Morton wrote:
> >>If you have a loop like:
> >>
> >> char *buf;
> >>
> >> for (lots) {
> >> read(fd, buf, size);
> >> }
> >>
> >>the optimum value of `size' is small: as little as 8k. Once `size' gets
> >>close to half the size of the L1 cache you end up pushing the memory at
> >>`buf' out of CPU cache all the time.
> >
> > I've seen this too, not that Andrew needs me to back him up, but in many
> > cases even 4k is big enough. Linux has a very thin system call layer so
> > it is OK, good even, to use reasonable buffer sizes.
>
> Slight tangent, FWIW... Back when I was working on my "race-free
> userland" project, I noticed that the fastest cp(1) implementation was
> GNU's: read/write from a single, statically allocated, page-aligned 4K
> buffer. I experimented with various buffer sizes, mmap-based copies,
> and even with sendfile(2) where both arguments were files.
> read(2)/write(2) of a single 4K buffer was always the fastest.
That sounds reasonable, but today's RAM throughput is on the order
of 1GB/s, not 100Mb/s. 'Out of L1' theory can't explain 100Mb/s ceiling
it seems.
--
vda
|