#89 Enumerating two or more directores in parallel

None
open
nobody
None
1
2014-02-10
2014-02-08
Ralph Böhme
No

Summary

If the clients sends directory enumeration requests for two or more directories in parallel, if both directories contain many files performance may be severely affected.

Details

Interleave directory enumerations means the client sends enumeration requests for at least two directores in parallel:

req1:   40 elements from directory id 1 from index 1
req2:   40 elements from directory id 2 from index 1
req3:   40 elements from directory id 1 from index 41
req4:   40 elements from directory id 2 from index 41
req5:   40 elements from directory id 1 from index 81
req6:   40 elements from directory id 2 from index 81
req7:   40 elements from directory id 1 from index 121
req8:   40 elements from directory id 2 from index 121

Every time the directory id changes in subsequent enumeration requests, the afpd process will discard the pre-cached folder enumeration listing and rescan the full directory.

If both directories contain eg 20000 files, for every request the afpd process will rescan the full directory. For 20000 files per directory and 40 files per AFP enumeration request, 500 AFP enumeration requests are needed per directory, or a total of 1000 full directory scans for both.

One directory scan requires roughly 20000 / 40 = 500 getdents() syscalls, if on Linux the getdents() syscall uses a 32k buffersize which fit 40 items on average.

Result: 1000 directory scans * 500 getdents() syscall per directory scan = 500,000 getdents() syscalls.

Improve directory enumeration so it's capable of caching more then one directory.

Discussion