ReadAhead-NG Code
Status: Alpha
Brought to you by:
bluefoxicy
File | Date | Author | Commit |
---|---|---|---|
include | 2006-10-17 | bluefoxicy | [r19] Work on rah_inotify.c; Updated copyrights, adde... |
COPYING | 2006-10-17 | bluefoxicy | [r19] Work on rah_inotify.c; Updated copyrights, adde... |
Makefile | 2006-10-17 | bluefoxicy | [r19] Work on rah_inotify.c; Updated copyrights, adde... |
README | 2006-10-06 | bluefox | [r3] Updated README |
rah.c | 2006-10-17 | bluefoxicy | [r19] Work on rah_inotify.c; Updated copyrights, adde... |
rah_core.c | 2006-10-17 | bluefoxicy | [r20] fixed rah_core.c to not have dPRINTF defined |
rah_inotify.c | 2006-10-17 | bluefoxicy | [r19] Work on rah_inotify.c; Updated copyrights, adde... |
rah_internal.h | 2006-10-17 | bluefoxicy | [r19] Work on rah_inotify.c; Updated copyrights, adde... |
rhtest.c | 2006-10-17 | bluefoxicy | [r19] Work on rah_inotify.c; Updated copyrights, adde... |
ReadAHead-ng This is a library and support framework for dynamic readahead() in Linux. For systems without readahead(), we supply a simulation method using mmap() and assignment; this simulation is less efficient, but should do the same job. There are several major parts of the framework. The subsystem is object oriented as well. RAH OBJECTS ReadAHead-ng is object oriented. Objects are not thread safe; the program builds an RAH-ng module and then passes it to RAH-ng. This module contains instructions, such as "Monitor my process tree," "RAH these files," and "Write an analysis log to /var/log/rah-ng/xxx" for example. RAH-ng runs in a separate thread to avoid blocking and COW CPU thrashing. Currently the instructions we forsee include: - OPTIMIZE. Optimizations to Readahead order, such as micro-reordering, will be performed at run. - FILE. Specifies a file to readahead(). - MONITOR. Specifies a directory to monitor non-recursively for access - LOG. Specifies a file handle to log monitor data to. - REPLAY. Specifies a file handle to a log to replay. These instructions wil be used in various ways to carry out RAH-ng functionality. For example, MONITOR can be used to monitor access during a specific operation such as booting, and write it out to a file descriptor specified by LOG. Later, REPLAY can specify the log when that operation occurs, while MONITOR and LOG can be used to again monitor and eventually re-log the operation to optimize for any changes made to the process. The OPTIMIZE instruction is unspecified. An optimization function will reorder RAH-ng modules so that MONITOR and LOG commands come first, and place OPTIMIZE at the beginning. The exact behavior of RAH-ng after encountering OPTIMIZE is something that will be tuned over time. Currenty we are thinking of OPTIMIZE techniques such as micro-reordering files during execution of a module. Files will be read almost in order; this technique will move a file up to two spaces away from its original position in an attempt to get the smaller files earlier. This preserves overall flow, but brings the highest number of seeks to the earliest access so if the program catches up it will likely catch up to a minimally blocking read. Another possible gain is that large files are probably read in pieces, so they may benefit from kernel-initiated readahead anyway. ADAPTIVE READAHEAD ReadAHead-ng will take advantage of inotify on Linux to log access patterns for certain events. For example, init can trigger RAH-ng so that it can read files needed for boot off disk; during boot, RAH-ng will analyze disk usage and make any amendments to its previous analysis as needed. When boot finishes, init will terminate RAH-ng, which will write its ammendments back to disk. By supplying an adaptive framework, RAH-ng allows system administrators to benefit from RAH-ng without any reconfiguration as the system changes. The object oriented nature of RAH-ng allows for separate logs and analysis to be made. For example, a process such as init can start RAH-ng on /lib, /lib/modules/[booted kernel tree], /bin, and /sbin; and then after mounting /usr construct another RAH-ng module to execute on /usr. A method for blocking threads until another thread finishes its readahead chain is also in the works; the thread will finish its final readahead() call, unblock dependent threads, and continue monitoring its paths. INTELLIGENT READAHEAD The ReadAHead-ng intelligent read-ahead handling handles files depending on their type. Particularly, ELF files have several sections targetted and will only read other sections if they fall between these sections and aren't very long. With ELF files, we particularly want to avoid reading .text, the longest and least relevant section of the ELF file. Relocations, the GOT, and a few other headers are used during dynamic linking; these are important and must be faulted in before the library or executable is used. Other sections such as .text will not have their entire area used, and will not be used until dynamic linking is finished. Reading in the ELF headers will allow the program to load faster; its execution will not be as IO bound, and reading its .text in ahead of time is a waste of time and page cache. It is predicted that skipping a single page or such would only increase latency. If .text is 6000 bytes and falls between two useful sections, such that there is a single page between them that consists of only .text, then it is faster to just read straight across .text and not bother skipping. On the other hand, if .text is 800K, it's worth skipping and picking back up at .got. (NOTE: Systems using flash probably should just skip anything they can; seek time is 0.) There are other special optimizations that can be performed on ELF files. We know .data and .bss will be needed, but not immediately; RAH-ng can perform all relevant readahead() operations and then return to ELF files to read .data and .bss where it could not be justified in the first pass. It would also be possible to discover the location of main() and readahead() it and the .init section.