From: Vlad H. <hv...@us...> - 2007-07-26 10:13:57
|
> > > I know various db "vendors" support direct I/O as a db option > > > on Linux, using the current O_DIRECT implementation, but they > > > had to be careful in their code to avoid certain O_DIRECT > > > kernel bugs and race conditions. For example, it seems mixing > > > O_DIRECT and non-O_DIRECT reads and writes on the same file > > > simultaneously may result in stale data reads or writes. The > > > workaround: don't do that! > > > > posix_fadvise free from this drawback and not restrict us to use > > aligned memory buffers, right ? > > But lucks one more important support - direct I/O from/to user space. If i understand correctly what Linus said about O_DIRECT implementation details - there are no direct I/O from/to user space. It anyway must be coherent with cache state. I may be wrong. Anyway - cost of context switch and page copy is far less than cost of direct access to disk, isn't is ? > I suggest we begin with O_DIRECT, and only in case it has serious problems try > with posix_fadvise(). Ok. In unix.cpp\PIO_force_write there are code i don't understant completely. Please explain : why there are no call of fcntl( F_GETFL) ? why for "hpux" platform used "union fcntlun" while i can't find it on http://docs.hp.com (fcntl described at http://docs.hp.com/en/B2355-60130/fcntl.2.html) I've attached diff with proposed fix, may you take a look at it ? > This is even more important taking into an account that > 2.4 kernels do not support posix_fadvise(). "Do not support" is : a) not implemented but entrypoint exists or b) entrypoint is missed ? If (a) - i think we can live with it Regards, Vlad |