From: Leyne, S. <Se...@br...> - 2007-07-24 16:11:01
|
Alex, > > Disk sectors have been 512 bytes for the last 20 years and likely for > > many more. Accordingly, our disk accesses will always be on sector > > boundaries. >=20 > Sean, problem is that not all of our buffers are 512 bytes aligned now. Which buffers? The only buffers which need to be aligned would be those that would be written to disk AND where the file was opened with the special flag. I was expecting that only the database file would be opened with the flag. Sean |
From: Alex P. <pes...@ma...> - 2007-07-25 06:38:27
|
On Tuesday 24 July 2007 20:12, Leyne, Sean wrote: > Alex, > > > > Disk sectors have been 512 bytes for the last 20 years and likely > > for > > > > many more. Accordingly, our disk accesses will always be on sector > > > boundaries. > > > > Sean, problem is that not all of our buffers are 512 bytes aligned > > now. > > Which buffers? > for example, this one: static bool raw_devices_validate_database ( ... char header[MIN_PAGE_SIZE]; ... const ssize_t bytes = read (desc, header, sizeof(header)); I don't say it's impossible to make all of them aligned, but it was not done yet. > The only buffers which need to be aligned would be those that would be > written to disk AND where the file was opened with the special flag. And read from disk to, me thinks? > I was expecting that only the database file would be opened with the > flag. Yes, certainly. |
From: Alex P. <pes...@ma...> - 2007-07-26 06:11:25
|
On Wednesday 25 July 2007 17:30, Vlad Horsun wrote: > > I know various db "vendors" support direct I/O as a db option > > on Linux, using the current O_DIRECT implementation, but they > > had to be careful in their code to avoid certain O_DIRECT > > kernel bugs and race conditions. For example, it seems mixing > > O_DIRECT and non-O_DIRECT reads and writes on the same file > > simultaneously may result in stale data reads or writes. The > > workaround: don't do that! > > posix_fadvise free from this drawback and not restrict us to use > aligned memory buffers, right ? But lucks one more important support - direct I/O from/to user space. I suggest we begin with O_DIRECT, and only in case it has serious problems try with posix_fadvise(). This is even more important taking into an account that 2.4 kernels do not support posix_fadvise(). |
From: Vlad H. <hv...@us...> - 2007-07-26 10:13:57
Attachments:
unix.diff
|
> > > I know various db "vendors" support direct I/O as a db option > > > on Linux, using the current O_DIRECT implementation, but they > > > had to be careful in their code to avoid certain O_DIRECT > > > kernel bugs and race conditions. For example, it seems mixing > > > O_DIRECT and non-O_DIRECT reads and writes on the same file > > > simultaneously may result in stale data reads or writes. The > > > workaround: don't do that! > > > > posix_fadvise free from this drawback and not restrict us to use > > aligned memory buffers, right ? > > But lucks one more important support - direct I/O from/to user space. If i understand correctly what Linus said about O_DIRECT implementation details - there are no direct I/O from/to user space. It anyway must be coherent with cache state. I may be wrong. Anyway - cost of context switch and page copy is far less than cost of direct access to disk, isn't is ? > I suggest we begin with O_DIRECT, and only in case it has serious problems try > with posix_fadvise(). Ok. In unix.cpp\PIO_force_write there are code i don't understant completely. Please explain : why there are no call of fcntl( F_GETFL) ? why for "hpux" platform used "union fcntlun" while i can't find it on http://docs.hp.com (fcntl described at http://docs.hp.com/en/B2355-60130/fcntl.2.html) I've attached diff with proposed fix, may you take a look at it ? > This is even more important taking into an account that > 2.4 kernels do not support posix_fadvise(). "Do not support" is : a) not implemented but entrypoint exists or b) entrypoint is missed ? If (a) - i think we can live with it Regards, Vlad |
From: Alex P. <pes...@ma...> - 2007-07-26 10:53:38
|
On Thursday 26 July 2007 14:13, Vlad Horsun wrote: > > > > I know various db "vendors" support direct I/O as a db option > > > > on Linux, using the current O_DIRECT implementation, but they > > > > had to be careful in their code to avoid certain O_DIRECT > > > > kernel bugs and race conditions. For example, it seems mixing > > > > O_DIRECT and non-O_DIRECT reads and writes on the same file > > > > simultaneously may result in stale data reads or writes. The > > > > workaround: don't do that! > > > > > > posix_fadvise free from this drawback and not restrict us to use > > > aligned memory buffers, right ? > > > > But lucks one more important support - direct I/O from/to user space. > > If i understand correctly what Linus said about O_DIRECT implementation > details - there are no direct I/O from/to user space. It anyway must be > coherent with cache state. I may be wrong. This is from 'man open': O_DIRECT Try to minimize cache effects of the I/O to and from this file. In general this will degrade performance, but it is useful in special situations, such as when applications do their own caching. File I/O is done directly to/from user space buffers. The I/O is synchronous, i.e., at the completion of a read(2) or write(2), data is guaranteed to have been transferred. Under Linux 2.4 transfer sizes, and the alignment of user buffer and file offset must all be multiples of the logical block size of the file system. Under Linux 2.6 alignment to 512-byte boundaries suffices. > Anyway - cost of context switch and page copy is far less than cost of > direct access to disk, isn't is ? But if we can avoid them, why not? > > I suggest we begin with O_DIRECT, and only in case it has serious > > problems try with posix_fadvise(). > > Ok. In unix.cpp\PIO_force_write there are code i don't understant > completely. > > Please explain : > > why there are no call of fcntl( F_GETFL) ? Hmmm... looks like it will be useful to do it. > why for "hpux" platform used "union fcntlun" while i can't find it on > http://docs.hp.com > (fcntl described at http://docs.hp.com/en/B2355-60130/fcntl.2.html) > > I've attached diff with proposed fix, may you take a look at it ? OK from F_GETFL pov, but do not want to modify hpux blindly. May be it does not match docs? This code was unmodied since 1.5, which has hpux port, which works. > > > This is even more important taking into an account that > > 2.4 kernels do not support posix_fadvise(). > > "Do not support" is : > a) not implemented but entrypoint exists > or > b) entrypoint is missed > ? > > If (a) - i think we can live with it Yes and no. People who work with firebird under really high loads noticed that CS works absolutely fine with 2.4, but has problems with 2.6. Quick test (far not final result) showed that kernels work slightly differenly with SYSV semaphores. Therefore it's rather desired to have implementation fully operational for 2.4 too. O_DIRECT is present and implemented in both. |
From: Adriano d. S. F. <adr...@uo...> - 2007-07-27 16:39:38
|
Hi! Testing the new feature, I see a problem. Since we maintain the cache per-database, when all attachments are disconnected, we loose all the cache, causing very huge performance problems. Adriano |
From: Vlad H. <hv...@us...> - 2007-07-27 16:49:16
|
> Testing the new feature, I see a problem. > > Since we maintain the cache per-database, when all attachments are > disconnected, we loose all the cache, causing very huge performance > problems. How it is related to the new feature ? Regards, Vlad |
From: Adriano d. S. F. <adr...@uo...> - 2007-07-27 16:52:54
|
Vlad Horsun escreveu: >> Testing the new feature, I see a problem. >> >> Since we maintain the cache per-database, when all attachments are >> disconnected, we loose all the cache, causing very huge performance >> problems. >> > > How it is related to the new feature ? Not directly related, but with FS cache enabled, the OS caches even closed files. This is big problem with FS cache disabled... Imagine all users disconnect from application in the end of day and reconect again in the another day. Just for example, I have a query that runs in 250 seconds when database is not cached, and 3 seconds when it is cached (i.e., run it again even if I close the server). Adriano |
From: Vlad H. <hv...@us...> - 2007-07-27 17:00:35
|
> >> Testing the new feature, I see a problem. > >> > >> Since we maintain the cache per-database, when all attachments are > >> disconnected, we loose all the cache, causing very huge performance > >> problems. > >> > > > > How it is related to the new feature ? > Not directly related, but with FS cache enabled, the OS caches even > closed files. Yes, for some time. > This is big problem with FS cache disabled... I.e. you disabled it explicitly for some database ? > Imagine all users disconnect from application in the end of day and > reconect again in the another day. Do you think OS will retain FS cache over the night ? ;) It must be not a very busy server ;) > Just for example, I have a query that runs in 250 seconds when database > is not cached, and 3 seconds when it is cached (i.e., run it again even > if I close the server). And what ? This feature : a) disabled by default. To enable it you must set page buffers for some (or all) database more than 65536 (not a very common value, isn't is ?) or set MaxFileSystemCache to some small value (less than 2048, by default). b) If you decide to use it - you must understand what you do I'll write short guide how and when to use it Regards, Vlad |
From: Adriano d. S. F. <adr...@uo...> - 2007-07-27 17:07:55
|
Vlad Horsun escreveu: >> This is big problem with FS cache disabled... >> > > I.e. you disabled it explicitly for some database ? > Incread buffer to 65536. > > >> Imagine all users disconnect from application in the end of day and >> reconect again in the another day. >> > > Do you think OS will retain FS cache over the night ? ;) It must be not > a very busy server ;) > If there is no activity in the server, I expect the OS to not loose the cache for nothing. > >> Just for example, I have a query that runs in 250 seconds when database >> is not cached, and 3 seconds when it is cached (i.e., run it again even >> if I close the server). >> > > And what ? > > This feature : > > a) disabled by default. To enable it you must set page buffers for some (or > all) database more than 65536 (not a very common value, isn't is ?) or > set MaxFileSystemCache to some small value (less than 2048, by default). > > b) If you decide to use it - you must understand what you do > > I'll write short guide how and when to use it I know this is as designed, but is not good. Also, imagine admin put database in exclusive mode for some task then disconnect after make it online again. This really limit the usage of the new feature. Adriano |
From: Alexandre B. S. <ib...@th...> - 2007-07-27 18:18:40
|
Adriano dos Santos Fernandes wrote: > Vlad Horsun escreveu: > >>> This is big problem with FS cache disabled... >>> >>> >> I.e. you disabled it explicitly for some database ? >> >> > Incread buffer to 65536. > > >> >> >> >>> Imagine all users disconnect from application in the end of day and >>> reconect again in the another day. >>> >>> >> Do you think OS will retain FS cache over the night ? ;) It must be not >> a very busy server ;) >> >> > If there is no activity in the server, I expect the OS to not loose the > cache for nothing. > > >> >> >>> Just for example, I have a query that runs in 250 seconds when database >>> is not cached, and 3 seconds when it is cached (i.e., run it again even >>> if I close the server). >>> >>> >> And what ? >> >> This feature : >> >> a) disabled by default. To enable it you must set page buffers for some (or >> all) database more than 65536 (not a very common value, isn't is ?) or >> set MaxFileSystemCache to some small value (less than 2048, by default). >> >> b) If you decide to use it - you must understand what you do >> >> I'll write short guide how and when to use it >> > I know this is as designed, but is not good. > > Also, imagine admin put database in exclusive mode for some task then > disconnect after make it online again. > > This really limit the usage of the new feature. > > > Adriano > I got your point.. But I think it's worth... How much time will be needed to FB fill in the cache ? 10 minutes ? (I believe even less). When a non cached database start to work the first queries will be slow, but I think in very few time the most used data will be in cache again. see you ! -- Alexandre Benson Smith Development THOR Software e Comercial Ltda Santo Andre - Sao Paulo - Brazil www.thorsoftware.com.br |
From: Vlad H. <hv...@us...> - 2007-07-27 17:30:30
|
> I know this is as designed, but is not good. "Good" and "bad" is very depends on what you do and what you expect > Also, imagine admin put database in exclusive mode for some task then > disconnect after make it online again. > > This really limit the usage of the new feature. Nobody said it is for all and every usage. Regards, Vlad |
From: Dmitry Y. <fir...@ya...> - 2007-07-27 16:51:06
|
Adriano dos Santos Fernandes wrote: > > Since we maintain the cache per-database, when all attachments are > disconnected, we loose all the cache, causing very huge performance > problems. Do you mean problems in subsequent [newly created] attachments? Well, this is a kind of expected. The apps often connecting/disconnecting should not use this feature, IMO. Dmitry |
From: Alex P. <pes...@ma...> - 2007-07-31 07:06:46
|
On Friday 27 July 2007 20:51, Dmitry Yemanov wrote: > Adriano dos Santos Fernandes wrote: > > Since we maintain the cache per-database, when all attachments are > > disconnected, we loose all the cache, causing very huge performance > > problems. > > Do you mean problems in subsequent [newly created] attachments? Well, > this is a kind of expected. The apps often connecting/disconnecting > should not use this feature, IMO. Problems happening after non-working night may be solved adding one more attachment to database, doing nothing by itself. But what about database shutdown - if one uses single attachment to shutdown database, do some actions with it and bring it back online, cache will be preserved. But if one needs to change attachments in single-user shutdown mode cache will be dropped. Suppose well designed system do not need this mode too often. |
From: Alexandre B. S. <ib...@th...> - 2007-07-27 16:54:46
|
Adriano dos Santos Fernandes wrote: > Hi! > > Testing the new feature, I see a problem. > > Since we maintain the cache per-database, when all attachments are > disconnected, we loose all the cache, causing very huge performance > problems. > > > Adriano > I think it is as expected/designed see you ! -- Alexandre Benson Smith Development THOR Software e Comercial Ltda Santo Andre - Sao Paulo - Brazil www.thorsoftware.com.br |
From: Vlad H. <hv...@us...> - 2007-07-27 17:01:11
|
> Adriano dos Santos Fernandes wrote: > > Hi! > > > > Testing the new feature, I see a problem. > > > > Since we maintain the cache per-database, when all attachments are > > disconnected, we loose all the cache, causing very huge performance > > problems. > > > > > > Adriano > > > > I think it is as expected/designed Exactly Regards, Vlad |