From: Fulvio S. <ma...@fs...> - 2011-08-30 21:42:44
|
I have read a recent thread about ext4 being slower than ext3, and I have just discovered another problem. I have developed an open source program that catalogs digital images: all the information, included image thumbnails, are stored in a single database. My own database contains almost 20000 images and its size is about 460 MB. The program uses an embedded copy of Firebird for the Linux, Mac and Windows versions. When I open a catalog (a Firebird database) the program executes a number of queries to (partially) fill some tree controls that work like folders in a file system. The queries are always very simple, something like "SELECT * FROM Table WHERE someID = 5". The number of queries that are executed when opening a catalog changes with the stored data: for my own catalog the program executes approximately 600 queries. In Windows opening a catalog has not perceivable delay: the queries execute instantly. The same happens for the Linux version when it runs in a virtual machine. Now I have installed Ubuntu 10.04 LTS 64 bit in my laptop, formatting the disk partitions with the ext4 file system. When I use the same program to open my catalog, it takes about 23 seconds. That is in incredibly long time and I really cannot imagine how can it be so slow. For the whole 23 seconds the hard disk works continuously and cpu usage is very low. If I close the program and I open the catalog again, it takes the same 23 seconds again. When opening a catalog the program executes only SELECT queries, so it is only reading the database. It looks like Firefox reads are not cached, but this is very strange. Disk reads are usually cached: if I open a large program for the first time the hard disk works, if I open it again the disk does not work any more, so those reads are cached. I cannot imagine why Firebird disk reads are not cached, but this could be an explanation. The problem is that not only opening a catalog takes a lot of time, the program is painfully slow also when it is used. It is rather near to being unusable. I have tried copying exactly the same database file to an external hard disk connected to an USB port. The disk is formatted with ntfs and if I open that database the program's performance it lightning fast: instant catalog opening, and very fast usage. They are the same performances as the Windows version. The program uses an embedded copy of version 2.0.6. I have also tried installing version 2.1.3 super from the Ubuntu repositories but performance is the same. I have just reduced a little the size of my /home partition and I have added a small ext3 partition. I have copied the database to that ext3 partition: opening the catalog is very fast and the program runs very fast again. So it looks like there is a very big problem with ext4. Unfortunately mine is a desktop program so I cannot say "do not use ext4" or "do not use Ubuntu". People expect that the program works well with their own computers. I have also checked the hard disk with GSmartControl and it seems to work well. No errors or warnings reported. No disk errors in the system logs. It would be great if it were possible to fix this problem in some way. I would be glad to help in any possible way, giving more information and eventually testing. I can also post the database if anybody wants to test it. Thanks in advance Fulvio Senore |
From: Adriano d. S. F. <adr...@gm...> - 2011-08-30 22:04:07
|
On 30-08-2011 18:27, Fulvio Senore wrote: > > Now I have installed Ubuntu 10.04 LTS 64 bit in my laptop, formatting > the disk partitions with the ext4 file system. > When I use the same program to open my catalog, it takes about 23 > seconds. That is in incredibly long time and I really cannot imagine how > can it be so slow. For the whole 23 seconds the hard disk works > continuously and cpu usage is very low. If I close the program and I > open the catalog again, it takes the same 23 seconds again. > When opening a catalog the program executes only SELECT queries, so it > is only reading the database. > It looks like Firefox reads are not cached, but this is very strange. > Disk reads are usually cached: if I open a large program for the first > time the hard disk works, if I open it again the disk does not work any > more, so those reads are cached. > I cannot imagine why Firebird disk reads are not cached, but this could > be an explanation. > Are you sure you're not executing each query in a different transaction? Please also try to set the database to async mode (forced write off) and see if it changes. Adriano |
From: Fulvio S. <ma...@fs...> - 2011-08-31 08:25:52
|
Il 31/08/2011 0.03, Adriano dos Santos Fernandes ha scritto: > On 30-08-2011 18:27, Fulvio Senore wrote: >> >> Now I have installed Ubuntu 10.04 LTS 64 bit in my laptop, formatting >> the disk partitions with the ext4 file system. >> When I use the same program to open my catalog, it takes about 23 >> seconds. That is in incredibly long time and I really cannot imagine how >> can it be so slow. For the whole 23 seconds the hard disk works >> continuously and cpu usage is very low. If I close the program and I >> open the catalog again, it takes the same 23 seconds again. >> When opening a catalog the program executes only SELECT queries, so it >> is only reading the database. >> It looks like Firefox reads are not cached, but this is very strange. >> Disk reads are usually cached: if I open a large program for the first >> time the hard disk works, if I open it again the disk does not work any >> more, so those reads are cached. >> I cannot imagine why Firebird disk reads are not cached, but this could >> be an explanation. >> > > Are you sure you're not executing each query in a different transaction? > > Please also try to set the database to async mode (forced write off) and > see if it changes. > Thank you for your quick answer. Setting the database to forced writes off made opening the catalog a very quick task. I must admit that I was executing each query in its own transaction. Since I was only executing SELECT statements I was thinking that there was no problem in working this way. I never had performance problems with Windows and Linux in virtual machines, so I did non think that this might be a problem. I have tried executing all the queries in a single transaction and now opening the catalog is lightning fast! It looks like opening and closing a transaction requires writing to the database, even if the transaction only executes SELECT statements. Lesson learned. Thank you for solving my problem. I am still wondering why performance was so bad with ext4 only, but this is not a real problem any more, just a matter of curiosity. Fulvio |
From: Alex P. <pes...@ma...> - 2011-08-31 08:40:34
|
On 08/31/11 12:25, Fulvio Senore wrote: > > I must admit that I was executing each query in its own transaction. > Since I was only executing SELECT statements I was thinking that there > was no problem in working this way. I never had performance problems > with Windows and Linux in virtual machines, so I did non think that this > might be a problem. > That's due to caching effects of VM. > I have tried executing all the queries in a single transaction and now > opening the catalog is lightning fast! It looks like opening and closing > a transaction requires writing to the database, even if the transaction > only executes SELECT statements. The fact that transaction is started/ended must be stored in database. The only exception from this rule is read-only database. > Lesson learned. > > Thank you for solving my problem. > > I am still wondering why performance was so bad with ext4 only, but this > is not a real problem any more, just a matter of curiosity. > A bit offtopic - can you provide a link to take a look at that application? |
From: Fulvio S. <ma...@fs...> - 2011-08-31 14:21:25
|
Il 31/08/2011 10.40, Alex Peshkoff ha scritto: > On 08/31/11 12:25, Fulvio Senore wrote: > > A bit offtopic - can you provide a link to take a look at that application? > Sure: http://vvvp.sourceforge.net/index.html Fulvio |
From: Mark R. <ma...@la...> - 2011-08-31 08:57:41
|
On Tue, 30 Aug 2011 23:27:01 +0200, Fulvio Senore <ma...@fs...> wrote: > When I open a catalog (a Firebird database) the program executes a > number of queries to (partially) fill some tree controls that work like > folders in a file system. The queries are always very simple, something > like "SELECT * FROM Table WHERE someID = 5". The number of queries that > are executed when opening a catalog changes with the stored data: for my > own catalog the program executes approximately 600 queries. Unrelated to your actual problem: I don't know your database structure and requirements, but a query explosion to 600 queries for opening a catalog (which on the surface sounds easy) seems a bit excessive. Are you sure you aren't falling in the N+1 query trap, where 1 query could suffice? Mark |
From: Fulvio S. <ma...@fs...> - 2011-08-31 14:20:39
|
Il 31/08/2011 10.57, Mark Rotteveel ha scritto: > On Tue, 30 Aug 2011 23:27:01 +0200, Fulvio Senore<ma...@fs...> > wrote: >> When I open a catalog (a Firebird database) the program executes a >> number of queries to (partially) fill some tree controls that work like >> folders in a file system. The queries are always very simple, something >> like "SELECT * FROM Table WHERE someID = 5". The number of queries that >> are executed when opening a catalog changes with the stored data: for my > >> own catalog the program executes approximately 600 queries. > > Unrelated to your actual problem: I don't know your database structure and > requirements, but a query explosion to 600 queries for opening a catalog > (which on the surface sounds easy) seems a bit excessive. Are you sure you > aren't falling in the N+1 query trap, where 1 query could suffice? > > Mark I didn't know anything about the N+1 query trap, but Google was my friend. Yes, that is my problem, but I don't know how to find a better solution. My program shows 3 user selectable views. For each view a tree control at the left shows a structure of "folders" and "subfolders" where images are organized. When I open a catalog I load the first 2 levels of items in the tree controls, so that users can see all the first level items and they can see if an item has sub-items so it can be expanded. Further items loading in the tree controls happens when the user expands an existing item. Of course if the user has a lot of items in the tree controls this leads to an explosion of queries. This has never been a problem before, but I realize that a better solution would be welcome. At the moment I enumerate all the first level items with a query, then I execute a query for each item to find its sub-items. Items are stored in a single table, with a parent-child relationship created using a FATHER_ID field. I probably need a better solution, but probably this question would be better suited for the Firebird Support list. Fulvio |
From: Mark R. <ma...@la...> - 2011-08-31 14:37:26
|
On Wed, 31 Aug 2011 16:20:24 +0200, Fulvio Senore <ma...@fs...> wrote: > I didn't know anything about the N+1 query trap, but Google was my friend. > Yes, that is my problem, but I don't know how to find a better solution. > > My program shows 3 user selectable views. For each view a tree control > at the left shows a structure of "folders" and "subfolders" where images > are organized. > > When I open a catalog I load the first 2 levels of items in the tree > controls, so that users can see all the first level items and they can > see if an item has sub-items so it can be expanded. > Further items loading in the tree controls happens when the user expands > an existing item. > > Of course if the user has a lot of items in the tree controls this leads > to an explosion of queries. This has never been a problem before, but I > realize that a better solution would be welcome. > > At the moment I enumerate all the first level items with a query, then I > execute a query for each item to find its sub-items. > Items are stored in a single table, with a parent-child relationship > created using a FATHER_ID field. > I probably need a better solution, but probably this question would be > better suited for the Firebird Support list. Not 100% sure, but it sounds like a recursive Common Table Expression might be your friend here. See for example http://mycodingexperience.blogspot.com/2011/04/common-table-expression-cte-and-tree.html (in Firebird you will need to use WITH RECURSIVE for this) Mark |