From: Vlad K. <hv...@us...> - 2008-02-27 23:32:45
|
> So, you're basically saying that it is known that the query containing > exists(select * from MOVI m where p.PROC = m.PROC and m.TIPO = 'B') is > considerably slower if the BLOB m.TEXT (not referenced at all) isn't > NULL. I still find it strange that I can do two things (count + query) > in 20 seconds, whereas only doing the query takes 10 minutes, but I'll > consider it a Firebird gotcha that I wasn't aware of and that hopefully > will be improved in a future version. Oh, it seems i was not clear enough. Well, give me a second try ;) When you run query with EXISTS (let's name it 'slow' query) first time, Firebird's page cache is empty, so many of the data pages must be read from disk. As MOVI is biggest relation we will consider only it. As it is not innermost (as shown by query plan) its pages reads from disk in random order. This is very slow process as you already know. When you run SELECT COUNT(*) FROM MOVI you fill the Firebird's page cache by MOVI's pages, also you fill OS's cache by this data and subsequent queries are faster because of this. With SELECT COUNT(*) all pages are reads in mostly sequential order and much faster than random ordered reads. So, reading big table in sequential order (SELECT COUNT(*)) is much faster than reading the same table in random order ('slow' query). When most amount of table pages is cached (by Firebird or by OS) this difference almost disappeared. About blobs. If MOVI contains blobs there is three cases : a) all blobs is null - table occupied only space needed to store non-blobs columns. Lets say it occupy N pages b) blobs is mostly small and stored within data pages - any queries against table will read more pages than in case (a). Tables data is diluted by small blobs. In this case table may occupy 2N, 3N and more pages c) blobs is mostly big (more than page size) - in this case blobs stored off data pages and will not be read from disk if query not referenced it. In this case table will occupy a little bit than N data pages and (not interested here) M pages of blobs So, any query against this table and not referenced blobs will read much more pages in case (b) than in cases (a) and (c). Hope this is more clear now ;) Regards, Vlad |