Re: [IBPP-DISCUSS] Slow writing to blobs
IBPP is a C++ client class library for FirebirdSQL
Status: Inactive
Brought to you by:
epocman
From: Olivier M. <om...@ti...> - 2014-06-09 08:30:08
|
Le 7 juin 2014 à 22:07, Mike Ro <mik...@gm...> a écrit : > I recently started a thread on this topic on the Firebird support > mailing list, but would like to continue here because it seems to be > more of an IBPP or API issue. > > I am using Firebird embedded (2.5.2) on Linux (Ubuntu 14.04) via IBPP > with a completely default firebird.conf. > > My hardware is a Dell Optiplex 755, Intel Core 2 Duo @ 2.33GHz with 4Gb > RAM and a 2Tb Western Digital WD20EARX-32P hard disk. > > When I use IBPP to write a blob (actually a music MP3) to a simple test > database (which contains about 12 tables with 6 records in each) it > takes around 35 seconds to write a 10Mb file. > > The bottleneck seems to be when IBPP is writing a segment: > > (*gds.Call()->m_put_segment)(status.Self(), &mHandle, (unsigned > short)size, (char*)buffer); > > The blob type is zero and has a segment size of 4096, I have matched > this so the segment is written 4096 bytes at a time. I have tried > smaller and larger sizes with little change in performance. I have never ever even tried to specify a 'segment' size when defining a blob at the SQL level. I'm always using binary blobs at their simplest expression: something BLOB, which implies binary subtype, the mode 'leave the blob for what it's for': a binary large object of bytes meaningful only to the client saving it and reading it. We use them a *awful* lot (but yes on Windows builds) and never suffered from performance issue writing or reading them. > I see similar performance inserting a blob using Flamerobin, but that is > of course hardly surprising and also with PHP. Reading the blob is > instantaneous in all cases. On Windows the performance is much better. > > By contrast using an UDF to write the blob (more or less the same as the > Adhoc UDF) on Linux, it takes just 0.36 seconds to write 61Mb. Of course the performance can be impacted by various factors: - your DB page size and the size of blob segments you write (segment size does NOT have to be constant) - the path between the client-side and the server-side (wether the data has to pipe over a transportation mechanism or not) - the way the I/O is handled by FB server and its host OS (very different on linux and windows ports) The general rule is to write blobs in segments as large as possible: some old documentation refers to 32K-1 bytes being the upper safe bound. That's what the Blob::Save(const string&) does as an example. When reading back, you'll get the data in pieces never larger than their initial write size: if you wrote three 16K segments, you will get at best 3 reads of 16K if you provide a buffer large enough to read 16K. If your read buffer is smaller, the data will be further sliced and will require more roundtrips to be returned to you. The API declares the size parameter as unsigned short: it should be OK to use 64K-1 bytes as the segment size, but I'm pretty sure from memory it had issues with older FB/IB versions due to the usual programming error of mixing signed and unsigned in a terrible manner in old C code hacked by many different people along the time. So please: - do NOT use 4KB as the size of segments you write to your blobs. Use let's say 16 KB or 32K-1. Of course if you have to write a single segment (or for writing the last segment), use the exact number of bytes needed so you don't store non useful bytes. Be prepared to re-use as large buffers when reading back. - check your database page size : larger generally better when there are blobs in the way (more space on ONE page to store blob data, so less pages to allocate when writing a new blob and less pages to fetch when reading it back). I often use 16 KB as database page size, which is a good tradeoff for my own needs. - move your blob handling at the server side (if at all possible). I understand I have an advantage here as none of my programmings ever use the network links of fb between the client and the server. Our code actually use either the embedded server or do connect locally to (server name empty, so no ':' in the connection string). > Do you think this could be something in IBPP or is it API related? IBPP: no (I'd love if IBPP could have an impact, it means I could easily do something about it), but unfortunately not. You've seen the Blob::Read() and Blob::Write() code, there is few it could do except waiting for the FB API to return. The move of the bytes from client realm to server realm and back is important for performance: so the protocol and its implementation (implying both client and server side) are important in this matter but may not be your primary source of unusual slowness. 35 seconds to write about 2560 segments of 4 KB sounds a lot but may not be so much if you're hit by an awful combination of page size / segment size / network write / OS disk flush-write behavior. I'd anyway prefer to write about ~330 segments of 32K than 2560 smaller ones. Hope it will help you pinpoint the real perf issue, Best, __ Olivier Mascia tipgroup.com/om |