From: Mike C. F. <mcf...@ro...> - 2003-12-05 21:49:19
|
Ivar Zarrans and myself have both encountered (separately) an interesting "bug" in PostgreSQL performance. Basically, the query planner, if given a query like this: UPDATE trigger SET frozen=False WHERE object_id=23; where object_id is an indexed big-int field of trigger, will *always* choose a sequential scan. On a large table, that is going to be unacceptably slow, especially when one is writing the queries with the assumption that you are using an indexed field. The problem here is that, even when a PgInt8 is passed as the object_id value, it is inserted into the query as an int4 literal (a simple number). The server then sees a request to search for the field with a value of a *different type* than the available index, so ignores the index. Apparently PostgreSQL 7.5 has a fix checked in for it, however, in the meantime, any 7.4 or below DB using bigints for specifying where's on large tables will be ridiculously slow on UPDATE (think multiple whole minutes to update a single row of a 88,000 row table). Apparently this is a general problem, int2 indexed fields also won't use the index if an int4 literal is specified as the search term, though I'd guess that int8 is the most common problem, as both people discovering it were using that. This can be worked around by: * adding quotes around the value '23', which will make it an unknown type * coercing value to target type; i.e. 23::bigint anywhere you are using an integer as a query term against a field that is int8 and for which you need indexing * altering all queries to explicitly coerce to the target type * Greg Stark, on the postgres performance list, suggested using "the new binary protocol for parameters [within pyPgSQL] so that there are no quoting issues at all", but that's not going to work with older PostgreSQL versions, and still wouldn't help if someone had passed a regular integer (it would help if a PgInt8 was passed, however). The first two need to be done by creating a subclass or wrapper of PgInt8 which does the munged representation in it's __str__/__repr__ method (so that pyPgSQL won't complain about an improperly formatted value). Here's a sample class: class Wrapper: def __init__( self, value ): self.value = value def __repr__( self ): return '%s::int8'%(self.value, ) __str__ = __repr__ It doesn't address the underlying problem, but it should allow people to move forward if they find themselves in this situation. Enjoy, Mike _______________________________________ Mike C. Fletcher Designer, VR Plumber, Coder http://members.rogers.com/mcfletch/ |