Re: [Gmod-schema] Postgresql DBA-type question

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Scott,

Which Pg server are you using?

Explaining your query1 on Pg 7.3 using srcfeature_id = 1, 4, 6, I'm not seeing 
any difference in index usage.   Eg:

explain select distinct f.name,fl.min,fl.max,fl.strand,f.type_id,f.feature_id
   from feature f, featureloc fl
     where
     fl.srcfeature_id = 6 and
     f.feature_id  = fl.feature_id and
     fl.max >= 390956 and
     fl.min <= 393164
;

-Dave

From: Scott Cain <ca...@cs...>
>> To: gmod schema <gmo...@li...>
>> Subject: [Gmod-schema] Postgresql DBA-type question
>> 
>> Hello,
>> 
>> This is going to be fairly long.  
>> 
>> I am on a continuing quest to improve query performance for queries used
>> frequently by gbrowse.  The most common time consuming query is to find
>> all of the features in a given range.  To do that, there are two types
>> of queries that can be used, which I will refer to as Query1 and Query2.
>> Query1 uses the coordinate system I have advocated the last couple of
>> weeks, with min and max columns and min < max always.  (Those column
>> names should be changed to avoid conflicts with restricted words.)
>> Query2 uses a union of two queries that use the other coordinate system
>> (nbeg and nend) combined with strand.  Here are examples:
>> 
>> Query1:
>> select distinct f.name,fl.min,fl.max,fl.strand,f.type_id,f.feature_id
>>    from feature f, featureloc fl
>>      where
>>      fl.srcfeature_id = 1 and
>>      f.feature_id  = fl.feature_id and
>>      fl.max >= 390956 and
>>      fl.min <= 393164
>> 
>> Query2:
>>     select distinct f.name,fl.nbeg,fl.nend,fl.strand,f.type_id,f.feature_id
>>     from feature f, featureloc fl
>>     where
>>       fl.srcfeature_id = 1 and
>>       f.feature_id  = fl.feature_id and
>>       fl.nbeg <= 393164 and fl.nend >= 390956 and strand=1
>>    union
>>     select distinct f.name,fl.nbeg,fl.nend,fl.strand,f.type_id,f.feature_id
>>     from feature f, featureloc fl
>>     where
>>       fl.srcfeature_id = 1 and
>>       f.feature_id  = fl.feature_id and
>>       fl.nbeg >= 390956 and fl.nend <= 393164 and strand=-1
>> 
>> Here is the odd thing that I can't explain: the performance (read:index
>> usage) seems to vary by srcfeature_id.  Here's a table showing what I
>> mean (values are msec from explain analyze):
>> 
>> src_id      Query1      Query2
>> 1        116433.34   209532.05
>> 2           979.55     1039.81
>> 4            36.76      251.25
>> 5           369.95      901.55
>> 6        108404.04   208322.54
>> 7           798.28     1131.86
>> 
>> Note that srcfeature_id 4 is much faster because there are relatively
>> few features on it.  The biggest difference is srcfeature_ids 1 and 6
>> taking 3 orders of magnitude longer than the other queries.  That is
>> because the query planner uses seq scans when the srcfeature_id is 1 or
>> 6, but for other srcfeature_ids, it uses the appropriate index (either
>> featureloc_src_nbeg_nend or featureloc_src_min_max depending on the
>> query).  I am at a total loss as to why postgres would do this; can
>> anyone with more experience with postgres enlighten me?
>> 
>> Note that this was done with gadfly 3 (filename gadfly_chado_dump2.gz)
>> if it matters to anyone.  I will probably take this to the postgres
>> mailing list as well to get there input.
>> 
>> Thanks,
>> Scott
>> 
>> 
>> -- 
>> ------------------------------------------------------------------------
>> Scott Cain, Ph. D.                                         ca...@cs...
>> GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
>> Cold Spring Harbor Laboratory
>> 
>> 
>> 
>> -------------------------------------------------------
>> This SF.net email is sponsored by: Scholarships for Techies!
>> Can't afford IT training? All 2003 ictp students receive scholarships.
>> Get hands-on training in Microsoft, Cisco, Sun, Linux/UNIX, and more.
>> www.ictp.com/training/sourceforge.asp
>> _______________________________________________
>> Gmod-schema mailing list
>> Gmo...@li...
>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>> 
>>