From: SourceForge.net <no...@so...> - 2010-02-17 01:06:43
|
Bugs item #2945391, was opened at 2010-02-03 13:04 Message generated for change (Comment added) made by vgapeyev You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1126676&aid=2945391&group_id=248804 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: data Group: None >Status: Closed Priority: 9 Private: No Submitted By: William Piel (sfrgpiel) >Assigned to: Vladimir Gapeyev (vgapeyev) Summary: New Indices Needed Initial Comment: Because we have reverted to an older instance of the data (and schema), there are some changes that we may need to make. One is to create indexes on fields that are likely to be queried a lot using something like "CREATE INDEX taxonlabel_i ON taxonlabel USING btree (taxonlabel);". At the bottom is a list of all our existing indices -- although all the PKs are covered other fields would seem to benefit from an index. I would assume that the following fields need to be indexed: Table Field -------- ------- taxon name taxonvariant fullname taxonlabel taxonlabel citation title citation abstract person lastname In addition, I would think that FKs need to be indexed as well, although I don't know which are the critical ones. What is clear is that some queries are very very slow. For example, do a taxon variant name search on "Homo sapiens" (this results in a selection of two taxon records) and then click the "Trees" tab -- after 2 minutes it returned a "Proxy Error: The proxy server received an invalid response from an upstream server. Reason: Error reading from remote host" which I'm guessing is a time-out of some sort. At any rate, the FK called taxonlabel_id and the FK called phylotree_id in the phylotreenode table probably should be indexed to speed up querying for trees based on a selection of taxa. Undoubtedly there are other FKs that need indexing, but this requires knowledge on how common queries are made via hibernate. These changes to the schema should be applied to both the dev and stage databases, as well as noted in the instantiation scripts. bp SELECT n.nspname AS "Schema", t.relname AS "Table", c.relname AS "Index" FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace JOIN pg_catalog.pg_index i ON i.indexrelid = c.oid JOIN pg_catalog.pg_class t ON i.indrelid = t.oid WHERE c.relkind = 'i' AND n.nspname NOT IN ('pg_catalog', 'pg_toast') AND pg_catalog.pg_table_is_visible(c.oid) ORDER BY n.nspname, t.relname, c.relname; Schema | Table | Index --------+-----------------------------+-------------------------------------------- public | algorithm | algorithm_pkey public | analysis | analysis_pkey public | analysis | analysis_study_id_idx public | analysisstep | analysisstep_pkey public | analyzeddata | analyzeddata_pkey public | ancestralstate | ancestralstate_pkey public | ancstateset | ancstateset_pkey public | chargroup | chargroup_pkey public | chargroup_colrange | chargroup_colrange_columnrange_id_key public | charpartition | charpartition_pkey public | charset | charset_pkey public | charset_colrange | charset_colrange_columnrange_id_key public | charweight | charweight_pkey public | charweight_colrange | charweight_colrange_columnrange_id_key public | charweightset | charweightset_pkey public | citation | citation_pkey public | citation_author | citation_author_pkey public | citation_editor | citation_editor_pkey public | citationstatus | citationstatus_pkey public | coderecord_colrange | coderecord_colrange_columnrange_id_key public | codonchar1_colrange | codonchar1_colrange_columnrange_id_key public | codonchar2_colrange | codonchar2_colrange_columnrange_id_key public | codonchar3_colrange | codonchar3_colrange_columnrange_id_key public | codonnoncoding_colrange | codonnoncoding_colrange_columnrange_id_key public | codonpositionset | codonpositionset_pkey public | columnrange | columnrange_pkey public | compound_element | compound_element_element_id_key public | compound_element | compound_element_pkey public | cstreenode | cstreenode_pkey public | discretecharstate | discretecharstate_phylochar_id_idx public | discretecharstate | discretecharstate_pkey public | distancematrixelement | distancematrixelement_pkey public | gapmode | gapmode_pkey public | geneticcode | geneticcode_pkey public | geneticcoderecord | geneticcoderecord_pkey public | geneticcodeset | geneticcodeset_pkey public | help | help_pkey public | itemdefinition | itemdefinition_pkey public | itemvalue | itemvalue_pkey public | matrix | matrix_pkey public | matrix_itemdefinition | matrix_itemdefinition_pkey public | matrixcolumn | matrixcolumn_phylochar_id_idx public | matrixcolumn | matrixcolumn_pkey public | matrixcolumn_itemdefinition | matrixcolumn_itemdefinition_pkey public | matrixdatatype | matrixdatatype_phylochar_id_idx public | matrixdatatype | matrixdatatype_pkey public | matrixelement | matrixelement_discretecharstate_id_idx public | matrixelement | matrixelement_itemdefinition_id_idx public | matrixelement | matrixelement_pkey public | matrixkind | matrixkind_pkey public | matrixrow | matrixrow_matrix_id_idx public | matrixrow | matrixrow_pkey public | matrixrow | matrixrow_taxonlabel_id_idx public | nodeattribute | nodeattribute_pkey public | person | person_pkey public | phylochar | phylochar_pkey public | phylotree | phylotree_pkey public | phylotreenode | phylotreenode_pkey public | polytcount | polytcount_pkey public | rowsegment | rowsegment_matrixrow_id_idx public | rowsegment | rowsegment_pkey public | software | software_pkey public | statechangeset | statechangeset_pkey public | stateformat | stateformat_pkey public | statemodifier | statemodifier_pkey public | stateset | stateset_pkey public | stepmatrixelement | stepmatrixelement_pkey public | study | study_citation_id_idx public | study | study_pkey public | study_nexusfile | study_nexusfile_pkey public | studystatus | studystatus_pkey public | sub_matrix | sub_matrix_matrix_id_key public | sub_matrix | sub_matrix_pkey public | sub_taxonlabel | sub_taxonlabel_pkey public | sub_taxonlabel | sub_taxonlabel_taxonlabel_id_key public | sub_treeblock | sub_treeblock_pkey public | sub_treeblock | sub_treeblock_treeblock_id_key public | submission | submission_pkey public | submission | submission_study_id_idx public | submission | submission_study_id_key public | taxon | taxon_pkey public | taxonauthority | taxonauthority_pkey public | taxonlabel | taxonlabel_pkey public | taxonlabelgroup | taxonlabelgroup_pkey public | taxonlabelpartition | taxonlabelpartition_pkey public | taxonlabelset | taxonlabelset_pkey public | taxonlabelset_taxonlabel | taxonlabelset_taxonlabel_pkey public | taxonlink | taxonlink_pkey public | taxonset | taxonset_pkey public | taxonset_taxon | taxonset_taxon_pkey public | taxonvariant | taxonvariant_pkey public | treeattribute | treeattribute_pkey public | treeblock | treeblock_pkey public | treegroup | treegroup_pkey public | treegroup_phylotree | treegroup_phylotree_pkey public | treekind | treekind_pkey public | treenodeedge | treenodeedge_pkey public | treepartition | treepartition_pkey public | treequality | treequality_pkey public | treeset | treeset_pkey public | treeset_phylotree | treeset_phylotree_pkey public | treetype | treetype_pkey public | typeset | typeset_pkey public | user | user_pkey public | user | user_username_key public | userrole | userrole_pkey public | usertype | usertype_pkey public | usertyperecord | usertyperecord_pkey public | usertyperrd_colrange | usertyperrd_colrange_columnrange_id_key (109 rows) ---------------------------------------------------------------------- >Comment By: Vladimir Gapeyev (vgapeyev) Date: 2010-02-16 20:06 Message: For fixes requested in the ticket, see below. The problem, however, persists, see #2953212, https://sourceforge.net/tracker/index.php?func=detail&aid=2953212&group_id=248804&atid=1126676 I have created indexes for the fields mentioned in the request, except for citation(abstract). I doubt, however, these would lead to any efficiency gains: any standard DB index (btree, hash, ...) helps with "full-field" comparisons, while the request probably meant google-style "within-the-field" searches. I did not index citation(abstract) exactly because the point is the most apparent for this field. I have created indexes for FKs that participate in tree reconstruction. It is possible some FKs that participate in matrix reconstruction would also benefit -- we can do those later, when we catch slow queries. ---------------------------------------------------------------------- Comment By: William Piel (sfrgpiel) Date: 2010-02-08 10:55 Message: This is assigned to Hilmar. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1126676&aid=2945391&group_id=248804 |