Menu

#1259 Excessive DB Queries/Privacy Checks in 4.1

v4.0.2
open
None
5
2009-11-08
2007-01-22
KosherJava
No

I am not sure that all of these can be addressed, but there seem to be too many queries and privacy checks in the lists.
1) Media List. I have about 255 media items. When showing the default 20 it runs "Total Database Queries: 598. Total privacy checks: 785". Changing it to the max and showing all: "Total Database Queries: 1025. Total privacy checks: 982". Firstly the default 20 should have nowhere near that many queries, and even showing all results in about 4 queries per item.

2) indilist.php: defailt letter A indilist.php?alpha=B, 179 names "Total Database Queries: 137. Total privacy checks: 171.", not too bad, but try drilling into one name indilist.php?surname=Bachrach with 43 names: "Total Database Queries: 63. Total privacy checks: 421" almost 2 queries per individual, and 421 privacy checks (this is not DB but obviously is costly in processing time at over 6 checks per person). indilist.php?alpha=B&surname_sublist=no&show_all=no = "indilist.php?alpha=B&surname_sublist=no&show_all=no"

3) sourcelist.php 452 sources "Total Database Queries: 1363. Total privacy checks: 4062". that is 3 queries per source and almost 10 privacy checks per source.

4) placelist.php?action=show&display=hierarchy for a location with 8 families and 35 individuals: "Total Database Queries: 387. Total privacy checks: 309" over 10 queries per row. placelist.php?action=show&level=1&parent[0]=Poland : 506 individuals and 105 families: "Execution time: 29.817 sec. Total Database Queries: 4528. Total privacy checks: 4275."

While caching can help, the examples above show that we would be caching hundreds or thousands of different pages per site. Is there any way to streamline this?

Discussion

  • KosherJava

    KosherJava - 2007-01-22

    Logged In: YES
    user_id=634811
    Originator: YES

    5) famlist.php?alpha=B&surname_sublist=yes 86 families "Total Database Queries: 162. Total privacy checks: 86." famlist.php?surname=Bachrach 26 names "Total Database Queries: 72. Total privacy checks: 421"

     
  • John Finlay

    John Finlay - 2007-01-24

    Logged In: YES
    user_id=300048
    Originator: NO

    One of the things you can do to debug this is turn on the SQL query log. This can be enabled by setting the $SQL_LOG = true in the functions_db.php file. This will cause every DB query that is run to be logged in a file in the index directory.

    --John

     
  • KosherJava

    KosherJava - 2007-01-25

    Logged In: YES
    user_id=634811
    Originator: YES

    I emailed you the logs for the sourcelist and medialist (set to only show 20 media).
    for each source there are 3 queries to get the counts of items lined to each source:
    SELECT count(i_id) FROM pgv_individuals WHERE i_file='1' AND i_gedcom REGEXP 'SOUR @S316@'
    SELECT count(f_id) FROM pgv_families WHERE f_file='1' AND f_gedcom REGEXP 'SOUR @S316@'
    SELECT count(m_id) FROM pgv_media WHERE m_gedfile='1' AND m_gedrec REGEXP 'SOUR @S316@'

    For every media in the DB, even if I elected to only show 20, there is the following query:
    select m_media from pgv_media where m_file LIKE '%media/someMedia.jpg'
    SELECT mm_gid FROM pgv_media_mapping WHERE mm_media='M50' AND mm_gedfile='1'
    48 cases of
    SELECT i_gedcom, i_name, i_isdead, i_file, i_letter, i_surname FROM pgv_individuals WHERE i_id LIKE 'I1591' AND i_file='1'

    and for the 13 ones that link to sources (out of the 20), there is
    SELECT s_gedcom, s_name, s_file FROM pgv_sources WHERE s_id LIKE 'S144' AND s_file='1'

    and a repeat query of
    select m_media from pgv_media where m_file LIKE '%media/myMedia1.jpg'
    for all 20 media to display.

     
  • John Finlay

    John Finlay - 2007-01-25
    • assigned_to: nobody --> yalnifj
     
  • John Finlay

    John Finlay - 2007-01-25

    Logged In: YES
    user_id=300048
    Originator: NO

    I have just committed code that optimizes the medialist. The medialist is a difficult one to really optimize because the privacy of a media item is defined by the privacy of the items it links to. So you can't check the privacy without loading up all of the linked records.

    On my site before the optimization the medialist ran at:
    Execution time: 1.040 sec. Total Database Queries: 389. Total privacy checks: 268. Total Memory Usage: 17497.47 KB.

    After the optimization:
    Execution time: 0.659 sec. Total Database Queries: 17. Total privacy checks: 227. Total Memory Usage: 17210.81 KB.

    Let me know if the medialist is improved for you.

    Now on to the source list.
    --John

     
  • KosherJava

    KosherJava - 2007-01-25

    Logged In: YES
    user_id=634811
    Originator: YES

    Nice improvement for the media list. For the first page (20 media) it is now:
    "Execution time: 1.853 sec. Total Database Queries: 21. Total privacy checks: 736". This is a great improvement since it had been 598. I do not get memory usage output with the stats. This is a 27 fold improvement in the number of queries!!!

     
  • John Finlay

    John Finlay - 2007-01-25

    Logged In: YES
    user_id=300048
    Originator: NO

    I had a hard time with the sourcelist at first because it was running in under 1 second with my data. But then I found that there was a check to see if there were more than 4000 people in the DB. If there were more than 4000 people, then the number of related indis, fams, etc were not being shown making the list run much faster with my data. If I disable that check and always show the count of the related records, then the page takes 9 secs because it has run 3 regular expression searches on the database for each source.

    I see no way to optimize this with our current database schema. For now I recommend disabling the option to show the count of related records for DBs of more than 1000 people.

    For the future (v4.2), we need to change the import procedure to support it. Probably by creating a pgv_sourcelinks table which links all of the IDs that are related to a source. During the import we will look for all source links and add them to the table. This will increase import time, but it will improve performance of the source list and the source pages.

    --John

     
  • KosherJava

    KosherJava - 2007-01-25

    Logged In: YES
    user_id=634811
    Originator: YES

    the latest code that blocks the indi and families linked to the source for gedcoms > 1000 indies dropped the source page calls to 7. This can be managed by a config change as well, but ideally I will look forward to the 4.2 enhancement.

     
  • Greg Roach

    Greg Roach - 2008-11-17
    • milestone: --> v4.0.2
     
  • Greg Roach

    Greg Roach - 2009-11-07

    The XXXXlist functions have been largely rewritten since this bug was opened.

    Is it still an issue?

     
  • Greg Roach

    Greg Roach - 2009-11-07
    • status: open --> pending
     
  • KosherJava

    KosherJava - 2009-11-08

    Here are updated stats for all the scenarios above. In general I would say that this should not be closed. My data changed in nature since my post, so I cant do an apples to apples comparison.

    1) Medialist Total Database Queries: 1519. Total privacy checks: 1050 (in original post it was "Total Database Queries: 598. Total privacy checks: 785". This increased number is no greater if I try to bring back 20 or 200.

    2) Indilist for name BACHRACH Total Database Queries: 186. Total privacy checks: 93. this is VS Total Database Queries: 63. Total privacy checks: 421 in the old, so the number of queries is up 50%, but privacy checks see a great improvement to 1/5th of what it used to be.

    3) Sourcelist - Total Database Queries: 3807. Total privacy checks: 952. This is VS Total Database Queries: 1363. Total privacy checks: 4062. The number of sources I now have doubled since this post from 452 to 940 sources, but the queries tripled while privacy checks are now about 1 per source

    4) Can't replicate the exact scenario for the placelist since it now runs out of memory for the large Poland query, but privacy checks seem somewhat lower

     
  • KosherJava

    KosherJava - 2009-11-08
    • status: pending --> open
     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.