From: <bac...@li...> - 2004-09-20 11:42:48
|
A BUGNOTE has been added to this bug. ====================================================================== http://bugs.bacula.org/bug_view_advanced_page.php?bug_id=0000106 ====================================================================== Reported By: martin Assigned To: Dan Langille ====================================================================== Project: bacula Bug ID: 106 Category: Director Reproducibility: always Severity: major Priority: normal Status: assigned ====================================================================== Date Submitted: 09-14-2004 05:16 PDT Last Modified: 09-20-2004 04:49 PDT ====================================================================== Summary: VolumeToCatalog verify jobs are unusably slow Description: Each file is taking 35 seconds! The problem is in db_get_file_record(), which accesses the file table like this: SELECT FileId, LStat, MD5 FROM File WHERE File.JobId=260 AND File.PathId=25180 AND File.FilenameId=243613 but there is no index for jobid, pathid and filenameid. ====================================================================== ---------------------------------------------------------------------- martin - 09-15-2004 05:07 PDT ---------------------------------------------------------------------- DiskToCatalog verify jobs are also impossibly slow, e.g. doing SELECT FileId, LStat, MD5 FROM File,Job WHERE File.JobId=Job.JobId AND File.PathId=2027 AND File.FilenameId=7839 AND Job.Type='B' AND Job.JobSTATUS='T' AND ClientId=1 ORDER BY StartTime DESC LIMIT 1 at 30 seconds per file. ---------------------------------------------------------------------- Dan Langille - 09-15-2004 05:57 PDT ---------------------------------------------------------------------- Good! We have a query and a time. Please paste the output of the following: explain analyse SELECT FileId, LStat, MD5 FROM File,Job WHERE File.JobId=Job.JobId AND File.PathId=2027 AND File.FilenameId=7839 AND Job.Type='B' AND Job.JobSTATUS='T' AND ClientId=1 ORDER BY StartTime DESC LIMIT 1 Then run "vacuum analyse" and then do the above again, and give us the results. ---------------------------------------------------------------------- martin - 09-15-2004 06:28 PDT ---------------------------------------------------------------------- Thanks for the suggestions. I ran this as the pgsql user and the vacuum took several minutes, but it doesn't seem to have changed much for the query. bacula=# explain analyse bacula-# SELECT FileId, LStat, MD5 bacula-# FROM File,Job bacula-# WHERE File.JobId=Job.JobId bacula-# AND File.PathId=2027 bacula-# AND File.FilenameId=7839 bacula-# AND Job.Type='B' bacula-# AND Job.JobSTATUS='T' bacula-# AND ClientId=1 bacula-# ORDER BY StartTime DESC LIMIT 1; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Limit (cost=18786.64..18786.64 rows=1 width=105) (actual time=34102.67..34102.68 rows=1 loops=1) -> Sort (cost=18786.64..18786.64 rows=1 width=105) (actual time=34102.65..34102.66 rows=2 loops=1) Sort Key: job.starttime -> Nested Loop (cost=0.00..18786.63 rows=1 width=105) (actual time=20545.13..34102.47 rows=3 loops=1) -> Seq Scan on file (cost=0.00..18783.60 rows=1 width=93) (actual time=20519.81..34051.89 rows=3 loops=1) Filter: ((pathid = 2027) AND (filenameid = 7839)) -> Index Scan using job_pkey on job (cost=0.00..3.02 rows=1 width=12) (actual time=16.81..16.82 rows=1 loops=3) Index Cond: ("outer".jobid = job.jobid) Filter: (("type" = 'B'::bpchar) AND (jobstatus = 'T'::bpchar) AND (clientid = 1)) Total runtime: 34102.88 msec (10 rows) bacula=# vacuum analyse; VACUUM bacula=# explain analyse bacula-# SELECT FileId, LStat, MD5 bacula-# FROM File,Job bacula-# WHERE File.JobId=Job.JobId bacula-# AND File.PathId=2027 bacula-# AND File.FilenameId=7839 bacula-# AND Job.Type='B' bacula-# AND Job.JobSTATUS='T' bacula-# AND ClientId=1 bacula-# ORDER BY StartTime DESC LIMIT 1; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Limit (cost=274212.10..274212.11 rows=1 width=109) (actual time=34092.11..34092.12 rows=1 loops=1) -> Sort (cost=274212.10..274212.11 rows=1 width=109) (actual time=34092.10..34092.11 rows=2 loops=1) Sort Key: job.starttime -> Nested Loop (cost=0.00..274212.09 rows=1 width=109) (actual time=20760.43..34092.05 rows=3 loops=1) -> Seq Scan on file (cost=0.00..274207.51 rows=1 width=97) (actual time=20742.78..34053.71 rows=3 loops=1) Filter: ((pathid = 2027) AND (filenameid = 7839)) -> Index Scan using job_pkey on job (cost=0.00..4.57 rows=1 width=12) (actual time=12.73..12.74 rows=1 loops=3) Index Cond: ("outer".jobid = job.jobid) Filter: (("type" = 'B'::bpchar) AND (jobstatus = 'T'::bpchar) AND (clientid = 1)) Total runtime: 34092.21 msec (10 rows) bacula=# \dt List of relations Schema | Name | Type | Owner --------+--------------+-------+------- public | basefiles | table | pgsql public | client | table | pgsql public | counters | table | pgsql public | file | table | pgsql public | filename | table | pgsql public | fileset | table | pgsql public | job | table | pgsql public | jobmedia | table | pgsql public | media | table | pgsql public | path | table | pgsql public | pool | table | pgsql public | unsavedfiles | table | pgsql public | version | table | pgsql (13 rows) bacula=# \di List of relations Schema | Name | Type | Owner | Table --------+-------------------------------+-------+-------+-------------- public | basefiles_pkey | index | pgsql | basefiles public | client_name_idx | index | pgsql | client public | client_pkey | index | pgsql | client public | counters_pkey | index | pgsql | counters public | file_pkey | index | pgsql | file public | filename_name_idx | index | pgsql | filename public | filename_pkey | index | pgsql | filename public | fileset_name_idx | index | pgsql | fileset public | fileset_pkey | index | pgsql | fileset public | job_media_job_id_media_id_idx | index | pgsql | jobmedia public | job_name_idx | index | pgsql | job public | job_pkey | index | pgsql | job public | jobmedia_pkey | index | pgsql | jobmedia public | media_pkey | index | pgsql | media public | media_volumename_id | index | pgsql | media public | path_name_idx | index | pgsql | path public | path_pkey | index | pgsql | path public | pool_name_idx | index | pgsql | pool public | pool_pkey | index | pgsql | pool public | unsavedfiles_pkey | index | pgsql | unsavedfiles (20 rows) bacula=# ---------------------------------------------------------------------- Dan Langille - 09-15-2004 10:57 PDT ---------------------------------------------------------------------- I have good news. This takes a while on my box too: bacula=# explain analyse bacula-# SELECT FileId, LStat, MD5 bacula-# FROM File,Job bacula-# WHERE File.JobId=Job.JobId bacula-# AND File.PathId=2027 bacula-# AND File.FilenameId=7839 bacula-# AND Job.Type='B' bacula-# AND Job.JobSTATUS='T' bacula-# AND ClientId=1 bacula-# ORDER BY StartTime DESC LIMIT 1 bacula-# ; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------ Limit (cost=134360.90..134360.90 rows=1 width=96) (actual time=18971.874..18971.874 rows=0 loops=1) -> Sort (cost=134360.90..134360.90 rows=1 width=96) (actual time=18971.866..18971.866 rows=0 loops=1) Sort Key: job.starttime -> Nested Loop (cost=0.00..134360.89 rows=1 width=96) (actual time=18971.852..18971.852 rows=0 loops=1) -> Seq Scan on file (cost=0.00..134355.54 rows=1 width=92) (actual time=18971.841..18971.841 rows=0 loops=1) Filter: ((pathid = 2027) AND (filenameid = 7839)) -> Index Scan using job_pkey on job (cost=0.00..5.34 rows=1 width=12) (never executed) Index Cond: ("outer".jobid = job.jobid) Filter: (("type" = 'B'::bpchar) AND (jobstatus = 'T'::bpchar) AND (clientid = 1)) Total runtime: 18971.966 ms (10 rows) bacula=# Now, let's create one index: bacula=# create index file_pathid on file(pathid); CREATE INDEX And run that query again! bacula=# explain analyse bacula-# SELECT FileId, LStat, MD5 bacula-# FROM File,Job bacula-# WHERE File.JobId=Job.JobId bacula-# AND File.PathId=2027 bacula-# AND File.FilenameId=7839 bacula-# AND Job.Type='B' bacula-# AND Job.JobSTATUS='T' bacula-# AND ClientId=1 bacula-# ORDER BY StartTime DESC LIMIT 1; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=1083.85..1083.85 rows=1 width=96) (actual time=385.884..385.884 rows=0 loops=1) -> Sort (cost=1083.85..1083.85 rows=1 width=96) (actual time=385.875..385.875 rows=0 loops=1) Sort Key: job.starttime -> Nested Loop (cost=0.00..1083.84 rows=1 width=96) (actual time=385.860..385.860 rows=0 loops=1) -> Index Scan using file_pathid on file (cost=0.00..1078.49 rows=1 width=92) (actual time=385.850..385.850 rows=0 loops=1) Index Cond: (pathid = 2027) Filter: (filenameid = 7839) -> Index Scan using job_pkey on job (cost=0.00..5.34 rows=1 width=12) (never executed) Index Cond: ("outer".jobid = job.jobid) Filter: (("type" = 'B'::bpchar) AND (jobstatus = 'T'::bpchar) AND (clientid = 1)) Total runtime: 385.975 ms (11 rows) bacula=# Much much better. Less than 0.4s. Try that. See if it helps. edited on: 09-15-04 10:57 ---------------------------------------------------------------------- martin - 09-16-2004 05:23 PDT ---------------------------------------------------------------------- Thanks, that fixes it (for both VolumeToCatalog and DiskToCatalog)! Hopefully the PathId index should be sufficient until someone creates a million files in one directory :-) Was there some performance problem with having this index in the past? I see that the mysql config has several indexes for File, including this one and others that probably speed up 'list files jobid=272' and the 'List where a File is saved regardless of the directory:' query which also take over 30 seconds for me. These are not to important though, because they are one-off operations. I will check the time taken for some medium sized full backups (around 200000 files per fileset) over the weekend. ---------------------------------------------------------------------- Dan Langille - 09-16-2004 05:30 PDT ---------------------------------------------------------------------- The original PostgreSQL DDL contained foreign keys. That, by implication, created indexes (IIRC). The removed FKs were not replaced with indexes. I think we'll just add indexes as they are found to be justified. Please monitor your backups and restores for a while to see if there are any side effects. I think you are already running "vacuum analyse" on a regular basis. But if not, you should be. ---------------------------------------------------------------------- martin - 09-16-2004 08:08 PDT ---------------------------------------------------------------------- I will report back on the timings. Thanks again. ---------------------------------------------------------------------- martin - 09-20-2004 04:49 PDT ---------------------------------------------------------------------- I checked the times for the weekend backups with the file(pathid) index and they are almost identical to last weekend's backups without it. Adding this looks like a worthwhile change. Bug History Date Modified Username Field Change ====================================================================== 09-14-04 05:16 martin New Bug 09-15-04 05:07 martin Bugnote Added: 0000234 09-15-04 05:57 Dan Langille Bugnote Added: 0000237 09-15-04 06:28 martin Bugnote Added: 0000238 09-15-04 10:56 Dan Langille Bugnote Added: 0000239 09-15-04 10:57 Dan Langille Bugnote Edited: 0000239 09-15-04 17:04 Dan Langille Assigned To => Dan Langille 09-15-04 17:04 Dan Langille Status new => assigned 09-16-04 05:23 martin Bugnote Added: 0000240 09-16-04 05:30 Dan Langille Bugnote Added: 0000241 09-16-04 08:08 martin Bugnote Added: 0000242 09-20-04 04:49 martin Bugnote Added: 0000249 ====================================================================== |