From: Mantis B. T. <no...@bu...> - 2011-11-24 20:27:59
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=1788 ====================================================================== Reported By: ba...@ip... Assigned To: ebollengier ====================================================================== Project: bacula Issue ID: 1788 Category: sql Reproducibility: sometimes Severity: minor Priority: normal Status: assigned ====================================================================== Date Submitted: 2011-11-14 13:31 UTC Last Modified: 2011-11-24 20:27 UTC ====================================================================== Summary: 5.2.1 - Wrong storage id. Description: Hi, Background: We are using Bacula 5.2.1 with mysql. We are using 12 different catalogs, 12 pools and 23 storage nodes. Issue: Cannot restore files, but backups works. What have i found: I found out that in the MySQL database on one random catalog, in the Media table, the StorageId points to wrong storage, but i can fix this with this command: update Media set StorageId = '41' for example, but i want to know why does it be like this? Some info: mysql> select * from Media limit 1\G *************************** 1. row *************************** MediaId: 1 VolumeName: Pool-fs15-1 Slot: 1 PoolId: 23 MediaType: File-15 MediaTypeId: 19 LabelType: 0 FirstWritten: 2011-11-12 03:31:20 LastWritten: 2011-11-12 03:32:19 LabelDate: 2011-11-11 23:05:02 VolJobs: 1 VolFiles: 1 VolBlocks: 77504 VolMounts: 6 VolBytes: 4999938114 VolParts: 0 VolErrors: 0 VolWrites: 153127 VolCapacityBytes: 0 VolStatus: Full Enabled: 1 Recycle: 1 ActionOnPurge: 1 VolRetention: 604800 VolUseDuration: 345600 MaxVolJobs: 0 MaxVolFiles: 0 MaxVolBytes: 5000000000 InChanger: 1 StorageId: 41 DeviceId: 0 MediaAddressing: 0 VolReadTime: 0 VolWriteTime: 10579697 EndFile: 1 EndBlock: 704970817 LocationId: 0 RecycleCount: 1 InitialWrite: 0000-00-00 00:00:00 ScratchPoolId: 0 RecyclePoolId: 0 Comment: NULL 1 row in set (0.00 sec) mysql> select * from Storage; +-----------+------------------+-------------+ | StorageId | Name | AutoChanger | +-----------+------------------+-------------+ | 1 | File-fs02 | 1 | | 3 | File-fs02-bak | 1 | | 5 | File-fs03 | 1 | | 7 | File-fs03-bak | 1 | | 9 | File-fs04 | 1 | | 11 | File-fs04-bak | 1 | | 13 | File-fs05 | 1 | | 15 | File-fs05-bak | 1 | | 17 | File-fs06-01 | 1 | | 19 | File-fs06-01-bak | 1 | | 21 | File-fs06-02 | 1 | | 23 | File-fs06-02-bak | 1 | | 25 | File-fs07 | 1 | | 27 | File-fs07-bak | 1 | | 29 | File-fs08 | 1 | | 31 | File-fs08-bak | 1 | | 33 | File-fs10 | 1 | | 35 | File-fs10-bak | 1 | | 37 | File-fs12 | 1 | | 39 | File-fs12-bak | 1 | | 41 | File-fs15-bak | 1 | | 43 | File-fs16 | 1 | | 45 | File-fs16-bak | 1 | +-----------+------------------+-------------+ 23 rows in set (0.00 sec) Thank you, /Robin ====================================================================== ---------------------------------------------------------------------- (0006045) ebollengier (administrator) - 2011-11-14 19:16 http://bugs.bacula.org/view.php?id=1788#c6045 ---------------------------------------------------------------------- It looks to be a configuration issue, I don't see anything wrong in your output, please explain again what is wrong with the StorageId 41 that's point to File-fs15-bak. (The StorageId field is a Postgres/MySQL autoincrement field, we don't manage it directly). ---------------------------------------------------------------------- (0006047) ba...@ip... (reporter) - 2011-11-15 06:42 http://bugs.bacula.org/view.php?id=1788#c6047 ---------------------------------------------------------------------- Yes this is ok i fixed this one, it was pointing to File-fs12 before. The storages are on all catalogs but with different ids, should i change so all catalogs has the same storage structure? Here is a error one: mysql> select * from Media limit 1\G *************************** 1. row *************************** MediaId: 1 VolumeName: Pool-fs07-1 Slot: 1 PoolId: 1 MediaType: File-07 MediaTypeId: 9 LabelType: 0 FirstWritten: 2011-11-08 03:16:46 LastWritten: 2011-11-08 03:18:23 LabelDate: 2011-11-07 23:05:23 VolJobs: 1 VolFiles: 1 VolBlocks: 77505 VolMounts: 33 VolBytes: 4999953334 VolParts: 0 VolErrors: 0 VolWrites: 2236014 VolCapacityBytes: 0 VolStatus: Full Enabled: 1 Recycle: 1 ActionOnPurge: 1 VolRetention: 604800 VolUseDuration: 345600 MaxVolJobs: 0 MaxVolFiles: 0 MaxVolBytes: 5000000000 InChanger: 1 StorageId: 27 DeviceId: 0 MediaAddressing: 0 VolReadTime: 0 VolWriteTime: 211308071 EndFile: 1 EndBlock: 704792501 LocationId: 0 RecycleCount: 28 InitialWrite: 0000-00-00 00:00:00 ScratchPoolId: 0 RecyclePoolId: 0 Comment: NULL 1 row in set (0.00 sec) mysql> select * from Storage; +-----------+------------------+-------------+ | StorageId | Name | AutoChanger | +-----------+------------------+-------------+ | 1 | File-fs02 | 1 | | 3 | File-fs03 | 1 | | 5 | File-fs04 | 1 | | 7 | File-fs05 | 1 | | 9 | File-fs06 | 1 | | 11 | File-fs06-02 | 1 | | 13 | File-fs07 | 1 | | 15 | File-fs08 | 1 | | 17 | File-fs10 | 1 | | 19 | File-fs12 | 1 | | 21 | File-fs13 | 1 | | 23 | File-fs06-01 | 1 | | 25 | st-10 | 1 | | 27 | st-12 | 1 | | 29 | Storage-fs10 | 1 | | 31 | Storage-fs12 | 1 | | 33 | File-fs05-bak | 1 | | 35 | File-fs02-bak | 1 | | 37 | File-fs03-bak | 1 | | 39 | File-fs04-bak | 1 | | 41 | File-fs06-01-bak | 1 | | 43 | File-fs06-02-bak | 1 | | 45 | File-fs07-bak | 1 | | 47 | File-fs08-bak | 1 | | 49 | File-fs10-bak | 1 | | 51 | File-fs12-bak | 1 | | 53 | File-fs15-bak | 1 | | 55 | File-fs16 | 1 | | 57 | File-fs16-bak | 1 | +-----------+------------------+-------------+ StorageId should be 13, also Storageid 11,25,27,29,31 is removed. What else do you need? Thanks /Robin ---------------------------------------------------------------------- (0006048) ebollengier (administrator) - 2011-11-15 07:09 http://bugs.bacula.org/view.php?id=1788#c6048 ---------------------------------------------------------------------- I still don't understand the problem and specially what kind of trouble it creates on your production. Bacula might have a problem when using multiple Catalog, this is not the most common setup. Once we will understand the problem, we will need the exact procedure to reproduce this problem. ---------------------------------------------------------------------- (0006049) ba...@ip... (reporter) - 2011-11-15 07:28 http://bugs.bacula.org/view.php?id=1788#c6049 ---------------------------------------------------------------------- Backups are all running fine, but i get trouble when i try to restore it says: 15-Nov 08:20 birch-dir JobId 12229: Start Restore Job Restore-Files.2011-11-15_08.20.29_38 15-Nov 08:20 birch-dir JobId 12229: Using Device "FileStorage02" 15-Nov 08:20 birch-sd-fs07 JobId 12229: Fatal error: No Volume names found for restore. 15-Nov 08:18 vz-0067.s.ipeer.se-fd JobId 12229: Fatal error: /home/kern/bacula/k/bacula/src/filed/job.c:2031 Bad response to Read Data command. Wanted 3000 OK data , got 3000 error 15-Nov 08:20 birch-dir JobId 12229: Using Device "FileStorage02" 15-Nov 08:18 vz-0067.s.ipeer.se-fd JobId 12229: Fatal error: Failed to authenticate Storage daemon. 15-Nov 08:20 birch-dir JobId 12229: Fatal error: Bad response to Storage command: wanted 2000 OK storage , got 2902 Bad storage And if i run this: mysql> update Media set StorageId = '13'; Query OK, 2700 rows affected (0.06 sec) Rows matched: 2700 Changed: 2700 Warnings: 0 *mess 15-Nov 08:22 birch-dir JobId 12231: Bacula birch-dir 5.2.1 (30Oct11): Build OS: x86_64-unknown-linux-gnu ubuntu 10.04 JobId: 12231 Job: Restore-Files.2011-11-15_08.22.13_40 Restore Client: vz-0067.s.ipeer.se-fd Start time: 15-Nov-2011 08:22:15 End time: 15-Nov-2011 08:22:22 Files Expected: 46 Files Restored: 46 Bytes Restored: 85,726 Rate: 12.2 KB/s FD Errors: 0 FD termination status: OK SD termination status: OK Termination: Restore OK So the problem is to find out why some of the Catalogs has wrong StorageId connected to each media, or is the only way to check and fix this through the previous mysql query? I understand that our setup is not the most common :) we are using vtapes and have around 750 clients, all of us that is inside bconsole and makes restores is not knowing how to handle the mysql databases this is why i want to find out what is causing this. What exactly does the "update volume" and "update pool" command? ---------------------------------------------------------------------- (0006058) kbiernat (reporter) - 2011-11-24 05:23 http://bugs.bacula.org/view.php?id=1788#c6058 ---------------------------------------------------------------------- I ran across the same problem (on PostgreSQL), so let me try to explain. Below are results of a same query: select storageid, name from storage order by storageid; on 2 different catalogs: bacula5_lan=# select storageid, name from storage order by storageid; storageid | name -----------+------------------------- 1 | backup1-internal-sd 2 | srv-areca-sd 3 | backup1-xen-sd 4 | backup1-external-sd 5 | backup1-mrsaxo-sd 6 | backup1-mrregiopraca-sd 7 | backup1-mr-sd 8 | backup1-iklu-sd 9 | backup1-mrtest-sd 10 | backup1-polityka-sd 11 | backup1-phx-sd 12 | backup1-sensi-sd 13 | srv-areca-sensi-sd 14 | srv-areca-lan2-sd 15 | backup1-mr2-sd (15 rows) bacula5_mr=# select storageid, name from storage order by storageid; storageid | name -----------+------------------------- 1 | backup1-internal-sd 2 | backup1-xen-sd 3 | backup1-external-sd 4 | backup1-mrsaxo-sd 5 | backup1-mrregiopraca-sd 6 | backup1-mr-sd 7 | backup1-iklu-sd 8 | backup1-mrtest-sd 9 | backup1-polityka-sd 10 | backup1-phx-sd 11 | backup1-sensi-sd 12 | srv-areca-sd 13 | srv-areca-sensi-sd 14 | srv-areca-lan2-sd 15 | backup1-mr2-sd (15 rows) As you can see, on bacula5_lan catalog, srv-areca-sd seems to have the id = 2 and is stuck between backup1-internal and backup1-xen. Here is a query on bacula5_mr catalog, seeking for media that match dbrw% - matches a client that is configured to use bacula5_mr catalog: bacula5_mr=# select volumename, storageid from media where volumename like 'dbrw%' order by mediaid desc limit 1; volumename | storageid -----------------------------------------+----------- dbrw-dbbackup.mrsaxo.internal-Full-4074 | 5 (1 row) And as you can see, the storageid = 5. Should equal 4, obviously, since (as you can also see), this client is configured in mrsaxo zone and on bacula5_mr catalog, storage for those is defined as: backup1-mrsaxo-sd with id = 4. For some reason though, bacula uses ID from different catalog than it should for this client. I have 6 catalogs configured, all of them (except bacula5_lan catalog) have the same storage order as bacula5_mr. I'm thinking about trying these approaches: a) update lan catalog to match other catalogs b) update other catalogs so they match lan catalog c) write a script to parse the config and update catalogs as needed (but it'll just fix the symptom, not the cause, so not too smart) ---------------------------------------------------------------------- (0006059) marcovw (developer) - 2011-11-24 10:04 http://bugs.bacula.org/view.php?id=1788#c6059 ---------------------------------------------------------------------- Ok regarding the last report. Did you create all the catalogs at the same time ? Or did you create the the bacula5_mr catalog later on. Looking at the code the multiple catalog stuff tries to make sure the other catalogs (other then the master catalog) have the minimum set of data available to make sure that more then one catalog can be used. For this it populates the other catalogs with fake clients etc. It does this by scanning the config again. If you don't create all catalogs at once you may end up with different id's for certain clients or storages etc. as these get auto assigned serials by the database. Fixing this would be quite some work as we should probably make one catalog the master catalog (the first one ?) and then populate the other catalogs with the data from the master database. The whole concept of multiple catalogs is far from being tested properly so I'm wondering if we should keep it in its current form. ---------------------------------------------------------------------- (0006063) kbiernat (reporter) - 2011-11-24 20:27 http://bugs.bacula.org/view.php?id=1788#c6063 ---------------------------------------------------------------------- I don't remember excatly, because we migrated all clients from bacula5_*lan* from another director. But as far as I remember, we created a whole new catalog then. Means it was created as the last one. Question is: where does bacula take 'storageid' field value from when inserting into media for a given job? Obviously not from storage table in catalog to which client for this job is bound. Issue History Date Modified Username Field Change ====================================================================== 2011-11-14 13:31 ba...@ip...New Issue 2011-11-14 19:16 ebollengier Note Added: 0006045 2011-11-14 19:16 ebollengier Assigned To => ebollengier 2011-11-14 19:16 ebollengier Status new => feedback 2011-11-14 19:17 ebollengier Priority high => normal 2011-11-14 19:17 ebollengier Severity major => minor 2011-11-15 06:42 ba...@ip...Note Added: 0006047 2011-11-15 06:42 ba...@ip...Status feedback => assigned 2011-11-15 07:09 ebollengier Note Added: 0006048 2011-11-15 07:28 ba...@ip...Note Added: 0006049 2011-11-24 05:23 kbiernat Note Added: 0006058 2011-11-24 10:03 marcovw Note Added: 0006059 2011-11-24 10:04 marcovw Note Edited: 0006059 2011-11-24 20:27 kbiernat Note Added: 0006063 ====================================================================== |