From: Hans Schou <hans@mo...> - 2014-02-11 13:04:17
When going into restore:
2: List Jobs where a given File is saved
it takes quite a lot of time to get the list.
To speed it up I have added an index:
CREATE INDEX FilenameId ON File(FilenameId);
Am I doing something wrong or could this be added to futufre releases
(and then what will be the name of the index?).
It will add a little more time when adding/deleting files as the index
has to be maintained but I think it is worth it.
If you have a lot of files and want to test it, run:
time mysql bacula -e "SELECT * FROM File WHERE File.FilenameId=7;" #
replace the number
then add the index and run again.
The select which actually is run:
Job.JobId as JobId,
CONCAT(Path.Path,Filename.Name) as Name,
Type as JobType,
Client, Job, File, Filename, Path
AND File.FileIndex > 0
From: Eric Bollengier <eric.bollengier@ba...> - 2014-02-14 09:26:10
On 02/11/2014 01:32 PM, Hans Schou wrote:
> When going into restore:
> 2: List Jobs where a given File is saved
> it takes quite a lot of time to get the list.
> To speed it up I have added an index:
> CREATE INDEX FilenameId ON File(FilenameId);
> Am I doing something wrong or could this be added to futufre releases
> (and then what will be the name of the index?).
Your query is not a part of Bacula itself, as the catalog schema is
open, easy to query, it's great that you can use it in this way.
> It will add a little more time when adding/deleting files as the index
> has to be maintained but I think it is worth it.
> If you have a lot of files and want to test it, run:
> time mysql bacula -e "SELECT * FROM File WHERE File.FilenameId=7;" #
> replace the number
> then add the index and run again.
This query looks like a
find / -name 'xxxx'
Over potentially hundred of thousand jobs, thousand of systems, and
sometime, billion of files. In the real life, I don't think it's really
something you would do very often ;-)
With the current database structure, Bacula is efficient if you give the
Job or the Client (to limit the number of jobs), and the Path. This is a
scalable way to query a big catalog.
The index you propose is OK for your query, but as Bacula doesn't search
in the catalog using this approach, so I don't think we will add this
default index. Each new index, specially on the largest table of Bacula
(that can have 3 or 4 billion records), have huge impacts on backup
speed, maintenance tasks and catalog size. So we prefer to not add
"default" new indexes unless Bacula core needs it.
Now, if you want to search files with no selection criteria, your index
is welcome and Bacula will have no problems with it, this is a specific
problem that can be solved very easily with your proposition.
I would even advise to create your index using case independent key
(creating the index in lowercase for example), doing so, on systems such
as Windows, where filenames are not always using the same case, your
query would return all records directly using your index.
search_file('resolv.conf') => Resolv.conf, RESOLV.CONF, resolv.conf
Need professional help and support for Bacula ?