Menu

#48 Feature: purgeall command

closed
nobody
None
5
2022-02-16
2020-06-15
No

I was setting up a few backups today, also including some purge configuration. However, for some profiles, I used MAX_AGE, and for some backups I used MAX_FULL_BACKUPS. This means that for the former profiles, I have to call purge, while for the latter I have to call purgefull (and for both, I also configured MAX_FULLS_WITH_INCRS, so those need purgeincr as well).

This means I cannot just do a for loop over all profiles and run the same backup command for each. Also, if I change the profile configuration, I also have to change the calling script/cronjob/systemd file, which is inconvenient. I cannot just run all purge commands, since they error out when the relevant configuration is not set.

I would suggest a new command, purgeall, which would:
- If MAX_FULLS_WITH_INCRS is set, run purge
- If MAX_FULLS_BACKUPS is set, run purgefull
- If MAX_AGE is set, run purge
(maybe the order needs to be revised)

This would allow just running the same command (e.g. backup_purgeall) for all profiles.

Related

Feature Requests: #48

Discussion

  • ede

    ede - 2020-06-15

    hey big foot hunting Matthijs,

    On 15.06.2020 09:51, Matthijs Kooijman wrote:


    [feature-requests:#48] https://sourceforge.net/p/ftplicity/feature-requests/48/ Feature: purgeall command

    Status: open
    Group: Next Release (example)
    Created: Mon Jun 15, 2020 07:51 AM UTC by Matthijs Kooijman
    Last Updated: Mon Jun 15, 2020 07:51 AM UTC
    Owner: nobody

    I was setting up a few backups today, also including some purge configuration. However, for some profiles, I used |MAX_AGE|, and for some backups I used |MAX_FULL_BACKUPS|. This means that for the former profiles, I have to call |purge|, while for the latter I have to call |purgefull| (and for both, I also configured |MAX_FULLS_WITH_INCRS|, so those need |purgeincr| as well).

    how many profiles are we talking? just to get an idea of the effort needed in relation.

    This means I cannot just do a for loop over all profiles and run the same backup command for each. Also, if I change the profile configuration, I also have to change the calling script/cronjob/systemd file, which is inconvenient.

    what do you mean. can you give an example?

    I cannot just run all purge commands, since they error out when the relevant configuration is not set.

    I would suggest a new command, |purgeall|, which would:
    - If |MAX_FULLS_WITH_INCRS| is set, run |purge|
    - If |MAX_FULLS_BACKUPS| is set, run |purgefull|
    - If |MAX_AGE| is set, run |purge|
    (maybe the order needs to be revised)

    This would allow just running the same command (e.g. |backup_purgeall|) for all profiles.

    well in theory we could make 'purge' a little more intelligent by testing which vars are set and purging accordingly. before tackling that are you aware that you can leave out purge conf vars completely and give them as parameter? e.g.
    duply test purge 365D
    ?

    ..ede/duply.net

     

    Related

    Feature Requests: #48

  • Matthijs Kooijman

    how many profiles are we talking? just to get an idea of the effort needed in relation.

    I had 5 profiles, though some fiddling around with excludes and mountpoints reduced this to three, so it's not terrible.

    what do you mean. can you give an example?

    I currently have e.g. this in my systemd unit file:

    ExecStart=/usr/bin/duply data backup_purgeincr_purge_cleanup --force
    ExecStart=/usr/bin/duply root backup_purgeincr_purgefull_cleanup --force
    

    Now, if I for example change /etc/duply/data/conf from MAX_AGE=1M to MAX_FULL_BACKUPS=10, I have to also change the systemd unit file above replacing purge with purgefull for the data profile.

    Also, I cannot currently do something like:

    for p in data root; do
       duply $p backup_purge_cleanup --force
     done
    

    Since this would require different purge options for different profiles.

    well in theory we could make 'purge' a little more intelligent by testing which vars are set and purging accordingly.

    Yeah, exactly. Though I proposed a new command, to prevent breaking existing installations (e.g. if you let purge apply all three possible purge actions, then someone explicitly calling purge_purgeincr would run purgeincr twice.

    before tackling that are you aware that you can leave out purge conf vars completely and give them as parameter? e.g. duply test purge 365D

    I had not realized that, and it would help a bit, since then I can move the purge config entirely into the systemd file, except that that would (AFAICS) require calling purge and purgeincr separately. Also, I would still need to configure --full-if-older-than in the duply config file, so that would again split the age config a bit (and having it all in the duply config seems nicer anyway, the less profile-specific config there is in the systemd file, the better IMHO).

     
    • ede

      ede - 2020-06-16

      how many profiles are we talking? just to get an idea of the effort needed in relation.

      I had 5 profiles, though some fiddling around with excludes and mountpoints reduced this to three, so it's not terrible.

      agreed

      what do you mean. can you give an example?

      I currently have e.g. this in my systemd unit file:

      ExecStart=/usr/bin/duply data backup_purgeincr_purge_cleanup --force
      ExecStart=/usr/bin/duply root backup_purgeincr_purgefull_cleanup --force
      

      Now, if I for example change /etc/duply/data/conf from MAX_AGE=1M to MAX_FULL_BACKUPS=10, I have to also change the systemd unit file above replacing purge with purgefull for the data profile.

      any reason not to use cron, where it is a oneliner?

      Also, I cannot currently do something like:

      for p in data root; do
         duply $p backup_purge_cleanup --force
       done
      

      Since this would require different purge options for different profiles.

      that's true.

      just a hint, consider adding 'verify_and_' or 'verify+' befor the purge so that purging only happens when backups are verifiedly proper.´

      well in theory we could make 'purge' a little more intelligent by testing which vars are set and purging accordingly.

      Yeah, exactly. Though I proposed a new command, to prevent breaking existing installations (e.g. if you let purge apply all three possible purge actions, then someone explicitly calling purge_purgeincr would run purgeincr twice.

      never imagined someone would run two purges. but hey, you're right it's possible.

      before tackling that are you aware that you can leave out purge conf vars completely and give them as parameter? e.g. duply test purge 365D

      I had not realized that, and it would help a bit, since then I can move the purge config entirely into the systemd file, except that that would (AFAICS) require calling purge and purgeincr separately. Also, I would still need to configure --full-if-older-than in the duply config file, so that would again split the age config a bit (and having it all in the duply config seems nicer anyway, the less profile-specific config there is in the systemd file, the better IMHO).

      the problem seems to be the whole one commandline for different backups thing. why should it be identical if it is not. questions is, how could one make it more one for all maabe.

      thinking about it.. ede

       
  • Matthijs Kooijman

    I had 5 profiles, though some fiddling around with excludes and mountpoints reduced this to three, so it's not terrible.

    I changed my mind about this and want to split my backups a bit further (to make them more managable in size), so now I'll probably have 6 or so :-)

    any reason not to use cron, where it is a oneliner?

    I rather like systemd for running services, since it takes care of logging output, logging exit status, has a status command that shows the current state, allows manually triggering a service, etc. I guess the oneliner you mean is the for loop, which I could probably also do in systemd with an /bin/sh -c 'for ..., but even that approach would need the original request (purgeall) to be implemented.

    just a hint, consider adding 'verify_and_' or 'verify+' befor the purge so that purging only happens when backups are verifiedly proper.´

    Oh, that's actually a good suggestion (though I'm a bit afraid of verify, since this 200GB backup already took 48h to run a full backup, verify is probably not much better and might block new backups in the meanwhile? Though it seems verify does not compare contents by default, so that should be faster I guess). I do wonder: If verify fails if the file list (or any file contents with --compare-data) has changed, won't it pretty much always fail on big backup? In the time a full backup happened, I can be pretty sure that new files have been created. Even if I do another one or two incrementals afterwards to make the backup more up-to-date, I'm afraid that verify will never actually work...

    Anyway, it is probably good practice to purge only occasionally, not on every backup run, which could match with verifying only occasionally too (e.g. every month do a new full backup, verify and purge or so).

    never imagined someone would run two purges. but hey, you're right it's possible.

    Hm, that suprises me. How is purgeincr useful by itself? AFAICS it only removes incrementals, never full backups? So if you only run purgeincr, you'll keep full backups indefinitely. Seems like you should always use purgeincr together with purge or purgefull?

    the problem seems to be the whole one commandline for different backups thing. why should it be identical if it is not. questions is, how could one make it more one for all maabe.

    Yeah, exactly. Hence my suggesting of a purgeall command that runs any of the three purge commands that is configured. With that, all my profiles can use the same commands, just the config is different.

     
  • ede

    ede - 2020-12-28

    Hm, that suprises me. How is purgeincr useful by itself? AFAICS it only removes incrementals, never full backups? So if you only run purgeincr, you'll keep full backups indefinitely. Seems like you should always use purgeincr together with purge or purgefull?

    purgeIncr is translated to the duplicity command remove-all-inc-of-but-n-full which the manpage explains like this

    remove-all-inc-of-but-n-full <count> [--force] <url>
    Delete incremental sets of all backups sets that are older than the
    count:th last full backup (in other words, keep only old full backups 
    and not their increments). count must be larger than zero. 
    A value of 1 means that only the single most recent backup chain will be 
    kept intact. Note that --force will be needed to delete the files instead 
    of just listing them.
    

    so yeah, it removes old chain's incrementals only. you are totally right. obviously i never used it so far.

    if i find i may have a look at a purgeAuto command or such. be patient. ..ede

     

    Last edit: ede 2020-12-28
  • Matthijs Kooijman

    I came back to this issue after discovering my remote backup disk was full (because I had not enabled purging yet due to this). Looking at this again, I realized there is an easy workaround: just specify large values for any purging options you do not need. E.g. if you only want to use MAX_FULL_BACUPS you can specify:

    MAX_FULL_BACKUPS=3
    MAX_FULLS_WITH_INCRS=9999
    MAX_AGE=9999Y
    

    This should allow running backup+verify+purge+purgefull+purgeincr unconditionally and without errors (at the expense of doing some additional listing of the backup sets that are not really needed).

    I haven't actually tried this yet (the full commandline, I did check that duply/duplicity runs fine with individual purge commands), since looking better at my backup configuration it seems all of my configurations seem to use MAX_AGE now, I guess I unified that to work around this issue but then forgot to actually enable purging :-)

     
  • ede

    ede - 2022-02-08

    thanks for the heads up,

    i'm currently thinking about adding purgeAuto as a batch command that get's expanded to [purge_purgeFull_purgeIncr] depending on the conf vars are set. not sure enforcing condional AND would be a good idea.

     
  • Matthijs Kooijman

    That sounds like a good approach. With "AND", you mean whether e.g. purgeincr should be allowed to run when purgefull fails? It's probably ok to continue running when one command fails, but it might be safer to abort on the first failure, also to prevent the error messages from being scrolled out of sight by a subsequent successful operation?

     
  • ede

    ede - 2022-02-08

    yeah, conditional chaining. seeing, as currently, backup des not enforce AND i am inclined to keep it simple for the proposed purgeAuto as well. after all, just because one fails, you might still want the others to clean up?

     
  • ede

    ede - 2022-02-11
    • status: open --> closed
     
  • ede

    ede - 2022-02-11

    purgeAuto as outlined above will be in the next release. thanks!

     
  • ede

    ede - 2022-02-16

    hey Matthijs,

    could you please test if the purgeAuto as implemented in the snapshot
    https://duply.net/tmp/duply.sh

    works as you expect it? thanks! ..ede

     

Log in to post a comment.