then I observe that as well as the backup.zip file there is another tmp file created in the same folder. I am not sure but it looks like the contents of the original backup.zip are being copied to the tmp file, then the new data is added to the tmp file, and finally the original backup.zip is deleted and the tmp file renamed as backup.zip.
As I am archiving large files (> 5G) this process of copying the existing data to a temporary file is slowing the process down considerably. I couldn't see any command line flags to prevent the creation of a temporary file.
Does anybody know why 7zip works like this ? My understanding of the zip archive format is that the central directory which lists the contents of the zip file is located at the end of the file. This means that when adding a new file to an existing archive you don't need a temporary copy. You should be able to read the central directory into memory, append the new data at the end of the file where the central directory was located, then update the in-memory central directory with the new file, and append it to the end of the file.
And when I use the Zipfile module from python this is exactly how it works. I can add new files to an existing zip file without creating a time (and space) consuming copy. As the zstd compression method is much faster than the standard deflate64 method I was hoping to use 7zz directly instead of using the Zipfile module. But the temporary file slows it down and uses up too much disk space.
Is there a way to get 7zz to add new files to the end of an existing archive without creating a temporary file ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It seems like the possibility of data loss would be very small. You could leave the existing central directory on the end of the existing file and just append the new file and a new central directory. If you ran out of disc space during this operation you could then just revert the file to its original size.
For my use case I would much prefer the option to accept this small risk as otherwise 7zip is not a workable solution and I will have to use something else. I understand if you prefer to not implement such an option but feel that for some use cases it is necessary.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I create an archive file like this:
dd if=/dev/sda1 bs=1M | 7zz a -tzip -m0=zstd -mx1 -sipartition.1 backup.zip
which works as expected. When I try to add a second file to the same archive:
dd if=/dev/sda2 bs=1M | 7zz a -tzip -m0=zstd -mx1 -sipartition.2 backup.zip
then I observe that as well as the backup.zip file there is another tmp file created in the same folder. I am not sure but it looks like the contents of the original backup.zip are being copied to the tmp file, then the new data is added to the tmp file, and finally the original backup.zip is deleted and the tmp file renamed as backup.zip.
As I am archiving large files (> 5G) this process of copying the existing data to a temporary file is slowing the process down considerably. I couldn't see any command line flags to prevent the creation of a temporary file.
Does anybody know why 7zip works like this ? My understanding of the zip archive format is that the central directory which lists the contents of the zip file is located at the end of the file. This means that when adding a new file to an existing archive you don't need a temporary copy. You should be able to read the central directory into memory, append the new data at the end of the file where the central directory was located, then update the in-memory central directory with the new file, and append it to the end of the file.
And when I use the Zipfile module from python this is exactly how it works. I can add new files to an existing zip file without creating a time (and space) consuming copy. As the zstd compression method is much faster than the standard deflate64 method I was hoping to use 7zz directly instead of using the Zipfile module. But the temporary file slows it down and uses up too much disk space.
Is there a way to get 7zz to add new files to the end of an existing archive without creating a temporary file ?
There is no way to disable the creation of temporary archive file.
7-zip creates a temporary file to prevent data loss in case of failures.
It seems like the possibility of data loss would be very small. You could leave the existing central directory on the end of the existing file and just append the new file and a new central directory. If you ran out of disc space during this operation you could then just revert the file to its original size.
For my use case I would much prefer the option to accept this small risk as otherwise 7zip is not a workable solution and I will have to use something else. I understand if you prefer to not implement such an option but feel that for some use cases it is necessary.