dump currently does not build on systems (alpine, gentoo, void linux, etc) where the libc is musl and not glibc. Some of the issues are easy to fix, for example by replacing the __BEGIN_DECLS and __END_DECLS macros from sys/cdefs.h per https://wiki.musl-libc.org/faq.html#Q:-When-compiling-something-against-musl,-I-get-error-messages-about-%3Ccode%3Esys/cdefs.h%3C/code%3E
But others are less easy. fstab.h is missing, is it important? The daddr_t type is also nowhere to be found and I don't know what to do about that.
I'm not even sure if this is within the scope of the dump tool, but we have a bug report about it on Gentoo, so now you have one too :)
Thanks for reporting this. I'm not sure exactly how far I'll be able to get with this but I see that debian does package musl so at some point I'll try and get this compiling with that and see how far that gets us. I don't have a huge amount of time to work on dump so I almost certainly won't be setting up a new distribution to test with in the forseeable future but perhaps just getting it building on debian will be enough.
I had a brief look into this and I'm not sure if I'm going to be able to do much for this in the foreseeable future. While installing musl-tools got me as far as running configure, it looks like at least on debian, I'm stuck on things as fundamental as ext2fs/ext2_fs.h.
But the fstab.h include is irrelevant. I will remove them in the next release. As are the cdefs.h includes and the DECLS stuff which can be deleted.
daddr_tis defined insys/types.hand ultimately resolves to:__STD_TYPE __DADDR_T_TYPE __daddr_t; /* The type of a disk address. */But as it's only used in sizeof and the underlying code uses
__u32innew_bsd_inodeI'm not completely sure why this type is used at all now. It was used in many more places back in 2001. I will change these as follows in the next release (unless I find a reason why they cannot be changed)No problem,e very little bit helps, thanks. What's the issue with the ext2fs headers? Don't they come from either e2fsprogs or the kernel headers?
With a little more hacking I was able to get
maketo complete. Pending more serious testing,timelocalfunction can be replaced bymktime. According the man-pages project, the latter is a POSIX replacement for the former.#include <fcntl.h>foropen()and friends.#include <limits.h>forPATH_MAX.I also had to comment out the
rcmdcall in common/dumprmt.c, which probably is not the solution you want, but maybe the remote dump feature could be compiled out or something on systems with norcmd.That is good news. I'll have a look at those changes and see what I can do. I thought there was a way to compile out the rcmd stuff but obviously not. I'm seriously tempted to add that so that it can be only rsh based. I'm not sure if anybody still uses rcmd for anything.
FWIW there's a big stack of tests in testing/scripts - but only from git, not from the release tarball. Unfortunately, they require root to run and do many "aggressive" things like
mke2fs,rm -frso please be cautious. They work for me but I cannot guarantee that there's not some quirk in there that will be catastrophic for someone else. The entire suite can be run from./run-all-tests.sh -h(One of my goals for the v0.4b50 release was to allow them to run as non-root using unshare and fuse but I've hit a couple of "hard" issues to solve that are going to take a bit of perseverance. I've collected a fair few minor fixes and cleanups locally so I'll probably create 0.4b50 soon without non-root tests)
./configure --enable-rcmd=noappears to DTRT.It should be
#include <linux/limits.h>forPATH_MAX. Does that work for you?This changes have been made in 0.4b50
Thanks a lot, it builds now with
--disable-rcmd. I've been meaning to run the test suite, but I'm not brave enough to do it on my real workstation. I should be able to use an Alpine linux (they use musl) VM but my home machine is a bit weird and I haven't been able to get kvm/qemu working. The next time I'm at my office I have a nice normal amd64 machine I can use though.A quick test you can do asuming you have an ext[234] partition mounted somewhere (ro to be extra safe) is something like this which dumps a fs to stdout and then pipes it into restore to verify against the mounted fs
(you still need to be root but it doesn't involve all the complex logic to create filesystems and clean up afterwards.)
`
umount /mnt/nobackup/backup/
mount -o ro /mnt/nobackup/backup/
dump -0 -f - /dev/mapper/vg--dirac-backup | restore -C -D /mnt/nobackup/backup/ -f -
DUMP: Date of this level 0 dump: Thu Mar 20 18:39:56 2025
DUMP: Dumping /dev/mapper/vg--dirac-backup (/mnt/nobackup/backup) to standard output
DUMP: Label: none
DUMP: Writing 10 Kilobyte records
DUMP: mapping (Pass I) [regular files]
DUMP: mapping (Pass II) [directories]
DUMP: estimated 1372620 blocks.
DUMP: Volume 1 started with block 1 at: Thu Mar 20 18:39:57 2025
Dump date: Thu Mar 20 18:39:56 2025
Dumped from: the epoch
Level 0 dump of /mnt/nobackup/backup on dirac.home.woodall.me.uk:/dev/mapper/vg--dirac-backup
Label: none
DUMP: dumping (Pass III) [directories]
filesys = /mnt/nobackup/backup/
DUMP: dumping (Pass IV) [regular files]
DUMP: Volume 1 completed at: Thu Mar 20 18:40:04 2025
DUMP: Volume 1 1369690 blocks (1337.59MB)
DUMP: Volume 1 took 0:00:07
DUMP: Volume 1 transfer rate: 195670 kB/s
DUMP: 1369690 blocks (1337.59MB)
DUMP: finished in 7 seconds, throughput 195670 kBytes/sec
DUMP: Date of this level 0 dump: Thu Mar 20 18:39:56 2025
DUMP: Date this dump completed: Thu Mar 20 18:40:04 2025
DUMP: Average transfer rate: 195670 kB/s
DUMP: DUMP IS DONE
And add
-vto the restore command if you want lots of logging from restore (otherwise it only logs on a problems)dump seems to work OK...
but restore hit some problem:
Keep in mind though that in addition to using musl, this is also a RISC-V system. It has an old forked kernel and brand new compilers. The risk of conflating issues is high.
Oh dear, that's going wrong right at the very start of reading the tape.
You got to restore/tape.c line 388 so it found the start of the CLRI bitmap, which will have been full of zeros for a L0 dump, but it didn't then find the BITS bitmap following it (which will have had a bit for each inode that is going to be dumped set.)
Don't know if you're tried building it but
make check
builds a few helper program in faketape.
(none of this needs to be done as root)
First run ./faketape_test (it takes 30s or so to run on my system but if it doesn't work there's no point continuing with the rest of this until that is sorted. It's a standalone program that doesn't depends on any of the code in dump/restore)
Then try converting your file image into a faketape image.
First parameter is a NEW file that it will create, it will refuse to run if that file exists. Second parameter is your image. This will take a while even if it works. If it fails then that cannot read your tape either but it's going to be much easier to see what it's complaining about, the entire program is only around 200 lines.
Finally, if that works (and I suspect that it won't)
(note that you have to rewind again to run dump-info again. faketape emulates a non-auto rewinding tape)
and you should get something like
at the start. You see that
CLRI size=16384which is why there are 16 blocks. Then there's the TS_BITS which is the same size and then beyond that the data stuff starts.Ha!
file-to-tape: Dump ../x.img written successfullythat's a bug. It's not writing x.img at all, only reading it.
Actually, now I remind myself about how this works, file-to-tape should work, it's dump-info that will hopefully give some clues as to what's going wrong.
I'm trying to guess what is most likely to be the cause. Struct padding would be my first thought but
u_spcl_size_assertis supposed to do a compile time check on this.You might want to change say line 129 of compat/include/protocols/dumprestore.h to something like:
case sizeof(us) == 1025 ? 1 : 0:and ensure that it fails to build and the musl compiler isn't doing something clever and not compiling an unused function which is then hiding a struct padding issue.This does fail
What version of the catch library are you using? I think the two that are available on Gentoo are too old / too new, but I could build one myself. Ignoring that for the moment...
To get the rest to build, I had to add
#include <sys/stat.h>to faketape/bswap_header.cpp and faketape/dump-info.cpp. Afterwards, file-to-tape does work. This is what dump-info has to say:and it goes on for another 33MiB before finishing with
Nevermind about catch, I was able to get it working with v3.7.1:
This looks like an off-by-one bug in dump / restore. Hopefully easily fixed in restore otherwise..
It fits exactly in 1843 blocks.
I suspect that's counting from 0 and it looks like it's gone wrong on the last block of the CLRI bitmap - which (my guess) is dump is writing but restore is not expecting.
I'm more than a bit disappointed that restore didn't resync on the next block, I need to have a dig into that - which is probably where the fix will lie (as well as fixing dump to not write the extra block)
Hmmm, now I've looked at the code I don't know why it's gone wrong.
converthead sets blksread to 0 which means
Incorrect block for <file removal list> at 1845 blocksis reading what should be the TS_BITS header which means that something went wrong with gethead at line 1316 of restore/tape.c
It's clearly read 1844 blocks at line 1286
Ohhh, it's gone wrong at line 1305
Incorrect block for <file removal list> at 1845 blockshas read one too many blocks because it starts at 0 so it's skipped over the TS_BITS and then fails when it gets to the TS_INODEI'm pretty sure you are reaching line 1305 but I cannot see how that can happen.
You get to 1299 with size==0 and b==0 and i==1843 having read the last block of the bitmap at line 1286.
Then that do { } while loop sets enclen to 1 so readtape(junk); should never be reached.
You could try this fix but I'm suspicious it's wrong where we've read a file of say 1K on a system with blocks of 4K although interestingly, all the "non slow tests" have passed, I'm running the historical-regression tests while I write this.
It would be interesting if you could add some logging before each of the readtape calls in this function outputting the line number the value of i, b and blksread. That should only output 3K lines of trace before it fails. I'd be particularly interested in what it says just before that first "Incorrect block for" message.
Actually, I think
Incorrect block for <file removal list> at 1845 blocksis correct, it's read then 1845th block after the start of the bitmap - which should be the TS_BITS but somehow gethead doesn't like it.Could you add trace to converthead too so we can know which particular FAIL is being triggered. I think it has to be the NFS_MAGIC case because a checksum failure should have logged.
This is very puzzling!