cdc-file-transfer is a bandwidth-efficient file transfer and sync tool that uses content-defined chunking to detect and transmit only the parts of files that actually changed. Unlike fixed-size blocks, content-defined chunks are cut at data-dependent boundaries, which means shifts and insertions don’t invalidate the rest of the file’s delta map. The tool maintains chunk fingerprints and a manifest so it can de-duplicate across versions and even across different files that share regions of content. This approach makes it well suited for large logs, VM images, and binary assets where small edits would otherwise force full re-uploads. The implementation focuses on being robust over unreliable networks with resumable transfers and integrity checks that verify every chunk. Operationally, it behaves like a straightforward CLI: point it at a source and destination and it negotiates what needs to move, minimizing CPU and bandwidth where possible.
Features
- Content-defined chunking for resilient delta detection
- Cross-file and cross-version de-duplication of identical chunks
- Resumable uploads with per-chunk integrity verification
- Efficient handling of insertions, shifts, and small edits
- Simple CLI workflow for one-off transfers and scripted syncs
- Manifests and fingerprints to avoid needless re-transmission