The program that I use to rip CDs performs really poorly when I rip a handful of CDs at the same time, and it looks like sox is the bottleneck. To isolate the bug, I first created a test file:
sox -n -b 16 -r 44100 -c 2 silence.wav trim 0 60
The sox command whipper uses ran relatively quickly when only one was running in parallel:
$ time sox silence.wav -n stats -b 16
Overall Left Right
DC offset 0 0 0
Min level -1 -1 -1
Max level 1 1 1
Pk lev dB -90.31 -90.31 -90.31
RMS lev dB -96.33 -96.33 -96.33
RMS Pk dB -95.89 -95.92 -95.89
RMS Tr dB -96.83 -96.83 -96.78
Crest factor - 2.00 2.00
Flat factor 2.18 2.18 2.19
Pk count 661k 662k 661k
Bit-depth 2/16 2/16 2/16
Num samples 2.65M
Length s 60.000
Scale max 32767
Window s 0.050
real 0m0,067s
user 0m0,384s
sys 0m0,008s
If I run the same command 3 times in parallel, it takes 38 seconds instead of less than 1:
$ time (for x in {1..3}; do sox silence.wav -n stats -b 16 & done; wait)
[...]
real 0m38,170s
user 7m23,014s
sys 0m0,240s
And changing the command above to run in series instead of parallel, it goes back to taking a reasonable amount of time:
$ time (for x in {1..3}; do sox silence.wav -n stats -b 16; done; wait)
[...]
real 0m0,379s
user 0m4,052s
sys 0m0,024s
Assuming sox doesn't do any inter-process communication between those 3 processes, I'm not sure why running 3 in parallel should be so much slower than running them in series on a 6-core CPU with an audio file on a tmpfs filesystem.
I'm running sox 14.7.0.9+ds1-1 on linux 6.18.5+deb14-amd64 from Debian testing, with an Intel i5-11600.
From here on, I'm just guessing/speculating: Is there any chance that sox is using some CPU instructions that interact really poorly with context switching between threads/processes? I vaguely remember reading something about AVX-512 have some multi-threaded performance issues, but I'm not really familiar with it.
I just tested on another computer I have access to. It's an AMD 5600G running Debian trixie, sox 14.4.2+git20190427-5+b3 and linux 6.12.63+deb13-amd64. On that system, running in parallel is actually faster than in series, which is what I'd normally expect:
Oops, apparently Debian switched from sox to sox_ng and I didn't realize it. I filed this at https://codeberg.org/sox_ng/sox_ng/issues/826 so this issue can be closed.
In Debian testing, yes.
sox_ng has --multi-threaded enabled by default which usually completes sooner at a cost of more total CPU time due to more inter-process communication. Sorry this has caused breakage; maybe
it should go back to --single-threaded by default to avoid such scenarios.