Re: [SoX-users] toward floating-point?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Cory Nelson <ph...@gm...>:

> The current guard system does work fine for most of the filters,
> though it really, really sucks when you happen upon one where it
> doesn't (like compand) and just need to take wild guesses until you
> find a gain that works. This is really my only motivation here.

I think switching to floating-point wouldn't help you there. The  
lowpass, highpass, equalizer etc. filters simply amplify part of a  
signal by, say, 20%, and it's irrelevant whether that's from 0.8 to  
0.96, or from 0.01 to 0.012, or whatever. The compand effect, however,  
looks at absolute levels and may change 0.8 to 0.96, but 0.01 to 0.2.  
That's its very purpose, otherwise we could replace it with gain or vol.

So instead of manually fiddling with the gain to avoid clipping, you'd  
have to manually fiddle with the gain to get the signal to the right  
level for companding. (Alternatively, you'd have to manually fiddle  
with the transfer function.)

> Doug Cook <idi...@us...>:
>> That said, it is definitely interesting to consider whether sox's
>> current preferred encoding is the right one. Seeing as how you
>> essentially can't do anything useful in sox without the sample being
>> converted to a double at least once (for example, any gain adjustment
>> involves a floating-point multiply), I'm pretty confident that the
>> best choice of preferred encoding for sox is a floating-point format,

I tend to agree, seeing how much processing time is burned by  
SOX_ROUND_CLIP_COUNT alone. On the other hand, format conversion  
without applying any effects would take longer in most cases if all  
data had to take a floating-point transit.

Perhaps only the effects chain should use FP, not the format handlers.  
Then again, there are some file formats that store FP data ...

>> not an integer format. That means it would be a fight between float32
>> (float) and float64 (double).
...
>> A float64-based sox might take a few
>> milliseconds longer to encode something, but it probably wouldn't
>> affect me. In fact, on an x86, there might not be any difference at
>> all. But on an arm chip, or perhaps in the future when somebody wants
>> to make sox do SSE-optimized (or GPU-optimized) vectorized effects
>> calculations, the difference between a 32-bit and 64-bit float might
>> be more significant.

I'm not really sure of the x86 thing, not for the CPU cycles, but for  
the cache size that effectively halves when you go from 4 bytes to 8  
bytes per samples. Cache size is very important, just try increasing  
--buffer until it doesn't fit anymore.

In principle, I'm in favour of keeping as much precision as possible,  
so that would mean float64. But as float32 and float64 behave very  
similar from a C perspective, it might be possible to offer both  
float32 and float64 as a compile-time option. Most code would simply  
use sox_sample_t and, where necessary, macros like  
SOX_FLOAT_32BIT_TO_SAMPLE as today that would be #defined away in the  
float32 version. Few functions (like those that compute sample  
precision) would need to look under the hood.

Perhaps additionally supporting float128 (double double) would make  
SoX attractive to certain audiophiles. :-)

> With ARM getting more relevant, though, I guess it is important to
> take it into consideration and I've got no clue how it handles double
> or prefetching.

I think you can't speak of "ARM" by itself; it's more like a  
collection of building blocks that a vendor may choose from. They  
certainly offer fast float64 logic, too, but a vendor may refrain from  
including it for cost or power usage reasons.

>> The second unanswered question is how much any of this matters. A
>> floating-point format would make some normalization tasks easier, but
>> the current system also works fine. And just as my ears can't tell the
>> difference between float32 and float64, they probably also can't tell
>> the difference between int32 and float64. Where would this fit in a
>> list of sox feature requests? For my needs, it wouldn't rank very
>> high.

I agree, also regarding priority. I'd rather see it as a speed  
optimization for effects processing.

Ulrich