In some cases the bit depth values printed by the "stats" effect
change when the input file is inverted. I think this is a bug,
since I understand the "bit depth" figures are meant to be a
property of the signal as a whole and should be independent of a
change of sign, similarly to other level measurements such as "Pk
lev dB" etc. and contrary to "Min/Max level" and "DC offset",
which are clearly signed values.
You can show this behaviour in a very simple way by feeding just
two 16-bit samples into the "stats" effect, one being zero and
the other 0x3ff0 (positive):
$ echo -ne '\xf0\x3f\0\0' \ > | sox -t raw -e signed -b 16 -r 48000 - -n stats ... Bit-depth 10/12 ...
This is what I would expect, consistent with the explanation of the "Bit-depth"
figures in "man sox". However, if we invert the signal, using for
example "-v -1":
$ echo -ne '\xf0\x3f\0\0' \ > | sox -t raw -e signed -b 16 -r 48000 -v -1 - -n stats ... Bit-depth 11/12 ...
and to me "11" looks one bit too many.
The root reason for my objection to this behaviour is that I am using
the "stats" effect in sox to assess how two PCM files differ. I do
this by using the '--combine mix' option and the "-v -1" option before
one of the files to feed "stats" with the difference signal. However,
given the current behaviour of the bit-depth calculation, in some
cases this result depends on which one of the two files is inverted,
which makes the measurement unreliable.
The code responsible for this is at line 133 of "stats.c":
Working with 16-bit samples ('-b 16'), for the sample sequence [2, -2, 2, -2] this code gives a bit depth of "1/15", as expected, while for the sequence [-2, -2, -2, -2] it produces "15/15", which is clearly wrong.
I propose the following change:
This clarifies the meaning of the "bit depth" and makes it independent of the sign of individual samples. The code modified as above yields "1/15" for all sequences [2, -2, 2, -2], [2, 2, 2, 2] and [-2, -2, -2, -2].
It would be good to indicate in the 'man' page that in the "x/y" notation y includes the sign bit, while x does not. So for a 16-bit signal that fills the entire dynamic range one would get "15/16", and never "16/16".
Last edit: Giuseppe Scelsi 2015-10-16
I have revised my patch to better behave in some corner cases and I
also wrote a test script that I am attaching. Comments at the bottom
of the script indicate in which test cases the output of my
implementation differs from sox 14.4.1. I also reworded a bit the
explanation in the man page to make it more consistent with the new
behaviour.
I believe this gives sane, consistent results: