SoX - Sound eXchange / Bugs / #274 Codec bug in IMA and OKI ADPCM algorithms.

Codec bug in IMA and OKI ADPCM algorithms.

#274 Codec bug in IMA and OKI ADPCM algorithms.

Status: open

Owner: nobody

Labels: None

Priority: 5

Updated: 2025-08-13

Created: 2016-04-20

Creator: John Brandwood

Private: No

Hi,

I'm afraid to report that the current v14.4.2 version of SoX has a bug in its IMA and OKI ADPCM compression/decompression codecs.

The algorithm that you're using in "adpcms.c" doesn't match the bit-rounding that's implicit in the IMA and OKI specifications.

The result is that when I decode a IMA/OKI file with SoX I get a low-fequency "slinky" adjustment with the whole waveform moving up and down across the zero line.

The same problem is seen when I encode a WAV file into an IMA/OKI file with SoX and then decompress it with a different IMA/OKI-compliant utility.

As I said, the problem is in your codec in "adpcms.c". Here's an example of the math error (using an example IMA setp-size of 21) ...

SOX IMA-ADPCM

p->setup.steps[p->step_index] = 21 (for example)

// code = 0 1 2 3 4 5 6 7

int s = ((code & (p->setup.sign - 1)) << 1) | 1;

// s = 1 3 5 7 9 11 13 15

s = (p->setup.steps[p->step_index] * s)

// s = 21 63 105 147 189 231 273 315

s = (s >> (p->setup.shift + 1)) & p->setup.mask;

// s = 2 7 13 18 23 28 34 39

IMA-ADPCM SHOULD BE ...

(step ) = 21
(step >> 1) = 10
(step >> 2) = 5
(step >> 3) = 2

// s = 2 7 12 17 23 28 33 38

The small error between your code and the official spec is causing a slow-drift in the math over time ... leading to the "slinky" result.

Please note that just encoding/decoding the same file with SoX produces "correct-looking" results because both the encoder and the decoder share the same "incorrect" algorithm.

I'm attaching screengrabs from Audacity showing the effect, and here are the SoX command lines that produce the bad files ...

sox -V -V ICouldJustHaveYouRevoked16K.wav -e oki-adpcm ICouldJustHaveYouRevoked.vox

sox: SoX v14.4.2
time: Feb 22 2015 15:05:01
compiler: gcc 4.9.2 20141030 (Fedora MinGW 4.9.2-1.fc21)
arch: 1248 48 44 L OMP
sox INFO formats: detected file format type `wav'
sox DBUG wav: Searching for 66 6d 74 20
sox DBUG wav: WAV Chunk fmt
sox DBUG wav: Searching for 64 61 74 61
sox DBUG wav: WAV Chunk data
sox DBUG wav: Reading Wave file: Microsoft PCM format, 1 channel, 16000 samp/sec

sox DBUG wav: 32000 byte/sec, 2 block align, 16 bits/samp, 8684800 data
bytes
sox DBUG wav: 4342400 Samps/chans
sox DBUG wav: Searching for 4c 49 53 54

Input File : 'ICouldJustHaveYouRevoked16K.wav'
Channels : 1
Sample Rate : 16000
Precision : 16-bit
Duration : 00:04:31.40 = 4342400 samples ~ 20355 CDDA sectors
File Size : 8.68M
Bit Rate : 256k
Sample Encoding: 16-bit Signed Integer PCM
Endian Type : little
Reverse Nibbles: no
Reverse Bits : no

Output File : 'ICouldJustHaveYouRevoked.vox'
Channels : 1
Sample Rate : 16000
Precision : 12-bit
Duration : 00:04:31.40 = 4342400 samples ~ 20355 CDDA sectors
Sample Encoding: 4-bit OKI ADPCM
Reverse Nibbles: no
Reverse Bits : no
Comment : 'Processed by SoX'

sox INFO sox: effects chain: input 16000Hz 1 channels (multi) 16 bits 00
:04:31.40
sox INFO sox: effects chain: dither 16000Hz 1 channels 12 bits 00
:04:31.40
sox INFO sox: effects chain: output 16000Hz 1 channels (multi) 12 bits 00
:04:31.40
sox DBUG sox: start-up time = 0.030002

sox -V -V -r 16000 -e oki-adpcm xanadu1_adpcm_2a.vox xanadu1_adpcm_2a.wav

sox: SoX v14.4.2
time: Feb 22 2015 15:05:01
compiler: gcc 4.9.2 20141030 (Fedora MinGW 4.9.2-1.fc21)
arch: 1248 48 44 L OMP

Input File : 'xanadu1_adpcm_2a.vox'
Channels : 1
Sample Rate : 16000
Precision : 12-bit
Duration : 00:01:29.34 = 1429504 samples ~ 6700.8 CDDA sectors
File Size : 715k
Bit Rate : 64.0k
Sample Encoding: 4-bit OKI ADPCM
Reverse Nibbles: no
Reverse Bits : no

sox DBUG wav: Writing Wave file: Microsoft PCM format, 1 channel, 16000 samp/sec

sox DBUG wav: 32000 byte/sec, 2 block align, 16 bits/samp

Output File : 'xanadu1_adpcm_2a.wav'
Channels : 1
Sample Rate : 16000
Precision : 16-bit
Duration : 00:01:29.34 = 1429504 samples ~ 6700.8 CDDA sectors
Sample Encoding: 16-bit Signed Integer PCM
Endian Type : little
Reverse Nibbles: no
Reverse Bits : no
Comment : 'Processed by SoX'

sox INFO sox: effects chain: input 16000Hz 1 channels (multi) 12 bits 00
:01:29.34
sox INFO sox: effects chain: output 16000Hz 1 channels (multi) 16 bits 00
:01:29.34
sox DBUG sox: start-up time = 0.025001
sox WARN adpcms: xanadu1_adpcm_2a.vox: ADPCM state errors: 7

2 Attachments

ICouldJustHaveYouRevoked.png

xanadu1_adpcm_2a.png

Discussion

John Brandwood - 2016-04-20

I've just checked the SoX source history in SoundForge and this bug was introduced sometime between versions 12.18.2 and 12.99.10 when "ima_rw.c" and "vox.c" were merged into "adpcms.c", and the understanding of why the STRICT_IMA flag was critical to the math was somehow missed.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jeff Calwood - 2021-02-28

It seems the problem is this, and I refer to https://wiki.multimedia.cx/index.php?title=IMA_ADPCM for background.
The site states that ADPCM codec perform the calculation
[1] diff = ((sign/mag.)nibble + 0.5) * step / 4
The author then states possible optimizations, depending on whether the hardware can compute multiplications efficiently or not. The author (or an editor) realizes that the two optimizations don't compute the same result. He is correct. The optimization for hardware with poor multiplication capabilities introduces rounding effects (errors?) compare with formula [1].
It now appears that the IMA ADPCM reference implementation (linked in the above page) implements the math with rounding effects (errors?), while Sox adpcms.c avoid the rounding error.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

John Brandwood - 2021-10-24

It now appears that the IMA ADPCM reference implementation (linked in the above page) implements the math with rounding effects (errors?), while Sox adpcms.c avoid the rounding error.

The problem is that the author of that wiki article, and the person who merged "ima_rw.c" and "vox.c" into "adpcms.c" are both wrong, and the IMA reference implmentation is correct, specificially because it is a reference implementation.

The algorithm was originally designed to be implemented on cheap hardware using nothing more than bit-shifts for multiplication and division, and with integer-only precision.

SoX's "adpcm.c" provides too much precision, and breaks the algorithm.

Sure, you can certainly do that (and the mathematical results will be more accurate) ... but then it isn't really IMA or Dialogic ADCPM anymore, and the output will not play properly on all of the hardware (and software) that implements the IMA algorithm.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Martin Guy - 2025-08-13

Thanks for reporting. https://codeberg.org/sox_ng/sox_ng/issues/562

I've tried encoding a short sine wave with

sox -n 440.wav synth 1 sine ffmpeg -i 440.wav -acodec adpcm_ima_wav 440-adpcm.wav sox 440-adpcm.wav -e signed -b 16 440-adpcm-s16.wav

but the result looks like a centered sine wav.

Also,

this bug was introduced sometime between versions 12.18.2 and 12.99.10 when "ima_rw.c" and "vox.c" were merged into "adpcms.c"

So I'm assuming it came in as part of the new adpcms.c but am having trouble finding the
line in adpcms.c that needs correcting.

Do you have one of the test files that produces slinky output?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Codec bug in IMA and OKI ADPCM algorithms.

Searches

Help

#274 Codec bug in IMA and OKI ADPCM algorithms.

Discussion