Please correct me if I'm wrong here, but 4095 bytes is
_per granule per channel_ hard limit for MP3 format, not
a _per granule_ hard limit.
on_pe(), however, sets MAX_BITS per the entire granule.
This does not affect CBR because CBR_iteration_loop()
does not use the return value, but in VBR mode, this
sets a contrived constraint on the frame size in mid/side
stereo mode (reduce_size() clamps estimates so they
don't exceed mxb = on_pe()). So in quantize_pvt.c:
on_pe():
if (max_bits > MAX_BITS) /* hard limit per granule */
max_bits = MAX_BITS;
should probably be
if (max_bits > gfc->channels_out * MAX_BITS) /* hard
limit per granule */
max_bits = gfc->channels_out * MAX_BITS;
The same goes with ABR: in calc_target_bits,
reduce_side (targ_bits[gr], ms_ener_ratio[gr],
mean_bits*gfc->channels_out,
MAX_BITS);
should probably be
reduce_side (targ_bits[gr], ms_ener_ratio[gr],
mean_bits*gfc->channels_out,
gfc->channels_out * MAX_BITS);
It looks like no other changes need made because both
on_pe() and reduce_side() make sure that no single
channel gets more than MAX_BITS.
(un)related suggestion: unless (gfp->strict_ISO), lose all
artificial frame size constraints altogether. I.e. in
reservoir.c:ResvFrameBegin():
maxmp3buf = 8*2047;
instead of
maxmp3buf = 8*1440;
(do not restrict frame size below the actual limit
supported by the format), and in reservoir.c:
ResvMaxBits():
if (gfp->strict_ISO) {
*extra_bits =
(ResvSize < (gfc->ResvMax*6)/10 ? ResvSize :
(gfc->ResvMax*6)/10);
}
else {
*extra_bits = ResvSize;
}
instead of
*extra_bits =
(ResvSize < (gfc->ResvMax*6)/10 ? ResvSize : (gfc-
>ResvMax*6)/10);
(do not restrict available reservoir size that can be used
in 1 granule to 0.6 of the reservoir capacity).
In addition: in 320 kbps CBR, ISO says there should be
no reservoir (obviously, because at 320kbps 48 khz frame
length is 8*960 == 7680, more at lower samplerates, and
the absolute imposed maximum is 7680), but if we are
not to limit max framesize to framesize at 320 kbps at
the specified samplerate (as per GBouvigne's
interpretation) and use the absolute maximum of 8*2097,
then reservoir at 320 kbps is possible. So, in set_get.c:
lame_set_bitrate(), we can remove:
if (brate >= 320) {
gfp->disable_reservoir = 1;
}
This is bad practice anyway, as behavior depends on the
order the user passes parameters -- could well do
lame_set_bitrate(320) and then
lame_set_disable_reservoir(0).
Instead, we can modify the decision logic in reservoir.c:
ResvFrameBegin():
if (gfc->ResvMax < 0 || gfp->disable_reservoir)
gfc->ResvMax = 0;
replace with
if (gfc->ResvMax < 0 || gfp->disable_reservoir || (gfp-
>brate >= 320 && gfp->strict_ISO))
gfc->ResvMax = 0;
I understand this is redundant because in strict ISO at
320 kbps GBouvigne's formula will calculate maxmp3buf
equal to frameLength anyway, but what if something
decides to muck with the FPU control word and we
suddenly decide to round our floats up instead of down.
Finally, in reservoir.c:ResvFrameBegin(), we will need to
replace
if (gfp->brate >= 320) {
with
if (gfp->brate >= 320 && gfp->free_format) {
(from the comments, this was apparently meant for free
format, but not resulting in any bogus behavior currently
because bit reservoir is identically disabled for 320 kbps
anyway).
Logged In: YES
user_id=499949
Correction: I should have been more general.
maxmp3buf = gfc->mode_gr * gfc->channels_out * MAX_BITS
Logged In: YES
user_id=1071
From ISO mp3 specification (ISO 11172-3), section 2.4.3,
The Audio Decoding Process
>Buffer considerations
>The following rule can be used to calculate the maximum
>number of bits used for one granule:
:
>The actual maximum deviation is 2**9 * 8 bit = 4096 bits.
:
>The exchange of buffer between the left and right channel in
>a stereo bitstream is allowed without restrictions.
So, I think 4095 "bits" is the maximum for a granule. This
limitation is used to detemine the size of the decoding buffer
and the max. computation power decoders needed to decode
the huffman code.
Logged In: YES
user_id=499949
But then, shouldn't LR stereo also be limited to 4095 bytes/
granule? Because currently it looks like MS stereo is limited
and LR is not, and "stereo bitstream" does not specifically
refer to MS(+IS) or LR.
"One granule" -- does this refer to both channels though, or to
one channel? One would think that "granule" would be referring
to a single decodable atom, regardless of the number of
channels in the bitstream ("This value is used as the
maximum buffer _per channel_ at the lower bitrates" -- does
that refer to stereo, or does that refer to dual channel?).
Otherwise it just doesn't seem right, because we are artificially
limiting ourselves to twice as few bits (on the average) as the
format can handle.
Besides, in that context, 4096 bits seems to refer to the
reservoir size ("deviation"). But reservoir (deviation) is per
frame, not per granule; main_data_end is stored in a frame
header (2.4.1.7) and is a 9-bit value, hence the maximum
_frame_ size variability of 4096 bits, byte aligned (which LAME
does observe correctly); the rest of the deviation (as 4608 -
4095 = 513 bits in that example) must be padded. But this
has nothing to do with how bits are allocated _within_ the
frame, or does it? -- because frame boundaries, not granule
boundaries, are synchronization points, so bit allocations
within the frame are irrelevant to the stream buffer size.
Of course, with the exception of part2_3_length, which is a 12-
bit value, hence the maximum of 4095 bits per granule per
channel.
Which decoders would you recommend that are most
capricious with respect to buffer size, that I could test with? I
experimented with fatboy.wav and had the bit reservoir enabled
at 320 kbps (which is directly against the spec, but the format
allows it, so why not); my theoretically allowed maximum
frame size was 2*2*4095 = 16380 bits, and of course the
practically allowed was around 12 kbits (around 8 kbits mean
frame size and 4 kbits reservoir); I assume I have to set the
bitrate to 550 (reservoir disabled) to actually reach the
theoretical maximum. But even at 320 kbps, I had a good
number of frames approaching 11 kbits (which guarantees that
at least one granule had more than 4 kbits for both channels),
and both Winamp 5.08 and stock Windows decoder (FhG)
appear to handle that correctly. The next thing would probably
be to try this on my GPX portable CD/mp3 player (a very
mediocre brand) to see how it would fare...
Logged In: YES
user_id=499949
Correction: Pardon me, max reservoir is actually 4088 bits
byte aligned (0*8 .. 511* 8) -- ISO example does not seem to
take into account that it's zero based -- but that's correct in
LAME anyway.
Logged In: YES
user_id=499949
Upon further testing, I'll take back the part with "handling
correctly"; apparently FhG does sometimes overrun its buffer
trying to decode 320 kbps with full bit reservoir (4088),
although Winamp's Nitrane does play alright, and so does
LAME's mpglib (no bitstream desyncs). I'll try to rig LAME to
produce some oversized (> 4095 over both channels) granules
within acceptable frame size and test it out more thoroughly
with FhG and some mediocre portables.
Logged In: YES
user_id=499949
Done; hacked the code to produce 5.5k on the second granule
(over both channels -- at the expense of the first granule) at
320 kbps. Tested with FhG, Nitrane (winamp), mpglib
(mplayerc), and a portable player (GPX). All appear to handle
this correctly (no desyncs, no obvious glitches, all sound
about the same). Sample attached.
Large MS granule test sample
Logged In: YES
user_id=1071
Thank you for your suggestions.
I changed the code on the experimental branch to ignore the
MAX_BITS when not strictly-ISO mode.
The bug report itself, I close this one. Further discussion
about MAX_BITS, please use lame-dev ML.