From: Pa3PyX <pa3...@ic...> - 2005-11-22 22:18:12
|
Martin: I didn't finish my prior email and hit "send" by mistake. Sorry. This one corrects the mistakes. The paper you are quoting (from 1997) is not an official standard though, like you mentioned. In the Scope section, it explicitly states "Note: If anything in this document does not meet the specifications in ISO/IEC 11172-3 or ISO/IEC 13818-3.2, then the definitions in these documents are valid. On the other hand, some mistakes, contradictions and unclarities were tried to be solved in this document." Of course the logical way to resolve contradictions and unclarities while remaining compliant is to simply be on the safe side, and take the strictest possible interpretation. But the painful thing is that the ISO 11172-3 standard itself, to my knowledge, was never revised based on this paper or any other feedback. Robert, The thing I noticed is that in CBR 320, because the first frame has no padding, you will still have 8 or so bits of reservoir, even with strict-ISO, unless you explicitly disable the bit reservoir. There is a hack in set_get.c, lame_set_brate() that does this in all cases, freeformat or not: int CDECL lame_set_brate( lame_global_flags* gfp, int brate ) { gfp->brate = brate; if (brate>= 320) { gfp->disable_reservoir = 1; } return 0; } I'm not sure if having some bits of bit reservoir on the first frame is per standard or not -- I don't see why not, the frame will still fit in the buffer, and the subsequent frames will be constant size even if the first frame happens to save some bits. But who knows what decoders will assume main_data_begin to be identically 0 in 320 kbps because the standard says that at 320 kbps the frame is of constant length -- and blow up on such stream. Personal opinion is (if anyone cares), for LAME to be on the safe side: 1) in freeformat, allow unrestricted use of frame buffer and bit reservoir (maxmp3buf = gfc->mode_gr * gfc->channels_out * 4095) [<- someone correct me if I'm wrong here; does freeformat allow bit reservoir at all?]. Freeformat is what it says -- freeformat; right? 2) otherwise, if strict-ISO, set maxmp3buf = 7680 (regardless of the sample rate); 3) otherwise (non-strict-ISO), calculate maxmp3buf based on the sample rate (as does the current strict-ISO formula). Of course, it's up to the developers here to decide; this is probably still too restrictive. By the way: there is yet another "kludge" in the bit reservoir code, this time in ResvMaxBits(): /* amount from the reservoir we are allowed to use. ISO says 6/10 */ *extra_bits = (ResvSize <(gfc->ResvMax*6)/10 ? ResvSize : (gfc->ResvMax*6)/10); I wasn't able to find any such restriction in ISO 11172-3 (could someone point me?), and, in the light of buffering restrictions already existing, this seems like an unnecessary one -- especially when the reservoir size is already limited because of the high bitrate. I was able to lose this one without encountering any problems. Thanks, Pa3PyX <-----Original Message-----> >From: Ruckert Martin >Sent: 11/22/2005 9:59:49 AM >To: lam...@li... >Subject: Re: [Lame-dev] Re: frame sizes and bitreservoir > >I try to review some material from the standard regarding the >question of buffering: > >I guess that the buffer limitation is connected with the minimum delay >requirements. ISO/IEC 1172-3 says in the Introduction: >0.2 Layers >.... >The theoretical minimum encoding/decoding delay for Layer I is about 19ms, >.... for layer II is about 35 ms, ... for Layer II is about 59 ms. > >These decoding delays set constraints on the amount of allowed buffering. >I do not recall, however, a specific mentioning of this connection to the >maximum buffer in the standard. The standard specifies in section 2.4.3.4..4 >a Section that is specific to Layer III: >"The buffer length is 7680 bits. This value is used as the maximum buffer at >every bitrate" The section then continues to explain the difference between >the maximum deviation from the bit_rate, determined by this buffer limit and >the actual maximum deviation from the bit_rate determined by the maximum >range of main_data_begin. And hints that from this you can calculate the >delay[!]. >[The ISO document of 1993 I have reads a bit different than the document >cited below.] > >In their paper "MPEG-Layer3 Bitsream Syntax and Decoding", the quasi >standard for MPEG 2.5, Sieler and Sperschneider come up with >a buffer size of even 9984 bit. The text reads: > >4.3.6 Buffer considerations > >According to ISO/IEC 11172-3 the required buffer size for Layer3 is 7680 bits. >This value is derived from frame length at the highest allowed bitrate and >sampling frequency. This value is valid for MPEG2 and MPEG2.5 as well. >It has to be interpreted as follows: > >"The length of the actual frame plus the length of additional data caused > by main_data_begin must not exceed 7680 bits. Therefore main_data_begin is > limited for higher bitrates according to the formula: > > max_mdb = max( 0, min( mdb_limit, 7680 - bitstream_frame_length)) > >mdb is used as abbreviation for main_data_begin. mdb_limit represents the >limitation of the backpointer due to the limitation of the main_data_begin bit >field. >mdb_limit is 4088 for MPEG1 (9 bit field) and 2040 for MPEG2 and MPEG2.5 >(8 bit field). bitstream_frame_length is the length of the current bitstream >frame. >max_mdb is directly connected with this length, which on its part depends on >bitrate_index. If bitrate_index switches, the length of the bitstream frame >changes, >and simultaneously max_mdb may change. > >"For MPEG1 an additional restriction has to be applied: The length of one > granule must not exceed 7680 bit. Within frames with a frame length less or > equal to 7680 bit the bit exchange between the two granules is unlimited. > For frames with a greater frame length the bit exchange is limited. > E. g. a 320 kbit frame at 32 kHz, length 11520 bit, must use at least > 11520 - 7680 = 3840 bit for each granule. > >"If two channels are contained in the bitstream (mode!='11') the bit > exchange between the channels within a granule is unlimited. > >Layer3 supports two Basic coding modes: > >variable rate coding -- Variable rate coding means that the encoder may change >the bitrate_index on a frame by frame basis depending on the psychoacoustic >demands. All allowed bitrate indices (see Table 8) may be used. The frames must >follow the rules given above. Decoding of variable bitstreams can only be done >"on demand", i. e. the decoder communicates with the server the amount of data >that is needed to continue decoding. A constant rate transmission is impossible >with such bitstreams. > >constant rate coding -- This coding method is used for constant rate >transmissions. For constant rate transmissions, Layer3 supports switching of the >bitrate_index between nearby values in the bitrate table. This feature is used >for two different purposes: > >1.Emulation of intermediate bitrates without using free format. E. g. a bitrate >of 60 kbit/s can be achieved by continuos switching between 64 kbit/s and >56 kbit/s. > >2.Asynchronous operation mode. Layer3 allows to work with bitstream clocks >asynchronous to the sampling frequency of the input audio data. The asynchronity >is intercepted by switching the bitrate_index. In the decoder the sampling >frequency >at the encoder input may then be reconstructed. Asynchronous bitrate_index >switching requires usage of 3 nearby bitrate indices at most. E. g. to perform >asynchronous operation at a bitrate of 64 kbit/s, the bitrates 56 kbit/s and >80 kbit/s may be used as well to adjust the data rate. Such a bitstream will >mainly >consist of 64 kbit/s frames. After a number of 64 kbit/s frames, a frame of >56 kbit/s or a frame of 80 kbit/s or a sequence of 80 kbit/s followed by 56 >kbit/s (emul >ating a frame of 72 kbit/s) can occur. Then a number of 64 kbit/s frames is >following a. s. o. > >A full ISO compliant decoder must support variable rate coding as well as both >types of constant rate coding. Regarding buffer considerations that means that >the bitstream input buffer must be designed in such a way that all possible >bitrates are covered. Example: A 192 kbit/s bitstream in asynchronous operation >mode may switch to 160 kbit/s or to 224 kbit/s to adjust the data rate. If the >bitstream switches to 224 kbit/s the decoder must have access to the whole >224 kbit/s frame (and not only the number of bits which corresponds to a >192 kbit/s frame). If the bitstream switches to 160 kbit/s, a higher >main_data_begin value is allowed according to the formula given above. >The decoder must have access to all data which can be addressed by this >main_data_begin (and not only the data which can be addressed by the >main_data_begin value allowed at 192 kbit/s). The latter case becomes >unimportant >at bitrates where main_data_begin is limited due to the maximum size of the >main_data_begin bit field for the bitrates affected. Worst case is the >asynchronous >operation mode at a target bitrate of 256 kbit/s at 48 kHz sampling frequency, >with possible switches to 224 kbit/s and 320 kbit/s. The >allowed backpointer of 224 kbit/s is 2304 bit, the frame length of 320 kbit is >7680 bits. >Therefore a range of 2304 bit + 7680 bit = 9984 bit may be accessed during the >decoding >process of one granule. > >For specific applications it may be known that asynchronous operation and/or bit >rate_index switching are not used. The additional buffer needed for these featur >es can then be omitted. > >Martin Ruckert <P><font face="Arial, Helvetica, sans-serif" size="2" style="font-size:13.5px">_______________________________________________________________<BR><font face="Arial, Helvetica, sans-serif" size="2" style="font-size:13.5px">ICQ - You get the message, anywhere!<br>Get it @ <a href="http://www.icq.com" target=new>http://www.icq.com</a></font><br><br> </font></font> |