From: Torsten J. <t....@gm...> - 2011-11-19 15:11:59
Attachments:
hifi_aac_avcodec.diff
hifi_aac_libfaad.diff
|
When AAC came up I could not really make friends with it. It sounded distinctly worse than mp3 to me. Now I found out thet most of my bad feelings were caused by libfaad/libavcodec decoders. Possible reason: Encoder input signal can have been previously optimized for rms power using some sort of wave shaping algorithm. Most audio CDs are done this way. There even was an Audacity plugin doing this. Encoder now splits this into various spectral, pulse and noise components, dropping optimization info. Each component can still use full volume range, though. Unlike mp3, it may even drop large frequency windows to keep bitrate down. Decoder reassembles those components which can produce peaks overshooting +/- 1.0f. These get clipped when exported to integer samples. Resulting distortion cannot be corrected later by just volume control adjustment. The problem is reproducable with ffmpeg aac encoder. Possible solutions: a) Implement a wave shaper inside decoder. Let it apply small time shifts to the components to avoid overshooting. This is non-trivial, and will require quite some extra processing power. b) Attenuate output float samples by 0.7 aka sqrt(2)/2 aka -3dB, which is the rms to peak ratio of a single sine wave. This is what I chose. It comes out just right on a lot of tested streams here. Please Sir Reimar Doeffinger of ffmpeg: If you are still "listening along here a bit", tell me what you think of it, too. Torsten -- NEU: FreePhone - 0ct/min Handyspartarif mit Geld-zurück-Garantie! Jetzt informieren: http://www.gmx.net/de/go/freephone |
From: Reimar D. <Rei...@gm...> - 2011-11-19 17:59:35
|
On 19 Nov 2011, at 16:11, "Torsten Jager" <t....@gm...> wrote: > When AAC came up I could not really make friends with it. > It sounded distinctly worse than mp3 to me. > Now I found out thet most of my bad feelings were caused by > libfaad/libavcodec decoders. > > Possible reason: > > Encoder input signal can have been previously optimized for > rms power using some sort of wave shaping algorithm. Most > audio CDs are done this way. There even was an Audacity > plugin doing this. > > Encoder now splits this into various spectral, pulse and noise > components, dropping optimization info. Each component can still > use full volume range, though. Unlike mp3, it may even drop > large frequency windows to keep bitrate down. > > Decoder reassembles those components which can produce > peaks overshooting +/- 1.0f. These get clipped when exported to > integer samples. Resulting distortion cannot be corrected > later by just volume control adjustment. You tested this? How? > The problem is reproducable with ffmpeg aac encoder. > > Possible solutions: > > a) Implement a wave shaper inside decoder. Let it apply small > time shifts to the components to avoid overshooting. > This is non-trivial, and will require quite some extra > processing power. That might have unexpected side-effects once audio encoders actually go beyond "incredibly stupid" and try to figure in the artefacts a real-world decoder will produce. > b) Attenuate output float samples by 0.7 aka sqrt(2)/2 aka -3dB, > which is the rms to peak ratio of a single sine wave. > This is what I chose. It comes out just right on a lot of > tested streams here. At best you could extended FFmpeg so the decoder can do volume adjustments, i.e. multiply with a value that can be specified. However there is only a point if the decoder can do it more efficiently than one multiplication per sample. Your description really sounds like it is most of all two issues outside FFmpeg: - crappy encoders that produce values that result in out-of-range values in reasonable decoder implementations. - xine not requesting float output and then doing volume control on the float (which also means that if someone wants these hacks in a local FFmpeg copy please at least only use the multiplication for the int output path - particularly since I think the factor 0.7 isn't nice enough for the multiplication to be (mostly) lossless). > |