Thread: [Audacity-devel] Audio track sample format suggestions
A free multi-track audio editor and recorder
Brought to you by:
aosiniao
From: Steve t. F. <ste...@gm...> - 2013-02-17 15:19:07
|
If 32 bit audio is pasted into a 16 bit audio track, the track information still says 16 bit, but the audio data is actually still 32 bit float, so the track information is incorrect and misleading. Similarly if 16 bit data is pasted into a 32 bit track. When a 16 bit audio track is processed, the data is processed as 32 bit float, then converted to 16 bit integer on being returned to the audio track, creating unnecessary loss of sound quality either through the addition of dither noise or quantize errors. In some, but not all cases, when 32 bit audio data that is in a 16 bit audio track is processed (in 32 bit float), it may be needlessly converted to 16 bit integer format on being returned to the track, thus reducing the quality. This is confusing and I think undocumented. If an integer format track has several processes applied to it then the audio is converted from 32 bit float format to integer format multiple times. With the default settings will apply shaped dither each time, resulting in significantly more noise that is necessary. FFMpeg import produces 16 bit integer tracks regardless of the Quality preferences. Unless the user is diligent about manually changing the track settings to 32 bit float they will suffer the above problems. My suggestions to resolve these issues are: 1) Remove the (misleading and possibly incorrect) sample format information from the track control panel, 2) Remove the Sample Format options from the track drop down menu 3) Convert FFMpeg imports to 32 bit float on import (unless a lower format is specified in Quality preferences). 4) Always return processed audio as 32 bit float regardless of the original sample format or Preference Quality settings. 5) Apply dither once only on export if converting audio data to a lower bit format. If the Quality settings are set to an integer format and no processing has been done, then all of the audio data will probably be integer format (should be tested on Export), in which case if exporting to the same or higher sample format, dither should not be applied. In effect, all tracks will become 32-bit float format, but may (as now) contain integer format data if recording or importing integer format. The only downside that I can see is that in a few cases the Audacity Project data may be bigger than now, but I think that this is massively outweighed by the benefits of not creating unnecessary degrading of the audio quality. A fringe benefits is that it would simplify Audacity from a user perspective. It would hopefully also reduce the number of complaints about excessive dither noise (though there is still room for improvement in the shaped dither that we use). Are there any reasons not to do this? Steve |
From: Gale (A. Team) <ga...@au...> - 2013-02-21 18:08:11
|
Summary: * I support upconversion of 16-bit FFmpeg imports according to user's Quality Prefs setting. * Equalization for whatever reason is already doing what Steve is asking for. This needs a lot of thought and I'm not sure how much work is needed here to accomplish this. I would support upconversion of FFmpeg-imported audio from 16-bit to 24-bit or 32-bit float respectively if user's Quality Preferences are set to 24-bit or 32-bit. I understand it is not desirable to force FFmpeg to do this, but doing this conversion would give parity with PCM import. The conversion could be a second progress dialogue that says what's happening. I don't know what the implications may be if FFmpeg-on-demand is brought back at some future time. In Steve's wider suggestions I would see a philosophical objection to the extra (unasked for) data usage, but doing all processing in float even when in integer quality could already be seen as somewhat "hidden". One reason for possibly addressing this issue is that Equalization (at least) is already returning 32-bit float data to the track in integer projects. So it proves it can be done. To test that: 1 New project in Audacity HEAD, 16-bit quality 2 Generate tone 1.0 Amplitude, 1 minute. 3 Open EQ, Flatten, drag the line up to +30 dB, OK. 4 Effect > Amplify. The "Amplification dB" box says -30 dB. I would expect it to say "0.0 dB". Then look at the data usage changes on disk. Initially we have 5 MB of 16-bit data. After processing EQ we have (I think) * 5 MB of original 16 bit audio data + * 10 MB of 32 bit EQ processed audio data for a total of 15 MB. We expected 10 MB. So we have 32-bit data when the Track Control Panel says 16-bit. Changing the track format to 32-bit in the Track Drop-Down Menu then actually doesn't increase the size on disk, because there is nothing to increase. Note that when I retested this today to write this, those symptoms were not happening, but happened immediately I quit Audacity, initialised audacity.cfg, restarted Audacity then changed to 16-bit quality in Prefs. I don't have any explanation of that, but it seems to show that Prefs corruption was not causing the issue. Gale -- View this message in context: http://audacity.238276.n2.nabble.com/Audio-track-sample-format-suggestions-tp7557702p7557720.html Sent from the audacity-devel mailing list archive at Nabble.com. |
From: Vaughan J. <va...@au...> - 2013-02-22 00:47:02
|
On 2/17/2013 7:19 AM, Steve the Fiddle wrote: > If 32 bit audio is pasted into a 16 bit audio track, the track > information still says 16 bit, but the audio data is actually still 32 > bit float, so the track information is incorrect and misleading. > [...] > Thanks, Steve. I'm curious why you brought this up on -devel. As it's about bugs, I think it's more appropriate to -quality. Thanks, Vaughan |
From: Steve t. F. <ste...@gm...> - 2013-02-22 11:43:05
|
On 22 February 2013 00:46, Vaughan Johnson <va...@au...> wrote: > On 2/17/2013 7:19 AM, Steve the Fiddle wrote: >> If 32 bit audio is pasted into a 16 bit audio track, the track >> information still says 16 bit, but the audio data is actually still 32 >> bit float, so the track information is incorrect and misleading. >> [...] >> > > Thanks, Steve. I'm curious why you brought this up on -devel. As it's > about bugs, I think it's more appropriate to -quality. > > Thanks, > Vaughan > I don't think this is in essence about bugs. Most of the "symptoms" that I raised are not "bugs" because they are working as designed. I'm wanting to raise the underlying issue about the current design, which I think originates from the limitation in Audacity 1.x that it did not support multiple bit formats in the same track. When tracks only supported one bit format it was a technical requirement to ensure that data returned to a track was in the same bit format as the track. What I'm asking (on -devel) is, are there now any technical reasons why this still has to be so? I assume that there aren't, because the Equalization effect already does something very similar to what I'm suggesting, but I could be missing something important, which is why I'm asking. If there are no blocking technical issues, then yes we can discuss this on -quality and decide if it is an improvement that we want to make. If it is, then it comes back to being a -devel issue, but there is no point in discussing this on -quality if there are technical reasons why it can't or shouldn't be done. Steve |
From: Martyn S. <mar...@gm...> - 2013-02-26 00:59:36
|
Thanks Steve I find all your suggestions useful, and I think that we should convert all data in tracks to 32-bit floats, I see no reason not to. Any issue of storage space is irrelevant, given the increase in space/price even over our release cycles! And yes, having several bit-depths in a track saying XX-bits is simply a lie! We should and can do better. And simplify things as a result. We should be handling everything internally as 32-bit and only doing the conversions on import/export. That means there is no need to indicate a track as 'XX-bit (float)' etc. But that would be a big patch, so if you go for it, please do it in stages. Thanks Martyn On 17/02/2013 15:19, Steve the Fiddle wrote: > If 32 bit audio is pasted into a 16 bit audio track, the track > information still says 16 bit, but the audio data is actually still 32 > bit float, so the track information is incorrect and misleading. > Similarly if 16 bit data is pasted into a 32 bit track. > > When a 16 bit audio track is processed, the data is processed as 32 > bit float, then converted to 16 bit integer on being returned to the > audio track, creating unnecessary loss of sound quality either through > the addition of dither noise or quantize errors. > > In some, but not all cases, when 32 bit audio data that is in a 16 bit > audio track is processed (in 32 bit float), it may be needlessly > converted to 16 bit integer format on being returned to the track, > thus reducing the quality. This is confusing and I think undocumented. > > If an integer format track has several processes applied to it then > the audio is converted from 32 bit float format to integer format > multiple times. With the default settings will apply shaped dither > each time, resulting in significantly more noise that is necessary. > > FFMpeg import produces 16 bit integer tracks regardless of the Quality > preferences. Unless the user is diligent about manually changing the > track settings to 32 bit float they will suffer the above problems. > > My suggestions to resolve these issues are: > > 1) Remove the (misleading and possibly incorrect) sample format > information from the track control panel, > > 2) Remove the Sample Format options from the track drop down menu > > 3) Convert FFMpeg imports to 32 bit float on import (unless a lower > format is specified in Quality preferences). > > 4) Always return processed audio as 32 bit float regardless of the > original sample format or Preference Quality settings. > > 5) Apply dither once only on export if converting audio data to a > lower bit format. > > If the Quality settings are set to an integer format and no processing > has been done, then all of the audio data will probably be integer > format (should be tested on Export), in which case if exporting to the > same or higher sample format, dither should not be applied. > > In effect, all tracks will become 32-bit float format, but may (as > now) contain integer format data if recording or importing integer > format. > > The only downside that I can see is that in a few cases the Audacity > Project data may be bigger than now, but I think that this is > massively outweighed by the benefits of not creating unnecessary > degrading of the audio quality. > > A fringe benefits is that it would simplify Audacity from a user > perspective. It would hopefully also reduce the number of complaints > about excessive dither noise (though there is still room for > improvement in the shaped dither that we use). > > Are there any reasons not to do this? > > Steve > > ------------------------------------------------------------------------------ > The Go Parallel Website, sponsored by Intel - in partnership with Geeknet, > is your hub for all things parallel software development, from weekly thought > leadership blogs to news, videos, case studies, tutorials, tech docs, > whitepapers, evaluation guides, and opinion stories. Check out the most > recent posts - join the conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > audacity-devel mailing list > aud...@li... > https://lists.sourceforge.net/lists/listinfo/audacity-devel > |
From: Richard A. <ri...@au...> - 2013-03-03 20:44:58
|
On Tue, 26 Feb 2013 00:59:12 +0000 Martyn Shaw <mar...@gm...> wrote: > I find all your suggestions useful, and I think that we should > convert all data in tracks to 32-bit floats, I see no reason not to. Ack to this - we already know that dithering back makes integer processing slower than float for many effects! > We should be handling everything internally as 32-bit and only doing > the conversions on import/export. That means there is no need to > indicate a track as 'XX-bit (float)' etc. But that would be a big > patch, so if you go for it, please do it in stages. We would need to decide how to set the recording bit depth - whilst nearly always requesting float from portaudio is fine, there are cases where requesting integer formats may give different results from the sound hardware, so we may need to retain a way of doing that. It doesn't stop the data going into a floating point track though! Richard |