Re: [Audacity-devel] [PATCH] Corrected: Fix for ImportRaw audio format classification
A free multi-track audio editor and recorder
Brought to you by:
aosiniao
From: Philipp S. <phi...@go...> - 2014-07-27 21:33:45
|
Am 24.07.2014 17:37, schrieb Gale Andrews: > On 24 July 2014 00:17, Vaughan Johnson <va...@au...> wrote: >> >> On 7/22/2014 2:37 PM, Gale (Audacity Team) wrote: >>>> "Stevethefiddle" wrote: >>>>> On 22 July 2014 21:05, Gale (Audacity Team) wrote: >>>>> Thanks. It will need more tuning even for the formats it supports. >>>>> I had not previously done a lot of testing, as I said (except for >>>>> getting it to compile). >>>>> >>>>> The dummy read seems to work fine for headered files but I sent a >>>>> batch of files to Philipp for debug a couple of days ago. Problems >>>>> found: >>>>> >>>>> * Most mono files are detected as stereo (this is worse >>>>> than 2.0.5 behaviour) >>>>> >>>>> * s16be in a container is detected as 32-bit float. >>>>> >>>>> * s16le in a matroska container is detected as u8le. >>>>> >>>>> * Non-44100 Hz files I tried (8000 Hz, 384000 Hz) were >>>>> all detected as 44100 Hz - I don't know if the patch was >>>>> addressing this but I think the 2.0.5 code has the same >>>>> problem. >>>>> >>>>> * s24le or s24be is misdetected as another encoding >>>>> (it is not listed in FormatClassifier.cpp). >>>>> >>>>> So some improvements on the 2.0.5 code, but one thing worse. >>>> So perhaps it would be best (for a speedy release of 2.0.6) to revert >>>> these changes for now, then put them back in straight after releases >>>> so that we have time for thorough testing and ironing out whatever >>>> kinks remain? >>> The formats in containers are mostly wrongly detected in 2.0.5 (but in >>> different ways) so the only thing that is worse in some cases is mono >>> detection. Balanced against that, some other parameters in wrongly >>> detected stereo files are correct in 2.0.6-alpha but wrong in 2.0.5. >>> >>> To me the commit is acceptable if Philipp undertakes to improve it, >>> but I just wanted to point out that it isn't perfect yet (and the major >>> gain many are looking for - detection of U-Law/A-Law - still has to be >>> worked on). >>> >>> >>> >>> Gale >> Okay, thanks, but what does that mean you actually support it for 2.0.6 >> release or not? Is it in or out? Seems like a net gain, but recently >> contributed and not much tested. > My vote is leave it in, now I've tested it versus 2.0.5 and Philipp > has agreed to work further on it after 2.0.6. > > It is a net gain, and will be a larger net gain when mono detection is > improved. I'll track the mono detection as a bug in what we have now, > if it stays in. Hi Gale, in the meantime I did some debugging on the classifier using your raw files and my test set. Doing this I a had a deeper look on their spectral behaviour, which for your int16 type files was quite surprising: Roughly beginning at 15 kHz towards higher frequencies they show a distinct "noise hump" some dBs higher than the typical noise floor before 15 kHz. At first I thought this might be an effect related to the recording equipment (e.g. for the guitar tune) or the preprocessing steps in the classifier. But then this effect proved to be present when importing the whole tracks both into Audacity itself (Spectrogram view, Spectral Analysis view, especially salient after a normalization of quiet parts) and into the Matlab prototype of the classifier. To make a long story short: This made the int16/int32 and partly also the mono/stereo estimation instable, as small deviations in the signal's high band contents are the distinctive part especially to decide between integer sample formats (as mentioned earlier). Frankly, it's not the only reason for the misclassifications, but it had a strong impact. As the int16/int32 separation is the most susceptible one to disturbances and the int32 format seems to be not very common, I might remove the int32 class from FormatClassifier. Again, this is in line with RawAudioGuess, where only int8 and int16 guessing is supported. Instead, as a next step in the future, an introduction of an int24 classification for studio quality recordings might be a worthwhile thing to do. int16/int24 are not byte-aligned to each other so I wouldn't expect "crosstalk effects" there as they are present in the int16/int32 case. Now, what's causing the noise effect seems to be the export functionality of Audacity itself. The effect can be reproduced with those steps: - Generate one / more tracks in Audacity, even a single track with total silence (Generate | Silence...) will do. - Export the project to a raw uncompressed file, headerless int16. - Re-import the raw file again. - Normalize silent parts of the signal (to make the effect even more visible) to -1 dBFS. - The effect now is visible in Analyze | Plot Spectrum... or the spectrogram view of the track. Interestingly, the effect (its noise frequency response) is a little different on Linux and Windows platforms (both 2.0.5), but it's there on both systems. It's also not a phantom bug within the editor, small fluctuations around 0 (for an exported track with total silence) are actually visible in the saved raw file. So what might be the cause there? Some numerical fluctuations in the downmix stage, or filtering / resampling of the signal applied there? Philipp > > > Gale > > ------------------------------------------------------------------------------ > Want fast and easy access to all the code in your enterprise? Index and > search up to 200,000 lines of code with a free copy of Black Duck > Code Sight - the same software that powers the world's largest code > search on Ohloh, the Black Duck Open Hub! Try it now. > http://p.sf.net/sfu/bds > _______________________________________________ > audacity-devel mailing list > aud...@li... > https://lists.sourceforge.net/lists/listinfo/audacity-devel |