Menu

#453 gapless decoding issues

Compatibility
closed-out-of-date
nobody
None
5
2017-08-30
2015-09-03
Cálestyo
No

Hey.

Some issues about gapless decoding:
1) gapless decoding seems to be the default nowadays, i.e. even when --nogap isn't specified, things will still work perfectly in players supporting it (e.g. mpv).

However
2) When just --nogap (regardless of whether --nogapout is used or not) is used, then it seems that gapless playback actually DOESN'T work... using --verbose implies that simply no LAME tag is written.
While the manpage doesn't mention this at all, the --longhelp at least says that --nogaptags would be needed for VBR,.. however this whole point (2) seems to apply to CBR/ABR as well, and --nogaptags seems to work there either.

3) Now when encoding two files with the following different options:
$ lame -v 1.wav 2.wav
$ lame -v --nogapout -nogap 1.wav 2.wav
I would expect that the files were identical, however they're not.

Even worse, the 2nd version doesn't play right, at least in mpv (while the first does).
Of course it's now open whether this is a lame bug or an mpv bug,... for more details have a look here:
https://github.com/mpv-player/mpv/issues/2276#issuecomment-137252561
which features an image that shows the problem that seems to arise in the 2nd mode.

Cheers,
Chris.

Discussion

  • Josep Maria Antolín Segura

    --nogap and gapless playback are two different things. Let me clarify it:

    The MP3 format itself does not have an exact audio size. It works in blocks of audio and its output is always the content of the whole block. This means that if you have 2000 samples, the MP3 output might be 2500 (this is not a real example).

    Also, there are two things called encoder delay and decoder delay. Encoder delay comes from the fact that the first output block of the decoder has some prepended silent audio, and the size depends on its own implementation and the block sizes used in the first block.
    The decoder delay is almost the same concept, which means that the output of the decoder also has some prepended amount of audio at the beginning that depends on how it is implemented too.
    A decoder is able to skip its own delay (if it is coded to do so), but initially it had no way to know the encoder delay. Usually it doesn't matter, but for gapless music (mixed or continuous music), it can be annoying.

    To improve on this situation, originally, LAME implemented the --nogap option. The --nogap option generates multiple MP3 files, but those files are encoded internally as a single file. It simply splits them where the input files end. This means that the last block of the first file does have part of the audio of the second file, and that the encoder delay dissapears, because it is already initialized with the ongoing audio. So the only remaining problems are the decoder delay, and the ability of the player to not stop the audio between tracks.

    Some time later, LAME extended the original Xing header (used in VBR files) to add more information like the encoder delay and last block padding, which allowed the format to become aware of the exact audio size. Frauhofer encoder also added their VBR header (VBRi) and their size information, so two standards were born.
    At some point, the LAME header was made available also to CBR files. The LAME header in CBR files is similar to the Xing VBR header, but without the seeking table.

    With this header, the decoders are now able to skip the encoder delay, and the last block padding, which by itself almost solves the problem that --nogap tried to solve, without the need of knowing the files in advance.
    Gapless playback can still have glitches that --nogap would solve, but those are rare. This can happen on loud bass sounds in the gap, where the samples around the joint point might not be as accurately generated as it would have been if it had known the data in advance, and so a discontinuity appears.

    About your point two, I'll fist add what the HTML documentation says:

    --nogap file1 file2 [...] Encodes multiple continuous files.

    Encodes multiple files (ordered by position) which are meant to be played gaplessly.

    By default, LAME will encode the files with accurate length, but the first and last frame may contain a few erroneous samples for signals that don't fade-in/out (as is the case of continuous playback).

    This setting solves that by using the samples from the next/previous file to compute the encoding.

    --nogapout dir Specify a directory for the output of the files encoded with --nogap

    This setting should precede --nogap, and is used to specify the alternate directory where to store the encoded files. The default one is the input file directory.

    --nogaptags Enables the use of VBR tags with files encoded with --nogap

    Tells LAME to put VBR tags to encoded files if they are encoded in VBR or ABR modes. Else, using the --nogap option doesn't generate it.

    So yes, the --nogaptags would be needed for files to have the LAME or Xing header. Probably the description should no longer imply that is only about VBR files, since the LAME tag is added to CBR files.
    Anyway, the files should still sound as if they were gapless if played one after another on a player can plays gapless with the LAME header.

    As for your point 3, I think that I've explained here why the files will be different if they are encoded with --nogap.
    Said that, your first line is incorrect. That will overwrite 2.wav with the mp3 output .

    As for the graphic, I cannot be sure which is correct, and who is at fault if any. The difference looks approximately around 20 milliseconds. (aprox. 1024 samples at 44Khz). It might be related to de encoder or decoder delay, but I am not sure.

     
  • Cálestyo

    Cálestyo - 2015-09-16

    I'll read/check/answer this later... I'm currently on vacation :-)

     
  • Robert Hegemann

    Robert Hegemann - 2017-08-30
    • status: open --> closed-out-of-date
     

Log in to post a comment.

MongoDB Logo MongoDB