Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#323 cuesheet parser can't handle BOM marker

open
Josh Coalson
4
2012-12-12
2007-12-31
Josh Coalson
No

--- Artyom Karpenko <****@gmail.com> wrote:

> Is there any way for flac or metaflac to support input from files
> with Unicode Byte Order Marks (BOM)?
> In command:
> metaflac --remove-all-tags --set-tag-from-file="CUESHEET=CDImage.cue"
> --set-tag-from-file="LOGFILE=CDImage.log" CDImage.flac
>
> If "CDImage.cue" contains Unicode characters, most text editors
> append BOM at the 1st line of this file.
>
> And metaflac does not deal with it. It simply put BOM to the flac as
> part of the tag value. So the CUESHEET tag appears broken.

Discussion

  • Logged In: NO

    artyomk3d@gmail.com
    Example cuesheet:
    First 3 symbols are automatically added by notepad to any Utf-8 text file.
    ---
    я╗┐REM GENRE Rock
    REM DATE 1997
    REM DISCID A30E1A0C
    REM COMMENT "ExactAudioCopy v0.95b4"
    PERFORMER "U2"
    TITLE "Pop (1997 Island CIDU210 524 334-2)"
    FILE "CDImage.wav" WAVE
    TRACK 01 AUDIO
    TITLE "Discoth├иque"
    INDEX 01 00:00:00
    TRACK 02 AUDIO
    TITLE "Do You Feel Loved"
    INDEX 01 05:18:50
    TRACK 03 AUDIO
    TITLE "Mofo"
    INDEX 01 10:26:05
    TRACK 04 AUDIO
    TITLE "If God Will Send His Angels"
    INDEX 00 16:13:10
    INDEX 01 16:15:15
    TRACK 05 AUDIO
    TITLE "Staring At The Sun"
    INDEX 01 21:37:50
    TRACK 06 AUDIO
    TITLE "Last Night On Earth"
    INDEX 01 26:14:35
    TRACK 07 AUDIO
    TITLE "Gone"
    INDEX 01 31:00:10
    TRACK 08 AUDIO
    TITLE "Miami"
    INDEX 01 35:26:67
    TRACK 09 AUDIO
    TITLE "The Playboy Mansion"
    INDEX 01 40:19:62
    TRACK 10 AUDIO
    TITLE "If You Wear That Velvet Dress"
    INDEX 01 45:00:50
    TRACK 11 AUDIO
    TITLE "Please"
    INDEX 01 50:15:22
    TRACK 12 AUDIO
    TITLE "Wake Up Dead Man"
    INDEX 01 55:17:65
    ---

     
  • gharris999
    gharris999
    2008-02-03

    Logged In: YES
    user_id=1734443
    Originator: NO

    Why is this a problem? Are you saying that some software (other than flac and metaflac) has trouble reading comment[x]=CUESHEET content if it contains a BOM? None of the software that I use has any problem with this. I regularly store a BOM in my embedded metadata cuesheets.

    If your cuesheet is UTF8 encoded with a BOM, then embed the cuesheet with:

    metaflac --no-utf8-convert "--set-tag-from-file=CUESHEET=cuesheet file.cue"

    and, as you say, the BOM will be embedded in the flac along with the correctly encoded UTF8 data.

    To Extract, I use:

    metaflac --no-utf8-convert --show-tag=CUESHEET flacfile.flac >"some cuesheet.cue"

    Then, you need to strip out the first 9 bytes of the file (the "cuesheet=" part) and you'll be left with a properly UTF8 encoded cuesheet with the correct BOM.

    Personally I can see the utility of an enhancement to metaflac: a "--export-tag-to=TAGNAME=filename" option that would be the "data out" partner to the existing "--set-tag-from-file=TAGNAME=filename" option. This would obviate the need to strip out the tagname header from the output file.

    That's what I'd like to see, anyway.

     
  • gharris999
    gharris999
    2008-02-03

    Logged In: YES
    user_id=1734443
    Originator: NO

    Actually, I suppose that metaflac enhancement option ought to be:

    --export-tag-to-file=TAGNAME=filename

     
  • gharris999
    gharris999
    2008-02-16

    Logged In: YES
    user_id=1734443
    Originator: NO

    OK, so metaflac --no-utf8-convert "--import-tags-from=somefile.txt" can't handle somfile.txt if it has a BOM header. Not a big deal, but "--no-utf8-convert" ought to be a clue that a BOM may be present and ought to be skipped past.