#95 Genre table doesn't follow any convention

open
nobody
None
1
2014-08-13
2004-11-26
No

The genre table doesn't follow the id3 convention
(http://www.id3.org/id3v2.4.0-frames.txt) for genre
id-->name mapping.

I'm attaching the genre table, as copied from the spec
(above). Here is how to import it into your database
from the MySQL shell:

mysql> LOAD DATA INFILE '/path/to/id3genre.txt'
-> REPLACE
-> INTO TABLE genre
-> FIELDS TERMINATED BY '.';

WinAmp has also appended their own tags to this list
(information from
http://linuxselfhelp.com/HOWTO/MP3-HOWTO-13.html\),
which I'm also attaching here. Import those in exactly
the same way as above.

If desired (after importing both tables), delete tags
that are not in either of these lists by using the
following command in the mysql shell:

mysql> delete from genre where id>125;

Discussion

  • William Skellenger

    Logged In: YES
    user_id=631721

    You will also have to add a genre "0", which is "Blues" in
    the id3 spec. Id 0 is not in the default installation, and
    it was also left out of my list somehow, so I'm updating the
    files to recognize that one.

    Here is how I did this:

    mysql> insert into genre(id,name) values(0,"Blues");
    mysql> update genre set id=0 where name="Blues";

     
  • Lewis Jardine

    Lewis Jardine - 2004-12-08

    Logged In: YES
    user_id=436862

    This is not necessarily a good solution: id is intended to
    be a unique identifier, not necessarily a mapping with any
    global table. In my opinion, a better solution would be to
    add a column 'ID3_ID' or similar, and use that if it is
    necessary to look up id3 genre numbers.

    As I understand it, MP3::Info returns the genre as a string
    after doing the lookup itself (and thus jukebox only has to
    deal with strings, and can ignore the id3 genre numbers)?

     
  • Paradxum

    Paradxum - 2005-08-23

    Logged In: YES
    user_id=596175

    ok... I think that this issue is rediculous because it's
    matched by name... but OK... I'm trying to close out these
    "bugs" slowly... so why don't we make everyone happy by
    adding a column called "winamp_id" and loading that with
    data. (the previously metioned solution was not a good one
    because of the ability for the user to add custom genre's
    .... and the id should not be equal to the winamp id just
    some of the time....)

     
  • William Skellenger

    Logged In: YES
    user_id=631721

    Again, this is not a "winamp id", it's part of the ID3
    specification. The most recent additions, were defined by
    Winamp, and they seem to be accepted by most players
    andn rippers as the new standard, probably due to Winamp
    popularity alone.

    As ridiculous as you may find this issue, you will find that
    some rippers, namely Grip, will tag the song by default with
    only the genre ID number. Matching by name, therefore,
    doesn't work -- it only ends up adding new numeric "genres"
    in the genre table. Matching by number doesn't work,
    because the numbers have no meaning other than as a
    unique index for table itself.

    If a new field were added to the table, I would gladly populate
    it with proper values (will also mean adding some new genres
    to the list).

    Please don't call it winamp_id, as the intention is just to
    reflect the ID3 standard, which now seems to be adopting the
    new Winamp genres as well.

    Attaching the id3genres.c, which contains an recent array of
    genres, including the Winamp ones.

     
  • Paradxum

    Paradxum - 2005-08-23

    Logged In: YES
    user_id=596175

    Excellent.... then we should call it id3_id (although that
    sounds redundant... what does id3 stand for.... does anyone
    know?) maybe id3_num that sounds a little better... it
    indicates it's the id3 numeric id ... maybe id3_nid???
    what's your thoughts?

     
  • William Skellenger

    Logged In: YES
    user_id=631721

    Good news: surprisingly enough, all of the genres from
    tables.sql seem to be represented in the file that I posted.
    I wrote a simple Python script to append an ID3 index to
    each existing genre. There are only six conflicts due to
    spelling or punctuation. I will further research these and
    find out what is appropriate -- the .c file that I posted is
    probably not a standard and perhaps it's got some errors.

    Old genres not found in "new" list:
    A Cappella
    Fast-Fusion
    Folk/Rock
    Rhytmic Soul
    Synthpop
    Thrash Metal

    New genres not found in current list
    Folk-Rock
    Fast Fusion
    Rhythmic Soul
    A capella
    Trash Metal
    SynthPop

    Notice that all are represented, but just with
    case/spelling/punctuation issues. I will post, soon, a .sql
    file to update the existing genre table.

    I'd say that the new field could be called something like
    id3index or something. I also found some history of "ID3"
    itself at id3.org:

    " The original standard for tagging digital files was
    developed in 1996 by Eric Kemp and he coined the term ID3.
    At that time ID3 simply meant 'IDentify an MP3'. "

     
  • William Skellenger

    Logged In: YES
    user_id=631721

    Did some more searching, and it seems that the file I posted
    does have some spelling errors in it.

    Using the file "in_mp3.dll" as the "standard" from the
    Winamp distribution, I stripped out the genre list and came
    up with the following differences:

    Current --> "Accepted"
    ---------------
    AlternRock --> Alt. Rock
    Gangsta --> Gangsta Rap
    Rhytmic Soul --> Rhythmic Soul

    All other "accepted" genres are currently in
    jukebox/sql/update4112.sql.

    So, this leaves the mapping. With no more discussion about
    what the new field should be called, I created a new field
    called "id3_map", type INT(3), default value NULL.

    I wrote a Python script that populates two lists, one with
    the current genres and one with the accepted genres, and
    then matching the two creating the map. I output SQL
    commands which populate the new field. I tested these on my
    own database.

    Attaching the sql file, which can be executed from within
    the MySQL shell by using the "source <filename>" command.

     
  • William Skellenger

    correct spelling in current genre list, add and populate mapping field to id3/Winamp accepted genre ids

     
  • William Skellenger

    Logged In: YES
    user_id=631721

    Now, we would still have to change some code to look at the
    new id3_map field, especially when importing files with
    numeric genres. There is an open bug on this point and I
    submitted a patch already, but the patch will need updating
    with this new field addition.

     
  • Lewis Jardine

    Lewis Jardine - 2005-08-28

    Logged In: YES
    user_id=436862

    Jukebox's database should never have any files with a
    numeric genre: mp3::info[1] replaces the number with the
    proper string when jukebox imports the file; the
    ascii-coded-number format should never make it to the
    database. If that's not happening, it's a bug in mp3::info*,
    which is where the fix should be done. Check line 629ish of
    Info.pm[2] . If you post me an example mp3, and I'll see if
    I can fix this bug in mp3::Info.

    While genre as an ascii-coded number is competely acceptable
    in id3v2, mp3::info's contract is to give the string
    representation, not the number, and thus the Right Thing
    would be to make mp3::info do what it claims to, not make
    gjukebox work around its weirdness.

    * or a malformed tag, in which case, in my opinion, it's
    still a bug in mp3::info : it should be able to read it,
    malformed tag or no.

    [1] - http://sourceforge.net/projects/pudge/
    [2]http://cvs.sourceforge.net/viewcvs.py/pudge/mp3-info/Info.pm?rev=1.19&view=markup

     
  • Nobody/Anonymous

    Logged In: NO

    We're coming full circle now. Interesting...

    Pudge's genre list includes the (potential?) mis-named
    genres that I mention earlier. So... Can I assume that the
    original genre list was borrowed from MP3::Info! However,
    instead of inserting them into the db in the correct order,
    they were inserted in alphabetical order, screwing up their IDs.

    'AlternRock'
    'Gangsta'

    (Rhythmic Soul is spelled correctly as far back as I can
    look in Info.pm.)

    This would explain why all of the genres that I found had
    matches in the db!

    Next:

    The bit you mention seems to be:

    } elsif ($id =~ /^TCON?$/) {
    if ($data =~ /^ \(? (\d+) (?:\)|\000)? (.+)?/sx) {
    my($index, $name) = ($1, $2);
    if ($name && $name ne "\000") {
    $data = $name;
    } else {
    $data = $mp3_genres[$index];
    }
    }
    }

    This code is introduced in 1.16 of Info.pm (who knew?),
    which goes with the 1.10 release of MP3::Info -- Dec 2004?

    I am using 1.15 of Info.pm, included in the 1.02 release and
    the most current at the time that I installed GJukebox.

    So, at this point, this bug can be marked invalid if
    everyone were to upgrade their version of MP3::info (after I
    do this, back out my patch, and test it).

    We come back to convention. I personally don't like the
    fact that the songs database stores the ID that is not at
    all related to the accepted genre list. IMO this is a
    design flaw -- it means that the db can't be used "properly"
    as a back end for anything else (amaroK, for example, can
    use MySQL as a backend for your music list) unless you
    re-map or perform the lookup anyway. The three
    inconsistencies are also bothering me. Referring to the ID
    number, as we all know, is the best way to handle this.
    Parsing and comparing strings, especially since some genre
    names contain symbols, is a great way to cause more problems.

    But at any rate, back to the current issue.

    If the new Info::MP3 handles numeric genres, great.
    However, I still believe the current implementation to be
    incorrect. IMO the proper way to fix this is to re-write
    the genre table and *not* allow user additions to it, OR at
    least implement the convention somewhere here.

    Convention. I'm glad to see that the names come from the
    convention, but when you're reading files that have many
    different spellings or case issues or whatever, it's best to
    store a number.

    I am upgrading my MP3::Info right now, but this means that
    everyone else should. There should be Jukeboxen out there
    right now with genres in the genre table called "11" with id
    = "142".

     
  • Lewis Jardine

    Lewis Jardine - 2005-08-28

    Logged In: YES
    user_id=436862

    Regarding storing and comparing strings, we have to: it's
    part of the ID3v2 spec*. Were we to populate the genre table
    from the spec and make it unmodifiable, there'd be no way to
    represent an mp3 with an ID3v2 genre that is not present in
    the ID3v2 list of id3v1 genres (which, I note, is missing
    the Winamp genres; Primus fans will be disappointed), or if
    we break/extend the standard, the Winamp genre list.

    *TCON
    The 'Content type', which ID3v1 was stored as a one byte
    numeric
    value only, is now a string. You may use one or several
    of the ID3v1
    types as numerical strings, or, since the category list
    would be
    impossible to maintain with accurate and up to date
    categories,
    define your own. Example: "21" $00 "Eurodisco" $00

     
  • William Skellenger

    Logged In: YES
    user_id=631721

    Thanks for researching this.

    You're right. From reading the below, it seems that Winamp
    genres are not accepted at all, if ID3 is the de-facto
    standard. Numerically you're allowed to use ID3v1 genres,
    but nothing else. In the future, if I read this correctly,
    *anyone* is allowed to define whatever string they want,
    giving it *whatever* ID they want! How in the hell will
    this work?

    Operating purely from the specification, the genre table
    will no longer hold much value. If I receive a MP3 with a
    user-defined genre, and receive another one attempting to
    define a different genre with the same ID, how should this
    be handled? I don't like the idea of simply plopping the
    names into the table...

    I've written several MS-Access databases that are used daily
    in my place of work... Give the user to opportunity: "This
    item isn't in the list, would you like to add it?" Most
    will lazily add the item to the list. What results is a
    genre table that looks like this, years later:

    Rock & Roll
    Rock-n-Roll
    rock&roll
    Rock
    rockin'
    Trash Metal
    Thrash Metal
    Metallica
    Primus

    I followed up with your own research. You were reading the
    2.4.0 spec. Look at the 2.4.0-changes.txt document:

    This document describes the changes between ID3v2.3.0
    [ID3v2.3.0] and ID3v2.4.0
    [ID3v2.4.0-strct][ID3v2.4.0-frames]. This document does not
    claim to be complete nor correct.

    Here is the same section you posted, albeit from the 2.3.0
    document:

    ---------------
    TCON

    The 'Content type', which previously was stored as a one
    byte numeric value only, is now a numeric string. You may
    use one or several of the types as ID3v1.1 did or, since the
    category list would be impossible to maintain with accurate
    and up to date categories, define your own.

    References to the ID3v1 genres can be made by, as first
    byte, enter "(" followed by a number from the genres list
    (appendix A.) and ended with a ")" character. This is
    optionally followed by a refinement, e.g. "(21)" or
    (4)Eurodisco". Several references can be made in the same
    frame, e.g. "(51)(39)". If the refinement should begin with
    a "(" character it should be replaced with "((", e.g. "((I
    can figure out any genre)" or "(55)((I think...)". The
    following new content types is defined in ID3v2 and is
    implemented in the same way as the numerig content types,
    e.g. "(RX)".

    RX Remix
    CR Cover

    -------------------

    Notice that here, it's indicated that you may specify
    SEVERAL genres to a single song, or if you can't match to
    the existing ID3V1 list, just enter whatever you want.
    Using two parenthesis (( at the beginning of your definition
    indicates a custom entry. (??)

    Perhaps the "songs" table should take the TCON frame,
    verbatim, and store it in the songs table. Rely on other
    libraries (MP3::Info) to interpret it based on accepted
    formulae.

     
  • Paradxum

    Paradxum - 2005-08-28

    Logged In: YES
    user_id=596175

    heheh, it seems like we're getting into a "who's standard is
    standard" type of debate... honestly, that's one of the
    reasons I wrote it as I did.

    Ok, I agree that the "standard" id3 numeric id should be
    stored in the genre table.

    my response to a few supposed problems....
    --- "it means that the db can't be used "properly" as a back
    end for anything else"

    has anyone ever done a join in their life....:)

    --- "re-write the genre table and *not* allow user additions"

    all I can say is "it's not a bug, it's a feature".... I'm
    sorry but this idea just seems innane to me.

    Now, I do think that the id3idNum should be stored in a
    column as we've discussed. Simply for the interoperability
    issues. I never have a problem storing more data... today
    storage is cheep and the database is fast. So add a column
    and move on to more important issues.... If someone can't do
    a join on a table then they need to learn sql. If a program
    doesn't support a configurable sql query then patch their
    program so it does.

    Sorry for my harsh attitude, I probably should re-write my
    comment, but I'm not in the mood to be political. I guess
    I'm just tired of expending energy on an issue that I see as
    being pretty small and easy to fix (without removing features).

     
  • Paradxum

    Paradxum - 2005-08-28
    • priority: 5 --> 1
     
  • William Skellenger

    Logged In: YES
    user_id=631721

    > I'm just tired of expending energy on an issue that I see
    as being pretty small and easy to fix (without removing
    features).

    Toyota Principle #13: "Make decisions slowly by consensus,
    thoroughly considering all options; implement rapidly."

    Sure, after/if the new field is added, a JOIN would work
    well. However, I still see this as a design flaw, as the
    songs table just simply lists "genre" and specifies an ID.
    The join isn't that big of a deal (if this field existed to
    anyone except myself), it's just not clean.

     
  • William Skellenger

    Logged In: YES
    user_id=631721

    Today using import.pl:
    --------------------------------------------
    Adding: <some artist> - <title> - <album> - <path>
    Adding Genre Unclassifiable
    Adding: <some artist> - <title> - <album> - <path>
    Adding Genre General Unclassifiable
    --------------------------------------------

    Checking genre table, I now have two new and very useful genres:

    | 162 | Unclassifiable | NULL |
    | 163 | General Unclassifiable | NULL |

     
  • William Skellenger

    Logged In: YES
    user_id=631721

    Looked at both files with mp3info.

    The first contains:
    Genre: [255]
    And was marked as "Unclassifiable"

    The other contains:
    Genre: Pop [13]
    And was marked as "General Unclassifiable"

    This must be the new version of Info::MP3

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks