From: Joel S. <jsf...@jk...> - 2012-11-09 04:56:39
|
Hi all, I have added some patches to the Atomic Parsley wrapper to get support for reading MP4 Quicktime-style chapters and reading/writing some missing tags, such as the sorting tags. Additionally, I have some minor bugfixes for reading the compilation flag (which should fix http://gtkpod.org/bugs/index.php?do=details&task_id=42 I think) and reading the media type tag from MP4 files. I have tested the code with my MP4 audiobooks, but most of my files are more-or-less the same in terms of atom layout, etc, so if anybody has some files that break my code, I'd be happy to take a look and fix the code to accommodate other variations. You can find the patches in my sandbox here: https://gitorious.org/~joelsmith/gtkpod/joelsmiths-sandbox/commits/master There are 5 patches, all pushed 2012/11/09 @ 05:08 UTC. Regards, Joel |
From: phantomjinx <p.g...@ph...> - 2012-11-10 21:35:10
|
On 11/09/2012 04:29 AM, Joel Smith wrote: > Hi all, > I have added some patches to the Atomic Parsley wrapper to get support > for reading MP4 Quicktime-style chapters and reading/writing some > missing tags, such as the sorting tags. > > Additionally, I have some minor bugfixes for reading the compilation > flag (which should fix > http://gtkpod.org/bugs/index.php?do=details&task_id=42 I think) and > reading the media type tag from MP4 files. > > I have tested the code with my MP4 audiobooks, but most of my files are > more-or-less the same in terms of atom layout, etc, so if anybody has > some files that break my code, I'd be happy to take a look and fix the > code to accommodate other variations. > > You can find the patches in my sandbox here: > https://gitorious.org/~joelsmith/gtkpod/joelsmiths-sandbox/commits/master > There are 5 patches, all pushed 2012/11/09 @ 05:08 UTC. > > Regards, > Joel Hi Joel, When you are ready can you push your patches onto sf. I would like to roll them into the 2.1.3 release. Any idea whether your patches would solve [1]? Cheers PGR [1] https://sourceforge.net/tracker/index.php?func=detail&aid=3575205&group_id=67873&atid=519273 -- Paul Richardson * p.g...@ph... * p.g...@re... * pgr...@li... "I know exactly who reads the papers ... * The Daily Mirror is read by people who think they run the country. * The Guardian is read by people who think they ought to run the country. * The Times is read by people who do actually run the country. * The Daily Mail is read by the wives of the people who run the country. * The Financial Times is read by the people who own the country. * The Morning Star is read by the people who think the country ought to be run by another country. * The Daily Telegraph is read by the people who think it is." Jim Hacker, Yes Minister |
From: Joel S. <jsf...@jk...> - 2012-11-11 02:07:34
|
On 11/10/2012 02:34 PM, phantomjinx wrote: > Hi Joel, > > When you are ready can you push your patches onto sf. I would like to roll them into the 2.1.3 release. > > Any idea whether your patches would solve [1]? > > Cheers > > PGR > > [1] https://sourceforge.net/tracker/index.php?func=detail&aid=3575205&group_id=67873&atid=519273 > > Hi PGR, I had forgotten that I had commit access on sf. I'll push soon. Regarding [1] above, I had tested all the tags and not found any problems other than the ones I fixed, but after reading the bug report, I looked into it. It looks like that's an Atomic Parseley bug where if you have a tempo above 99 BPM, it won't fit into the malloc'd buffer that sprintf() is using for its output. (The buffer is size 4 and after the \n and the \0, any tempo of more than 2 digits will overflow the buffer). I'll make a fix and push that too. Regards, Joel |
From: Daniele F. <df...@gm...> - 2012-11-11 10:09:40
|
2012/11/9 Joel Smith: > I have added some patches to the Atomic Parsley wrapper to get support > for reading MP4 Quicktime-style chapters and reading/writing some > missing tags, such as the sorting tags. I don't know that code but in this commit https://gitorious.org/~joelsmith/gtkpod/joelsmiths-sandbox/commit/6157a8c10646ff25c98651755c6dbbafa63170ba you are using g_strdup() while existing code uses charset_to_utf8() exisitng code: 182 // MP4 Title 183 value = find_atom_value(TITLE); 184 if (value) { 185 track->title = charset_to_utf8(value); 186 free(value); 187 } new code: 383 // MP4 Sort Title 384 value = find_atom_value(SORT_TITLE); 385 if (value) { 386 track->sort_title = g_strdup(value); 387 free(value); 388 } if that's ok can you please explain why with a comment in the code? -- Daniele Forsi |
From: Joel S. <jsf...@jk...> - 2012-11-13 01:44:24
|
On 11/11/2012 03:09 AM, Daniele Forsi wrote: > I don't know that code but in this commit > https://gitorious.org/~joelsmith/gtkpod/joelsmiths-sandbox/commit/6157a8c10646ff25c98651755c6dbbafa63170ba > > you are using g_strdup() while existing code uses charset_to_utf8() > > exisitng code: > 182 // MP4 Title > 183 value = find_atom_value(TITLE); > 184 if (value) { > 185 track->title = charset_to_utf8(value); > 186 free(value); > 187 } > > new code: > 383 // MP4 Sort Title > 384 value = find_atom_value(SORT_TITLE); > 385 if (value) { > 386 track->sort_title = g_strdup(value); > 387 free(value); > 388 } > > if that's ok can you please explain why with a comment in the code? I think the existing code is wrong. MP4 tags are already stored as UTF-8 (or at least they're supposed to be). And AP doesn't do any further conversions before returning the strings to us, so we shouldn't be trying to convert an already UTF-8 string from local charset to UTF-8. If run with a non-UTF-8 charset locale, that would corrupt the data. For example, with ISO-8859-1, all but the lower 128 char values would get corrupted. Funny that I didn't even notice the other ones doing it wrong. Thanks for reviewing the patch so we could catch the other bug. I'll commit a fix. I'm no expert on i18n, charsets, etc, so maybe we'll end up reverting my change right away, but that's how I see it. Regards, Joel |
From: Javier K. <jk...@gm...> - 2012-11-13 09:10:43
|
On Tue, Nov 13, 2012 at 2:44 AM, Joel Smith <jsf...@jk...>wrote: > On 11/11/2012 03:09 AM, Daniele Forsi wrote: > > I don't know that code but in this commit > > > https://gitorious.org/~joelsmith/gtkpod/joelsmiths-sandbox/commit/6157a8c10646ff25c98651755c6dbbafa63170ba > > > > you are using g_strdup() while existing code uses charset_to_utf8() > > > > exisitng code: > > 182 // MP4 Title > > 183 value = find_atom_value(TITLE); > > 184 if (value) { > > 185 track->title = charset_to_utf8(value); > > 186 free(value); > > 187 } > > > > new code: > > 383 // MP4 Sort Title > > 384 value = find_atom_value(SORT_TITLE); > > 385 if (value) { > > 386 track->sort_title = g_strdup(value); > > 387 free(value); > > 388 } > > > > if that's ok can you please explain why with a comment in the code? > > I think the existing code is wrong. MP4 tags are already stored as > UTF-8 (or at least they're supposed to be). And AP doesn't do any > further conversions before returning the strings to us, so we shouldn't > be trying to convert an already UTF-8 string from local charset to > UTF-8. If run with a non-UTF-8 charset locale, that would corrupt the > data. For example, with ISO-8859-1, all but the lower 128 char values > would get corrupted. Funny that I didn't even notice the other ones > doing it wrong. Thanks for reviewing the patch so we could catch the > other bug. I'll commit a fix. I'm no expert on i18n, charsets, etc, so > maybe we'll end up reverting my change right away, but that's how I see it. > I'm no expert in i18n and Unicode either, but I have some experience. I can review the change if you'd like. |
From: Jan K. <JKubitschek@JakPod.de> - 2012-11-13 14:12:48
|
Hi all, has someone discovered something about the new iPhone stuff? As far as I can see the naming scheme for artwork files has changed - and seems (on a first view) not to be related to anything inside the iTunesCDB or MediaLibrary.sqlitedb. If I can help with some files or anything else from the iPhone, or if there is somewhere a better place to discuss those things - please let me know. Kind Regards, Jan. ------------------------------ JakPod - the iPod-manager http://www.jakpod.de |
From: Joel S. <jsf...@jk...> - 2012-11-16 03:39:41
|
On 11/13/2012 02:10 AM, Javier Kohen wrote: > > On Tue, Nov 13, 2012 at 2:44 AM, Joel Smith <jsf...@jk... > <mailto:jsf...@jk...>> wrote: > > On 11/11/2012 03:09 AM, Daniele Forsi wrote: > > I don't know that code but in this commit > > > https://gitorious.org/~joelsmith/gtkpod/joelsmiths-sandbox/commit/6157a8c10646ff25c98651755c6dbbafa63170ba > <https://gitorious.org/%7Ejoelsmith/gtkpod/joelsmiths-sandbox/commit/6157a8c10646ff25c98651755c6dbbafa63170ba> > > > > you are using g_strdup() while existing code uses charset_to_utf8() > > > > exisitng code: > > 182 // MP4 Title > > 183 value = find_atom_value(TITLE); > > 184 if (value) { > > 185 track->title = charset_to_utf8(value); > > 186 free(value); > > 187 } > > > > new code: > > 383 // MP4 Sort Title > > 384 value = find_atom_value(SORT_TITLE); > > 385 if (value) { > > 386 track->sort_title = g_strdup(value); > > 387 free(value); > > 388 } > > > > if that's ok can you please explain why with a comment in the code? > > I think the existing code is wrong. MP4 tags are already stored as > UTF-8 (or at least they're supposed to be). And AP doesn't do any > further conversions before returning the strings to us, so we > shouldn't > be trying to convert an already UTF-8 string from local charset to > UTF-8. If run with a non-UTF-8 charset locale, that would corrupt the > data. For example, with ISO-8859-1, all but the lower 128 char values > would get corrupted. Funny that I didn't even notice the other ones > doing it wrong. Thanks for reviewing the patch so we could catch the > other bug. I'll commit a fix. I'm no expert on i18n, charsets, etc, so > maybe we'll end up reverting my change right away, but that's how > I see it. > > > I'm no expert in i18n and Unicode either, but I have some experience. > I can review the change if you'd like. > The more eyes, the the better. Here's the patch: http://gtkpod.git.sourceforge.net/git/gitweb.cgi?p=gtkpod/gtkpod;a=commitdiff;h=0fd91b5fe7e93e5b2409b7b5a36e9dd052042237 My reasoning is that find_atom_value() calls APar_ExtractDataAtom() which for text types, simply seeks to an offset in the MP4 file and reads the atom value data into the return buffer, without doing any kind of charset conversion. Since MP4 atom text values are supposed to be stored as UTF-8, we should just copy these UTF-8 strings directly, rather than calling charset_to_utf8() on them. Regards, Joel |
From: Javier K. <jk...@gm...> - 2012-11-16 06:26:28
|
On Fri, Nov 16, 2012 at 4:39 AM, Joel Smith <jsf...@jk...>wrote: > On 11/13/2012 02:10 AM, Javier Kohen wrote: > > > On Tue, Nov 13, 2012 at 2:44 AM, Joel Smith <jsf...@jk...>wrote: > >> On 11/11/2012 03:09 AM, Daniele Forsi wrote: >> > I don't know that code but in this commit >> > >> https://gitorious.org/~joelsmith/gtkpod/joelsmiths-sandbox/commit/6157a8c10646ff25c98651755c6dbbafa63170ba >> > >> > you are using g_strdup() while existing code uses charset_to_utf8() >> > >> > exisitng code: >> > 182 // MP4 Title >> > 183 value = find_atom_value(TITLE); >> > 184 if (value) { >> > 185 track->title = charset_to_utf8(value); >> > 186 free(value); >> > 187 } >> > >> > new code: >> > 383 // MP4 Sort Title >> > 384 value = find_atom_value(SORT_TITLE); >> > 385 if (value) { >> > 386 track->sort_title = g_strdup(value); >> > 387 free(value); >> > 388 } >> > >> > if that's ok can you please explain why with a comment in the code? >> >> I think the existing code is wrong. MP4 tags are already stored as >> UTF-8 (or at least they're supposed to be). And AP doesn't do any >> further conversions before returning the strings to us, so we shouldn't >> be trying to convert an already UTF-8 string from local charset to >> UTF-8. If run with a non-UTF-8 charset locale, that would corrupt the >> data. For example, with ISO-8859-1, all but the lower 128 char values >> would get corrupted. Funny that I didn't even notice the other ones >> doing it wrong. Thanks for reviewing the patch so we could catch the >> other bug. I'll commit a fix. I'm no expert on i18n, charsets, etc, so >> maybe we'll end up reverting my change right away, but that's how I see >> it. >> > > I'm no expert in i18n and Unicode either, but I have some experience. I > can review the change if you'd like. > > > The more eyes, the the better. Here's the patch: > > http://gtkpod.git.sourceforge.net/git/gitweb.cgi?p=gtkpod/gtkpod;a=commitdiff;h=0fd91b5fe7e93e5b2409b7b5a36e9dd052042237 > > My reasoning is that find_atom_value() calls APar_ExtractDataAtom() which > for text types, simply seeks to an offset in the MP4 file and reads the > atom value data into the return buffer, without doing any kind of charset > conversion. Since MP4 atom text values are supposed to be stored as UTF-8, > we should just copy these UTF-8 strings directly, rather than calling > charset_to_utf8() on them. > Sounds good to me. Here's what the official site<http://atomicparsley.sourceforge.net/>says on this topic: > With the exception of artwork, all iTunes-style tags support only 1 piece > of metadata - you can't have for example 3 title tags. 3gp assets and > copyright notices support multiple tags, but they must differ in the > language setting for that tag. You can have for example 3 title tags, but 1 > in english, 1 in spanish, 1 in undefined. *All iTunes-style text metadata > is always in UTF-8*; 3gp assets & copyright notices can be in set in > UTF-8 or UTF-16. *All strings are converted internally to UTF-8*, > converting as necessary. On *nix platforms, input is in UTF-8; the native > Windows port supports full UTF-16 input. |