From: SourceForge.net <no...@so...> - 2004-11-29 17:56:43
|
Bugs item #1075421, was opened at 2004-11-29 09:56 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109655&aid=1075421&group_id=9655 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: UTF-8 encoding in meta info Initial Comment: I'm trying to use libxine and using this to read meta info: const char *xine_get_meta_info (xine_stream_t *stream, int info) according to the xine.h the output should be UTF-8 encoded, however this is not the case. The output is iso8859-1 encoded. When using it on mp3 files with ascii tags. the output is ascii. When using it on mp3 files with unicode meta tags. the output is unknown (some garbage output). The same thing with cddb entries: when reading us-ascii encoded data from freedb.org the output is good, and when reading UTF-8 encoded data from freedb.org the output is a lot of question marks ?????????. so, why the xine.h claims the encoding to be UTF-8 encoded? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109655&aid=1075421&group_id=9655 |
From: SourceForge.net <no...@so...> - 2004-11-29 20:47:51
|
Bugs item #1075421, was opened at 2004-11-29 18:56 Message generated for change (Comment added) made by mroi You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109655&aid=1075421&group_id=9655 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: UTF-8 encoding in meta info Initial Comment: I'm trying to use libxine and using this to read meta info: const char *xine_get_meta_info (xine_stream_t *stream, int info) according to the xine.h the output should be UTF-8 encoded, however this is not the case. The output is iso8859-1 encoded. When using it on mp3 files with ascii tags. the output is ascii. When using it on mp3 files with unicode meta tags. the output is unknown (some garbage output). The same thing with cddb entries: when reading us-ascii encoded data from freedb.org the output is good, and when reading UTF-8 encoded data from freedb.org the output is a lot of question marks ?????????. so, why the xine.h claims the encoding to be UTF-8 encoded? ---------------------------------------------------------------------- >Comment By: Michael Roitzsch (mroi) Date: 2004-11-29 21:47 Message: Logged In: YES user_id=552060 The UTF-8 support is only best effort. Sorry. I am not familiar with either ID3 tags nor the CDDB protocol. If it does contain information about the encoding used, we should indeed convert the string to UTF-8 in xine-lib. If it does not contain such information, this is simply a design limitation of the format which we cannot fix. I don't think guessing the encoding would be a good idea. Michael ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109655&aid=1075421&group_id=9655 |
From: SourceForge.net <no...@so...> - 2004-11-29 23:14:06
|
Bugs item #1075421, was opened at 2004-11-29 09:56 Message generated for change (Comment added) made by nobody You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109655&aid=1075421&group_id=9655 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: UTF-8 encoding in meta info Initial Comment: I'm trying to use libxine and using this to read meta info: const char *xine_get_meta_info (xine_stream_t *stream, int info) according to the xine.h the output should be UTF-8 encoded, however this is not the case. The output is iso8859-1 encoded. When using it on mp3 files with ascii tags. the output is ascii. When using it on mp3 files with unicode meta tags. the output is unknown (some garbage output). The same thing with cddb entries: when reading us-ascii encoded data from freedb.org the output is good, and when reading UTF-8 encoded data from freedb.org the output is a lot of question marks ?????????. so, why the xine.h claims the encoding to be UTF-8 encoded? ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-11-29 15:14 Message: Logged In: NO OK. I know you can't guess the encoding. So why can't you output the char* that was in id3tag, or cddb entry without encoding at all. output it transparent and the backend will decide what to do with it. Now, xine breaks the char* string in the way and there is no way to retreive the input. The input is just a string of chars. eventually UTF-8 encoding will be the standard. ---------------------------------------------------------------------- Comment By: Michael Roitzsch (mroi) Date: 2004-11-29 12:47 Message: Logged In: YES user_id=552060 The UTF-8 support is only best effort. Sorry. I am not familiar with either ID3 tags nor the CDDB protocol. If it does contain information about the encoding used, we should indeed convert the string to UTF-8 in xine-lib. If it does not contain such information, this is simply a design limitation of the format which we cannot fix. I don't think guessing the encoding would be a good idea. Michael ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109655&aid=1075421&group_id=9655 |
From: SourceForge.net <no...@so...> - 2004-11-30 15:00:19
|
Bugs item #1075421, was opened at 2004-11-29 18:56 Message generated for change (Comment added) made by mroi You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109655&aid=1075421&group_id=9655 Category: None Group: None >Status: Pending Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: UTF-8 encoding in meta info Initial Comment: I'm trying to use libxine and using this to read meta info: const char *xine_get_meta_info (xine_stream_t *stream, int info) according to the xine.h the output should be UTF-8 encoded, however this is not the case. The output is iso8859-1 encoded. When using it on mp3 files with ascii tags. the output is ascii. When using it on mp3 files with unicode meta tags. the output is unknown (some garbage output). The same thing with cddb entries: when reading us-ascii encoded data from freedb.org the output is good, and when reading UTF-8 encoded data from freedb.org the output is a lot of question marks ?????????. so, why the xine.h claims the encoding to be UTF-8 encoded? ---------------------------------------------------------------------- >Comment By: Michael Roitzsch (mroi) Date: 2004-11-30 16:00 Message: Logged In: YES user_id=552060 > So why can't you output the char* that was in id3tag, or > cddb entry without encoding at all. I just looked at xine-lib/src/input/input_cdda.c and xine-lib/src/demuxers/id3.c and found that we are currently doing exactly that: Pass the string unmodified. > eventually UTF-8 encoding will be the standard. Hopefully. Michael ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-11-30 00:14 Message: Logged In: NO OK. I know you can't guess the encoding. So why can't you output the char* that was in id3tag, or cddb entry without encoding at all. output it transparent and the backend will decide what to do with it. Now, xine breaks the char* string in the way and there is no way to retreive the input. The input is just a string of chars. eventually UTF-8 encoding will be the standard. ---------------------------------------------------------------------- Comment By: Michael Roitzsch (mroi) Date: 2004-11-29 21:47 Message: Logged In: YES user_id=552060 The UTF-8 support is only best effort. Sorry. I am not familiar with either ID3 tags nor the CDDB protocol. If it does contain information about the encoding used, we should indeed convert the string to UTF-8 in xine-lib. If it does not contain such information, this is simply a design limitation of the format which we cannot fix. I don't think guessing the encoding would be a good idea. Michael ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109655&aid=1075421&group_id=9655 |
From: SourceForge.net <no...@so...> - 2004-11-30 16:46:55
|
Bugs item #1075421, was opened at 2004-11-29 09:56 Message generated for change (Comment added) made by nobody You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109655&aid=1075421&group_id=9655 Category: None Group: None Status: Pending Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: UTF-8 encoding in meta info Initial Comment: I'm trying to use libxine and using this to read meta info: const char *xine_get_meta_info (xine_stream_t *stream, int info) according to the xine.h the output should be UTF-8 encoded, however this is not the case. The output is iso8859-1 encoded. When using it on mp3 files with ascii tags. the output is ascii. When using it on mp3 files with unicode meta tags. the output is unknown (some garbage output). The same thing with cddb entries: when reading us-ascii encoded data from freedb.org the output is good, and when reading UTF-8 encoded data from freedb.org the output is a lot of question marks ?????????. so, why the xine.h claims the encoding to be UTF-8 encoded? ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-11-30 08:46 Message: Logged In: NO > I just looked at xine-lib/src/input/input_cdda.c and > xine-lib/src/demuxers/id3.c and found that we are currently > doing exactly that: Pass the string unmodified. Hi again, I looked also at input_cdda.c and found no problem, but I don't know why the title, artist, album become question marks instead of UTF-8 characters. (one question mark for every UTF-8 character) I Looked at the file retreived from freedb.org and found question mark there too. the file was in the cache: ".xine/cddbcache/" maybe there is a conversion somewhere (unsigned to signed or something). I tested the problem, and it appeared in Xine-ui, Totem, Kaffeine. However, when i used Grip I saw the currect encoding. thx for your help :-) ---------------------------------------------------------------------- Comment By: Michael Roitzsch (mroi) Date: 2004-11-30 07:00 Message: Logged In: YES user_id=552060 > So why can't you output the char* that was in id3tag, or > cddb entry without encoding at all. I just looked at xine-lib/src/input/input_cdda.c and xine-lib/src/demuxers/id3.c and found that we are currently doing exactly that: Pass the string unmodified. > eventually UTF-8 encoding will be the standard. Hopefully. Michael ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-11-29 15:14 Message: Logged In: NO OK. I know you can't guess the encoding. So why can't you output the char* that was in id3tag, or cddb entry without encoding at all. output it transparent and the backend will decide what to do with it. Now, xine breaks the char* string in the way and there is no way to retreive the input. The input is just a string of chars. eventually UTF-8 encoding will be the standard. ---------------------------------------------------------------------- Comment By: Michael Roitzsch (mroi) Date: 2004-11-29 12:47 Message: Logged In: YES user_id=552060 The UTF-8 support is only best effort. Sorry. I am not familiar with either ID3 tags nor the CDDB protocol. If it does contain information about the encoding used, we should indeed convert the string to UTF-8 in xine-lib. If it does not contain such information, this is simply a design limitation of the format which we cannot fix. I don't think guessing the encoding would be a good idea. Michael ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109655&aid=1075421&group_id=9655 |
From: SourceForge.net <no...@so...> - 2004-11-30 18:57:32
|
Bugs item #1075421, was opened at 2004-11-29 09:56 Message generated for change (Comment added) made by nobody You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109655&aid=1075421&group_id=9655 Category: None Group: None Status: Pending Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: UTF-8 encoding in meta info Initial Comment: I'm trying to use libxine and using this to read meta info: const char *xine_get_meta_info (xine_stream_t *stream, int info) according to the xine.h the output should be UTF-8 encoded, however this is not the case. The output is iso8859-1 encoded. When using it on mp3 files with ascii tags. the output is ascii. When using it on mp3 files with unicode meta tags. the output is unknown (some garbage output). The same thing with cddb entries: when reading us-ascii encoded data from freedb.org the output is good, and when reading UTF-8 encoded data from freedb.org the output is a lot of question marks ?????????. so, why the xine.h claims the encoding to be UTF-8 encoded? ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-11-30 10:57 Message: Logged In: NO I found what is the problem with the question marks. freedb has protocols when connecting. xine uses protocol #1 and in order to suppot UTF-8 it should support protocol #6. You can read all about this at: http://www.freedb.org/modules.php?name=Sections&sop=viewarticle&artid=28 until that no UTF-8 will be supported in cddb titles. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-11-30 08:46 Message: Logged In: NO > I just looked at xine-lib/src/input/input_cdda.c and > xine-lib/src/demuxers/id3.c and found that we are currently > doing exactly that: Pass the string unmodified. Hi again, I looked also at input_cdda.c and found no problem, but I don't know why the title, artist, album become question marks instead of UTF-8 characters. (one question mark for every UTF-8 character) I Looked at the file retreived from freedb.org and found question mark there too. the file was in the cache: ".xine/cddbcache/" maybe there is a conversion somewhere (unsigned to signed or something). I tested the problem, and it appeared in Xine-ui, Totem, Kaffeine. However, when i used Grip I saw the currect encoding. thx for your help :-) ---------------------------------------------------------------------- Comment By: Michael Roitzsch (mroi) Date: 2004-11-30 07:00 Message: Logged In: YES user_id=552060 > So why can't you output the char* that was in id3tag, or > cddb entry without encoding at all. I just looked at xine-lib/src/input/input_cdda.c and xine-lib/src/demuxers/id3.c and found that we are currently doing exactly that: Pass the string unmodified. > eventually UTF-8 encoding will be the standard. Hopefully. Michael ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-11-29 15:14 Message: Logged In: NO OK. I know you can't guess the encoding. So why can't you output the char* that was in id3tag, or cddb entry without encoding at all. output it transparent and the backend will decide what to do with it. Now, xine breaks the char* string in the way and there is no way to retreive the input. The input is just a string of chars. eventually UTF-8 encoding will be the standard. ---------------------------------------------------------------------- Comment By: Michael Roitzsch (mroi) Date: 2004-11-29 12:47 Message: Logged In: YES user_id=552060 The UTF-8 support is only best effort. Sorry. I am not familiar with either ID3 tags nor the CDDB protocol. If it does contain information about the encoding used, we should indeed convert the string to UTF-8 in xine-lib. If it does not contain such information, this is simply a design limitation of the format which we cannot fix. I don't think guessing the encoding would be a good idea. Michael ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109655&aid=1075421&group_id=9655 |
From: SourceForge.net <no...@so...> - 2004-11-30 21:17:53
|
Bugs item #1075421, was opened at 2004-11-29 18:56 Message generated for change (Comment added) made by mroi You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109655&aid=1075421&group_id=9655 Category: None Group: None >Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: UTF-8 encoding in meta info Initial Comment: I'm trying to use libxine and using this to read meta info: const char *xine_get_meta_info (xine_stream_t *stream, int info) according to the xine.h the output should be UTF-8 encoded, however this is not the case. The output is iso8859-1 encoded. When using it on mp3 files with ascii tags. the output is ascii. When using it on mp3 files with unicode meta tags. the output is unknown (some garbage output). The same thing with cddb entries: when reading us-ascii encoded data from freedb.org the output is good, and when reading UTF-8 encoded data from freedb.org the output is a lot of question marks ?????????. so, why the xine.h claims the encoding to be UTF-8 encoded? ---------------------------------------------------------------------- >Comment By: Michael Roitzsch (mroi) Date: 2004-11-30 22:17 Message: Logged In: YES user_id=552060 That's interesting. Someone should implement this in xine. Unfortunately, I don't have the time, so I hope someone else is interested in improving xine's CDDB support. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-11-30 19:57 Message: Logged In: NO I found what is the problem with the question marks. freedb has protocols when connecting. xine uses protocol #1 and in order to suppot UTF-8 it should support protocol #6. You can read all about this at: http://www.freedb.org/modules.php?name=Sections&sop=viewarticle&artid=28 until that no UTF-8 will be supported in cddb titles. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-11-30 17:46 Message: Logged In: NO > I just looked at xine-lib/src/input/input_cdda.c and > xine-lib/src/demuxers/id3.c and found that we are currently > doing exactly that: Pass the string unmodified. Hi again, I looked also at input_cdda.c and found no problem, but I don't know why the title, artist, album become question marks instead of UTF-8 characters. (one question mark for every UTF-8 character) I Looked at the file retreived from freedb.org and found question mark there too. the file was in the cache: ".xine/cddbcache/" maybe there is a conversion somewhere (unsigned to signed or something). I tested the problem, and it appeared in Xine-ui, Totem, Kaffeine. However, when i used Grip I saw the currect encoding. thx for your help :-) ---------------------------------------------------------------------- Comment By: Michael Roitzsch (mroi) Date: 2004-11-30 16:00 Message: Logged In: YES user_id=552060 > So why can't you output the char* that was in id3tag, or > cddb entry without encoding at all. I just looked at xine-lib/src/input/input_cdda.c and xine-lib/src/demuxers/id3.c and found that we are currently doing exactly that: Pass the string unmodified. > eventually UTF-8 encoding will be the standard. Hopefully. Michael ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-11-30 00:14 Message: Logged In: NO OK. I know you can't guess the encoding. So why can't you output the char* that was in id3tag, or cddb entry without encoding at all. output it transparent and the backend will decide what to do with it. Now, xine breaks the char* string in the way and there is no way to retreive the input. The input is just a string of chars. eventually UTF-8 encoding will be the standard. ---------------------------------------------------------------------- Comment By: Michael Roitzsch (mroi) Date: 2004-11-29 21:47 Message: Logged In: YES user_id=552060 The UTF-8 support is only best effort. Sorry. I am not familiar with either ID3 tags nor the CDDB protocol. If it does contain information about the encoding used, we should indeed convert the string to UTF-8 in xine-lib. If it does not contain such information, this is simply a design limitation of the format which we cannot fix. I don't think guessing the encoding would be a good idea. Michael ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109655&aid=1075421&group_id=9655 |
From: SourceForge.net <no...@so...> - 2005-01-26 13:30:46
|
Bugs item #1075421, was opened at 2004-11-29 18:56 Message generated for change (Comment added) made by mroi You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109655&aid=1075421&group_id=9655 Category: None Group: None >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: UTF-8 encoding in meta info Initial Comment: I'm trying to use libxine and using this to read meta info: const char *xine_get_meta_info (xine_stream_t *stream, int info) according to the xine.h the output should be UTF-8 encoded, however this is not the case. The output is iso8859-1 encoded. When using it on mp3 files with ascii tags. the output is ascii. When using it on mp3 files with unicode meta tags. the output is unknown (some garbage output). The same thing with cddb entries: when reading us-ascii encoded data from freedb.org the output is good, and when reading UTF-8 encoded data from freedb.org the output is a lot of question marks ?????????. so, why the xine.h claims the encoding to be UTF-8 encoded? ---------------------------------------------------------------------- >Comment By: Michael Roitzsch (mroi) Date: 2005-01-26 14:30 Message: Logged In: YES user_id=552060 The CDDA plugin has been updated to use CDDB protocol level 6 as you suggested. Thanks for the pointer. I hope you agree to closing this bug. If not, please reopen it. Michael ---------------------------------------------------------------------- Comment By: Michael Roitzsch (mroi) Date: 2004-11-30 22:17 Message: Logged In: YES user_id=552060 That's interesting. Someone should implement this in xine. Unfortunately, I don't have the time, so I hope someone else is interested in improving xine's CDDB support. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-11-30 19:57 Message: Logged In: NO I found what is the problem with the question marks. freedb has protocols when connecting. xine uses protocol #1 and in order to suppot UTF-8 it should support protocol #6. You can read all about this at: http://www.freedb.org/modules.php?name=Sections&sop=viewarticle&artid=28 until that no UTF-8 will be supported in cddb titles. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-11-30 17:46 Message: Logged In: NO > I just looked at xine-lib/src/input/input_cdda.c and > xine-lib/src/demuxers/id3.c and found that we are currently > doing exactly that: Pass the string unmodified. Hi again, I looked also at input_cdda.c and found no problem, but I don't know why the title, artist, album become question marks instead of UTF-8 characters. (one question mark for every UTF-8 character) I Looked at the file retreived from freedb.org and found question mark there too. the file was in the cache: ".xine/cddbcache/" maybe there is a conversion somewhere (unsigned to signed or something). I tested the problem, and it appeared in Xine-ui, Totem, Kaffeine. However, when i used Grip I saw the currect encoding. thx for your help :-) ---------------------------------------------------------------------- Comment By: Michael Roitzsch (mroi) Date: 2004-11-30 16:00 Message: Logged In: YES user_id=552060 > So why can't you output the char* that was in id3tag, or > cddb entry without encoding at all. I just looked at xine-lib/src/input/input_cdda.c and xine-lib/src/demuxers/id3.c and found that we are currently doing exactly that: Pass the string unmodified. > eventually UTF-8 encoding will be the standard. Hopefully. Michael ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-11-30 00:14 Message: Logged In: NO OK. I know you can't guess the encoding. So why can't you output the char* that was in id3tag, or cddb entry without encoding at all. output it transparent and the backend will decide what to do with it. Now, xine breaks the char* string in the way and there is no way to retreive the input. The input is just a string of chars. eventually UTF-8 encoding will be the standard. ---------------------------------------------------------------------- Comment By: Michael Roitzsch (mroi) Date: 2004-11-29 21:47 Message: Logged In: YES user_id=552060 The UTF-8 support is only best effort. Sorry. I am not familiar with either ID3 tags nor the CDDB protocol. If it does contain information about the encoding used, we should indeed convert the string to UTF-8 in xine-lib. If it does not contain such information, this is simply a design limitation of the format which we cannot fix. I don't think guessing the encoding would be a good idea. Michael ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109655&aid=1075421&group_id=9655 |