From: Jason T. <ta...@ur...> - 2006-04-26 13:04:40
|
On Wed, 2006-04-26 at 10:54 +0200, Dirk Meyer wrote: > Because you broke it. You last check ins are: Now now, let's not hastily point fingers. :) > | ctxt = libxml2.createFileParserCtxt(file.name) > | # Silence parse errors > | ctxt.setErrorHandler(lambda *args: None, None) > | ctxt.parseDocument() > | tag = ctxt.doc().children.name > > libxml2 returns UTF-8, that is bad. My xml.pt wrapper in kaa.base > always returns unicode and you need to use that First off, this has nothing to do with eyed3. This is the xml file parser. And you're also reading the code out of context. It doesn't matter if it returns utf-8 or latin-1. > | if isinstance(fileName, str): > | self.name = unicode(fileName, sys.getfilesystemencoding()); > | else: > | self.name = fileName; > | self.name = fileName > > I don't know the code, it is eyed3 what we only use, but you changed > the part where name is unicode to make it string. name refers to filename. It is broken to treat it as unicode. And indeed this fixes a problem where I have a latin1-encoded filename on my filesystem where the default encoding is utf-8. The problem I'm observing is in the id3 fields, like artist and title. > If this two parts are not the problem, please send my a file with the > bug. One side note: if a problem happens with mp3 files created my > grip, it is not something kaa should handle. grip uses UTF-8 in the id > tags, but sets the encoding of the file to latin-1. Ok, it could very well be that grip was used to encode this mp3. I'll see if I can find out. But have a look at this mminfo code: if medium: print unicode(medium).encode('latin-1', 'replace') My charset is UTF-8. You're encoding the output as latin-1, and yet things look correct printed on my screen. So my assumption was that you've got UTF-8 codes inside a unicode object. Maybe that assumption is wrong, but something here doesn't look quite right to me. Ping me on IRC and I'll get you the mp3 in question. Jason. |