Re: [Audacity-devel] non-ascii labels
A free multi-track audio editor and recorder
Brought to you by:
aosiniao
From: Stuart <smc...@fr...> - 2006-08-21 15:18:50
|
"Leland" wrote: [...snip...] > > When I export the labels, then import them, they are imported > > correctly. The labels file encoding also appears to be cp-932. > > > We're just getting lucky with this one since we expect the start and end > times to be in plain ASCII and allow the label test to be in Unicode. With most (nearly all?) encodings, the ascii letters, digits, and basic punctuation is the same values as in ascii I think, so if its lucky, its predictably lucky. But, yes, it would be better to do it right. If by unicode you mean, UTF-8, I think that would be fine. Isn't it pretty straight forward to encode the output data before writing it to the file? (He says glibbly without looking at the code. :-) If you mean UCS-2 or UCS-4, I think that would be less desireable due to byte order and BOM issues. And using system default encoding (which would result in byte-for-byte identical files that are produced now in most (all?) cases is not horrible either, although it does have portability issues. > A couple of us were tossing around the idea of changing the label export to > write XML files instead. But, we'll have to hold off on that until we > figure out how we're going to handle Unicode in the various XML files we > create. In my current use of Audacity, the label file *is* the work product. I have tools that read the label files into a database, so that the labeled audio clips can be analyzed, played, correlated, and otherwise used by other tools. So I hope you don't consider label files to be purely a persistence mechanism for Audacity. While I can process XML files without a big problem it will mean including xml libraries, more complex code, etc. Not a huge burden, but the file structure of label files is so simple I wonder if using xml is really worthwhile. (Of course you may have plans for introducing more capabilities and a more complex structure,,,) I am surprised (probably an indication of my ignorance) that unicode in xml presents a problem. Is that not why the default encoding for xml files is utf-8? The alternative would be to use the default system encoding (double-byte cp932 on my system) as is done now but just include an encoding line in the xml file. But I admit to being totally mysified by double-byte vs unicode in windows so please take this comment more as an expression of my ignorance than a suggestion! :-) |