|
From: Chris M. <cj...@fr...> - 2008-02-07 20:17:12
|
On Feb 7, 2008, at 11:00 AM, Jim Balhoff wrote: > On Feb 7, 2008, at 1:29 PM, Hilmar Lapp wrote: > >> On Feb 7, 2008, at 11:49 AM, Chris Mungall wrote: >> >>> I think the recommendation would be to avoid non-ascii where >>> possible >>> - most downstream consumers of obo files will react unpredictably >> >> >> I would venture to suggest that meanwhile we live in an age in >> which i) most programming languages and libraries support >> different character encodings perfectly fine (for example, >> supporting a non-ASCII character encoding in Java is simply a >> matter of passing in an additional argument to the file reader >> constructor), and ii) in science we're collaborating globally. >> Needing to tell collaborators how they should specify their native- >> language names to fit the ASCII limitation doesn't feel that good, >> frankly. >> >> Also, frankly, I would hate to have to entertain an argument of >> OWL/RDF/XML vs OBO on the basis of character encoding support - my >> take is that that argument should be unfounded. >> >> Maybe there is more involved than just putting an 'encoding' tag >> into the header, but it sounds unlikely that it's difficult to >> accommodate? > > I think it would be great to have something like "encoding: > UTF-8" (or whatever encoding) in the document header. It could be > optional, and UTF-8 could be standardly assumed if no encoding is > specified. While I don't have a good idea of what the consequences > for current ontologies would be if the OBO.jar parser started > assuming UTF-8, I think it would be better than the current > situation which I think depends on the OS and default charset a > user is running. OK, let's slate this for obof1.3 In the meantime you can start using this header now - it will be ignored by existing parsers but should be roundtripped. > Thanks, > Jim > > ____________________________________________ > James P. Balhoff, Ph.D. > National Evolutionary Synthesis Center > 2024 West Main St., Suite A200 > Durham, NC 27705 > USA > > |