From: Christiaan H. <cmh...@gm...> - 2006-11-24 18:27:03
|
On 11/24/06, Adam R. Maxwell <ama...@ma...> wrote: > > > On Nov 24, 2006, at 08:44, Christiaan Hofman wrote: > > > > > > > On 11/24/06, Adam R. Maxwell <ama...@ma...> wrote: > > > > On Nov 24, 2006, at 05:47, Christiaan Hofman wrote: > > > > > I can only think of one solution: always hand UTF-8 to the parser > > > (or perhaps for some encodings). This has a problem though. First, > > > the parser should always parse the data instead of the file. This > > > means that the filepath for errors is wrong. > > > > And that's a useful feature I'd hate to lose. We could do this just > > for certain encodings, though. > > > > I wonder which. Do we really need the original filepath, can't we > > just use a temporary one? > > Unicode and non-Western encodings are likely problem candidates. It's > possible to pass a temporary file to the parser, but it just means > more of a mess. If we go that route, let's do the conversion(s) in > the document, then pass BibTeXParser an NSString just like the other > parsers. We used to create a temporary file for paste/drag data, but > I removed that after discovering Omni's NSData-based file stream, > since it was slow and messy. Can every string be converted to UTF-8? > > Also we then can't save groups in UTF-8 anymore, because we need to > > > convert the whole file content. This could give problems with > > > compatibility and when saving in ASCII. > > > > I don't see any way around that, other than saving the groups in a > > file wrapper plist or as xattrs. > > > > We can escape non-ascii characters. > > That seems kind of fragile, at least from a compatibility standpoint. Compatibility would be a problem, yes. It would be a one-time switch though. An ASCII file could always be opened as UTF-8. I never liked the mixed encoding for group files, it's fragile by itself. It makes external editing of the file difficult. I really think we're just avoiding the inevitable switch away from > BibTeX as a file format, though. How much work would it be to add the > BibTeX exporter infrastructure that Mike originally talked about? We > can already serialize an array of BibItems. I would prefer not to use a binary format in BD1, stay with bibtex format. BTW, does BibTeX itself have problems with the Japanese characters? > > adam I don't think BibTeX has a problem. At least latex can handle Shift-JS encoding, so I think bibtex should also be able to handle it. No experience with nay of that though. Christiaan |