From: Terje B. <li...@po...> - 2002-04-14 18:15:33
|
Peter Newcomb <pet...@ep...> wrote: >>If nobody comes up with any suggestions, I'm going to try to hack >>$CVSROOT/unicode/gensyntax.pl to grok the current version of the >>UNICODE Tables from <URL:ftp://ftp.unicode.org/> and build with that. >>If it looks like it "works" I'll check it in and see if anyone screams. >>:-) > >This is a good idea, though it may not be necessary to solve your >problem. But, of course, when someone points me in the right direction I run out of time for actually trying it out. :-( >If I understand correctly, your problem is not that SP is mapping >characters incorrectly or that it is not recognizing characters' >syntactic functions (which is what updating gensyntax.pl would fix), but >rather that it is refusing characters because it doesn't think they >exist. I think that is an accurate asessment, yes. >This behavior is governed by the DESCSET section of the SGML >declaration. Yes. The problem occurs when I put OpenSP into Fixed Charset Mode by setting SP_CHARSET_FIXED to "YES". AFAICT, this makes OpenSP use an internal SGML Declaration, which, when SP_ENCODING is "XML", will be equivalent with the one from "pubtext/xml.dcl". Since MathML uses code points above 0x10FFFF, that were only defined after the UNICODE tables in OpenSP where last sync'ed, you cannot process MathML with OpenSP in "XML Mode" (as that requires setting SP_CHARSET_FIXED). >This says that code points 9-10, 13, 32-126, 160-55295, 57344-65533, and >65536-1114111 (0x10000-0x10FFFF) are defined. If you need to use code >points above 0x10FFFF, just add a line to the DESCSET... [...] The question is more: What SGML Declaration is "onsgmls" using when in Fixed Charset Mode? Is it one compiled into libosp.so? Or is it the one from "pubtext/xml.dcl" that gets installed by "make install" and whose path is compiled into OpenSP? Or is it the one from "unicode/unicode.sd"? Are these two the source for OpenSP's internal SGML Declaration, or is OpenSP's internal SGML Declaration the source for these two files? IOW, would simply adding that line to xml.dcl or unicode.sd have the desired effect, or would I need to modify any actual code to do it? As mentioned, I'll give it a try to see if it "just works" as soon as I can find some time for it, but I'd kinda like to understand this a bit better. [ Besides, the MathML guys are really chewing my ass about adding ] [ decent support for it before the heat death of the universe. :-) ] -- As a cat owner, I know this for a fact... Nothing says "I love you" like a decapitated gopher on your front porch. |