Re: [Silgraphite-fonts] Another attachment problem

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On 7/31/2012 12:52 PM, Carsten Becker wrote:
> Hi Sharon,
>
> On 30.07.2012 22:36, Sharon Correll wrote:
>> Oops, HTML or something got me! There probably need to be spaces
>> around the caret. That's the key magic that makes the chaining happen.
>>
>> takes_upper  upperdiac {attach {to=@1; at=TopS; with=TopM}}  /  _
>> LOWERSEQ  ^  _;
> I'm afraid I can't get that to work like my two-pass solution for some
> reason. As far as I can tell, in my case I'd have to change the rules
> this way (all the indented lines should be a single one):
>
>      takes_upper = ( clsCons, AnyTopDiacritic );
>      takes_lower = ( clsCons, AnyBotDiacritic, clsDiaBotRight );
>
>      takes_upper AnyTopDiacritic { attach { to=@1; at=topAnch;
>         with=attGeneric }} / _ BOTSEQ ^ _;
>      takes_lower AnyBotMidDiacritic { attach { to=@1; at=botAnchMid;
>         with=attGeneric }} / _ TOPSEQ BOTSEQ ^ _;
>      takes_lower clsDiaBotRight { attach { to=@1; at=botAnchRight;
>         with=attGeneric }} / _ TOPSEQ ^ _;

You need to move the caret to *before* the BOTSEQ, so the processing 
goes back and handles the bottom diacritics. In general, you want to 
make sure that each rule "consumes" only one character at a time, 
otherwise some will get skipped.

Also there is a problem with the second rule: including BOTSEQ (which 
I assume means "optional bottom diacritics") as part of the context 
means that the last bottom diacritic will be attached first, since the 
longest rule gets matched first. You probably need two optional 
sequences, say BOTMIDSEQ and BOTRTSEQ. And notice that optional items 
imply that the top diacritics come first, before the bottom ones 
(contrary to Unicode specification, but I guess it's up to you in this 
case).

Also you'd probably need to include tests to keep the same rules from 
being fired over and over. I would do it something like this:

#define NOTTOPSEQ [ NotTopDiac NotTopDiac?]?    // these sequences can 
be longer...
#define NOTBOTRIGHTSEQ [ NotBotRightDiac NotBotRightDiac?]?
#define NOTBOTMIDSEQ [ NotBotMidDiac NotBotMidDiac?]?

NotTopDiac = (clsDiaBotMid, clsDiaBotRight);
NotBotRightDiac = (clsDiaTop, clsDiaBotMid);
NotBotMidDiac = (clsDiaTop, clsDiaBotRight);

takes_upper  AnyTopDiacritic {attach...}  /  _  ^ NOTTOPSEQ  _ 
{attach.to == 0};
takes_lower  AnyBotMidDiacritic {attach...}  /  _ ^ NOTBOTMIDSEQ  _ 
{attach.to == 0};
takes_lower  AnyBotRightDiacritic {attach...}  /  _  ^ NOTBOTRIGHTSEQ  
_ {attach.to == 0};

Something like that anyway. Note that to make sure it can handle all 
the combinations, the optional sequences need to be very flexible. You 
might be able to simplify if you can assume that the diacritics appear 
in a particular order.

The tests will generate warnings; it's a slightly unconventional thing 
to do, but it works. Or you can use a user-defined slot attribute, eg:

#define isattached user1
takes_upper  AnyTopDiacritic {attach...; isattached =1}  /  _  ^ 
BOTSEQ  _ {isattached == 0};
takes_lower  AnyBotMidDiacritic {attach...; isattached =1}  /  _ ^ 
TOPorBRTSEQ  _ {isattached == 0};
takes_lower  AnyBotRightDiacritic {attach...; isattached =1}  /  _ ^ 
TOPorBMIDSEQ    _ {isattached == 0};

Just some ideas. Doing it in two passes is fine, but keep in mind that 
every pass slows down the processing. So if performance were an issue, 
one pass would be preferable if you could make it work.

> FWIW, I've also tested my font in Firefox 14 now that Firefox has been
> supporting Graphite since a couple of versions back. However, it will
> only follow the rules in the first positioning pass, so it won't do
> diacritic-to-diacritic attachment for me. I've tested this on both
> Ubuntu 12.04 and Windows XP (there with Firefox 16.0.1a). Is this a bug?
> Everything works fine in Libre Office 3.5.5.3.

:-(  I've never heard of a problem along those lines with Firefox. The 
only bug I know of is that features are not recognized if they don't 
have exactly 4-character IDs (numbers don't work, nor do 3-character IDs).