Re: [CZT-Devel] Object-Z parser v1.0

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Petra,

> thanks for providing this nice parser :-)
> 
I'm glad you like it!! :)

> Mark suggested to use my scanner together with your parser. I had a 
> short look
> at your token names and was very happy to find out that most of the 
> token names
> are equal.
Yes, Mark mentioned your unicode scanner when he rang the other day, so 
I checked it out of CVS and connected to the parser. It seemed to work 
ok, but the toolkit libraries are all in latex, so I couldn't use 
operators etc yet.

> However, there are still a few things I am not sure how to 
> handle:
> 
> - DELTA (one of your tokens) is not defined as a symbolic keyword in the
>   Z standard. Do you need that at all? Should I change my scanner to 
> recognice
>   that token as well?
> 
Yes, that is an Object-Z token that is used to identify secondary 
attributes, so an OZ scanner will need to return it.

> - ZSECTION (one of your tokens):
>   What is the difference between ZSECTION and SECTION?
>   Is ZSECTION an alphabetic keyword as well?  Is it an OZ keyword?
> 
ZSECTION is the latex tag "\begin{zsection}" and SECTION is just the 
word "section". Because unicode doesn't have a ZSECTION character, I 
think the best solution would be to remove the ZSECTION token from the 
parser and have the latex scanner ignore it. This won't cause any 
problems with an lalr parser.

> - NUMSTROKE, NEXTSTROKE, OUTSTROKE, and INSTROKE
>   (some of your tokens) are handled equally by my scanner. My scanner
>   returns the STROKE token, which is a String, and it is left to the parser
>   to figure out which kind of stroke it is.  I guess it is quite easy
>   to change the scanner to return the four different kinds of strokes ...
>   but if we want to stay as close to the standard as possible we should go
>   for STROKE.
> 
Yes, I noticed this. The reason I am using the 4 different types is that 
I am trying to make the parser as independent of the scanner as 
possible. The same problem occurs when we want to extract strokes from 
DECORWORDs to create a DeclName instance. However, if we want to follow 
the standard, I propose two possible solutions:
- Create an interface called something like CZTScanner, which provides 
methods for extracting strokes from DECORWORDs and from STROKEs, and 
have each scanner implement this interface. This removes any lexical 
work from the parser; or
- Do what I think your latex scanner seems to do and convert everything 
to unicode before sending it to the parser.

The best solution from a design point of view would probably be to do both?

> - GENSCH (one of my tokens):
>   I couldn't find the corresponding token in you parser. Is it GCH?
>   Since the name GENSCH is used in the standard I would like to use that
>   name as well (makes it easier for others to read the code).
>
Yes, I noticed this too. The token should be GENSCH, but he problem is 
that a schema definition in latex is:
   \begin{schema}{Name}

and a generic schema definition is
   \begin{schema}{Name}[Params]

So we need a two-token lookahead to tell whether it has parameters. I 
was avoiding this in the parser by combinining them:
   SCH boxName:bn optFormalParameters:ofp schemaText:st END

So the parameters are optional. The easiest solution for now is to add a 
GENSCH rule as well, but a longer term solution would be to implement 
the lookahead.

> - the following tokens are probably OZ tokens (could you please check
>   whether I am right):
>   CLASS, STATE, INIT, INITWORD, OPSCH, VISIBILITY, INHERITS,
>   DCNJ, DGCH, DSQC, PARALLEL, ASSOCPARALLEL, GCH,
>   CLASSCOM, ENDCLASSCOM, CLASSCOMWORD, DECLWORD,
Yes, these are all OZ tokens, except DECLWORD, which is a normal 
DECORWORD, but it occurs before a colon in a declaration. This is 
returned in Mark's SmartScanner to eliminate the set elaboration vs. set 
comprehension problem.

>   BOXNAME
I return BOXNAME after I see a "\begin{schema}" or "\begin{class}" 
otherwise the SmartScanner gets confused. The rule is:
   SCH NAME SchemaText END

The smart scanner sees NAME and begins lookahead, consuming the first 
DECORWORD token in SchemaText if there is one. When it stops lookahead, 
it returns all the backed-up tokens, not analysing them to see if they 
are before a colon, therefore they will never be returned DECLWORDs. I 
know that probably doesn't make sense just reading it. The BOXNAME was 
just a quick workaround to get the parser up and running. The best 
solution is to change the SmartScanner class.

>   I don't know how to change my scanner to recognice these tokens. Where
>   can I learn about the unicode characters in OZ?
> 
I'm not sure about the unicode characters. The best bet would be the 
people from NUS. I recall having read some papers on their XML stuff, 
and they mention unicode characters. I will ask Roger Duke or Graeme 
Smith the next time I see one of them - they may have an idea.

Perhaps someone on this mailing list can help?

> - _APPLICATION, _RENAME (some of your tokens)
>   I have got no idea what those are good for. I guess my scanner doesn't 
> have
>   to worry about those since these are used internally?
> 
Yes, they are just used to force precedence in the parser - they are not 
tokens. This is why they start with an underscore... I should really 
document that in the parser! :)

> 
> By the way, do you know whether it is possible to connect one scanner to 
> different
> parsers? I am worried about the sym.java class generated by the cup 
> parser. The
> scanner usually needs the sym.java class, but since I've got several of 
> them there
> remains the question which one should the scanner use?
> 
I'm not sure I understand your problem. I know that you can get cup to 
write the symbols to a specified file name, but it seems that your 
problem is in knowing which class to use in the scanner? I doubt that 
jflex would have any support for that... it doesn't seem very adapt to 
reuse. I do know that you can ask cup to produce the sym.java file as a 
class instead of an interface, so you can create an instance of that 
class and pass it to the scanner?

Thanks for your feedback,
Tim

Re: [CZT-Devel] Object-Z parser v1.0

Tool support for the Z formal notation

Re: [CZT-Devel] Object-Z parser v1.0