>>> "David PONCE" <David.Ponce@...> seems to think that:
[ ... ]
>> I do like the idea of renaming it too. I think Hollywood when you
>> say 'production' though. ;)
>
>Maybe simply `parse-rule' is better ;-)
I like that name.
>> In the previous email, the format you suggested was something vaguely
>> like this:
>>
>> (wisent-token normal args here :property prop info)
>>
>> I find it a tad strange to have a mix of regular arguments followed
>> by a property list.
>>
>> I found myself sometimes trying to put comments in my token generation
>> so I could keep track of which position mean which value. Perhaps all
>> of wisent-token could be this way. Reading it would be clearer:
>>
>> (wisent-token name 'function ; required args
>> :args $3
>> :type $1
>> :something $5
>> :else $6
>> :property 'reparse 'cool-thingy)
>>
>> Then the properties could be in random order, and :something and
>> :else would automatically get turned into ((something . $5) (else . $6))
>> in the extra specifier slot.
>>
>> Dilemma: Languages introducing new token tokens would have to have a
>> way of specifying preferred order.
>>
>> I think this would make things much more readable, AND let us change
>> the token format without re-writing language files. Nifty!
>> Cons: make things slower, not faster. ;( Perhaps a crazy macro could
>> compile them into the right format. Zoiks.
>
>Maybe should we go a little further and build semantic tokens as
>property lists? The token API would be simple and efficient
>(basically wrappers to `plist-get' and `plist-put'), and very flexible
>(no more depends on positional values, easy to add new properties).
>Certain property names would be "standardized" like :name, :type,
>:doc, :parts, :bounds, etc.. Also it would be possible to
>"propertize" property symbols with useful flags like a 'mandatory'
>flag, maybe a 'check' function, etc..
That's an interesting idea. It would eliminate extra specifiers, and
the need to know what details were specifiers, and which weren't.
I still think it's useful to keep the name as the first element of
the token, though. This way you can pass a stream into
`try-completion' and it will work.
I'm also attached to having the token type symbol second, though I
don't have any good reasons for it.
>Here is a first basic idea, looking a bit like the widget API:
>
>(defconst semantic-token-tag (make-symbol "semantic-token")
> "Unique symbol used to tag semantic tokens.")
>
>(defsubst semantic-make-token (&rest fields)
> "Create a new Semantic token with given FIELDS.
>FIELDS is a property list."
> (cons semantic-token-tag fields))
>
>(defsubst semantic-token-p (object)
> "Return non-nil if OBJECT is a Semantic token."
> (and (consp object) (eq (car object semantic-token-tag))))
>
>(defsubst semantic-token-check (object)
> "Signal an error if OBJECT is not a Semantic token."
> (or (semantic-token-p object)
> (error "Invalid Semantic token %S" object)))
>
>(defsubst semantic-token-get-field (token field)
> "Given a Semantic TOKEN, return its FIELD value.
>Signal an error if TOKEN is not a Semantic token."
> (semantic-token-check token)
> (plist-get (cdr token) field))
>
>(defsubst semantic-token-put-field (token field value)
> "Given a Semantic TOKEN, set its FIELD with VALUE.
>Return VALUE.
>Signal an error if TOKEN is not a Semantic token."
> (semantic-token-check token)
> (setcdr token (plist-put (cdr token) field value))
> value)
We could also make all the tokens with `defstruct' (in CL) or
with EIEIO. Converting to EIEIO would make token->text and other
token querying functions into methods instead of our overload
functions. (or we could do both.)
The nice thing about plists is the simplicity of the code (as above).
The nice thing with structs or classes is that they show a distinct
structure, and individual languages can place named fields in them
they consider popular. In particular, there is a syntax to declare
the structure that can be read by other programmers.
I'd be a little worried about speed for any of these solutions,
however, particularly with name regex matching (the most common thing
to do it seems) as the name will be in an unknown location.
It might be worth using an API (as above) and hacking in a few
back-ends to see what works well, in particular, just enough to do
name or name regex matching so we can know for sure.
If you have an API you think is good, I can create such a back end for
EIEIO to compare with. If the difference is very small, I'd prefer
declarative structure, otherwise speed is important.
>[...]
>> Hmm, I suspected that there was some reason but wasn't sure what it
>> was. You could enable the reparse if you had specialty rules for
>> the first token, and follwing tokens separate from the rule which
>> generates the master list during a full parse. Then the incremental
>> parser would work.
>
>Clever idea! It would be worth trying it, even I don't like too much
>that sort of artificial distinction between the first nonterminal rule
>and the next ones.
>
>I am just wondering if the incremental parser would work for a
>language like python, where just adding or removing spaces
>(indentation) can require a complete rebuild of the parent token
>surrounding the change because the parent-child relationship between
>tokens has probably changed!
[ ... ]
In your case, I don't think it is too important since reparsing the
parent would be so fast. Languages where you can't use the repeating
parser to identify children, where the parent is large could take
advantage of such a technique, however. As such, an experiment would
be good.
Eric
--
Eric Ludlam: zappo@..., eric@...
Home: http://www.ludlam.net Siege: http://www.siege-engine.com
Emacs: http://cedet.sourceforge.net GNU: http://www.gnu.org
|