[Pyparsing] action field
Brought to you by:
ptmcg
From: spir <den...@fr...> - 2008-11-19 11:43:01
|
Here is how I imagine the right field of pattern lines -- as of now: wed 19 of nov 2008, 12:41 here. === What fits in? Post-match jobs. All actions that apply on, or use, results generated according to this precise the pattern. This includes: * mutation (Post-) Parse action that returns a transformed result. * task Parse action that performs an additional operation when a match is found -- but let the result unchanged. * format Kind of intermediate between mutation and task. Will not change the result's content, only its format. This includes packing (=Group-ing), concatenation (=Combine), maybe toList() and Dict(). * pattern name = result type Products, or relevant results, can get a 'nature' field. May be implemented as attribute or dict item. Says what kind of result it is, hence no need to reparse when kinds of results are not predictable. Appropriate actions can be launched directly. Grammar/Parser directives may let relevant results be automatically typed, using pattern name. * 'star' (?) (new idea) Product identification, kind of flag. A special code, e.g. '*' used to identify relevant patterns. Meaning the ones needed for further processing. === Order There is a kind of natural order, maybe. Could be for instance: action : star? (mutation | task | format)* name? === comments separator I am not fully happy of ':' beeing the separator between expression and action fields. All right, especially because it is same as the name/expression separator. But there may be better; not obvious enough to my taste, actually. Mutations Must obviously be defined elsewhere. This is outside the scope of grammar itself. Can be built-in types/functions such as int(). Typically, I guess, cope only with the result as argument. Mutations & Formats It is not clear for me if the formatted result should then really be of target type (e.g. list, int) or a ParseResults object holding this content. I am not sure to understand what really happens when using such functions in ParseResults itself. There are very heterogeneous cases: int() or whatever ParseAction returning new result object asList() ParseResults method Dict() Group() Combine() ParserElement subtypes === star: token vs product Paul chose to write a single-pass parser generator. This is perfectly good, especially when grammar in pure code ;-). Still, for the programmer, results do not all have the same status: some are intermediate results (I call them tokens), some are the results one needs to cope with after parsing (produts) -- even if only for output. I find it rather natural to make a kind of flagging possible. I have some use for that in mind, but probably there is more. Note: there is no absolute need to write down token patterns, they could be sub-expressions of products. But it highly enhances legibility and avoids repetition. This is even more relevant for a text grammar which first purpose is clarity. Actually, I would support 'staring' (or anything that has a similar semantics) even if it had no real use: because it makes sense... May also be stored on product itself, together with nature/role. Precisely, as I see it, this flag is particuliarly meaningful in combination with nature and/or role. It shows what's important - and what's not. The 'star' code would allow, after parsing, the programmer to sort out relevant results. This code would also allow, and filter, automatic actions such as naming, suppression, or whatever. These actions could be controlled with parser 'switch' directives (all default values to False). All of this mess can be written manually for patterns on which it applies. But global control seems clearer to me and avoids useless grammar overload. Now, this is nothing specific to text grammar. Instead, global control could exist for pyParsing coded grammar, like presently whitespace control. === grammar/parser directives Some rough ideas. whitespace _ respect_whitespace (default: ignore) _ list of (default: idem pyParsing) product _ all_products (default: no) avoids staring all _ product_list (default: []) alternative to staring in-line _ type_dict (default: {}) --> automatic instanciation (*) typing/naming _ type_all (default: no) _ type_products actions _ product_format (default: None) default format, e.g. Dict _ product_action (default: None) default conversion, e.g. int _ product_task (default: None) default side task, e.g. count, add, print _ all_format (default: None) default format, e.g. Dict _ all_action (default: None) default conversion, e.g. int _ all_task (default: None) default side task, e.g. count, add, print (*) This is a specific need of mine. No idea if it matches a common requirements. Objects will instanciated with a type defined by the product's nature, and init data taken from the product itself. Looks like: type_dict[product.nature](product) Could be a parse action applying to all products -- but products only. denis |