From: Benny B. <Ben...@gm...> - 2011-10-12 23:52:39
|
Hi, Am 10.10.2011 20:13, schrieb W P Blatchley: > Hi, > > On Mon, 10 Oct 2011 01:41:39 +0200, Benny Baumann <Ben...@gm...> > wrote: >>> So you're thinking something like this (with a bit more error >>> checking!): >>> >>> (from class.geshirendererhtml.php, parseToken()) >>> >>> if (isset($data['url']) || isset($data['anchor']['name'])) { >>> // There's a URL or anchor associated with this token >>> $result .= '<a'; >>> $result .= isset($data['url']) ? ' href="' . GeSHi::hsc($data['url']) >>> . '"' : ''; >>> >>> if (isset($data['anchor']['name']) >>> { >>> $result .= ' name="' . >>> GeSHi::hsc(call_user_func($preprocess_anchor_fn, >>> $data['anchor']['name'])) . '"'; >>> } >>> $result .= '>'; >>> } >>> >>> So in other words, every time an anchor gets inserted, there's an >>> opportunity to munge it in some way first? >> Correct. That's what I had in mind. >>> And have similar code for ['anchor']['desc'], too? >> Yes. >>> Where would the preprocess anchor callback live? Should it just be a >>> global public function, defined in the language file? >> The global preprocess function would be called as part of the Source -> >> Tokenizer step, before the Tokenizer starts working. The function would >> be part of the language code parser with other hooks called as necessary >> from there. As the language file only gives the "static structure" of >> the language I doubt this is a good place to locate such "dynamic" >> features in. The Code Parser is the more fitting place IMHO. > As far as I can imagine, the code parser (in particular a > language-specific code parser) is the only code that is likely to be > inserting anchors in the code. Not quite. Imaging a user that wants to add some additional markers. > If that's true, and you then also put the preprocess callback in the > code parser, too, I'm wondering what the point of it would be... I > would have thought the idea of a hook was that it could be used to > modify behaviour /outside/ of the normal scope of the code. The reason for the hook being called before the tokenizing step is for preprocessing stuff. Like finding variables, procedures, functions, ... So there's one first preprocessing ("preview") to the whole source and after this you get the source again, token by token. The Preview step hereby is so you know what you'll get, not to do any processing to generate an output beforehand. > In the case of the BASICV code parser, it injects the anchors into the > source output. If it needed to do any processing on the anchor name or > desc, it could happily do that inside its own scope, without calling > the renderer and having the renderer call back into it to process a > change. Let's assume you get some BasicV source that defines some of the functions it calls, but not all. To be consistent with the links and don't have any unmet references you "preview" the source with the hook in the preview function. There you create a list of all functions that are defined and combine this one with an additional list optionally provided as metadata input to GeSHi (specifying other anchors that might live on the same output page for functions not in this snippet). After this you process the code token by token and eventually you end up with a list of tokens that need to be rendered. Since you are doing a browsable PDF you decided that function anchors should have references to the page documenting that function as part of the mouse-over hint. As this information is not known to GeSHi the user supplies a callback where - when rendering such a function reference - that callback inserts the correct page number as part of the hint that should be rendered. Hope this clarifies things a bit. > Can you imagine a practical situation in which we'd need the > above-mentioned callback system? I'm sure there is one; I'm just > struggling to see it ;) Above mentioned example ;-) >>> BTW, you can see an example of the anchor code working here (very >>> experimental!): http://riscos.willowroom.co.uk/highlighter/index.php >> Looks really nice. Though I'd make the actual procedure name after PROC >> be a subcontext like basicv/basicv/procedural/name. Also you seem to >> still have a prefix/postfix delimiter problem for names. But looks >> really good already! > Yes, nicely pointed out ;) I will make the procedure name a subcontext > so it can be styled independently. Yep. That's what subcontexts and virtual contexts are for ;-) > The prefix/postfix delimiter problems are serious and non-trivial to > fix, as BASICV doesn't require spaces between anything! I think it > will require changes/additions to geshicodecontext to fix, but I > haven't quite figured out exactly what yet... Maybe it can all be done > in the codeparser; I'm not sure at this stage. Do any other languages > implemented allow whitespace to be missed out? hmmm, Brainfuck and maybe Whitespace ;-P No, no idea ATM. At least not to that extend. Regarding your code parser: You can split tokens by returning multiple tokens for one presented input token. HTH. > Cheers, > > WPB Regards, BenBE. |