Re: [Python-markdown-discuss] Overriding Functions, etc.

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On 4/10/07, Waylan Limberg <wa...@gm...> wrote:
[...]
> > I was thinking that _perhaps_ the pre-processors could be a dictionary
> > instead of an array, which would allow deleting by name. Presently, I
> > set all the preprocessors by copying from the class, then delete the
> > one preprocessor I don't want. This would also allow use of has_key(),
> > as the present implementation (correct me if I'm wrong) does not allow
> > listing or deleting a preprocessor directly.
>
> This seems like a good idea until we remember that dictionaries do not
> preserve order. In this case, order is very important, as certain
> pre-processors must absolutely be run before others. That's imposable
> with a dictionary. With a little searching you'll find that various
> projects have implemented their own non-standard sorted-dict to address
> this issue, but every implementation is a little different. Another
> possibility could be a list of tuples [(key, value), (key. value)], but
> that can be a pain to work with. That is why a simple list is used.
> Currently is is easy enough to refer to each item by its index, but if
> you're making multiple changes, I can see how that could be problematic.

Right, dict is not sequenced, but there are a couple ways off-hand
that might work.

def PreprocessorInsert(key, val, index=-1):
  self.preprocessors[key] = value
  self.preprocessors_order.insert(key, index)

then when running preprocessors:
   for key in self.preprocessors_order:
       pre = self.preprocessor[key]

PmWiki has a situation where markups may be added willy-nilly while
maintaining order. It would be rather radical to introduce to
Markdown(). I'll try to describe it as best I can as there are two
ways to position markups: by group and relative to other markups.
First, the basic syntax for adding markup is:

Markup('name','phase','regex','substitute')

* Name refers to the key value of the regex, which allows a standard
markup to be overridden by custom markup, and easy identification.
* Phase refers to position either by category (e.g. preprocessor,
inline, postprocessor) or relationally (e.g. '<##' would occur before
'##'; '<inline' would essentially have it occur before all other
inlines. The '##' is the name of the markup it precedes (otherwise the
phase), not the markup itself,)
* Regex is self-explanatory.
* Substitution is self-explanatory, and it can either be text or function.

The code does the shuffling before text conversion. This has the
advantage of not needing to know the sequence per se, only that you
want the conversion to occur before/after/during another item. I
mention PmWiki only because I'm very familiar with its approach and
know its author seeks ease-of-customization. Markdown() generally does
not mean to be as customizable as it follows the Markdown standard
format.

[...]
> > Perhaps the example given could be an actual
> > implementation of a simple preprocessor?
>
> Generally the footnote extension is referred to as an example as it uses
> all three methods of adding extensions (pre-precessors, patterns, and
> post-processors).

Thanks!