Re: [Python-markdown-discuss] cleaning up the code

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Actually, how about this:

We can add two subclasses of Preprocessor: TextPreprocessor and
LinePreprocessor.  (For now LinePreprocessor would behave as
Preprocessor but we can deprecate it later).  Each will have a
get_input_type() method which would return "lines" or "text" which
will signify what input it expects.  Either would be allowed to return
a list of lines _or_ text.  Markdown will check type and do the
conversion if adjacent preprocessors want different formats.

So, you would be able to do this:

     class FooPreprocessor (TextPreprocessor) :
         def run(self, text)
              return foo(text)  # foo could return a single string
_or_ a list of lines

     class BarPreprocessor (LinesPreprocessor) :
         def run(self, lines)
              return bar(lines)  # bar could return a single string
_or_ a list of lines

You could then insert the BarPreprocessor and the FooPreprocessor into
the queue in any order.  If you put Foo after Bar and Foo.run returns
a string, it will be split into lines before being fed to Bar.run.  If
Foo.run returns a list, it will be fed into Bar.run as is.

  - yuri

On 5/15/07, Erick Tryzelaar <ida...@us...> wrote:
> Yuri Takhteyev wrote:
> > Yes, but how much performance do you gain compared to doing a
> > "\n".join, storing the string, doing whatever you want to it, then
> > returning s.split("\n")?
> >
> > I mean it as a serious empirical question.  If it makes a substantial
> > difference, it would be worth making the API a bit more complicated.
> > If it gains something like 1% in performance, I am not so it makes
> > sense to introduce a new type of post processor.
>
> You're right, I don't see any performance benefits. I do find it
> semantically easier to work directly with regexes, but I guess I can do
> that in my extensions.
>

-- 
Yuri Takhteyev
UC Berkeley School of Information
http://www.freewisdom.org/