Re: [Python-markdown-discuss] cleaning up the code

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Yes, but how much performance do you gain compared to doing a
"\n".join, storing the string, doing whatever you want to it, then
returning s.split("\n")?

I mean it as a serious empirical question.  If it makes a substantial
difference, it would be worth making the API a bit more complicated.
If it gains something like 1% in performance, I am not so it makes
sense to introduce a new type of post processor.

I applied your patch and then changed HTML_PREPROCESSOR to work as a
"text" preprocessor, but this gave me no performance improvement on
the "markdown-test".  I am attaching the output of test-markdown.py
with repeat set to 100 times (i.e., 10x the default value.  (The gray
numbers are values for pre-patch version.)

I do understand that this may not be a fair test.  Can you send me one
that shows more of a difference.

test-markdown.py by default runs all the files in a directory without
any extensions.  However, if the directory name starts with "ext-x-"
then whatever follows "-x-" is taken as a "-"-delimited llist of
extensions.  So, if you write an extention "foo" which uses text
preprocessors, the test cases this extension should go under
"ext-x-foo"

(The reason there are no "ext-x" test directories in SVN is that I
started making them, discovered that the wikilinks extension is broken
in the new version and haven't had time to fix it since March.)

  - yuri

On 5/15/07, Erick Tryzelaar <ida...@us...> wrote:
> With the whole text preprocessor, I can use re.finditer to find all the
> matches in a string, instead of having to test a regex against each line
> and maintain state between lines, so it can be a little easier to use. I
> haven't done too many performance tests, but on a large string, it ought
> to be faster since the string searching should remain in the c kernel.
>
> -e
>
>
> Yuri Takhteyev wrote:
> > Thanks for this patch.  About the preprocessors: did you actually get
> > a noticeable performance improvement with this?  If so, I will be
> > happy to put it in.
> >
> >  - yuri
> >
> > On 5/15/07, Erick Tryzelaar <ida...@us...> wrote:
> >> I noticed that the code for markdown.py isn't consistent in how it does
> >> spaces. I've tried to normalize it to the python coding standard,
> >> from here:
> >>
> >> http://www.python.org/dev/peps/pep-0008/
> >>
> >> I've also made the objects subclass from object, if that's alright. This
> >> also assumes that my previous patch has been applied, so if you don't
> >> want the text preprocessors, we'll have to edit this patch.
> >>
> >> I uploaded the patch here, since it's kind of big:
> >>
> >> http://sourceforge.net/tracker/index.php?func=detail&aid=1719072&group_id=153041&atid=790200
> >>
> >>
> >> -e
> >>
> >> -------------------------------------------------------------------------
> >>
> >> This SF.net email is sponsored by DB2 Express
> >> Download DB2 Express C - the FREE version of DB2 express and take
> >> control of your XML. No limits. Just data. Click to get it now.
> >> http://sourceforge.net/powerbar/db2/
> >> _______________________________________________
> >> Python-markdown-discuss mailing list
> >> Pyt...@li...
> >> https://lists.sourceforge.net/lists/listinfo/python-markdown-discuss
> >>
> >
> >
>
>

-- 
Yuri Takhteyev
UC Berkeley School of Information
http://www.freewisdom.org/