#115 Handling of large content



I ran into issues with my host when parsing content of about 29 kbytes. The server bids farewell with a 500 error.

The problem arises with a regex and preg_replace in function parse_non_string_part() near line 3454 of geshi.php (version I suggest not to parse the whole $stuff_to_parse at once but split in at \n and apply preg_replace line-wise. This way the content could be veeeeery large without getting into problems with any limitations. (preg_* function may handle about 100 k bytes only (Google says)).

I've changed the code near line 3450 in geshi.php to the following which solved the problem with my host. An item that previously forced a 500 is now parsed properly:



  • BenBE

    BenBE - 2010-10-09

    The proposed patch basically is okay, but doesn't fetch all the situations it needs to be applied to like keyword matching. But even then your patch is going to waste some performance for small non-string blocks and even might break language files that use regular expressions over multiple lines.

    Your idea should be refined a bit to not split by lines, but by line-breaks at a certain block size. Any ideas on a way this could be realized best?

    BTW: Your patch wastes memory AND performance by not storing the result of the explode call. Furthermore you'll see clear memory peaks the implode function is called. That's why GeSHi avoids it as hell ;-)

  • BenBE

    BenBE - 2010-10-09


    This canned response is to tell you that a GeSHi developer has looked at this feature request. Further information may follow. Thanks for your report!

  • BenBE

    BenBE - 2010-10-09
    • assigned_to: nobody --> benbe
    • milestone: --> Next_Release_(Stable)
    • status: open --> open-accepted

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks