Menu

#1216 Markdown lexer emphasis spans should be paragraph-oriented

Bug
closed-fixed
4
2021-07-26
2011-08-26
No

According to http://github.github.com/github-flavored-markdown/preview.html
text like these bits:

_starts with underscore

_do_not_emphasize

should *not* be emphasized, because the paragraph ends with an unclosed "_".
Note that in this text:

one two _three
four five_ six

"three four five" only will be emphasized. Leaving the "_" off
after "five" leaves all text in both lines unemphasized.

I'm relying on http://github.github.com/github-flavored-markdown/preview.html
to determine the actual Markdown syntax, since the spec is minimal, even with
github saying the preview is deprecated.

Discussion

  • Neil Hodgson

    Neil Hodgson - 2011-08-27
    • assigned_to: nobody --> nyamatongwe
    • priority: 5 --> 4
    • milestone: --> Bug
    • labels: --> Scintilla
    • status: open --> open-accepted
     
  • Neil Hodgson

    Neil Hodgson - 2011-08-27

    OK. I'll leave this to anhy Markdown user.

     
  • Bewied

    Bewied - 2017-03-21

    Over five years later, same problem :(

    Ideally, the parser would recognize that emphasis cannot span paragraphs. I guess checking for [\r?\n][\r?\n] should be both minimal and sufficient.

     
  • rdipardo

    rdipardo - 2021-06-27

    Patched by implementing the above suggestion of Bewied.[1]

    Note that the corrected parsing rules don't conform to any standard, whether GitHub's [2] or otherwise.

    In fact, GitHub's Markdown editor will only apply the emphatic style when a closing token is seen on the same line as the first:

    GH-markdown-em-syntax-wrong-rendered

    GH-markdown-em-syntax-correct

    GH-markdown-em-syntax-correct-rendered

    Evidently Github's parser scans backward from the end of the style region, while LexMarkdown does the opposite. Emulating the former's behaviour would involve a susbtantial re-write, and the original author's comments make clear that broad feature support was never part of the lexer's design.


    [1]: Two changed type specifiers and one cast were also added to weed out some more signed/unsigned mismatches; new tests are enclosed separately

    [2]: This link was the best substitute I could find for the obsolete original one

     
  • rdipardo

    rdipardo - 2021-06-27

    On a second attempt, it wasn't that hard to get LexMarkdown to behave in the GitHub way:

    Bug1216.md-lexed

    It was just a matter of scanning forward for a closing emphasis marker.

    Which one looks better is a matter of taste; but the original issue was really about compliance with GitHub's interpretation of Markdown syntax, so I guess this newer patch set is a better solution.

    Note: the earlier test files had misspellings that have now been corrected.

     
  • Neil Hodgson

    Neil Hodgson - 2021-06-28
    • labels: Scintilla --> Scintilla, lexilla, markdown
    • status: open-accepted --> open-fixed
     
  • Neil Hodgson

    Neil Hodgson - 2021-07-26
    • status: open-fixed --> closed-fixed
     

Log in to post a comment.

MongoDB Logo MongoDB