Heading(<h1-3>) support for HTML parser
Brought to you by:
dhiebert
Attached is a simple patch to add detection of HTML headings like <h1>Text</h1> and the same for <h2> and <h3>.
This is not yet perfect because it won't detect the headings if there are links inside the headings. There is a regexp commented which is capable of parsing links in headings but then headings without links won't work.
I'm not very familar with regular expresions, so maybe someone else wants to improve it and ideally combine the two regexp.
Heading detection for HTML parser