I'm trying to clean a HTML page exported from MS
Publisher and tidy outputs :
line 418 column 97 - Warning: missing </span> before
line 429 column 189 - Error: <o:p> is not recognized!
line 429 column 189 - Warning: discarding unexpected <o:p>
60 warnings, 19 errors were found! Not all
warnings/errors were shown.
This document has errors that must be fixed before
using HTML Tidy to generate a tidied up version.
However that last message and Tidy help don't explain
how the document errors can be fixed. Isn't Tidy
supposed to fix them ? How can I ask it to remove or
ignore the <o:p> unrecognized elements ?
My goal is to remove all tables and magic markups from