Menu

#197 garbled page layout

workingwiki
open
None
5
2013-05-07
2012-08-02
Lee Worden
No

I have an example on a private wiki of a page that looks very weird, with the top tabs nested inside a LaTeXML paper, and the sidebar in the wrong column. Seems like it could be a problem somewhere in all the WW/LaTeXML output.

Discussion

  • Lee Worden

    Lee Worden - 2012-08-03

    Interesting, looks like LaTeXML has started using self-closing elements like \

    in its XHTML, and in our output (which is parsed as regular HTML with MathML) that isn't parsed right, it's parsed as an opening element that never gets closed. And LaTeXML now offers html5 output as wellas xhtml, so probably the best thing is to use that. I checked it semi-manually and it does seems to fix the garbling.

     

    Last edit: Lee Worden 2012-08-30
  • Lee Worden

    Lee Worden - 2012-08-03

    There are a number of things in the code that presume that we serve XHTML to some browsers and HTML to others, depending on whether they can handle MathML. Mostly it's obsolete, since we are giving them all the XHTML and trusting MathJax to handle the MathML. Once I make this change it'll be even more obsolete, because even the XHTML won't be XHTML, so to speak, it'll be HTML5. So really I should review that code, and take out the parts that don't apply (should I make $wwUseMathJax=true mandatory and remove it as an option?). If there's a distinction to be made, it's (1) whether to use MathML vs images, and (2) whether to use MathML with MathJax or without it.

     
  • Lee Worden

    Lee Worden - 2012-08-04

    Basic fixes for this are up and running on yushan. All latex is now displayed as html5 by default. I still want to fix up the logic in the code as I said. Also I should notify people who use .latexml.xhtml targets explicitly.

     
  • Lee Worden

    Lee Worden - 2012-08-30

    Got another example of this kind of stuff. It happens in a complicated project on a private wiki, so no link here.

    In this case, the first big problem is a garbled .latexml.html5 file where some of the bibliography items are outside of the \

    and, in fact, even outside of the \
    . It looks like there are too many closing tags, because the wiki's \
    also gets closed before the end of the latexml document. Thus some of the latexml output appears in weird places on the wiki page, including in the sidebar.

    This suggests multiple needs:

    • latexml shouldn't produce garbled output. I suspect (somewhat randomly) that this is brought on by advanced stuff in the .bib file that latexml can't handle. Anyway, we can file bug reports against latexml but we aren't in a position to fix them.
    • my wiki-inclusion script might be partly responsible for the excessive closing tags and for ending the \
      in the wrong place.
    • when scripts produce messed-up html, I shouldn't display in an innocent way that lets them disrupt the wiki page. This probably means either filtering the html output in some way (which overlaps with the solution to the XSS issue, though in this case tidy could probably go a long way), or putting it in iframes, or both, or something.

    Before going any further, I need to figure out to what extent each of these 3 layers is involved.

     

    Last edit: Lee Worden 2012-08-30
  • Lee Worden

    Lee Worden - 2012-08-30

    I don't know. The source code is complicated, and the output markup is complicated. It looks like the first problem arises (always start with the first failure, because it could be involved in any of the later failures) at a .bib entry like

    , annote = {first paragraph
    
    second paragraph},
    

    which makes latexml choke at the paragraph break. The markup looks like

    <div class="bibblock">Note: <span class="text bib-note">first paragraph
    <li id="thesis.scaffold.bib.bib13" class="bibitem">
    <span class="bibtag">13</span><div class="bibblock">second paragraph</div>
    </li></span></div>.
    

    Here, I think the final \\ is supposed to close the first \

    and \, but in HTML5 the \
  • closes everything that's open since the previous \
  • . So the subsequent \ and \
close things from earlier in the document that aren't supposed to get closed yet.

If I'm right about this, it's a latexml bug, and to fix it we need to construct a minimal example and submit it to the latexml bug tracker.

This doesn't contradict the point that I shouldn't let markup like this destroy the wiki page's layout, of course.

 

Anonymous
Anonymous

Add attachments
Cancel