I have an example on a private wiki of a page that looks very weird, with the top tabs nested inside a LaTeXML paper, and the sidebar in the wrong column. Seems like it could be a problem somewhere in all the WW/LaTeXML output.
Interesting, looks like LaTeXML has started using self-closing elements like \
in its XHTML, and in our output (which is parsed as regular HTML with MathML) that isn't parsed right, it's parsed as an opening element that never gets closed. And LaTeXML now offers html5 output as wellas xhtml, so probably the best thing is to use that. I checked it semi-manually and it does seems to fix the garbling.
Last edit: Lee Worden 2012-08-30
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
There are a number of things in the code that presume that we serve XHTML to some browsers and HTML to others, depending on whether they can handle MathML. Mostly it's obsolete, since we are giving them all the XHTML and trusting MathJax to handle the MathML. Once I make this change it'll be even more obsolete, because even the XHTML won't be XHTML, so to speak, it'll be HTML5. So really I should review that code, and take out the parts that don't apply (should I make $wwUseMathJax=true mandatory and remove it as an option?). If there's a distinction to be made, it's (1) whether to use MathML vs images, and (2) whether to use MathML with MathJax or without it.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Basic fixes for this are up and running on yushan. All latex is now displayed as html5 by default. I still want to fix up the logic in the code as I said. Also I should notify people who use .latexml.xhtml targets explicitly.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Got another example of this kind of stuff. It happens in a complicated project on a private wiki, so no link here.
In this case, the first big problem is a garbled .latexml.html5 file where some of the bibliography items are outside of the \
and, in fact, even outside of the \
. It looks like there are too many closing tags, because the wiki's \
also gets closed before the end of the latexml document. Thus some of the latexml output appears in weird places on the wiki page, including in the sidebar.
This suggests multiple needs:
latexml shouldn't produce garbled output. I suspect (somewhat randomly) that this is brought on by advanced stuff in the .bib file that latexml can't handle. Anyway, we can file bug reports against latexml but we aren't in a position to fix them.
my wiki-inclusion script might be partly responsible for the excessive closing tags and for ending the \
in the wrong place.
when scripts produce messed-up html, I shouldn't display in an innocent way that lets them disrupt the wiki page. This probably means either filtering the html output in some way (which overlaps with the solution to the XSS issue, though in this case tidy could probably go a long way), or putting it in iframes, or both, or something.
Before going any further, I need to figure out to what extent each of these 3 layers is involved.
Last edit: Lee Worden 2012-08-30
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I don't know. The source code is complicated, and the output markup is complicated. It looks like the first problem arises (always start with the first failure, because it could be involved in any of the later failures) at a .bib entry like
, annote = {first paragraph
second paragraph},
which makes latexml choke at the paragraph break. The markup looks like
Do you know why this paragraph looks like this on my email?
In this case, the first big problem is a garbled .latexml.html5 file where
some of the bibliography items are outside of the and, in fact, even outside
of the
Just curious,
JD
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
yes, the new bug tracker uses markdown, which means instead of writing
\
I have to write \\
, or else it gets formatted wrong. I've
caught some other formatting issues too. I fixed that one, and others,
as soon as I saw them, but that happens after the email has gone out.
Last edit: Lee Worden 2012-08-31
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
yes, the new bug tracker uses markdown, which means instead of writing
I have to write
, or else it gets formatted wrong. I've
caught some other formatting issues too. I fixed that one, and others,
as soon as I saw them, but that happens after the email has gone out.
Thanks, but I remain curious. What is this mystery tag that you can't
email me by any means?
JD
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This is hilarious. I responded to the email I got with your message in it, and didn't realize I was responding back to the bug tracker, not directly to you. So I didn't use backslashes, and I added my message to the ticket when I thought I was just writing to you, and then it sent the comment to you, but after parsing it and removing the mystery tags. Your response is also on the tracker now. The mystery tag I used as an example is "div", but there are different tags involved in the actual bug ticket. Click through to the tracker to get the readable version.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Interesting, looks like LaTeXML has started using self-closing elements like \
Last edit: Lee Worden 2012-08-30
There are a number of things in the code that presume that we serve XHTML to some browsers and HTML to others, depending on whether they can handle MathML. Mostly it's obsolete, since we are giving them all the XHTML and trusting MathJax to handle the MathML. Once I make this change it'll be even more obsolete, because even the XHTML won't be XHTML, so to speak, it'll be HTML5. So really I should review that code, and take out the parts that don't apply (should I make $wwUseMathJax=true mandatory and remove it as an option?). If there's a distinction to be made, it's (1) whether to use MathML vs images, and (2) whether to use MathML with MathJax or without it.
Basic fixes for this are up and running on yushan. All latex is now displayed as html5 by default. I still want to fix up the logic in the code as I said. Also I should notify people who use .latexml.xhtml targets explicitly.
Got another example of this kind of stuff. It happens in a complicated project on a private wiki, so no link here.
In this case, the first big problem is a garbled .latexml.html5 file where some of the bibliography items are outside of the \
This suggests multiple needs:
Before going any further, I need to figure out to what extent each of these 3 layers is involved.
Last edit: Lee Worden 2012-08-30
I don't know. The source code is complicated, and the output markup is complicated. It looks like the first problem arises (always start with the first failure, because it could be involved in any of the later failures) at a .bib entry like
which makes latexml choke at the paragraph break. The markup looks like
Here, I think the final \\ is supposed to close the first \
If I'm right about this, it's a latexml bug, and to fix it we need to construct a minimal example and submit it to the latexml bug tracker.
This doesn't contradict the point that I shouldn't let markup like this destroy the wiki page's layout, of course.
Do you know why this paragraph looks like this on my email?
Just curious,
JD
yes, the new bug tracker uses markdown, which means instead of writing
\
caught some other formatting issues too. I fixed that one, and others,
as soon as I saw them, but that happens after the email has gone out.
Last edit: Lee Worden 2012-08-31
On Thu, Aug 30, 2012 at 10:39 PM, Lee Worden worden@users.sf.net wrote:
Thanks, but I remain curious. What is this mystery tag that you can't
email me by any means?
JD
This is hilarious. I responded to the email I got with your message in it, and didn't realize I was responding back to the bug tracker, not directly to you. So I didn't use backslashes, and I added my message to the ticket when I thought I was just writing to you, and then it sent the comment to you, but after parsing it and removing the mystery tags. Your response is also on the tracker now. The mystery tag I used as an example is "div", but there are different tags involved in the actual bug ticket. Click through to the tracker to get the readable version.