Menu

#1001 Tag validation false positives for some tags from html2 filter

5.3
closed-fixed
5
2020-08-03
2020-07-04
No

In html (using html2 filter), if you have "<form> <span> text <input> text </span></form>", you get tags <s0>, <i1>, </s0> and if you paste that in the translation too, you'll get a tag validation error: bad nesting.
It considers <i1> a start tag, since it is not like '<i1/>', and thus expects an end-tag before </s0>.

Same for <br>

possible fixes:

  1. don't complain if the source has no closing tag either
  2. fix html2 filter to create <i1/>, <br1/> etc. (breaks existing translations!)
  3. allow ignoring these errors

Discussion

  • Aaron Madlon-Kay

    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,9 +1,10 @@
    -In html (using html2 filter), if you have &#34;&lt;form&gt; &lt;span&gt; text &lt;input&gt; text &lt;/span&gt;&lt;/form&gt;&#34;, you get tags &lt;s0&gt;, &lt;i1&gt;, &lt;/s0&gt; and if you paste that in the translation too, you&#39;ll get a tag validation error: bad nesting.
    -It considers &lt;i1&gt; a start tag, since it is not like &#39;&lt;i1/&gt;&#39;, and thus expects an end-tag before &lt;/s0&gt;.
    +In html (using html2 filter), if you have &#34;`&lt;form&gt; &lt;span&gt; text &lt;input&gt; text &lt;/span&gt;&lt;/form&gt;`&#34;, you get tags `&lt;s0&gt;`, `&lt;i1&gt;`, `&lt;/s0&gt;` and if you paste that in the translation too, you&#39;ll get a tag validation error: bad nesting.
    +It considers `&lt;i1&gt;` a start tag, since it is not like &#39;`&lt;i1/&gt;`&#39;, and thus expects an end-tag before `&lt;/s0&gt;`.
    
    -Same for &lt;br&gt;
    +Same for `&lt;br&gt;`
    
     possible fixes: 
    -a) don&#39;t complain if the source has no closing tag either
    -b) fix html2 filter to create &lt;i1/&gt;, &lt;br1/&gt; etc.  (breaks existing translations!)
    -c) allow ignoring these errors
    +
    +1. don&#39;t complain if the source has no closing tag either
    +2. fix html2 filter to create `&lt;i1/&gt;`, `&lt;br1/&gt;` etc.  (breaks existing translations!)
    +3. allow ignoring these errors
    
     
  • Aaron Madlon-Kay

    There is an impedance mismatch between HTML's conception of tags and OmegaT's conception of tags (basically XML-like).

    In my opinion it doesn't make sense to make the tag validation logic try to handle this, because it would need to understand the HTML spec to know what tags are OK to leave unclosed, and that is the filter's job.

    I also don't like making tag validation too configurable, because it increases the user's burden. How can a user know what is valid and what's not? It will probably depend on the specific project and maybe even differ between files in the same project.

    Thus I think the correct thing to do is ensure that the HTML filter outputs tags compatible with OmegaT's tag format; this is your suggestion (2). Yes it's breaking, but that isn't necessarily disqualifying.

    A bigger idea is to allow filters to particpate in the validation process somehow. Obviously knowledge of what HTML tags can be standalone belongs in the HTML filter; it makes sense then that the HTML filter should know if tags in the translation are valid.

     
  • Aaron Madlon-Kay

    Actually "1. don't complain if the source has no closing tag either" is a reasonable idea too. I don't expect it will be easy to implement though.

     
  • Martin Fleurke

    Martin Fleurke - 2020-07-04

    I implemented 1.
    It was a fairly simple fix.

     
  • Aaron Madlon-Kay

    • summary: some tags recognized as start-tag causing validation errors --> Tag validation false positives for some tags from html2 filter
    • status: open --> open-fixed
     
  • Aaron Madlon-Kay

    Fixed in OmegaT 5.3.0.

     
  • Aaron Madlon-Kay

    • status: open-fixed --> closed-fixed
     

Log in to post a comment.

MongoDB Logo MongoDB