#930 single ampersands do not get changed to &

closed-out-of-date
nobody
None
5
2015-12-22
2010-03-04
No

If '--preserve-entities yes' is part of the configuration, ampersands that are not part of an entity do not get escaped to &

Contents of test file:
--- start test file ---
<p>asdlkfjdsalk &&& & &mdash; ldsakfjladsj</p>
--- end test file ---

Tidy version:
[Thu Mar 04 15:44:28] ~: tidy -v
HTML Tidy for Mac OS X released on 25 March 2009

Computer is Mac Book Pro 2.2GHz Intel Core 2 Duo with 4GB RAM running OS X 10.6.2.

Tidy command used to reproduce issue:
[Thu Mar 04 15:43:18] ~: tidy -m -wrap 0 --preserve-entities yes Desktop/blah.txt
line 1 column 1 - Warning: missing <!DOCTYPE> declaration
line 1 column 1 - Warning: inserting implicit <body>
line 1 column 17 - Warning: unescaped & which should be written as &amp;
line 1 column 18 - Warning: unescaped & which should be written as &amp;
line 1 column 19 - Warning: unescaped & which should be written as &amp;
line 1 column 21 - Warning: unescaped & which should be written as &amp;
line 1 column 1 - Warning: inserting missing 'title' element
Info: Document content looks like HTML 4.01 Strict
Info: No system identifier in emitted doctype
7 warnings, 0 errors were found!

--- start output of above command ---
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<meta name="generator" content="HTML Tidy for Mac OS X (vers 25 March 2009), see www.w3.org">
<title></title>
</head>
<body>
<p>asdlkfjdsalk &&& & &mdash; ldsakfjladsj</p>
</body>
</html>
--- end output of above command ---

Tidy command that works (i.e., ampersands that are not part of an entity are changed to &amp;):
[Thu Mar 04 15:43:57] ~: tidy -m -wrap 0 Desktop/blah.txt line 1 column 1 - Warning: missing <!DOCTYPE> declaration
line 1 column 1 - Warning: inserting implicit <body>
line 1 column 17 - Warning: unescaped & which should be written as &amp;
line 1 column 18 - Warning: unescaped & which should be written as &amp;
line 1 column 19 - Warning: unescaped & which should be written as &amp;
line 1 column 21 - Warning: unescaped & which should be written as &amp;
line 1 column 1 - Warning: inserting missing 'title' element
Info: Document content looks like HTML 4.01 Strict
Info: No system identifier in emitted doctype
7 warnings, 0 errors were found!

--- start output of above command ---
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<meta name="generator" content="HTML Tidy for Mac OS X (vers 25 March 2009), see www.w3.org">
<title></title>
</head>
<body>
<p>asdlkfjdsalk &amp;&amp;&amp; &amp; &mdash; ldsakfjladsj</p>
</body>
</html>
--- end output of above command ---

Discussion

  • Geoff

    Geoff - 2015-12-22
    • status: open --> closed-out-of-date
    • Group: --> Current - all platforms
     
  • Geoff

    Geoff - 2015-12-22

    Thanks for the report... now long ago... sorry for the delay...

    Tidy source has moved on to https://github.com/htacg/tidy-html5 ,
    site to http://www.html-tidy.org/

    Quite recently Tidy has had some updates concerning ampersands...

    Using your sample with the preserve entities option on modern tidy will not escape any... so maybe this is solved?

    But meantime closing this as out-of-date...

    If there is still a bug in modern tidy then please file an issue, and if you find and fix the problem in a tidy fork then you can issue a Pull Request

    Tidy needs your support...

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks