Share

HTML Tidy

Tracker: Bugs

5 Tidy not cleaning HTML generated by MS Word 2000 / Word 2002 - ID: 795245
Last Update: Settings changed ( hoehrmann )

Dear Friends,

We are using Tidy.dll and Tidy.cs as suggested on :
http://www.mattstan.pwp.blueyonder.co.uk/tidy/tidycs.h
tml

We are using this with Visual Studio .NET 2002 and using
C#.

The application is working ok except when we take the
content from the Microsoft Word 2000 or Word 2002 and
pass it through Tidy for cleaning and better formatting.
Tidy does not seem to clean the HTML generated from
Word 2000 and Word 2002. We also tried setting
option "word-2000" to 'yes' in the Tidy field. But this also
does not help.

Following is the configuration we use for Tidy:
-------------------------------------------------------
-------

tidy-mark: no
show-body-only: yes
replace-color: yes
indent: auto
indent-spaces: 2
wrap: 72
markup: yes
clean: no
show-warnings: yes
numeric-entities: yes
quote-marks: yes
quote-nbsp: yes
quote-ampersand: no
break-before-br: yes
uppercase-tags: yes
uppercase-attributes: no
word-2000: yes
char-encoding: latin1
new-inline-tags: cfif, cfelse, math, mroot,
mrow, mi, mn, mo, msqrt, mfrac, msubsup, munderover,
munder, mover, mmultiscripts, msup, msub, mtext,
mprescripts, mtable, mtr, mtd, mth
new-blocklevel-tags: cfoutput, cfquery
new-empty-tags: cfelse

-------------------------------------------------------
------

Could you please suggest any possible config option we
need to use or change for resolving this problem?

We would appreciate any help that you can provide to
solve this problem.

Thank you...

Best Regards,
Maulik Bhansali.
nbhama1@netweb.soft.net


Maulik Bhansali ( maulikb ) - 2003-08-26 10:15

5

Closed

Works For Me

Björn Höhrmann

HTML/XHTML Parser

Current - Other platforms

Public


Comment ( 1 )

Date: 2003-08-27 07:47
Sender: hoehrmannProject AdminAccepting Donations

Logged In: YES
user_id=188003

Do you have a small test file to reproduce the problem?


Attached File

No Files Currently Attached

Changes ( 4 )

Field Old Value Date By
status_id Open 2004-01-17 11:34 hoehrmann
resolution_id None 2004-01-17 11:34 hoehrmann
assigned_to nobody 2004-01-17 11:34 hoehrmann
close_date - 2004-01-17 11:34 hoehrmann