from jtidy-devel mailing list
Hi,
I'm using JTidy to cleanup Word HTML.
I've come across the following case (I've remove all of
the Word
attributes and comments for clarity):
Start Middle End
Note the spaces before the word "Middle"
When I run this through JTidy jtidy-r8-SNAPSHOT 2005-01-05
I get:
StartMiddleEnd
The Node methods:
trimInitialSpace
seems designed to handle this case but possibly does
not handle nested inlines?
I've made the following change which seems to rectify
this issue:
--- src/main/org/w3c/tidy/Node.orig 2005-01-06
09:14:05.000000000 +1100
+++ src/main/org/w3c/tidy/Node.java 2005-01-06
10:23:26.000000000 +1100
@@ -983,9 +983,9 @@
element.prev = node;
node.parent = element.parent;
}
+ // discard the space in current node
+ ++text.start;
}
- // discard the space in current node
- ++text.start;
}
}
Is this likely to cause other issues?
Regards,
Justin.
Logged In: YES
user_id=798060
testcase added. Unable to reproduce with the snippet provided.
Logged In: YES
user_id=577796
The issue is with the space befor the word "Middle" I'm
sending files to the mailing-list as I don't seem to have an
option to attch files