From: JEFFREY H. <has...@sb...> - 2010-08-19 22:06:38
|
This issue actually may be related to BUG 3031869 as there is whitespace in the TAG. They have a space between the End of the tag and the > bracket. Jeffrey Haskovec ________________________________ From: JEFFREY HASKOVEC <has...@sb...> To: nek...@li... Sent: Thu, August 19, 2010 3:05:12 PM Subject: How does NekoHTML deal with custom tags in an HTML file I was trying to parse a NYTimes.com story and I am not seeing any of the nodes with the article in them. I noticed that they use some custom tags around the article such as: <NYT_TEXT > What would be the behavior of NekoHTML upon parsing a tag such as that? It seems like it is dropping everything within it, or else I am not seeing the expected nodes that live inside of that tag. Thanks, Jeffrey Haskovec |