[Htmlparser-user] Presentation tags
Brought to you by:
derrickoswald
From: Chris B. <cba...@mi...> - 2011-06-30 09:39:56
|
Hi there, I use Aperture to extract text which runs Htmlparser when processing HTML. My question relates to the handling of presentation tags such as <u>, <b>, <i> when embedded within words - for example: <html><body><u>north</u>ern</body></html> What I would expect is that I should be delivered the word "northern" - but instead I get two tokens: "north" and "ern", which is clearly wrong in this context. It seems that Htmlparser is replacing tags with whitespace - why is this? Thanks for any help. - Chris |