[Htmlparser-developer] aside tag problem
Brought to you by:
derrickoswald
|
From: Marc P. <mar...@we...> - 2016-07-19 13:49:23
|
Hi!
I have a problem when parsing an "aside" tag. The source html has an aside
tag with text inside but when parsed the method getChildren returns null.
String page = "<html><head></head><body><h1>Good Text 1, " + //
"<aside class=\"test it\">Irrelevant Text A, </aside>" + //
"<div class=\"news-footer\">Irrelevant Text B, </div>" + //
"Good Text 2 </h1></body></html>";
Page p = new Page(page, charset);
Lexer l = new Lexer(p);
Parser parser = new Parser(l);
NodeList nodes = parser.parse(null);
Node body = nodes.elementAt(0).getChildren().elementAt(1);
Node h1 = body.getChildren().elementAt(0);
assertNotNull(h1.getChildren().elementAt(1).getChildren());
On the other hand, the tag div has children as expected.
Is there anything worng?
Thanks in advance
Regards
--
Marc Poch
[image: websays.com] <http://www.websays.com/> [image: facebook.com/websays]
<http://www.facebook.com/websays> [image: twitter.com/websays]
<http://www.twitter.com/websays> [image: linkedin.com/company/websays]
<http://www.linkedin.com/company/websays>
The information contained in this email and in any attachments is intended
only for the person or entity to which it is addressed and may contain
confidential and/or privileged material. Any review, retransmission,
dissemination or other use of, or taking of any action in reliance upon,
this information by persons or entities other than the intended recipient
is prohibited.
|