Parser parser = new parser(connection); // connection is URLConnection with that page
parser.setEncoding("iso-8859-1");
try {
nodelist = new NodeList();
for (NodeIterator e = parser.elements(); e.hasMoreNodes();) {
Node a = e.nextNode();
nodelist.add(a); // URL conversion occurs in the tags
}
} catch (Exception e) {
e.printStackTrace();
}
// ---- end of code
I'm getting this error on that page:
Exception in thread "Thread-2" java.lang.StackOverflowError
at org.htmlparser.util.NodeList.extractAllNodesThatMatch(NodeList.java:232)
at org.htmlparser.util.NodeList.extractAllNodesThatMatch(NodeList.java:242)
at org.htmlparser.util.NodeList.extractAllNodesThatMatch(NodeList.java:242)
.
.
.
.
last line repeats (79k of text)
Code never reaches e.printStackTrace(). The exception isn't thrown from e.nextNode().
Is there a better way to get whole NodeList to memory so I don't have to download
the page again when I want to work with whole page several times ? (visitors, extractAllNodesThatMatch, etc.).
Thanks for answer.
JerryMouse
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Before I post this to bugs I want to ask if someone experienced similar problem of if am I doing something wrong.
page: http://www.podcastdirectory.com/podcasts/index.php?iid=14362
// -------- code ---------
Parser parser = new parser(connection); // connection is URLConnection with that page
parser.setEncoding("iso-8859-1");
try {
nodelist = new NodeList();
for (NodeIterator e = parser.elements(); e.hasMoreNodes();) {
Node a = e.nextNode();
nodelist.add(a); // URL conversion occurs in the tags
}
} catch (Exception e) {
e.printStackTrace();
}
// ---- end of code
I'm getting this error on that page:
Exception in thread "Thread-2" java.lang.StackOverflowError
at org.htmlparser.util.NodeList.extractAllNodesThatMatch(NodeList.java:232)
at org.htmlparser.util.NodeList.extractAllNodesThatMatch(NodeList.java:242)
at org.htmlparser.util.NodeList.extractAllNodesThatMatch(NodeList.java:242)
.
.
.
.
last line repeats (79k of text)
Code never reaches e.printStackTrace(). The exception isn't thrown from e.nextNode().
Is there a better way to get whole NodeList to memory so I don't have to download
the page again when I want to work with whole page several times ? (visitors, extractAllNodesThatMatch, etc.).
Thanks for answer.
JerryMouse
I'm sorry. The error occurs in this:
// ------- CODE ----------
NodeClassFilter filter = new NodeClassFilter(ImageTag.class);
list = nodelist.extractAllNodesThatMatch(filter, true); // select all
// ------------------------
Throwable is thrown and can be caught, so ignore it :/
Sorry.