From: David M. G. <mic...@gm...> - 2014-04-30 06:46:46
|
Hi, I am already using the functions to wait for the javascript. I used both waitForBackgroundJavaScript(10000); and waitForBackgroundJavaScriptStartingBefore(10000) and it did not help. Besides this I need a generic solution, which can be achieved. For example jsoup knows how to cope with this html package test; import java.io.File; import java.io.IOException; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; import org.jsoup.Jsoup; public class JsoupTest1 { public static void main(String[] args) throws IOException { File in = new File("l.html"); Document doc = Jsoup.parse(in, null); Elements elems = doc.select("table"); for (Element elem:elems) { System.out.println(elem.text()); } } } Maybe i should file a bug, but i don't think that there is a reason for executing the special javascript command. Thanks, David >In the source page i could see body tag appended with: >onload="hideDiv(true);initBoxes('listview');callSubScroll('frm_tagged_documents',0,1);updateResultsNav();reloadClassification('false');scrollToHitPos('false');" >onUnload="storeScrollToHitPos('false'); >Execute this js functions then try extracting the page. >Once more thing what is your desired output |