From: Jack Bi <814...@qq...> - 2014-03-15 04:54:10
|
Hi all: I found htmlUnit always failed when execute JS in html, and if I try the same code, sometimes it's success, I was puzzled for a long time. My Java code like this, hope someone can help me solve this problem,thank you very much. WebClient webClient = new WebClient(BrowserVersion.FIREFOX_24); webClient.addWebWindowListener(new WebWindowListener() { public void webWindowOpened(WebWindowEvent event) { } public void webWindowContentChanged(WebWindowEvent event) { page = (HtmlPage) webClient.getCurrentWindow().getEnclosedPage(); if (page.getTitleText().equals("Security Warning")) { warningTag = true; } if (page.asText().contains(descTag)) { finalTag = true; } } public void webWindowClosed(WebWindowEvent event) { } }); page = webClient.getPage(urlString); for(int i = 0; i < 200; i++){ if (warningTag || finalTag) { break; } Thread.sleep(1000); } the HTML code is like this: <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> <title> title </title> <script type="text/javascript"> window.setTimeout('document.getElementById("formxh4t").submit();',1000); <form name="formxh4t" id="formxh4t" action="http://k2.ddtuangou.com/daili/process.php?action=update" method="post" onsubmit="return updateLocation(this);"> <input name="u" type="hidden" class="textbox" id="input" value="http://dongtaiwang.com" size="60"/> <input type="hidden" name="type" value="0"/> </form> </body> </html> thank you! yours jack -- View this message in context: http://htmlunit.10904.n7.nabble.com/htmlUnit-always-failed-when-execute-JS-in-html-tp33378.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |
From: David N. <dav...@gm...> - 2014-03-15 06:26:38
|
On 3/14/14, Jack Bi <814...@qq...> wrote: > Hi all: > I found htmlUnit always failed when execute JS in html, and if I try > the > same code, sometimes it's success, I was puzzled for a long time. > My Java code like this, hope someone can help me solve this > problem,thank you very much. > > WebClient webClient = new > WebClient(BrowserVersion.FIREFOX_24); > webClient.addWebWindowListener(new WebWindowListener() > { > > public void webWindowOpened(WebWindowEvent event) { > > } > public void webWindowContentChanged(WebWindowEvent event) { > page = (HtmlPage) webClient.getCurrentWindow().getEnclosedPage(); > if (page.getTitleText().equals("Security Warning")) { > warningTag = true; > } > if (page.asText().contains(descTag)) { > finalTag = true; > } > } > public void webWindowClosed(WebWindowEvent event) { > > } > }); > > page = webClient.getPage(urlString); > for(int i = 0; i < 200; i++){ > if (warningTag || finalTag) { > break; > } > Thread.sleep(1000); > } > > the HTML code is like this: > <html> > <head> > <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> > <title> > title > </title> > <script type="text/javascript"> > window.setTimeout('document.getElementById("formxh4t").submit();',1000); > <form name="formxh4t" id="formxh4t" > action="http://k2.ddtuangou.com/daili/process.php?action=update" > method="post" onsubmit="return updateLocation(this);"> > <input name="u" type="hidden" class="textbox" id="input" > value="http://dongtaiwang.com" size="60"/> > <input type="hidden" name="type" value="0"/> > </form> > </body> > </html> I have found the Javascript engines used by HTMLUnit unable to properly execute JS as well. For my project I wound up having to switch to PhantomJS/CasperJS. I'd recommend giving it a try. I still use HTMLUnit for some things, and for those things it does a really nice job, but when it comes to executing Javascript I have to use something else. |
From: Jack Bi <814...@qq...> - 2014-03-15 09:37:32
|
Hi David: I will try PhantomJs and CasperJs that you said. thank you very much! yours Jack -- View this message in context: http://htmlunit.10904.n7.nabble.com/htmlUnit-always-failed-when-execute-JS-in-html-tp33378p33382.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |
From: Ronald B. <rb...@rb...> - 2014-03-15 09:18:42
|
Hi Jack, i personally think that the JS support of HtmlUnit is not that bad. >From you info i'm not able to see your problem. Do you get any error or just not the response you expect?. In general HtmlUnit tries to mimic the browser as close as possible. So if you find a situation where the behaviour is different you have to isolate the situation (see http://htmlunit.sourceforge.net/submittingBugs.html and http://htmlunit.sourceforge.net/submittingJSBugs.html) and open an issue. Depending on the level of details you provide, we usually fix this as fast as possible. RBRi On Fri, 14 Mar 2014 21:53:59 -0700 (PDT) Jack Bi wrote: > >Hi all: > I found htmlUnit always failed when execute JS in html, and if I try the >same code, sometimes it's success, I was puzzled for a long time. > My Java code like this, hope someone can help me solve this >problem,thank you very much. > > WebClient webClient = new >WebClient(BrowserVersion.FIREFOX_24); > webClient.addWebWindowListener(new WebWindowListener() { > > public void webWindowOpened(WebWindowEvent event) { > > } > public void webWindowContentChanged(WebWindowEvent event) { > page = (HtmlPage) webClient.getCurrentWindow().getEnclosedPage(); > if (page.getTitleText().equals("Security Warning")) { > warningTag = true; > } > if (page.asText().contains(descTag)) { > finalTag = true; > } > } > public void webWindowClosed(WebWindowEvent event) { > > } > }); > > page = webClient.getPage(urlString); > for(int i = 0; i < 200; i++){ > if (warningTag || finalTag) { > break; > } > Thread.sleep(1000); > } > >the HTML code is like this: ><html> > <head> > <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> > <title> > title > </title> > <script type="text/javascript"> > window.setTimeout('document.getElementById("formxh4t").submit();',1000); > <form name="formxh4t" id="formxh4t" >action="http://k2.ddtuangou.com/daili/process.php?action=update" >method="post" onsubmit="return updateLocation(this);"> > <input name="u" type="hidden" class="textbox" id="input" >value="http://dongtaiwang.com" size="60"/> > <input type="hidden" name="type" value="0"/> > </form> > </body> ></html> > >thank you! > >yours jack > > > >-- >View this message in context: http://htmlunit.10904.n7.nabble.com/htmlUnit-always-failed-when-execute-JS-in-html-tp33378.html >Sent from the HtmlUnit - General mailing list archive at Nabble.com. > >------------------------------------------------------------------------------ >Learn Graph Databases - Download FREE O'Reilly Book >"Graph Databases" is the definitive new guide to graph databases and their >applications. Written by three acclaimed leaders in the field, >this first edition is now available. Download your free book today! >http://p.sf.net/sfu/13534_NeoTech >_______________________________________________ >Htmlunit-user mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlunit-user > |
From: Jack Bi <814...@qq...> - 2014-03-15 09:42:58
|
Hi Ronald: Thank you for your reply. If you are interested in my code. /******************************************************/ import java.io.IOException; import java.net.MalformedURLException; import java.util.ArrayList; import java.util.List; import java.util.logging.Level; import org.apache.log4j.Logger; import org.apache.log4j.chainsaw.Main; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; import com.gargoylesoftware.htmlunit.BrowserVersion; import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException; import com.gargoylesoftware.htmlunit.WebClient; import com.gargoylesoftware.htmlunit.WebWindowEvent; import com.gargoylesoftware.htmlunit.WebWindowListener; import com.gargoylesoftware.htmlunit.html.HtmlElement; import com.gargoylesoftware.htmlunit.html.HtmlForm; import com.gargoylesoftware.htmlunit.html.HtmlInput; import com.gargoylesoftware.htmlunit.html.HtmlOption; import com.gargoylesoftware.htmlunit.html.HtmlPage; import com.gargoylesoftware.htmlunit.html.HtmlSelect; public class A { private static Logger logger = Logger.getLogger(A.class); static boolean warningTag = false; static boolean finalTag = false; static boolean htmlTag = false; static final String descUrl = "https://twitter.com/"; static String descTag = "welcome twitter"; static HtmlPage page = null; static HtmlPage page2 = null; static WebClient webClient = null; static String htmlString = null; static String urlString = ""; public static void initWebClient() { webClient = new WebClient(BrowserVersion.FIREFOX_24); webClient.getOptions().setThrowExceptionOnFailingStatusCode(false); webClient.getOptions().setThrowExceptionOnScriptError(false); webClient.getOptions().setCssEnabled(false); webClient.getOptions().setRedirectEnabled(true); webClient.getOptions().setJavaScriptEnabled(true); // webClient.getOptions().setTimeout(20*1000); webClient.getOptions().setUseInsecureSSL(true); webClient.addWebWindowListener(new WebWindowListener() { @Override public void webWindowOpened(WebWindowEvent event) { } @Override public void webWindowContentChanged(WebWindowEvent event) { page = (HtmlPage) webClient.getCurrentWindow().getEnclosedPage(); if (page.getTitleText().equals("Security Warning")) { warningTag = true; } if (page.asText().contains(descTag)) { finalTag = true; } } @Override public void webWindowClosed(WebWindowEvent event) { } }); } public static void webClientGetPage(String url) throws IOException, InterruptedException { try { page = webClient.getPage(url); } catch (FailingHttpStatusCodeException e) { logger.info(url + ":Error:Http Status"); return; } catch (MalformedURLException e) { logger.info(url + ":Error:URL Error"); return; } catch (Exception e) { logger.info(url + ":Error:Connect Error"); return; } try { dealRootPage(page); } catch (Exception e) { logger.info(urlString + ":dealRootPage Failed"); return; } } public static void dealRootPage(HtmlPage htmlPage) throws IOException, InterruptedException { Document document = Jsoup.parse(htmlPage.asXml()); Elements forms = document.getElementsByTag("form"); List<HtmlForm> forms2 = htmlPage.getForms(); if (forms.size() == 0) { logger.info(urlString + ":Error:Cannot find Form"); return; } else { for (int i = 0; i < forms.size(); i++) { Element form = forms.get(i); HtmlForm form2 = forms2.get(i); System.out.println("Execute:form" + (i+1)); try { parseForm(form,form2); } catch (Exception e) { logger.info(urlString + ":parForm:form" + i + "Failed"); continue; } } } } public static int parseForm(Element formElement, HtmlForm form2) throws IOException, InterruptedException { String inputName = formElement.select("input[type=text]").attr("name"); String buttonValue = formElement.select("input[type=submit]").attr("value"); Elements selects = formElement.getElementsByTag("select"); HtmlSelect select = null; if (inputName.length() < 1) { logger.info(urlString + ":Error:Cannot find Input"); return -2; } HtmlInput inputText = form2.getInputByName(inputName); inputText.setAttribute("value", descUrl); HtmlInput buttonInput = form2.getInputByValue(buttonValue); if (selects.size() == 0) { try { doRequest(buttonInput); } catch (Exception e) { logger.info(urlString + ":doRequest(button)" + "Failed"); return -7; } } else { Elements options = formElement.getElementsByTag("select").get(0).getElementsByTag("option"); String selectName = selects.get(0).attr("name"); select = form2.getSelectByName(selectName); for (int i = 0; i < options.size(); i++) { String optionValue = options.get(i).attr("value"); HtmlOption option = select.getOptionByValue(optionValue); select.setSelectedAttribute(option, true); System.out.println("Select Server:" + (i+1)); try { doRequest(buttonInput); } catch (Exception e) { logger.info(urlString + ":doRequest " + "Server:" + (i+1) + "Failed"); continue; } } } return 0; } public static int doRequest(HtmlInput buttonInput) throws InterruptedException, IOException { page = (HtmlPage)(buttonInput.click()); for(int i = 0; i < 1000; i++){ if (warningTag || finalTag) { break; } if (i == 100) { logger.info(urlString + "Error:JS Timeout"); System.out.println(page.asXml()); return -4; } Thread.sleep(1000); } int result = dealPage(page); warningTag = false; finalTag = false; Thread.sleep(5000); return result; } public static int dealPage(HtmlPage page) throws IOException, InterruptedException { if (page.asText().contains("Twitter")) { logger.info(urlString + ":" + "Succ"); System.out.println(page.asXml()); return 1; } else if (page.asText().contains("Warning")) { return dealWarning(page); } return 0; } public static int dealWarning(HtmlPage page) throws IOException, InterruptedException { HtmlForm form2 = page.getForms().get(0); HtmlElement button2 = form2.getInputByValue("Continue anyway..."); @SuppressWarnings("unused") HtmlPage page3 = button2.click(); for(int i = 0; i < 1000; i++){ if (finalTag) { return 2;//Succ } if (i == 50) { logger.info(urlString + ":" + "Error:Warning timeout"); return -5; } Thread.sleep(1000); } return 0; } public static void main(String[] args) throws IOException, InterruptedException { java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(Level.OFF); System.setProperty("org.apache.commons.logging.Log","org.apache.commons.logging.impl.NoOpLog"); urlString = "http://www.q8daili.com/"; initWebClient(); Log.loadLogProperties(); logger.info("**************************************************"); logger.info("begin"); try { webClientGetPage(urlString); } catch (Exception e) { logger.info(urlString + ":Get root HTMLPage failed"); } } } Thank you! yours Jack! -- View this message in context: http://htmlunit.10904.n7.nabble.com/htmlUnit-always-failed-when-execute-JS-in-html-tp33378p33383.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |