From: David M. G. <mic...@gm...> - 2013-12-17 11:35:27
|
Hi Ahmed, This worked fine. I understand that i need to get the top page because window.open opens an unconnected page. Thanks a lot for your prompt response, David On Tue, Dec 17, 2013 at 12:27 PM, < htm...@li...> wrote: > Send Htmlunit-user mailing list submissions to > htm...@li... > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.sourceforge.net/lists/listinfo/htmlunit-user > or, via email, send a message with subject or body 'help' to > htm...@li... > > You can reach the person managing the list at > htm...@li... > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Htmlunit-user digest..." > > > Today's Topics: > > 1. Re: How to identify if page gets refreshed and how to wait > for it (Ahmed Ashour) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 17 Dec 2013 02:27:11 -0800 (PST) > From: Ahmed Ashour <asa...@ya...> > Subject: Re: [Htmlunit-user] How to identify if page gets refreshed > and how to wait for it > To: "htm...@li..." > <htm...@li...> > Message-ID: > <138...@we...> > Content-Type: text/plain; charset="iso-8859-1" > > Hi David, > > It seems you need to get the top page, something like: > > > ? ? public static void main(String[] args) throws > FailingHttpStatusCodeException, MalformedURLException, IOException { > ? ? ? ? WebClient webClient = new WebClient(); > ? ? ? ? webClient.setRefreshHandler(new ImmediateRefreshHandler()); > ? ? ? ? webClient.getOptions().setThrowExceptionOnScriptError(false); > ? ? ? ? String url = " > http://www.scielo.org.co/scielo.php?script=sci_arttext&pid=S0120-99572012000600002 > "; > ? ? ? ? HtmlPage page = webClient.getPage(url); > > > ? ? ? ? List<HtmlAnchor> anchors = page.getAnchors(); > ? ? ? ? for (HtmlAnchor anchor:anchors) { > ? ? ? ? ? ? String linkText = anchor.getTextContent(); > ? ? ? ? ? ? if(linkText.contains("pdf")) { > ? ? ? ? ? ? ? ? System.out.println("clicking on anchor:"+anchor); > > ? ? ? ? ? ? ? ? Page pdfPage = anchor.click(); > ? ? ? ? ? ? ? ? System.out.println("URL 1 " + pdfPage.getUrl()); > ? ? ? ? ? ? ? ? webClient.waitForBackgroundJavaScriptStartingBefore(10000); > ? ? ? ? ? ? ? ? System.out.println("URL 2 " + pdfPage.getUrl()); > ? ? ? ? ? ? ? ? System.out.println("URL 3 " + > webClient.getTopLevelWindows().get(0).getEnclosedPage().getUrl()); > ? ? ? ? ? ? ? ? if(pdfPage.isHtmlPage()) { > ? ? ? ? ? ? ? ? ? ? HtmlPage p = (HtmlPage) pdfPage; > ? ? ? ? ? ? ? ? } > ? ? ? ? ? ? ? ? else { > ? ? ? ? ? ? ? ? ? ? System.out.println("Page is pdf"); > ? ? ? ? ? ? ? ? ? ? System.out.println(pdfPage); > ? ? ? ? ? ? ? ? } > ? ? ? ? ? ? } > ? ? ? ? } > ? ? } > > Yours, > Ahmed > > ________________________________ > From: David Michael Gang <mic...@gm...> > To: htm...@li... > Sent: Tuesday, December 17, 2013 1:06 PM > Subject: Re: [Htmlunit-user] How to identify if page gets refreshed and > how to wait for it > > > > Hi, > > It seems that there is a more basic issue. > In the page > > http://www.scielo.org.co/scielo.php?script=sci_arttext&pid=S0120-99572012000600002 > I have the pdf article link > > When i press on the link, i don't get the new page. > > > Here is the code: > package test; > > import java.io.IOException; > import java.net.MalformedURLException; > import java.util.List; > > import com.gargoylesoftware.htmlunit.BrowserVersion; > import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException; > import com.gargoylesoftware.htmlunit.ImmediateRefreshHandler; > import com.gargoylesoftware.htmlunit.NiceRefreshHandler; > import com.gargoylesoftware.htmlunit.Page; > import com.gargoylesoftware.htmlunit.WebClient; > import com.gargoylesoftware.htmlunit.html.HtmlAnchor; > import com.gargoylesoftware.htmlunit.html.HtmlPage; > import com.gargoylesoftware.htmlunit.javascript.configuration.WebBrowser; > import com.google.common.collect.ImmutableList; > > public class Test{ > > ??? public static void main(String[] args) throws > FailingHttpStatusCodeException, MalformedURLException, IOException { > ??? ??? WebClient webClient = new WebClient(); > ??? ??? webClient.setRefreshHandler(new ImmediateRefreshHandler()); > ??? ??? webClient.getOptions().setThrowExceptionOnScriptError(false); > ??? ??? List<String> urls = ImmutableList.of(" > http://www.scielo.org.co/scielo.php?script=sci_arttext&pid=S0120-99572012000600002 > "); > ??? ??? for(String url:urls) { > ??? ??? ??? HtmlPage page = webClient.getPage(url); > ??? ??? ??? > ??? ??? ??? > ??? ??? ??? List<HtmlAnchor> anchors = page.getAnchors(); > ??? ??? ??? for (HtmlAnchor anchor:anchors) { > ??? ??? ??? ??? String linkText = anchor.getTextContent(); > ??? ??? ??? ??? if(linkText.contains("pdf")) { > ??? ??? ??? ??? ??? System.out.println("clicking on anchor:"+anchor); > ??? ??? ??? ??? ??? Page pdfPage = anchor.click(); > ??? ??? ??? ??? ??? > webClient.waitForBackgroundJavaScriptStartingBefore(1000); > ??? ??? ??? ??? ??? if(pdfPage.isHtmlPage()) { > ??? ??? ??? ??? ??? ??? HtmlPage p = (HtmlPage) pdfPage; > ??? ??? ??? ??? ??? ??? > ??? ??? ??? ??? ??? ??? System.out.println(p.asText()); > ??? ??? ??? ??? ??? } > ??? ??? ??? ??? ??? else { > ??? ??? ??? ??? ??? ??? System.out.println("Page is pdf"); > ??? ??? ??? ??? ??? ??? System.out.println(pdfPage); > ??? ??? ??? ??? ??? } > ??? ??? ??? ??? } > ??? ??? ??? } > } > > ??? } > > } > > > > Here is the output: > 17/12/2013 12:04:40 > com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify > WARNING: Obsolete content type encountered: 'application/x-javascript'. > 17/12/2013 12:04:40 > com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter runtimeError > SEVERE: runtimeError: message=[Unexpected call to method or property > access] sourceName=[ > http://www.scielo.org.co/applications/scielo-org/js/jquery-1.4.2.min.js] > line=[35] lineSource=[null] lineOffset=[0] > 17/12/2013 12:04:40 > com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter runtimeError > SEVERE: runtimeError: message=[The data necessary to complete this > operation is not yet available.] sourceName=[ > http://www.scielo.org.co/applications/scielo-org/js/jquery-1.4.2.min.js] > line=[16] lineSource=[null] lineOffset=[0] > 17/12/2013 12:04:41 org.apache.http.impl.client.DefaultHttpClient > tryExecute > INFO: I/O exception (org.apache.http.NoHttpResponseException) caught when > processing request: The target server failed to respond > 17/12/2013 12:04:41 org.apache.http.impl.client.DefaultHttpClient > tryExecute > INFO: Retrying request > 17/12/2013 12:04:42 > com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify > WARNING: Obsolete content type encountered: 'application/x-javascript'. > 17/12/2013 12:04:43 > com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify > WARNING: Obsolete content type encountered: 'application/x-javascript'. > 17/12/2013 12:04:43 > com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify > WARNING: Obsolete content type encountered: 'application/x-javascript'. > 17/12/2013 12:04:44 > com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify > WARNING: Obsolete content type encountered: 'application/x-javascript'. > 17/12/2013 12:04:44 > com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter runtimeError > SEVERE: runtimeError: message=[The data necessary to complete this > operation is not yet available.] sourceName=[ > http://s7.addthis.com/static/r07/core113.js] line=[2] lineSource=[null] > lineOffset=[0] > 17/12/2013 12:04:46 > com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify > WARNING: Obsolete content type encountered: 'text/javascript'. > 17/12/2013 12:04:47 com.gargoylesoftware.htmlunit.DefaultCssErrorHandler > error > WARNING: CSS error: 'http://s7.addthis.com/static/r07/widget118.css' > [1:5310] Error in style rule. (Invalid token "*". Was expecting one of: > <EOF>, <S>, <IDENT>, "}", ";".) > 17/12/2013 12:04:47 com.gargoylesoftware.htmlunit.DefaultCssErrorHandler > warning > WARNING: CSS warning: 'http://s7.addthis.com/static/r07/widget118.css' > [1:5310] Ignoring the following declarations in this rule. > 17/12/2013 12:04:47 com.gargoylesoftware.htmlunit.DefaultCssErrorHandler > error > WARNING: CSS error: 'http://s7.addthis.com/static/r07/widget118.css' > [1:5383] Error in style rule. (Invalid token "*". Was expecting one of: > <EOF>, <S>, <IDENT>, "}", ";".) > 17/12/2013 12:04:47 com.gargoylesoftware.htmlunit.DefaultCssErrorHandler > warning > WARNING: CSS warning: 'http://s7.addthis.com/static/r07/widget118.css' > [1:5383] Ignoring the following declarations in this rule. > 17/12/2013 12:04:47 com.gargoylesoftware.htmlunit.DefaultCssErrorHandler > error > WARNING: CSS error: 'http://s7.addthis.com/static/r07/widget118.css' > [1:62894] Error in expression. (Invalid token "#0d98fb". Was expecting one > of: <S>, <NUMBER>, <IDENT>, <STRING>, <PLUS>, <COMMA>, <EMS>, <EXS>, > <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, > <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, > <FREQ_HZ>, <FREQ_KHZ>, <PERCENTAGE>, <URI>, "-", "=", ")".) > 17/12/2013 12:04:47 com.gargoylesoftware.htmlunit.DefaultCssErrorHandler > error > WARNING: CSS error: 'http://s7.addthis.com/static/r07/widget118.css' > [1:62911] Error in style rule. (Invalid token "background-image". Was > expecting one of: <EOF>, "}", ";".) > 17/12/2013 12:04:47 com.gargoylesoftware.htmlunit.DefaultCssErrorHandler > warning > WARNING: CSS warning: 'http://s7.addthis.com/static/r07/widget118.css' > [1:62911] Ignoring the following declarations in this rule. > 17/12/2013 12:04:47 com.gargoylesoftware.htmlunit.DefaultCssErrorHandler > error > WARNING: CSS error: 'http://s7.addthis.com/static/r07/widget118.css' > [1:63424] Error in expression. (Invalid token "#0a85dd". Was expecting one > of: <S>, <NUMBER>, <IDENT>, <STRING>, <PLUS>, <COMMA>, <EMS>, <EXS>, > <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, > <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, > <FREQ_HZ>, <FREQ_KHZ>, <PERCENTAGE>, <URI>, "-", "=", ")".) > 17/12/2013 12:04:47 com.gargoylesoftware.htmlunit.DefaultCssErrorHandler > error > WARNING: CSS error: 'http://s7.addthis.com/static/r07/widget118.css' > [1:63441] Error in style rule. (Invalid token "background-image". Was > expecting one of: <EOF>, "}", ";".) > 17/12/2013 12:04:47 com.gargoylesoftware.htmlunit.DefaultCssErrorHandler > warning > WARNING: CSS warning: 'http://s7.addthis.com/static/r07/widget118.css' > [1:63441] Ignoring the following declarations in this rule. > 17/12/2013 12:04:47 com.gargoylesoftware.htmlunit.DefaultCssErrorHandler > error > WARNING: CSS error: 'http://s7.addthis.com/static/r07/widget118.css' > [1:83274] Error in @media rule. (Invalid token "and". Was expecting one of: > <S>, <LBRACE>, <COMMA>.) > 17/12/2013 12:04:47 com.gargoylesoftware.htmlunit.DefaultCssErrorHandler > warning > WARNING: CSS warning: 'http://s7.addthis.com/static/r07/widget118.css' > [1:83274] Ignoring the whole rule. > 17/12/2013 12:04:51 > com.gargoylesoftware.htmlunit.javascript.host.ActiveXObject jsConstructor > WARNING: Automation server can't create object for > 'ShockwaveFlash.ShockwaveFlash.7'. > 17/12/2013 12:04:51 > com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter runtimeError > SEVERE: runtimeError: message=[Automation server can't create object for > 'ShockwaveFlash.ShockwaveFlash.7'.] sourceName=[ > http://www.google-analytics.com/ga.js] line=[24] lineSource=[null] > lineOffset=[0] > 17/12/2013 12:04:51 > com.gargoylesoftware.htmlunit.javascript.host.ActiveXObject jsConstructor > WARNING: Automation server can't create object for > 'ShockwaveFlash.ShockwaveFlash.6'. > 17/12/2013 12:04:51 > com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter runtimeError > SEVERE: runtimeError: message=[Automation server can't create object for > 'ShockwaveFlash.ShockwaveFlash.6'.] sourceName=[ > http://www.google-analytics.com/ga.js] line=[24] lineSource=[null] > lineOffset=[0] > 17/12/2013 12:04:51 > com.gargoylesoftware.htmlunit.javascript.host.ActiveXObject jsConstructor > WARNING: Automation server can't create object for > 'ShockwaveFlash.ShockwaveFlash'. > 17/12/2013 12:04:51 > com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter runtimeError > SEVERE: runtimeError: message=[Automation server can't create object for > 'ShockwaveFlash.ShockwaveFlash'.] sourceName=[ > http://www.google-analytics.com/ga.js] line=[24] lineSource=[null] > lineOffset=[0] > 17/12/2013 12:04:53 > com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify > WARNING: Obsolete content type encountered: 'application/x-javascript'. > 17/12/2013 12:04:53 > com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter runtimeError > SEVERE: runtimeError: message=[An invalid or illegal selector was > specified (selector: '.box:eq(0)' error: Invalid selector: *.box:eq(0)).] > sourceName=[ > http://www.scielo.org.co/applications/scielo-org/js/jquery-1.4.2.min.js] > line=[91] lineSource=[null] lineOffset=[0] > 17/12/2013 12:04:53 > com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter runtimeError > SEVERE: runtimeError: message=[An invalid or illegal selector was > specified (selector: '.box:last' error: Invalid selector: *.box:last).] > sourceName=[ > http://www.scielo.org.co/applications/scielo-org/js/jquery-1.4.2.min.js] > line=[91] lineSource=[null] lineOffset=[0] > clicking on anchor:HtmlAnchor[<a href="javascript:%20void(0);%20" > onclick="setTimeout("window.open(' > http://www.scielo.org.co/scielo.php?script=sci_pdf&pid=S0120-99572012000600002&lng=en&nrm=iso&tlng=es','_self')", 3000);">] > Revista Colombiana de Gastroenterologia - I. Epidemiolog?a > ??? > ? > ???? ? > Services on Demand > Article > Article in pdf format > Article in xml format > Article references > How to cite this article > Automatic translation > Send this article by e-mail > Indicators > Related links > Bookmark > ... > > What do i wrong? > > Here is the log: > > > > ------------------------------------------------------------------------------ > Rapidly troubleshoot problems before they affect your business. Most IT > organizations don't have a clear picture of how application performance > affects their revenue. With AppDynamics, you get 100% visibility into your > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics > Pro! > http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk > _______________________________________________ > Htmlunit-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlunit-user > -------------- next part -------------- > An HTML attachment was scrubbed... > > ------------------------------ > > > ------------------------------------------------------------------------------ > Rapidly troubleshoot problems before they affect your business. Most IT > organizations don't have a clear picture of how application performance > affects their revenue. With AppDynamics, you get 100% visibility into your > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics > Pro! > http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk > > ------------------------------ > > _______________________________________________ > Htmlunit-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlunit-user > > > End of Htmlunit-user Digest, Vol 91, Issue 11 > ********************************************* > |