From: Ahmed A. <asa...@ya...> - 2013-12-26 15:17:32
|
Hi David, The page refreshes in 2 seconds and forwards to the PDF location. You can try: WebClient webClient = new WebClient(); webClient.setRefreshHandler(new ImmediateRefreshHandler()); Page page = webClient.getPage("http://xn--archivoespaoldearte-53b.revistas.csic.es/index.php/aea/article/viewDownloadInterstitial/552/549"); Ahmed ________________________________ From: David Michael Gang <mic...@gm...> To: htm...@li... Sent: Monday, December 23, 2013 12:24 PM Subject: [Htmlunit-user] deal with application/force-download Hi all, I have the following url: http://xn--archivoespaoldearte-53b.revistas.csic.es/index.php/aea/article/view/552/549 In firefox or ie8 the page refreshes and a pdf is downloaded. With htmlunit i try the following: When trying to go to the top page, it returns a sort of html page and not the pdf. Even when trying to go directly to the download page, it does not download the pdf. package test; import java.io.IOException; import java.net.MalformedURLException; import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException; import com.gargoylesoftware.htmlunit.Page; import com.gargoylesoftware.htmlunit.WebClient; import com.gargoylesoftware.htmlunit.html.HtmlPage; public class ForceDownload { public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException { WebClient client = new WebClient(); System.out.println("get to top page"); final String topUrl = "http://xn--archivoespaoldearte-53b.revistas.csic.es/index.php/aea/article/view/552/549"; final Page topPage = client.getPage(topUrl); if(topPage.isHtmlPage()) { System.out.println("topPage is htmlPage"); System.out.println("source of top page is "+((HtmlPage) topPage).asXml()); } System.out.println("get to download page directly"); final String downloadUrl = "http://archivoespañoldearte.revistas.csic.es/index.php/aea/article/download/552/549"; final Page page = client.getPage(downloadUrl); System.out.println(page.getWebResponse().getContentType()); } } This is the output of the script get to top page topPage is htmlPage source of top page is <?xml version="1.0" encoding="UTF-8"?> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> <title> Vasallo Toranzo </title> <link rel="stylesheet" href="http://xn--archivoespaoldearte-53b.revistas.csic.es/styles/common.css" type="text/css"/> <link rel="stylesheet" href="http://xn--archivoespaoldearte-53b.revistas.csic.es/styles/articleView.css" type="text/css"/> <link rel="icon" href="http://xn--archivoespaoldearte-53b.revistas.csic.es/favicon.ico" type="image/x-icon"/> <script type="text/javascript" src="http://xn--archivoespaoldearte-53b.revistas.csic.es/js/general.js"> </script> <!-- Add javascript required for font sizer --> <script type="text/javascript" src="http://xn--archivoespaoldearte-53b.revistas.csic.es/js/sizer.js"> </script> <!-- Add stylesheets for the font sizer --> <link rel="alternate stylesheet" title="Pequeña" href="http://xn--archivoespaoldearte-53b.revistas.csic.es/styles/fontSmall.css" type="text/css" disabled="disabled"/> <link rel="stylesheet" title="Mediana" href="http://xn--archivoespaoldearte-53b.revistas.csic.es/styles/fontMedium.css" type="text/css"/> <link rel="alternate stylesheet" title="Grande" href="http://xn--archivoespaoldearte-53b.revistas.csic.es/styles/fontLarge.css" type="text/css" disabled="disabled"/> </head> <frameset cols="220,*" style="border: 0;"> <!-- cols="*,180"--> <frame src="http://xn--archivoespaoldearte-53b.revistas.csic.es/index.php/aea/article/viewRST/552/549" noresize="noresize" frameborder="0" scrolling="auto"/> <frame src="http://xn--archivoespaoldearte-53b.revistas.csic.es/index.php/aea/article/viewDownloadInterstitial/552/549" frameborder="0"/> <noframes> <body> <table width="100%"> <tr> <td align="center"> Esta página usa marcos. <a href="http://xn--archivoespaoldearte-53b.revistas.csic.es/index.php/aea/article/viewDownloadInterstitial/552/549">Haga click aquí</a> para ir a la versión sin marcos. </td> </tr> </table> </body> </noframes> </frameset> </html> get to download page directly application/force-download How can i solve this challenge? How can i tell htmlunit to download the file directly? Thanks, David ------------------------------------------------------------------------------ Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk _______________________________________________ Htmlunit-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlunit-user |