From: David M. G. <mic...@gm...> - 2014-01-02 12:57:06
|
Hi Ahmed, Thanks for your reply. This solves halve of the problem. The immediate refresh handler redirects me automatically to the page which sends me the header "application/force-download". The question is how to emulate the browser behavior so that i get the pdf page automatically. Here is the code: package test; import java.io.IOException; import java.net.MalformedURLException; import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException; import com.gargoylesoftware.htmlunit.ImmediateRefreshHandler; import com.gargoylesoftware.htmlunit.Page; import com.gargoylesoftware.htmlunit.WebClient; public class ForceDownload { public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException { WebClient client = new WebClient(); client.setRefreshHandler(new ImmediateRefreshHandler()); final String downloadUrl = " http://archivoespañoldearte.revistas.csic.es/index.php/aea/article/download/552/549<http://xn--archivoespaoldearte-53b.revistas.csic.es/index.php/aea/article/download/552/549> "; final Page page = client.getPage(downloadUrl); System.out.println(page.getWebResponse().getContentType()); } } I get the output: application/force-download How can i get to the pdf? Thanks, David Message: 2 > Date: Thu, 26 Dec 2013 07:17:24 -0800 (PST) > From: Ahmed Ashour <asa...@ya...> > Subject: Re: [Htmlunit-user] deal with application/force-download > To: "htm...@li..." > <htm...@li...> > Message-ID: > <138...@we...> > Content-Type: text/plain; charset="iso-8859-1" > > Hi David, > > The page refreshes in 2 seconds and forwards to the PDF location. > > You can try: > > ??????? WebClient webClient = new WebClient(); > ??????? webClient.setRefreshHandler(new ImmediateRefreshHandler()); > ??????? Page page = webClient.getPage(" > http://xn--archivoespaoldearte-53b.revistas.csic.es/index.php/aea/article/viewDownloadInterstitial/552/549 > "); > > > Ahmed > > ________________________________ > From: David Michael Gang <mic...@gm...> > To: htm...@li... > Sent: Monday, December 23, 2013 12:24 PM > Subject: [Htmlunit-user] deal with application/force-download > > > > Hi all, > > I have the following url: > > http://xn--archivoespaoldearte-53b.revistas.csic.es/index.php/aea/article/view/552/549 > > In firefox or ie8 the page refreshes and a pdf is downloaded. > With htmlunit i try the following: > When trying to go to the top page, it returns a sort of html page and not > the pdf. > > Even when trying to go directly to the download page, it does not download > the pdf. > > > package test; > > import java.io.IOException; > import java.net.MalformedURLException; > > import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException; > import com.gargoylesoftware.htmlunit.Page; > import com.gargoylesoftware.htmlunit.WebClient; > import com.gargoylesoftware.htmlunit.html.HtmlPage; > > public class ForceDownload { > > ??? public static void main(String[] args) throws > FailingHttpStatusCodeException, MalformedURLException, IOException { > ??? ??? WebClient client = new WebClient(); > ??? ??? System.out.println("get to top page"); > ??? ??? final String topUrl = " > http://xn--archivoespaoldearte-53b.revistas.csic.es/index.php/aea/article/view/552/549 > "; > ??? ??? final Page topPage = client.getPage(topUrl); > ??? ??? if(topPage.isHtmlPage()) { > ??? ??? ??? System.out.println("topPage is htmlPage"); > ??? ??? ??? System.out.println("source of top page is "+((HtmlPage) > topPage).asXml()); > ??? ??? } > ??? ??? > ??? ??? System.out.println("get to download page directly"); > ??? ??? > ??? ??? final String downloadUrl = " > http://archivoespa?oldearte.revistas.csic.es/index.php/aea/article/download/552/549 > "; > ??? ??? > ??? ??? final Page page = client.getPage(downloadUrl); > ??? ??? System.out.println(page.getWebResponse().getContentType()); > ??? ??? > ??? ??? > ??? } > > } > > This is the output of the script > get to top page > topPage is htmlPage > source of top page is <?xml version="1.0" encoding="UTF-8"?> > <html xmlns="http://www.w3.org/1999/xhtml"> > ? <head> > ??? <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> > ??? <title> > ????? Vasallo Toranzo > ??? </title> > ??? <link rel="stylesheet" href=" > http://xn--archivoespaoldearte-53b.revistas.csic.es/styles/common.css" > type="text/css"/> > ??? <link rel="stylesheet" href=" > http://xn--archivoespaoldearte-53b.revistas.csic.es/styles/articleView.css" > type="text/css"/> > ??? <link rel="icon" href=" > http://xn--archivoespaoldearte-53b.revistas.csic.es/favicon.ico" > type="image/x-icon"/> > ??? <script type="text/javascript" src=" > http://xn--archivoespaoldearte-53b.revistas.csic.es/js/general.js"> > ??? </script> > ??? <!-- Add javascript required for font sizer -->??? <script > type="text/javascript" src=" > http://xn--archivoespaoldearte-53b.revistas.csic.es/js/sizer.js"> > ??? </script> > ??? <!-- Add stylesheets for the font sizer -->??? <link rel="alternate > stylesheet" title="Peque?a" href=" > http://xn--archivoespaoldearte-53b.revistas.csic.es/styles/fontSmall.css" > type="text/css" disabled="disabled"/> > ??? <link rel="stylesheet" title="Mediana" href=" > http://xn--archivoespaoldearte-53b.revistas.csic.es/styles/fontMedium.css" > type="text/css"/> > ??? <link rel="alternate stylesheet" title="Grande" href=" > http://xn--archivoespaoldearte-53b.revistas.csic.es/styles/fontLarge.css" > type="text/css" disabled="disabled"/> > ? </head> > ? <frameset cols="220,*" style="border: 0;"> > ??? <!-- cols="*,180"-->??? <frame src=" > http://xn--archivoespaoldearte-53b.revistas.csic.es/index.php/aea/article/viewRST/552/549" > noresize="noresize" frameborder="0" scrolling="auto"/> > ??? <frame src=" > http://xn--archivoespaoldearte-53b.revistas.csic.es/index.php/aea/article/viewDownloadInterstitial/552/549" > frameborder="0"/> > ??? <noframes> > ????? > <body> > ??? <table width="100%"> > ??? ??? <tr> > ??? ??? ??? <td align="center"> > ??? ??? ??? ??? Esta p?gina usa marcos. <a href=" > http://xn--archivoespaoldearte-53b.revistas.csic.es/index.php/aea/article/viewDownloadInterstitial/552/549">Haga > click aqu?</a> para ir a la versi?n sin marcos. > ??? ??? ??? </td> > ??? ??? </tr> > ??? </table> > </body> > > > ??? </noframes> > ? </frameset> > </html> > > get to download page directly > application/force-download > > > > How can i solve this challenge? > > How can i tell htmlunit to download the file directly? > > > Thanks, > > David > > > > ------------------------------------------------------------------------------ > Rapidly troubleshoot problems before they affect your business. Most IT > organizations don't have a clear picture of how application performance > affects their revenue. With AppDynamics, you get 100% visibility into your > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics > Pro! > http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk > _______________________________________________ > Htmlunit-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlunit-user > -------------- next part -------------- > An HTML attachment was scrubbed... > > > |