From: David M. G. <mic...@gm...> - 2014-01-06 08:19:51
|
Hi all, I have the following problem: When fetching the page with a browser: http://xn--archivoespaoldearte-53b.revistas.csic.es/index.php/aea/article/view/557/554 I get a pdf. Through htmlunit i just get a html page. Here is the program package test; import java.io.File; import java.io.IOException; import java.io.InputStream; import java.net.MalformedURLException; import org.apache.commons.io.FileUtils; import org.apache.pdfbox.exceptions.COSVisitorException; import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException; import com.gargoylesoftware.htmlunit.ImmediateRefreshHandler; import com.gargoylesoftware.htmlunit.Page; import com.gargoylesoftware.htmlunit.WebClient; public class ForceDownload { public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException, COSVisitorException { WebClient client = new WebClient(); client.setRefreshHandler(new ImmediateRefreshHandler()); final String downloadUrl = " http://xn--archivoespaoldearte-53b.revistas.csic.es/index.php/aea/article/view/557/554 "; final Page page = client.getPage(downloadUrl); System.out.println(page.getWebResponse().getContentType()); final InputStream is = page.getWebResponse().getContentAsStream(); FileUtils.copyInputStreamToFile(is, new File("file.pdf")); } } I get as output the html file. I already tried to set ImmediateRefreshHandler but it did not help. I tried to understand why and saw through the firefox web developer that it sends keep alive signals. How can i refresh the page to wait for keep alive till i get the pdf page? Thanks, David |