Menu

#1395 IOException in GAE when fetching URL

closed
None
5
2012-10-21
2012-03-26
José Reis
No

Hi,

fetching
http://www.jumbo.pt:80/Frontoffice/ContentPages/BrowseCatalog.aspx?C=050701
using WebClient works just fine within the GAE in local environment.

Same does not happend in Google's servers environment. The following exception is thrown:

java.lang.RuntimeException: java.io.IOException: Could not fetch URL: http://www.jumbo.pt:80/Frontoffice/ContentPages/BrowseCatalog.aspx?C=050701
at com.gargoylesoftware.htmlunit.UrlFetchWebConnection.getResponse(UrlFetchWebConnection.java:150)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1439)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1358)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:307)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:373)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:358).

If we try fetching in http://ajax-crawler.appspot.com/
we get:
"Unable to get a DOM for
http://www.jumbo.pt:80/Frontoffice/ContentPages/BrowseCatalog.aspx?C=050701".

I didn't define the group because i don't know how to check HtmlUnit version in GAE environment. Help here will be also much appreciated.

Thanks!
José

Discussion

  • Marc Guillemot

    Marc Guillemot - 2012-04-20

    Can you provide the stack trace of the root cause (the IOException)?

     
  • José Reis

    José Reis - 2012-09-30

    Sure. Here goes the root's exception stacktrace. Problem does not ocurr when in local environment. Thanks.
    2012-09-30 15:39:55.082
    com.maisbarato.server.web.WebpageRetriver getHtmlPage: [ZECAS3 JUMBAS_REF BEER_ALCOOHOLIC] GETTING URL >http://www.jumbo.pt:80/Frontoffice/ContentPages/BrowseCatalog.aspx?C=050301<
    java.lang.RuntimeException: java.io.IOException: Too many redirects at URL: http://www.jumbo.pt:80/Frontoffice/ContentPages/BrowseCatalog.aspx?C=050301 with redirect=true
    at com.gargoylesoftware.htmlunit.UrlFetchWebConnection.getResponse(UrlFetchWebConnection.java:150)
    at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1439)
    at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1358)
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:307)
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:373)
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:358)
    at com.maisbarato.server.web.WebpageRetriver.getHtmlPage(WebpageRetriver.java:80)
    at com.maisbarato.server.retailer.JumbasSeller.getBeverageList(JumbasSeller.java:99)
    at com.maisbarato.server.retailer.WebRetailer.getGoodList(WebRetailer.java:94)
    at com.maisbarato.server.GoodsProvider.refreshGoods(GoodsProvider.java:47)
    at com.maisbarato.server.CustomServletContextListener.contextInitialized(CustomServletContextListener.java:8)
    at org.mortbay.jetty.handler.ContextHandler.startContext(ContextHandler.java:548)
    at org.mortbay.jetty.servlet.Context.startContext(Context.java:136)
    at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1250)
    at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:517)
    at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:467)
    at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
    at com.google.apphosting.runtime.jetty.AppVersionHandlerMap.createHandler(AppVersionHandlerMap.java:219)
    at com.google.apphosting.runtime.jetty.AppVersionHandlerMap.getHandler(AppVersionHandlerMap.java:194)
    at com.google.apphosting.runtime.jetty.JettyServletEngineAdapter.serviceRequest(JettyServletEngineAdapter.java:134)
    at com.google.apphosting.runtime.JavaRuntime$RequestRunnable.run(JavaRuntime.java:447)
    at com.google.tracing.TraceContext$TraceContextRunnable.runInContext(TraceContext.java:452)
    at com.google.tracing.TraceContext$TraceContextRunnable$1.run(TraceContext.java:459)
    at com.google.tracing.TraceContext.runInContext(TraceContext.java:701)
    at com.google.tracing.TraceContext$AbstractTraceContextCallback.runInInheritedContextNoUnref(TraceContext.java:336)
    at com.google.tracing.TraceContext$AbstractTraceContextCallback.runInInheritedContext(TraceContext.java:328)
    at com.google.tracing.TraceContext$TraceContextRunnable.run(TraceContext.java:456)
    at com.google.apphosting.runtime.ThreadGroupPool$PoolEntry.run(ThreadGroupPool.java:251)
    at java.lang.Thread.run(Thread.java:679)
    Caused by: java.io.IOException: Too many redirects at URL: http://www.jumbo.pt:80/Frontoffice/ContentPages/BrowseCatalog.aspx?C=050301 with redirect=true
    at com.google.appengine.api.urlfetch.URLFetchServiceImpl.convertApplicationException(URLFetchServiceImpl.java:126)
    at com.google.appengine.api.urlfetch.URLFetchServiceImpl.fetch(URLFetchServiceImpl.java:43)
    at com.google.apphosting.utils.security.urlfetch.URLFetchServiceStreamHandler$Connection.fetchResponse(URLFetchServiceStreamHandler.java:417)
    at com.google.apphosting.utils.security.urlfetch.URLFetchServiceStreamHandler$Connection.getInputStream(URLFetchServiceStreamHandler.java:296)
    at com.google.apphosting.utils.security.urlfetch.URLFetchServiceStreamHandler$Connection.getResponseCode(URLFetchServiceStreamHandler.java:149)
    at com.gargoylesoftware.htmlunit.UrlFetchWebConnection.getResponse(UrlFetchWebConnection.java:115)
    ... 28 more

     
  • Marc Guillemot

    Marc Guillemot - 2012-10-01

    I believe that there is a problem with the site or with GAE rather than with HtmlUnit as googling for "google app engine Too many redirects at URL" gives a lot of results.

     

Log in to post a comment.