Hi,
fetching
http://www.jumbo.pt:80/Frontoffice/ContentPages/BrowseCatalog.aspx?C=050701
using WebClient works just fine within the GAE in local environment.
Same does not happend in Google's servers environment. The following exception is thrown:
java.lang.RuntimeException: java.io.IOException: Could not fetch URL: http://www.jumbo.pt:80/Frontoffice/ContentPages/BrowseCatalog.aspx?C=050701
at com.gargoylesoftware.htmlunit.UrlFetchWebConnection.getResponse(UrlFetchWebConnection.java:150)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1439)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1358)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:307)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:373)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:358).
If we try fetching in http://ajax-crawler.appspot.com/
we get:
"Unable to get a DOM for
http://www.jumbo.pt:80/Frontoffice/ContentPages/BrowseCatalog.aspx?C=050701".
I didn't define the group because i don't know how to check HtmlUnit version in GAE environment. Help here will be also much appreciated.
Thanks!
José
Can you provide the stack trace of the root cause (the IOException)?
Sure. Here goes the root's exception stacktrace. Problem does not ocurr when in local environment. Thanks.
2012-09-30 15:39:55.082
com.maisbarato.server.web.WebpageRetriver getHtmlPage: [ZECAS3 JUMBAS_REF BEER_ALCOOHOLIC] GETTING URL >http://www.jumbo.pt:80/Frontoffice/ContentPages/BrowseCatalog.aspx?C=050301<
java.lang.RuntimeException: java.io.IOException: Too many redirects at URL: http://www.jumbo.pt:80/Frontoffice/ContentPages/BrowseCatalog.aspx?C=050301 with redirect=true
at com.gargoylesoftware.htmlunit.UrlFetchWebConnection.getResponse(UrlFetchWebConnection.java:150)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1439)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1358)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:307)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:373)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:358)
at com.maisbarato.server.web.WebpageRetriver.getHtmlPage(WebpageRetriver.java:80)
at com.maisbarato.server.retailer.JumbasSeller.getBeverageList(JumbasSeller.java:99)
at com.maisbarato.server.retailer.WebRetailer.getGoodList(WebRetailer.java:94)
at com.maisbarato.server.GoodsProvider.refreshGoods(GoodsProvider.java:47)
at com.maisbarato.server.CustomServletContextListener.contextInitialized(CustomServletContextListener.java:8)
at org.mortbay.jetty.handler.ContextHandler.startContext(ContextHandler.java:548)
at org.mortbay.jetty.servlet.Context.startContext(Context.java:136)
at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1250)
at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:517)
at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:467)
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at com.google.apphosting.runtime.jetty.AppVersionHandlerMap.createHandler(AppVersionHandlerMap.java:219)
at com.google.apphosting.runtime.jetty.AppVersionHandlerMap.getHandler(AppVersionHandlerMap.java:194)
at com.google.apphosting.runtime.jetty.JettyServletEngineAdapter.serviceRequest(JettyServletEngineAdapter.java:134)
at com.google.apphosting.runtime.JavaRuntime$RequestRunnable.run(JavaRuntime.java:447)
at com.google.tracing.TraceContext$TraceContextRunnable.runInContext(TraceContext.java:452)
at com.google.tracing.TraceContext$TraceContextRunnable$1.run(TraceContext.java:459)
at com.google.tracing.TraceContext.runInContext(TraceContext.java:701)
at com.google.tracing.TraceContext$AbstractTraceContextCallback.runInInheritedContextNoUnref(TraceContext.java:336)
at com.google.tracing.TraceContext$AbstractTraceContextCallback.runInInheritedContext(TraceContext.java:328)
at com.google.tracing.TraceContext$TraceContextRunnable.run(TraceContext.java:456)
at com.google.apphosting.runtime.ThreadGroupPool$PoolEntry.run(ThreadGroupPool.java:251)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.io.IOException: Too many redirects at URL: http://www.jumbo.pt:80/Frontoffice/ContentPages/BrowseCatalog.aspx?C=050301 with redirect=true
at com.google.appengine.api.urlfetch.URLFetchServiceImpl.convertApplicationException(URLFetchServiceImpl.java:126)
at com.google.appengine.api.urlfetch.URLFetchServiceImpl.fetch(URLFetchServiceImpl.java:43)
at com.google.apphosting.utils.security.urlfetch.URLFetchServiceStreamHandler$Connection.fetchResponse(URLFetchServiceStreamHandler.java:417)
at com.google.apphosting.utils.security.urlfetch.URLFetchServiceStreamHandler$Connection.getInputStream(URLFetchServiceStreamHandler.java:296)
at com.google.apphosting.utils.security.urlfetch.URLFetchServiceStreamHandler$Connection.getResponseCode(URLFetchServiceStreamHandler.java:149)
at com.gargoylesoftware.htmlunit.UrlFetchWebConnection.getResponse(UrlFetchWebConnection.java:115)
... 28 more
I believe that there is a problem with the site or with GAE rather than with HtmlUnit as googling for "google app engine Too many redirects at URL" gives a lot of results.