Menu

Proxy doesn't seem to work

Help
David
2011-11-25
2012-09-04
  • David

    David - 2011-11-25

    Hi,

    I've only began to use web harvest and it's really cool, but I have a big
    issue with proxy. I have written a script that harvest data from a web site.
    If I run this script and the site is running on my local web server,
    everything is fine. But when I try to access the same site that is outside our
    company(behind firewall), it doesn't work. i have setup the proxy in the GUI
    and all the details(username, pasword and port) but still nothing. It does not
    work from my java program either(yes, I did setup all the proxy details there
    as well). I know that the proxy address and all the details are correct,
    because i use the same details in another scraping software and it works fine.

    When I diplay everyhing between the html body tags, here is what I get
    returned:

    The page

    cannot be displayedThere is a problem with the

    page you are trying to reach and it cannot be displayed.Please try the
    following:

    Click the

    Refreshbutton,

    or try again later.

    Open the

    <!--

    if (!((window.navigator.userAgent.indexOf("MSIE") > 0) &&
    (window.navigator.appVersion.charAt(0) == "2")))

    {

    Homepage();

    }

    //-->home page, and then look for links to the information you want.If you
    typed the page address in the Address bar, make sure that it

    is spelled correctly.

    Verify that the Internet access policy on your network allows you

    to view this this page.

    If you believe you should be able to view this directory or page,

    please contact the Web site administrator by using the e-mail address or

    phone number listed on the

    Homepage();home page.

    HTTP 407 Proxy Authentication Required - The ISA Server requires authorization
    to fulfill the request. Access to the Web Proxy service is denied. (12209)

    Internet Security and Acceleration Server

    Technical Information (for support personnel)

    Background:

    The gateway could not retrieve the requested page.

    ISA Server: blahblah.emea.ourcompanyname.net

    Via:

    Time: 25/11/2011 14:34:53 GMT

    Any thoughts on what could be wrong?

    Thanks for any input.

    Dave

     
  • Alex Wajda

    Alex Wajda - 2011-11-26

    unfortunately I cannot test it as I don't have any http proxy rolled out in
    any of the networks I have access to.

    From the source code perspective everything looks good and if you correctly
    provided all the proxy details in WH IDE it should work. I'll try to test
    proxy support somehow.

     
  • Anonymous

    Anonymous - 2011-11-27

    I used the proxy parameters and it works fine (without authentification), but
    I suppose that it's your ISA server wich is requesting an authentication in a
    mode different than (I suppose) the text mode you are providing.

    I neither have an ISA Serv. to confirm

     
  • alosada

    alosada - 2011-11-28

    Same issue with our ISA server....

     
  • David

    David - 2011-11-29

    Mmm, didn't manage to get it work, so what I basically did, I cheated. In my
    Java program, I download the web site to my local hard drive first and then
    web-scrape this file. It's probably not the best solution, but hey, it works
    :)

     

Log in to post a comment.