From: netbeans <net...@fl...> - 2002-08-26 04:50:09
|
I want to use HTMLUnit to fetch information from websites I use (that I login to) and create a personal page of information just the way I want it and save it to disk for offline viewing. I am hitting some of these sites and am getting errors back saying the page is not available. I believe they are detecting whether I am connecting to the site with a recognized browser or not. I changed all the HTTP request headers using HTMLUnit to match that of my browser but was still unable to get through. What can I do to hit the web site login with my information and get the data I need? Does the mozilla browser for example connect in such a way that these sites can tell? Am I going to have to dig into the mozilla code to do this for me or is there a way to do it with HTMLUnit? The headers I sent matched those of Mozilla exactly except for 2 things: 1. order - could not get HTMLUnit to send in the order the header was added. 2. HTMLUnit adds "Content-Length" even when I requested that it be removed. Besides these 2 items - the headers I sent were identical to mozilla headers and I am able to hit the site using mozilla 1.0. How do I specify the exact order and get the "Content-Length" removed? something tells me this may not make a difference but I should at least be able to do these two and try. If a browser gets through - there has to be a way to get through with htmlunit or some other program. thanks netbeans |