Re: [Htmlparser-user] help on parser's constructor
Brought to you by:
derrickoswald
|
From: Derrick O. <der...@ro...> - 2007-11-16 12:15:46
|
Method 1 uses the ConnectionManager class which does some conditioning of the connection besides proxies - which I assume you aren't using. The code looks like this:
HttpURLConnection http;
if (getRedirectionProcessingEnabled ())
http.setInstanceFollowRedirects (false);
// set the fixed request properties
properties = getRequestProperties ();
if (null != properties)
for (enumeration = properties.keys ();
enumeration.hasMoreElements ();)
{
key = (String)enumeration.nextElement ();
value = (String)properties.get (key);
http.setRequestProperty (key, value);
}
// set the proxy name and password
if ((null != getProxyUser ())
&& (null != getProxyPassword ()))
{
auth = getProxyUser () + ":" + getProxyPassword ();
encoded = encode (auth.getBytes("ISO-8859-1"));
http.setRequestProperty ("Proxy-Authorization", "Basic " + encoded);
}
// set the URL name and password
if ((null != getUser ()) && (null != getPassword ()))
{
auth = getUser () + ":" + getPassword ();
encoded = encode (auth.getBytes("ISO-8859-1"));
http.setRequestProperty ("Authorization",
"Basic " + encoded);
}
if (getCookieProcessingEnabled ())
// set the cookies based on the url
addCookies (http);
Of these, it's probably the request properties that are supplied by default that change the returned page (unless you're doing something else different yourself). The default request properties are only two:
"User-Agent", "HTMLParser/2.0"
"Accept-Encoding", "gzip, deflate"
You can add these to your own URLConnection and see if that changes the returned page.
----- Original Message ----
From: Marcel <ta...@gm...>
To: htmlparser user list <htm...@li...>
Sent: Friday, November 16, 2007 12:33:44 AM
Subject: [Htmlparser-user] help on parser's constructor
Hi, I used htmlparser to parse certain web pages. I found some weird thing about parser's two constructors.
Say, I have a urlString
----------- method 1 ------------
Parser parser = new Parser(urlString);
-------- method 2 ------------
URL url = new URL(urlString);
Parser parser = new Parser(url.openConnection());
These two methods got different page contents for the same urlString. Anybody knows the reason? What is the difference between those two constructors?
Thanks
-marcel
|