Re: [Htmlparser-user] Cookie handling
Brought to you by:
derrickoswald
From: Gavin G. <ga...@br...> - 2007-01-30 13:16:38
|
Hi Derrick, cheers for the quick reply. I knew I'd regret posting the code I did because the actual code I'm using does in fact have redirection processing enabled. (I had manged to mix up similar code elsewhere that actually needs the redirection stuff disabled.) Here's a quick example of the sort of problems I'm having (including a url of a page causing issues): --- import org.htmlparser.*; import org.htmlparser.http.*; import org.htmlparser.util.*; public class TestParser { public static void main(String[] argv) { try { ConnectionManager manager = Parser.getConnectionManager(); manager.setRedirectionProcessingEnabled(true); manager.setCookieProcessingEnabled(true); Parser parser = new Parser("http://www3.interscience.wiley.com/cgi-bin/abstract/68504762/ABSTRACT?CRETRY=1&SRETRY=0"); System.out.print(((NodeList)parser.parse(null)).toHtml()); } catch (ParserException pe) { pe.printStackTrace(); } } } sokar:~/junk% javac TestParser.java sokar:~/junk% java TestParser | grep -i --color Cookie <title>Wiley InterScience :: Session Cookies</title> <h2>Session Cookie Error</h2> <strong>An error has occured because we were unable to send a cookie to your web browser.</strong> Session cookies are commonly used to facilitate improved site navigation. In order to use Wiley InterScience you must have your browser set to accept cookies. Once you have logged in to Wiley InterScience, our Web server uses a temporary cookie to help us manage your visit. This <strong>Session Cookie</strong> is deleted when you logoff Wiley InterScience, or when you quit your browser. The cookie allows us to quickly determine your access control rights and your personal preferences during your online session. The Session Cookie is set out of necessity and not out of convenience. -- I'm getting the feeling that maybe the sites using some sort of strange technique for restricting access, but I've similar examples of other sites doing the same kind of things. Cheers again, Gavin. On Tue, Jan 30, 2007 at 04:36:49AM -0800, Derrick Oswald wrote: > Hello, > > In my experience, you will need both redirection processing and cookie processing. > The general scenario is a page is returned with cookie's and a redirect. > If the redirect is taken, the server looks for the cookie it tried to set with the first page. > You can monitor the traffic in the header by implementing ConnectionMonitor (like the Parser). > > Derrick > > ----- Original Message ---- > From: Gavin Gilmour <ga...@br...> > To: htm...@li... > Sent: Tuesday, January 30, 2007 5:59:41 AM > Subject: [Htmlparser-user] Cookie handling > > Hi, I hope this is the correct place to ask the mailing list is looking pretty > quiet according to sourceforge's mail archives :) > > I've ran into difficulty parsing certain sites that attempt to set cookies in > an effort to read them later. The sites in question generally spit back pages > saying that your browser or whatever is misconfigured and deny access. I assume > this is simply because they can't read the cookies that should've been set and > I wondered if there was an option to do so somewhere. > > I'm not trying to set *actual* cookies which most of the examples I've seen > indicate how to do, but just allow pages to dump cookies and read them later. > Is this possible or (possibly) unrealistic and difficult to implement? > > The current code I've got is: > > ... > ConnectionManager manager = Parser.getConnectionManager(); > manager.setRedirectionProcessingEnabled(false); > manager.setCookieProcessingEnabled(true); > Parser parser = new Parser(...); > ---- > > etc. > > > Cheers for any responses! > > Gavin. > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Htmlparser-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Htmlparser-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlparser-user |