From: Bruce D. <bd...@qa...> - 2005-01-26 19:29:17
|
I added the cookies_input_file and disable_cookies elements to the config file for this site as follows: # enable cookies support disable_cookies: false cookies_input_file: /apps/www/htdig/conf/cookies.txt I then added a single line to the cookies.txt file as follows: testintranetportal TRUE /portal FALSE 0 epicentric d97692f64e7b540b0f504e238790307a I got this info after clearing my cookies and then logging in to the site. There is also a session id cookie (JSESSIONID) which gets set, but it is only persistent for the session. The "epicentric" cookie seems to be the one which contains persistent login information and the value in the line above is what was stored in my browser for my login. However, this still does not get me past the login redirect.... Here's some info from the htdig verbose output: rundig: Start time: Wed Jan 26 14:13:43 EST 2005 ht://dig Start Time: Wed Jan 26 14:13:43 2005 Importing Cookies input file /apps/www/htdig/conf/cookies.txt Cookies that have been correctly imported from: /apps/www/htdig/conf/cookies.txt 1. epicentric: d97692f64e7b540b0f504e238790307a (Domain: testintranetportal) ......This tells me the htdig cookies file is being read correctly....... Try to get through to host testintranetportal (port 80) 2 - Open of the connection ok Assigning the server (testintranetportal) to the TCP connection Assigned the remote host testintranetportal Assigning the port (80) to the TCP connection Assigned the port 80 Connecting via TCP to (testintranetportal:80) New connection open successfully Header line: HTTP/1.1 302 Found Header line: Server: Microsoft-IIS/5.0 Header line: Date: Wed, 26 Jan 2005 19:09:14 GMT Header line: X-Powered-By: ASP.NET Discarded header line: X-Powered-By: ASP.NET Header line: Connection: close Header line: Server: WebSphere Application Server/5.0 Header line: Set-Cookie: JSESSIONID=00003WHZUFZ25GHUYKKT0E5Y5LI:-1;Path=/ ........This tells me that a TCP connection can be made to the server...and in fact the server sets JSESSIONID....... Retrieving document /portal/site/inside-test/index.jsp on host: testintranetportal:80 Http version : HTTP/1.1 Server : HTTP/1.1 Status Code : 302 Reason : Found Access Time : Wed, 26 Jan 2005 19:09:14 EST Modification Time : Wed, 26 Jan 2005 19:13:44 EST Content-type : text/html; charset=UTF-8 Content-Language : en-US Connection : close Persistent connection: not accepted Body not retrieved 2 - Connection closed (No persistent connection) Request time: 0 secs Contents: Content Type: text/html; charset=UTF-8 Content Length: -1 Modification Time: 2005-01-26 19:13:44 EST redirect redirect: http://testintranetportal/portal/site/inside-test/index.jsp?epi-content=LOGIN resolving 'http://testintranetportal/portal/site/inside-test/index.jsp?epi-content=LOGIN' pushing http://testintranetportal/portal/site/inside-test/index.jsp?epi-content=LOGIN ......This tells me that htdig is trying to retrieve the correct document (index.jsp)....I'm not sure what the deal is with the persistent connection error??....... ......Then you can see that there is a redirect to the LOGIN page...... Thereafter there are quite a few lines of similar content since each URL I'm trying to dig gets redirected in the same manner as above. At the end, the login page is actually read and indexed....but that's the only page. Perhaps this info provides some detail that might be helpful in further diagnosing the problem. I still haven't heard back from my colleague who is supposed to be contacting Vignette. Is there any way to tell where/how/if htdig is attempting to set the cookie or pass it to the host/server? I didn't see anything in the log file about that. Thanks for your help. Bruce Neal Richter <ne...@ri...> 01/24/2005 08:51 PM To Bruce DeYoung <bd...@qa...> cc htd...@li... Subject Re: [htdig-dev] htDig and Vignette?? On Mon, 24 Jan 2005, Bruce DeYoung wrote: > OK. Here's the URL at the login page: > > http://testintranetportal/portal/site/insideQAD/index.jsp?epi-content=LOGIN > > Then, after logging in, here are a couple of URL's of content pages: > > http://testintranetportal/portal/site/insideQAD/index.jsp?front_door=true&epi_menuItemID=17b4d03e0ebb0d03c0bc8ed22890307a&epi_menuID=557c013f162725a5c2046e478790307a&epi_baseMenuID=557c013f162725a5c2046e478790307a > > and > > http://testintranetportal/portal/site/insideQAD/index.jsp?front_door=true&epi_menuItemID=8853e4e036d9d40ecfd048922890307a&epi_menuID=b65bac56c452abf6aeda32202890307a&epi_baseMenuID=557c013f162725a5c2046e478790307a ha ha.. this is almost as opaque as it gets. I've solved your issue before via the cookies file and rewriting the URL.. but those are not very informative. For those URLs you need information on how they tell the CGI what to do and 'is there a sessionid buried in there'? Can you get this from Vignette or the people that connected Vignette to whatever CGI/ASP/JSP software that produces the website? The main question you want to answer is this: Do I need to do anything to those URLs so that after a user clicks on a search result they are able to view that page without screwing up my reporting? A simple test would be to log into the site with one browser and 'cut' a link URL. Then open up a second Browser (different one, not two IE windows) like Firefox (with the cookies all cleared) and paste the URL into it. What happens? Will the search box be 'behind' the login screen? ie the users will already be loged-in before they do their first search. Anyway, things to think about. Thanks > Thanks again, > > Bruce > > > > > Neal Richter <ne...@ri...> > 01/24/2005 12:59 PM > > To > Bruce DeYoung <bd...@qa...> > cc > htd...@li... > Subject > Re: [htdig-dev] htDig and Vignette?? > > > > > > > On Sun, 23 Jan 2005, Bruce DeYoung wrote: > >> Thanks Neal for the reply. Unfortunately, I cannot provide a link to > the >> site since it is an intranet site only....at this time. > > Post it anyway so I can take a look at it's structure. Post the login > URL then the first URL you see after a sucessful login. > >> My suspicion about this is that Vignette security is handled differently >> than, say, standard Apache security. Using the -u option with htdig > and >> supplying an authenticated user for our Apache-based sites works fine. > I'm >> not sure how Vignette authentication works, but I do know that when you >> attempt to access the site, if your login cookie is not set, it will >> rediret to a login page and request authentication information. > > Open your cookies file in the browser and clear anything associated > with > this website, then relogin into the webiste and check the cookies. > >> I've asked our Vignette developer to request some assistance from > Vignette >> support as well. >> >> When you say "make sure cookie support is enabled", are you referring to >> something in Vignette or in htDig? > > I assume you are using HtDig 3.2B6 > > Look at the cookies_input_file & disable_cookies settings in HtDig. > The disable_cookies is 'true' be default. > > My gut feeling is that it's setting a cookie. You can take the > contents of the cookie that the browser stores and load it in to the > HtDig indexer via the cookies_input_file. > > It may also be that the software checks the 'user_agent' string > supplied > by the browser/indexer and may disallow access if you aren't running a > certain version of browser. > > You can fake this buy setting the user_agent in HtDig to be the string > supplied by IE. Get it from your apache server weblogs. > > I've seen both of these problems and worked around them this way. > >> And, I understand what you're saying about using the rewrite rules...and > I >> think you're right about that one. So, once I'm able to dig the site, > I >> will look at the URL references and create a url_rewrite rule to remove >> the session information. > > Thanks. > > -- Neal Richter Knowledgebase Developer RightNow Technologies, Inc. Customer Service for Every Web Site Office: 406-522-1485 |