Re: [Phplib-users] Google, Session4 and block_alien_sid
Brought to you by:
nhruby,
richardarcher
From: Nathaniel P. <np...@te...> - 2004-12-07 17:50:25
|
Virilo Tejedor wrote: > After several attempts, finally Google bot visits all my site (also >dynamics pages). > > The problem is that Google have indexed something like: >mysite.com/article.php?article_id=111&Mi_Session=843e8bd410a726f15f63d0dfcc7 >da532 > > Im using phplib 7.4 with 'session4.inc' session class. I have noticed >that there is not block_alien_sid flag like in 'session.inc'. Then all >visitors linked by Google, are using the same session. > > Since session4.inc uses PHP's built-in session handling, the problem is with that, not with PHPlib. You'll want to take a good look at PHP's documentation on it: http://us2.php.net/manual/en/ref.session.php In your case, as long as your sessions expire within a reasonable amount of time (i.e. a few hours at most), it shouldn't be a widespread problem. PHP will simply discard the session in the URL if it has expired and create a new one. > I have thought in blocking alien sessions, clearing this string from >URL. But I cant manage a list with forbidden session ids, because there is >many bots, and they use a new session each time. > > One possible solution could be "ip-blocking". I have readed that this >isnt the best solution for session hijacking, due to the proxies, but can >solve my problem with Google. > > There is a better solution? Or any implementation for ip-bloking? > > There are a couple of possibilities. (Be aware, I have no experience using these settings; you might want to ask one of the general PHP lists about this for a firsthand account.) First, if your site doesn't require sessions to work properly, or you're willing to limit sessions to only clients that support cookies, you can fix this problem by setting the session.use_only_cookies setting to true, which disables the Session ID in the URL. This is probably one of the better solutions, as most clients that have cookies off are likely aware of the issues involved in not accepting cookies. However, if googlebot (which doesn't care about cookies, AFAIK) needs to have it's own session in order to properly index your site, this will cause problems. (However, one might argue that you have a flawed design if that is the case.) Another possibility would be to use session.referer_check, set to your website address. However, this would likely keep sessions from working on clients that either set an empty string for their referrer or that spoof it for privacy reasons. I don't know if googlebot is such a client, so again, this may cause problems if googlebot must have a session to index your site. You could disable or destroy sessions when the user agent looks like googlebot or some other search bot (via their IP or useragent string, perhaps). Another possibility is to store the user's IP address with the session when it is first created (is that what you meant by 'IP blocking'?), then make sure that IP address matches each time the session is called back up. However, this can cause problems if you can't depend on the user of the site to maintain a single IP address for the duration of the session (not uncommon with large ISPs that use proxies, such as AOL). This can be mitigated to some degree by only matching the first 2 or 3 octets of the address. A related topic is also discussed at this thread on the PHP-General mailing list: http://marc.theaimsgroup.com/?t=102722998300003&r=1&w=2 Again, I'd recommend posing the question to one of the PHP mailing lists for more specific answers. Hope this helps. -- ___________________________ Nathaniel Price http://www.tesserportal.net Webmaster |