That said, I have doubts about google searches using that interface. Likely the bot module would better to rewritten to manually parse the html itself, or use a perl module like WWW::Scraper, but seeing how it seems to require a forced install in cpan, I have my doubts about that as well.
Another important thing to consider is the google terms of service:
As well, the author of WWW::Scraper::Google has this to say:
-----
Please note that using the Google Scraper module (may) be a violation of Google’s "Terms of Service", of which your humble author has been repeatedly reminded. The TOS is not as easy to locate as some of these correspondents have suggested (without a smile), but you can find the TOS at http://www.google.com/terms_of_service.html
Briefly, the relevant part is the "No Automated Querying" section. It’s a kind of "do as I say, not as I do" dic‐ tum. Your author has tried to divine exactly what it means. On the surface it’s pretty clear, but if you follow the thread you will realize that it doesn’t lead to a place any of us want to be. However, Google Inc’s desire is clear enough. They do not want to be *abused* for the exclusive benefit of someone else.
Scraper is not a tool well suited for this kind of abuse. It is designed to be generally configurable and, as such, it is not particularly efficient. It obeys the "robot.txt" rules published by the web-server. It would require some effort on a user’s part to cirumvent this feature. The Google.pm does not do a "meta-search" on Google. Even if your humble author removed Google.pm from the Scraper suite, it would be trivially easy for someone to build a Google module for Scraper (their format is very simple compared to others).
I believe that Google Inc. understands a little interloping (in moderation) is beneficial to all. I should note that Google Inc. has not notified your author of any concern on their part. This has been done by third parties who, for whatever reasons of their own, feel it necessary to interject themselves in others’ disputes, even when no such dispute exists.
Keep in mind that this is Google’s livelihood. Should your use of Scraper be your hobby, or even part of your livelihood, remember it never helps to hit someone where they live. They will defend themselves to the death (even if that death is yours).
Scraper is a handy little tool for getting to stuff you can’t get to otherwise. Let’s keep it that way!
-----
All that being said, if you can demonstrate a working patch against svn trunk@HEAD, drop by irc, show us and I am it might be accepted as a patch. Otherwise, you might be better off waiting until we sit down and discus the fate of google searches in the bot.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have a Google API key (SOAP) and I did not see how to enter/use it from trolling the code - but I found a bit from blootbot.
Here is my svn diff to current on how I entered it:
--- -- ---
Index: Modules/W3Search.pl
===================================================================
--- Modules/W3Search.pl (revision 1832)
+++ Modules/W3Search.pl (working copy)
@@ -29,7 +29,16 @@
return unless &::loadPerlModule("WWW::Search");
- eval { $Search = new WWW::Search( $where, agent_name => 'Mozilla/4.5' ); };
+ eval {
+ if ($where eq 'Google') {
+ # key is your Google API key.
+ # Get it from http://api.google.com/createkey
+ $Search = new WWW::Search('Google',key => 'GOOGLE_SOPA_API_KEY')
+ #$Search = new WWW::Search('Google',key => &::IsParam('googleAPIkey') )
+ } else {
+ $Search = new WWW::Search( $where, agent_name => 'Mozilla/4.5' );
+ }
+ };
if ( !defined $Search ) {
&::msg( $::who, "$where is invalid search." );
--- -- ---
I wanted to make it a parameter - but the search came back as a bad key - if there are suggestion on that pointwould appreciate.
-D
Unfortunately, none of us have those ancient keys anymore to test with. See:
http://groups.google.com/group/Google-AJAX-Search-API/browse_thread/thread/5b332a153a3cb99b
That said, I have doubts about google searches using that interface. Likely the bot module would better to rewritten to manually parse the html itself, or use a perl module like WWW::Scraper, but seeing how it seems to require a forced install in cpan, I have my doubts about that as well.
Another important thing to consider is the google terms of service:
http://www.google.com/terms_of_service.html
As well, the author of WWW::Scraper::Google has this to say:
-----
Please note that using the Google Scraper module (may) be a violation of Google’s "Terms of Service", of which your humble author has been repeatedly reminded. The TOS is not as easy to locate as some of these correspondents have suggested (without a smile), but you can find the TOS at http://www.google.com/terms_of_service.html
Briefly, the relevant part is the "No Automated Querying" section. It’s a kind of "do as I say, not as I do" dic‐ tum. Your author has tried to divine exactly what it means. On the surface it’s pretty clear, but if you follow the thread you will realize that it doesn’t lead to a place any of us want to be. However, Google Inc’s desire is clear enough. They do not want to be *abused* for the exclusive benefit of someone else.
Scraper is not a tool well suited for this kind of abuse. It is designed to be generally configurable and, as such, it is not particularly efficient. It obeys the "robot.txt" rules published by the web-server. It would require some effort on a user’s part to cirumvent this feature. The Google.pm does not do a "meta-search" on Google. Even if your humble author removed Google.pm from the Scraper suite, it would be trivially easy for someone to build a Google module for Scraper (their format is very simple compared to others).
I believe that Google Inc. understands a little interloping (in moderation) is beneficial to all. I should note that Google Inc. has not notified your author of any concern on their part. This has been done by third parties who, for whatever reasons of their own, feel it necessary to interject themselves in others’ disputes, even when no such dispute exists.
Keep in mind that this is Google’s livelihood. Should your use of Scraper be your hobby, or even part of your livelihood, remember it never helps to hit someone where they live. They will defend themselves to the death (even if that death is yours).
Scraper is a handy little tool for getting to stuff you can’t get to otherwise. Let’s keep it that way!
-----
All that being said, if you can demonstrate a working patch against svn trunk@HEAD, drop by irc, show us and I am it might be accepted as a patch. Otherwise, you might be better off waiting until we sit down and discus the fate of google searches in the bot.