You can subscribe to this list here.
2005 |
Jan
(2) |
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
(1) |
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
---|
From: ted p. <tpederse@d.umn.edu> - 2005-07-23 03:18:27
|
I am happy to report that the final version of Prath's MS thesis has been submitted, and he is now officially graduated! Yippee! :) Prath is of course the founding developer of GoogleHack, which is available here: http://google-hack.sourceforge.net You can find Prath's thesis on "Identifying Sets of Related Words from the World Wide Web" here: http://www.d.umn.edu/~tpederse/Pubs/prath-thesis.pdf Prath is now employed at Thomson/West Publishing in the Twin Cities, but will still be involved in the future of GoogleHack. Congratulations! Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: ted p. <tpederse@d.umn.edu> - 2005-06-06 13:02:36
|
COMPUTER SCIENCE COLLOQUIUM Identifying Sets of Related Words on the World Wide Web PRATHEEPAN RAVEENDRANATHAN Computer Science Graduate Student Thursday, June 9, 2005 1:00 p.m. HELLER HALL 306 ABSTRACT As the Internet keeps growing, the number of Web pages indexed by commercial search engines such as Google increases rapidly. Currently, Google reports that they index over 8 billion Web pages. The type of information available through the Web is very diverse, from publications to electronic encyclopedias to information about products. In short, the Web is vast and huge. Until recently, the Web has not been used to acquire information about words in order to better understand Natural Language. However, we believe that there is a need to develop methods that take advantage of the huge amount of information on the Web. Hence, this thesis focuses on finding sets of related words by using the World Wide Web. This thesis presents three new methods for using Web search results to find sets of related words. We rely on the Google API to obtain search engine results, but in principle these methods can be used with any search engine. They rely on pattern matching techniques in addition to various measures or relatedness that we have developed. In addition to finding sets of related words, we also explore the problem of Sentiment Classification. This was motivated by a desire to find a practical application for the sets of related words we discover. As such we extend the Pointwise Mutual Information - Information Retrieval (PMI-IR) measure described in (Turney, 2002) to be used with Google in order to discover sets of related words. These sets are then used as seeds in our Sentiment Classification algorithm. |
From: ted p. <tpederse@d.umn.edu> - 2005-04-08 01:09:03
|
Greetings all, There have been a number of improvements made to Google-Hack over the last few weeks, and we'd encourage you to make sure you are using the most recent version (which is now 0.12). There is some new functionality, and we have also been working on improving the documentation. There are now several different measures supported, and we show results using both frequency and scores of these measures. We also support bigrams as input and output now. Find the new version on CPAN: http://search.cpan.org/dist/WebService-GoogleHack/ And links to the web interface, CPAN, sourceforge, etc. at: http://www.d.umn.edu/~tpederse/googlehack.html Enjoy, and let us know if you have any questions. -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: ted p. <tpederse@d.umn.edu> - 2005-01-03 20:50:35
|
We are happy to announce the release of Google-Hack version 0.06, available from CPAN or SourceForge. You can find links to both here: http://www.d.umn.edu/~tpederse/googlehack.html or at http://google-hack.sourceforge.net This release features a number of improvements, including a ranking score that is displayed for related words that are returned, and updated readme and installation instructions. You can also try the new version via the web interface as well. That can be found at: http://www.d.umn.edu/~rave0029/cgi-bin/ghack/google_hack.cgi Please check this out, and let us know what you think! Thanks, Ted and Prath -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: ted p. <tpederse@d.umn.edu> - 2005-01-01 18:20:41
|
I have updated http://www.d.umn.edu/~tpederse/googlehack.html to include a few more links and information about the package. I'll use this when I announce new releases, as it points to both SourceForge and CPAN. Let me know if there is any additional information I should include on the page! Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse |