| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| README.txt | 2010-02-06 | 1.4 kB | |
| parserd.php | 2010-02-06 | 2.7 kB | |
| logger.php | 2010-02-06 | 2.0 kB | |
| Totals: 3 Items | 6.1 kB | 0 |
The project is aiming to do two things: First, track the total transit time it takes a news story to travel from site A to site Z across the internet (total transit time = t3, get it? :P) Additionally, we'll be indexing the content of the article at each hop and tracking the morphing of the text of the article (or summaries) as it traverses the web. The goal here is to gather some interesting data sets relating to how quickly news flows across the net, and how it changes based on the genre of news site it hits (or if it changes at all). The technologies used are as follows: Main parsing system: PHP-CLI interfacing with TokyoCabinet / TokyoTyrant Database system Main Front End (web): PHP Controller scripts (add/remove/edit news sites we watch): PHP-CLI / Perl If you are interested in helping the project out with code, database design, or anything that may be of use, please visit our sourceforge page at: http://www.sourceforge.net/projects/t3study and get in touch! Files: parserd.php - this is the main application that creates a listening daemon on the server for the (soon to come) fetching+dumping app to dump its data too. This daemon will parse out the required information (author, title, body, post date) from the dumped articles and write them to the DB. logger.php - This servers as the function includes file for the project. README.txt - This file :-) - Mike <mb1689@gmail.com>