newsscrape is web scraping for news headline to analyse on how it relates to a news category.

- It extracts RSS feed from Google News.
- Each news headline is matched against Google News category like Entertainment, Sports, etc.
- Called from scheduler to collect this data at 5 minutes interval and be accumulated in a database.
- It contains R statistical computing scripts to learn the pattern on words in the headline resulting a particular category.
- To test its accuracy in predicting the category from a news headline, select a news title from other sources - e.g. http://rss.news.yahoo.com/rss/entertainment - and incorporate it into the R script for outputting a news category it assumes on the news title.

Project Activity

See All Activity >

Categories

Web Scrapers

Follow newsscrape

newsscrape Web Site

You Might Also Like
SKUDONET Open Source Load Balancer Icon
SKUDONET Open Source Load Balancer

Take advantage of Open Source Load Balancer to elevate your business security and IT infrastructure with a custom ADC Solution.

SKUDONET ADC, operates at the application layer, efficiently distributing network load and application load across multiple servers. This not only enhances the performance of your application but also ensures that your web servers can handle more traffic seamlessly.
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of newsscrape!

Additional Project Details

Registered

2016-07-15