Menu

#1 Ciao.es Web Scraping

open
Jesica F
None
7
2011-10-20
2011-10-20
Jesica F
No

Implement a web scraper that can download reviews in batch from ciao.es site.

Functional:
* The scraper can be configured manually using some kind of configuration files.
* Download the review text and clean it. Also it should automatically iterate in a category page downloading all the opinions found.
* Extract the rating of the product to categorize the review in a class. This is needed to train a classifier afterwards.

Non functional:
Performance requirements:
* Can perform multiple scrape tasks asynchronously. (Speeding the data extracting process)

Discussion


Log in to post a comment.

MongoDB Logo MongoDB