WebHarvest - web data extraction tool / Discussion / Help: How to scrap a website which prevents scraping

How to scrap a website which prevents scraping

Forum: Help

Creator: tarandeep sawhney

Created: 2013-09-11

Updated: 2013-09-22

tarandeep sawhney - 2013-09-11

Hi All

I am going to be using Webharvest for scraping around 3000 websites. I must congratulate all the commiters for creating such a great library. Can you please help me in understanding options while using webharvest to deal with a scenario when a website has blocked crawlers/scrapers. I am anticipating some websites will pose such a scenario so wanted to understand how would i deal with it using webharvest library

can you please provide your valuable help

best regards
tarandeep

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Maciej Czapiewski - 2013-09-22

Hi,

Can you give some example of website you are talking about? What mechanism exactly do you mean writing "blocked crawlers/scrapers"?

Cheers,
MC

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.