Name | Modified | Size | Downloads / Week |
---|---|---|---|
RoboBrowser.zip | 2016-09-18 | 47.2 MB | |
readme.txt | 2016-09-18 | 1.8 kB | |
Totals: 2 Items | 47.2 MB | 0 |
RoboBrowser Scrape the web while browsing This one a webkit powered browser which built for web scraping purposes. It loads requested webpage, saves page source to disk, and sends it's path to php script as first parameter. You have to write your own script to collect data from each page. A sample included in application. Application also supports javascript injection to loaded webpage to automate clicks. you can use preloaded jquery framework to simulate clicks and mouse events to create an headless setup. ################################################## Pros: + you can watch and take action when scraper stalled + solve captcha's yourself or enter your account credentials when servers asked. + browse webpages like any regular visitor by using all available web standards. servers can't tell the difference you're a bot or human. Cons: - Kinda slow when compared to headless spiders - Uses more system resources - Not designed for headless execution Known Issues: - Due to a renderer bug in cef, scrollbars gone red. It's only visual, doesn't affect the application behaviour. - Sometimes application crashing while shutting down, you have to terminate it manually from process manager in that case. - If application won't start, make sure there is no any other instances active and delete the contents of the "cache" folder ################################################## Might be useful while scraping results from highly secured data sources such as search engines, live stock information sports bet results etc. ################################################## It's designed for personal purposes, if you need extra upgrades contact me at root@psychip.net Armagan Corlu aka Psychip http://psychip.net Aug 2016