Menu

Tree [d455bf] master /
 History

HTTPS access


File Date Author Commit
 Windows-Version-1.1 2015-07-12 AJBBB AJBBB [d455bf] Updated Windows Version
 LICENSE 2015-07-11 AJBBB AJBBB [80c344] Initial commit
 README.md 2015-07-12 AJBBB AJBBB [bb2f48] Minor Changes
 forumlog.txt 2015-07-11 AJBBB AJBBB [d5bdec] Minor Changes
 mturklinks.html 2015-07-11 AJBBB AJBBB [d5bdec] Minor Changes
 style.css 2015-07-11 AJBBB AJBBB [200da7] Update style.css
 vB-mTurk-Scraper.py 2015-07-12 AJBBB AJBBB [f4fb64] Minor Changes

Read Me

vB-mTurk-Scraper

Can scrape vBulletin forums for links to mTurk HITS

Written in: Python 2.* (Not 3 Compatible OR Tested)


Requires: BeautifulSoup and requests

To install BeautifulSoup run:
pip install beautifulsoup4
OR: easy_install beautifulsoup4

To install requests run:
pip install requests
OR: easy_install requests

Information required to run program:

  • HITS Thread Number (Changes Daily: 5 digit number found in the thread URL)
  • Forum URL
  • Page To Start From
  • Number Of Pages To Scrape

This is a very very simple command-line script. Simply run python vB-mTurk-Scraper.py and it will guide you through setting the forum you want to scrape, entering todays thread number, which page you want to start from, and how many pages you want to go through.

Note: When entering the address to the forum, enter only the domain, ex: forum.com

It outputs all HITS to the html file mturklinks.html and always overwrites it self. (No need to worry about clearing the files)

The final question asked is about writing to the forumlog.txt. If you answer with True it will write to the forumlog.txt file with just the links so you can share that plain text with people. Otherwise type in False. This always overwrites it self. (No need to worry about clearing the files)

This is perfect for when you wake up! Run it on the hits thread of any vBulletin mTurk forum you like, and you get an html file full of all the hits you missed. (Note: It will include duplicates if people post and then quote the hits)

This has only been tested on Mturkforum.com and Turkernation.com - Any vBulletin forum that allows you to read posts without being logged in should work.

Note To Windows Users: Click "Download ZIP" on the right side of the page. Windows versions will be updated less frequently than the vB-mTurk-Scraper.py file.

MIT Licensed

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.