Showing 131 open source projects for "scraping"

View related business solutions
  • Our Free Plans just got better! | Auth0 by Okta Icon
    Our Free Plans just got better! | Auth0 by Okta

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your secuirty. Auth0 now, thank yourself later.
    Try free now
  • Bright Data - All in One Platform for Proxies and Web Scraping Icon
    Bright Data - All in One Platform for Proxies and Web Scraping

    Say goodbye to blocks, restrictions, and CAPTCHAs

    Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.
    Get Started
  • 1
    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit for All of Us

    DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/ It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. Of course you may specify...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    JAWS - Just Another Web Scraper

    JAWS - Just Another Web Scraper

    A simple Web Scraper using Regular Expression or Html Agility

    JAWS or Just Another Web Scraper, is part of the Data Scraping Softwares developed by SVbook, alongside JATI (Image to Text) and JAVT (Video to Text). JAWS offer easy interface to scrape data from the website using regular expression, text preprocessing, or HTML Agility Pack.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Simple-Scrape is a simple web-scraping library that allows for programmatic access to HTML code. No further techniques are needed and the library is very compact and thus easy to use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    RoboBrowser

    RoboBrowser

    On the fly web scraper

    RoboBrowser is a webkit powered browser which built for web scraping purposes. It loads requested webpage, saves page source to disk, and sends it's path to a php script as first parameter.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Digital Payments by Deluxe Payment Exchange Icon
    Digital Payments by Deluxe Payment Exchange

    A single integrated payables solution that takes manual payment processes out of the equation, helping reduce risk and cutting costs for your business

    Save time, money and your sanity. Deluxe Payment Exchange+ (DPX+) is our integrated payments solution that streamlines and automates your accounts payable (AP) disbursements. DPX+ ensures secure payments and offers suppliers alternate ways to receive funds, including mailed checks, ACH, virtual credit cards, debit cards, or eCheck payments. By simply integrating with your existing accounting software like QuickBooks®, you’ll implement efficient payment solutions for AP with ease—without costly development fees or untimely delays.
    Learn More
  • 5

    newsscrape

    news headline collecting for analysis in determining the category

    newsscrape is web scraping for news headline to analyse on how it relates to a news category. - It extracts RSS feed from Google News. - Each news headline is matched against Google News category like Entertainment, Sports, etc. - Called from scheduler to collect this data at 5 minutes interval and be accumulated in a database. - It contains R statistical computing scripts to learn the pattern on words in the headline resulting a particular category. - To test its accuracy in predicting...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    PAMIE

    A Python class to allow the user to automate Internet Explorer

    Python Automation Module (class) for Internet Explorer (PAM.py). Originally written as a simple Python module. This new Python class starting with 2.0 allows the user to automate Internet Explorer browser for QA testing, development testing, or web scraping. This python class only runs on Windows (only) and automates Internet Explorer using the COM object, there is no support for Firefox, Chrome, Safari or Flex at this time. This is not an Application. Also check out the original...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7

    x3270if

    .NET DLL for 3270 Screen Scraping

    x3270if is a .NET DLL for 3270 host screen scraping. It is based on (and requires) a compatible copy of ws3270.exe.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    p2p proxy network

    P2P Proxy Network by Proxies.online

    There are a lot of great features of the P2P Proxy Network. The most vital is the one that we have been discussing, getting access to the servers that we have available so that you are always anonymous. You send data to one of our proxy servers and it exits out the other side via a completely different proxy server with a completely different IP address. That means that you always have a new IP address. Keep in mind, we have anywhere from 1,000 to 2,000 active proxy servers running at any...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    pastebin poster v1.0

    pastebin poster v1.0

    A small program to paste on pastebin.com and user external data

    VISIT YOUTUBE this program have 3 functions : -Grab elements by tag name from a website and populate a text area. - Send a post to Pastebin as Guest or as authenticated user with the help of site API -Shorten urls using Adfly API when API key and uid provided for registered user.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Business Continuity Solutions | ConnectWise BCDR Icon
    Business Continuity Solutions | ConnectWise BCDR

    Build a foundation for data security and disaster recovery to fit your clients’ needs no matter the budget.

    Whether natural disaster, cyberattack, or plain-old human error, data can disappear in the blink of an eye. ConnectWise BCDR (formerly Recover) delivers reliable and secure backup and disaster recovery backed by powerful automation and a 24/7 NOC to get your clients back to work in minutes, not days.
    Learn More
  • 10
    Node Crawler

    Node Crawler

    Web Crawler/Spider for NodeJS + server-side jQuery

    Most powerful, popular and production crawling/scraping package for Node, happy hacking.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11

    TXR

    Text scraping and data munging language.

    NOTE: TXR used SourceForge for hosting binary downloads only. As of July 26, 2016, TXR uses the site Bintray (bintray.com) for hosting binary downloads. Do not look for new releases here! TXR combines a text scraping language combined with an innovative Lisp dialect geared toward data munging. TXR cribs ideas from modern scripting languages, multiple Lisp dialects, functional languages, and Unix tools.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    A Microsoft(tm) RDP session recorder and/or child safe browsing enforcer. Has a screen saver,keyboard logger and screen scraping ability. RautorViewer replays sessions and can do textual searches on scraped content.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13

    ScraperEdit for XBMC

    XML bindings and a GUI for creating and editing XBMC Scrapers

    This program is an editor for creating XBMC Scrapers. It is similar to ScraperEditor, an other editor using ScraperXML, that runs under .Net environment. This program runs under Sun/Oracle's Java Runtime. HELP WANTED! I am looking for someone, who would help me writing documentation, like user's manual and on-line help. Also if someone want to help, translated language files are always welcome...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    ... to master. The standard webStraktor output format is XML based, either in ASCII, UTF-8 or ISO-8859-1 (Latin1) code pages. webStraktor relies on the Apache HttpClient for retrieving content via the HTTP protocol. It adheres to the Robots Exclusion Protocol and it can be configured to operate in an anonymous way by connecting to the predominant types of web proxy servers. webStraktor extends the functionality of web crawlers, spiders or bots by integrating scraping and crawling capabilities.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    scraping - CodeIgniter and SimpleHtmlDom

    Scraping with CodeIgniter, with cURL and SimpleHtmlDom

    This tutorial is about how to build a scraping library [based on cURL] for your CodeIgniter [CI] MVC Framework.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    IP Proxy Scraper

    IP Proxy Scraper

    IP Proxy Scraper lets you extract multiple proxies

    This lightweight yet powerful application extracts IPs and ports from a list of specified websites. If you are in need of multiple proxies simply insert the desired website URLs and with a single click your proxies are gathered and presented to you in the output window, ready to be copied and saved. IP Proxy Scraper is also available for Linux, check it out here: https://sourceforge.net/projects/ipproxyscraperlinux/
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    autolunch

    autolunch

    tools to parse and score lunch restaurant weekly menus

    Autolunch aims to be a set of tools to automate re-occuring the task of selecting the best place to have lunch. It will achieve this by various ways of scraping lunch menus from the web, and score them based on a set of metrics to determine the ideal place to eat based on the particular taste and geographical location of the user. This project is far away from done at this point, and is not usable yet.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    Strigil

    Distributed web scraping tool

    Strigil is a software project started by students of Faculty of Mathematics and Physics on Charles University in Prague as part of OpenData initiative.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    SBDev

    SBDev

    Open torrent tracker

    MOVED: https://github.com/EdisonLV/sbdev_ci
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    SQLDOM

    HTML parser and DOM-related procedures for Microsoft T-SQL

    SQLDOM is an easy and robust way to parse HTML directly into SQL tables, manipulate DOM nodes in a JQuery-like manner, and to render HTML from the SQL-based DOM. SQLDOM is written entirely in native T-SQL, and uses only temporary database objects (tempdb). No changes to user databases are required.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    pyShowRename

    pyShowRename

    Sensible batch renaming of downloaded TV files

    pyShowRename is a semi-automated batch file renamer which allows for controlled selection and maintenance/sanitisation of downloaded TV programs. The project was initially just to help the author get to grips with python but as of version 1.0 the project is stable enough to be used regularly and has proven to be a very useful tool. pyShowRename interfaces directly with the free epguides.com website using HTML scraping and retains a local cache of available TV Shows. This can be manually updated...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    HTML DOM Parser

    HTML parser which can be used for screen-scraping applications

    htmldom parses the HTML file and provides methods for iterating and searching the parse tree in a similar way as Jquery. To report bugs please mail me at bhimsen.pes@gmail.com
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    VIT Marks Display

    A small program that accesses VIT marks of a specific student

    A small attempt while learning interfacing with the web while learning python to get the marks of a specific valid VIT student using basic web scraping techniques
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    Mangaligator

    Download manga or extract links from popular online manga readers

    Mangaligator is an OpenSource Manga Downloader. It supports the sites mangafox.com, mangareader.net, mangashare.com, mangavolume.com, goodmanga.net, manga.animea.net and mangahere.com. The URLs of the pages also are saved seperately and can be used for other purposes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Dusting off mpeg4ip. Scraping away rust. Getting it to compile on gcc4 (initially just fedora, but possible more *nix if I have time or help). Will prefix the command line apps with "re" to avoid name conflicts with other mpeg4ip forks.
    Downloads: 0 This Week
    Last Update:
    See Project