Showing 94 open source projects for "internet dump spider"

View related business solutions
  • Bright Data - All in One Platform for Proxies and Web Scraping Icon
    Bright Data - All in One Platform for Proxies and Web Scraping

    Say goodbye to blocks, restrictions, and CAPTCHAs

    Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.
    Get Started
  • Deliver secure remote access with OpenVPN. Icon
    Deliver secure remote access with OpenVPN.

    Trusted by nearly 20,000 customers worldwide, and all major cloud providers.

    OpenVPN's products provide scalable, secure remote access — giving complete freedom to your employees to work outside the office while securely accessing SaaS, the internet, and company resources.
    Get started — no credit card required.
  • 1
    DB Browser for SQLite

    DB Browser for SQLite

    The DB Browser for SQLite

    ... users, and must remain as simple to use as possible in order to achieve these goals. Import and export records as text, import and export tables from/to CSV files, import and export databases from/to SQL dump files, issue SQL queries and inspect the results, examine a log of all SQL commands issued by the application, plot simple graphs based on table or query data.
    Downloads: 137 This Week
    Last Update:
    See Project
  • 2
    EasySpider

    EasySpider

    A visual no-code/code-free web crawler/spider

    A visual code-free/no-code web crawler/spider, supporting both Chinese and English.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    Crawlab

    Crawlab

    Distributed web crawler admin platform for spiders management

    ... with each other via gRPC (a RPC framework). Tasks are scheduled by the task scheduler module in the master node, and received by the task handler module in worker nodes, which executes these tasks in task runners. Task runners are actually processes running spider or crawler programs, and can also send data through gRPC (integrated in SDK) to other data sources, e.g. MongoDB.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Scrapy-Redis

    Scrapy-Redis

    Redis-based components for Scrapy

    You can start multiple spider instances that share a single redis queue. Best suitable for broad multi-domain crawls. Scraped items gets pushed into a redis queued meaning that you can start as many as needed post-processing processes sharing the items queue. Scheduler + Duplication Filter, Item Pipeline, Base Spiders. Default requests serializer is pickle, but it can be changed to any module with loads and dumps functions. Note that pickle is not compatible between python versions. Version 0.3...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Free CRM Software With Something for Everyone Icon
    Free CRM Software With Something for Everyone

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    Think CRM software is just about contact management? Think again. HubSpot CRM has free tools for everyone on your team, and it’s 100% free. Here’s how our free CRM solution makes your job easier.
    Get free CRM
  • 5
    Spider-Search

    Spider-Search

    Search multiple engines for a specific string

    Search multiple engines for a specific string
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    req

    req

    Simple Go HTTP client with Black Magic

    Simple and easy to use, providing rich client-level and request-level settings, all of which are intuitive and chainable methods. Provides powerful and convenient debug utilities, including debug logs, performance traces, and even dump the complete request and response content. API testing can be done with minimal code, no need to explicitly create any Request or Client, or even to handle errors. Detect and decode to utf-8 automatically if possible to avoid garbled characters (See Auto Decode...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    GitHub Actions for Firebase

    GitHub Actions for Firebase

    GitHub Action for interacting with Firebase

    This Action for firebase-tools enables arbitrary actions with the firebase command-line client. Starting with version v2.1.2 each version release will point to a versioned docker image allowing for hardening our pipeline (so things don't break when I do something dump). On top of this, you can also point to a master version if you would like to test out what might not be deployed into a release yet. If you want to add a message to a deployment (e.g. the Git commit message) you need to take...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby and MySQL Database - Written in Java Cross Platform Also See Free email Sender...
    Leader badge
    Downloads: 79 This Week
    Last Update:
    See Project
  • 9
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Please follow this link to get latest version https://sourceforge.net/projects/web-spider-web-crawler-extract/ Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby OR MySQL Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Save hundreds of developer hours with components built for SaaS applications. Icon
    Save hundreds of developer hours with components built for SaaS applications.

    The #1 Embedded Analytics Solution for SaaS Teams.

    Whether you want full self-service analytics or simpler multi-tenant security, Qrvey’s embeddable components and scalable data management remove the guess work.
    Try Developer Playground
  • 10
    AutoWikiBrowser is a semi-automated Wikipedia editor, designed to make tedious, repetitive tasks quicker and easier. For more information, see the project homepage at http://en.wikipedia.org/wiki/Wikipedia:AutoWikiBrowser.
    Leader badge
    Downloads: 55 This Week
    Last Update:
    See Project
  • 11

    ahCrawler

    A PHP search engine for your website and web analytics tool. GNU GPL3

    ahCrawler is a set to implement your own search on your website and an analyzer for your web content. It can be used on a shared hosting. It consists of * crawler (spider) and indexer * search for your website(s) * search statistics * website analyzer (http header, short titles and keywords, linkchecker, ...) You need to install it on your own server. So all crawled data stay in your environment. You never know when an external webspider updated your content. Trigger a rescan...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Orao Basket

    Orao Basket

    Programming tools for emulator of eight bit computer ORAO

    Smederevo, 05, august 2018 Long time ago, about 1986 I have become proud owner of eight bit computer ORAO based on MOS 6502 processor. It was first and for me the best home computer at that time. My whole knowledge of computer programming begins with that computer. Recently for some unknown reason I have become interested in old eight bit computers again. After short search on the Internet I have found emulator of my favorite computer. It literally emulates every peace of hardware installed...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    Easyspider - Distributed Web Crawler

    Easyspider - Distributed Web Crawler

    Easy Spider is a distributed Perl Web Crawler Project from 2006

    Easy Spider is a distributed Perl Web Crawler Project from 2006. It features code from crawling webpages, distributing it to a server and generating xml files from it. The client site can be any computer (Windows or Linux) and the Server stores all data. Websites that use EasySpider Crawling for Article Writing Software: https://www.artikelschreiber.com/en/ https://www.unaique.net/en/ https://www.unaique.com/ https://www.artikelschreiber.com/marketing/ https://www.paraphrasingtool1.com...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    溫度日記 Hearty Journal

    溫度日記 Hearty Journal

    療癒系心情日記 App

    ... your memories safe, entries will be encrypted synced to the cloud for backup. A sanctuary for your mind and soul, Hearty Journal will help pour your feelings/brain dump, increase your positive energy, be more grateful and a calmer mind by building healthy thinking through journaling.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    ReconSpider

    ReconSpider

    Most Advanced Open Source Intelligence (OSINT) Framework

    ... the capabilities of Wave, Photon and Recon Dog to do a comprehensive enumeration of attack surfaces. Reconnaissance is a mission to obtain information by various detection methods, about the activities and resources of an enemy or potential enemy, or geographic characteristics of a particular area. A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web, typically for the purpose of Web indexing (web spidering).
    Downloads: 16 This Week
    Last Update:
    See Project
  • 16
    sposkpat2

    sposkpat2

    sposkpat2, Single Purpose Operating System Kpat Live Distro

    Distractionless Patience card game (also known as klondike, solitaire, пасьянс, त्यागी, kesabaran, türelem, ẩn sỉ, ソリティア, 接龍, سوليتير). The series of sposk is pure linux propaganda. Please give it a try. 12 card games are included: Aces Up Forty & Eight Freecell Golf Grandfather Grandfather's Clock Gypsy Klondike Mod3 Simple Simon Spider Yuko A safe and silent way to play a card game: Blocked from all networks, including the internet. Discs are spinned down...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    GoSpider

    GoSpider

    Gospider - Fast web spider written in Go

    GoSpider - Fast web spider written in Go. Fast web crawling. Brute force and parse sitemap.xml. Parse robots.txt. Generate and verify link from JavaScript files. Link Finder. Find AWS-S3 from response source. Find subdomains from the response source. Get URLs from Wayback Machine, Common Crawl, Virus Total, Alien Vault. Format output easy to Grep. Support Burp input. Crawl multiple sites in parallel.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18

    rubywebcrawler

    web spider software written in ruby

    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    kohttp

    kohttp

    Kotlin DSL http client

    Kotlin DSL HTTP client.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    DumpItBlue

    DumpItBlue

    Dump Facebook stuff for analysis or reporting purposes.

    Investigators, researchers or other analysts often have to get local copies of Facebook data. It can be necessary for a lot of reasons like submitting Facebook data as evidence, doing advanced offline analysis, etc. But Facebook interface has not been designed for that and does not provide printing or saving functions. DumpItBlue has been designed to help people to extract data from Facebook. It provides usefull functions to automate a lot of tasks that have to be done manually otherwise....
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    GOPA

    GOPA

    GOPA, a spider written in Golang, for Elasticsearch

    GOPA, a spider written in Golang, for Elasticsearch. Lightweight, low footprint, memory requirement should, be 100MB. Easy to deploy, no runtime or dependency required. Easy to use, no programming or script ability needed, out-of-box features. First of all, get it, two opinions: download the pre-built package or compile it yourself. Besides Elasticsearch, Gopa doesn't require any other dependencies, just simply run ./gopa to start the program. It's safety to press ctrl+c to stop the current...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Grab Framework Project

    Grab Framework Project

    Web Scraping Framework

    ... on top of urllib3 and lxml libraries. The Spider API to build asynchronous web crawlers. You write classes that define handlers for each type of network request. Each handler is able to spawn new network requests. Network requests are processed concurrently with a pool of asynchronous web sockets. Grab provides interface called Spider to develop multithreaded web-site scrapers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    pyspider

    pyspider

    A powerful Spider(Web Crawler) system in Python

    pyspider is a powerful Spider(Web Crawler) system in Python. Components are connected by message queue. Every component, including message queue, is running in their own process/thread, and replaceable. That means, when process is slow, you can have many instances of processor and make full use of multiple CPUs, or deploy to multiple machines. This architecture makes pyspider really fast. benchmarking. Since pyspider has various components, you can just run pyspider to start a standalone...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    OpenWebSpider
    OpenWebSpider is an Open Source multi-threaded Web Spider (robot, crawler) and search engine with a lot of interesting features!
    Downloads: 22 This Week
    Last Update:
    See Project
  • 25
    NiceShaper - Dynamic Traffic Shaper

    NiceShaper - Dynamic Traffic Shaper

    NiceShaper provides dynamic traffic shaping for Linux router

    ... with static rates. While constantly monitoring the traffic flowing through the router, in response to the changing load, dynamically adjusts the rate and ceil parameters values of enabled HTB classes to the values which enable the fullest possible utilization of Internet connection throughput. NiceShaper protects each host which uses reasonable amount of shared throughput while watching over the configured optimal utilization of Internet connection. Therefore, at the asymmetric Internet connectio
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next