Showing 129 open source projects for "python web crawler"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Turn Your Content into Interactive Magic - For Free Icon
    Turn Your Content into Interactive Magic - For Free

    From Canva to Slides, Desmos to YouTube, Lumio works with the tech tools you are already using.

    Transform anything you share into an engaging digital experience - for free. Instantly convert your PDFs, slides, and files into dynamic, interactive sessions with built-in collaboration tools, activities, and real-time assessment. From teaching to training to team building, make every presentation unforgettable. Used by millions for education, business, and professional development.
    Start Free Forever
  • 1
    Easyspider - Distributed Web Crawler

    Easyspider - Distributed Web Crawler

    Easy Spider is a distributed Perl Web Crawler Project from 2006

    Easy Spider is a distributed Perl Web Crawler Project from 2006. It features code from crawling webpages, distributing it to a server and generating xml files from it. The client site can be any computer (Windows or Linux) and the Server stores all data. Websites that use EasySpider Crawling for Article Writing Software: https://www.artikelschreiber.com/en/ https://www.unaique.net/en/ https://www.unaique.com/ https://www.artikelschreiben.com/ https://www.buzzerstar.com/ https...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Goutte

    Goutte

    Goutte, a simple PHP Web Scraper

    Goutte is a screen scraping and web crawling library for PHP. Goutte provides a nice API to crawl websites and extract data from the HTML/XML responses. Goutte depends on PHP 7.1+. Add fabpot/goutte as a require dependency in your composer.json file. Create a Goutte Client instance (which extends Symfony\Component\BrowserKit\HttpBrowser). Make requests with the request() method. The method returns a Crawler object (Symfony\Component\DomCrawler\Crawler). To use your own HTTP settings, you may...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Web Link Collector 1000

    Web Link Collector 1000

    Automatically collect all links from websites to a clean txt file

    ## About Easily and automatically collect all your links into a neat txt list from a particular website or an entire section of a multi-page website network! Web Link Collector 1000 is a simple tool for gathering links from websites with minimal effort. It helps you collect resources for research, create reference lists, or save useful links without manual copying and pasting. ## Features - Two Collection Modes: Single page or multiple pages of specific website section, or even...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Scrapyd

    Scrapyd

    A service daemon to run Scrapy spiders

    Scrapyd can manage multiple projects and each project can have multiple versions uploaded, but only the latest one will be used for launching new spiders. A common (and useful) convention to use for the version name is the revision number of the version control tool you’re using to track your Scrapy project code. For example: r23. The versions are not compared alphabetically but using a smarter algorithm (the same packaging uses) so r10 compares greater to r9, for example. Scrapyd is an...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Deliver secure remote access with OpenVPN. Icon
    Deliver secure remote access with OpenVPN.

    Trusted by nearly 20,000 customers worldwide, and all major cloud providers.

    OpenVPN's products provide scalable, secure remote access — giving complete freedom to your employees to work outside the office while securely accessing SaaS, the internet, and company resources.
    Get started — no credit card required.
  • 5
    KemonoDownloader

    KemonoDownloader

    Kemono Downloader - A cross-platform Python app built with PyQt6

    Welcome to Kemono Downloader, a versatile Python-based desktop application built with PyQt6, designed to download content from Kemono.su. This tool enables users to archive individual posts or entire creator profiles from services like Patreon, Fanbox, and more, supporting a wide range of file types with customizable settings and advanced features.
    Leader badge
    Downloads: 694 This Week
    Last Update:
    See Project
  • 6
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby and MySQL Database - Written in Java Cross Platform Also See Free email Sender...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 7
    WFDownloader App

    WFDownloader App

    Free batch downloader for image, wallpaper, video, audio, document,

    Use as an image gallery, wallpaper, audio/music, video, document, and other media bulk downloader from supported websites. Also use to download sequential website urls that have a certain pattern (e.g. image01.png to image100.png). Also use app's built-in site crawler for advanced link search or extraction. There is also special support for forum media and open directory downloading. It's a programmable downloader and also works with password protected sites. Say goodbye to downloading one...
    Leader badge
    Downloads: 280 This Week
    Last Update:
    See Project
  • 8
    Catbird Linux

    Catbird Linux

    Linux for content creation, web scraping, coding, and data analysis.

    Catbird Linux is a USB pluggable Live Linux operating system built for media creation, web scraping, and software coding. It is the daily driver you want for retrieving data, making videos or podcasts, and making software tools to automate the repetitive tasks. It is ready for work in Python, Lua, and Go languages, with numerous packages for web scraping or downloading data via API calls. Using Catbird Linux, it is possible to accomplish in depth stock market analysis, track weather trends...
    Downloads: 35 This Week
    Last Update:
    See Project
  • 9
    YouTube video web scraper 2 [ISA]

    YouTube video web scraper 2 [ISA]

    YouTube video web scraper 2 [Improved.Simplified.Alternative]

    'YouTube video web scraper 2' is an desktop application developed using python 3.11.4 and other add-on libaries. Finds YouTube video based on user request and view as table. Export the table as excel. Compatible only for windows OS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Test your software product anywhere in the world Icon
    Test your software product anywhere in the world

    Get feedback from real people across 190+ countries with the devices, environments, and payment instruments you need for your perfect test.

    Global App Testing is a managed pool of freelancers used by Google, Meta, Microsoft, and other world-beating software companies.
    Try us today.
  • 10
    WallPaper (alias crawlpaper)
    WallPaper (alias crawlpaper) is a desktop changer (NOT a screensaver) which includes a web crawler for picture download, an audio stream ripper, an audio player, a mini mp3 tag editor,etc. Also included support for .zip and .rar files and an interface to the BerkleyDB code for small databases.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    ScrapBot 1.40 64bits

    ScrapBot 1.40 64bits

    Task automation software for accessing and manipulating website data.

    ScrapBot is a task automation software that allows you to access, authenticate, extract, and insert data on any website. The software utilizes JavaScript to execute tasks, eliminating the need for server or additional software installations. The system can control the accessed webpage through JavaScript, and the entire navigation can be viewed in the program window. The main.js script runs in a separate frame from the navigation frame but can access all page content without any restrictions.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    scraper-with-chatgpt
    It is a powerful data scraping tool that helps you extract information from various online sources. Easily collect data from Google SERP, Maps, Shopify, Zillow, and more. With a user-friendly interface, you can scrape and save data in JSON or Excel formats. Unlock insights from the web effortlessly with scrape-it.cloud API.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Provas aplicadas - Concursos

    Provas aplicadas - Concursos

    Baixar provas aplicadas de bancas

    Um aplicativo desktop que permite o download de provas aplicadas em concursos públicos no Brasil. O aplicativo oferece filtros personalizáveis para pesquisa de provas e gabaritos de acordo com a banca organizadora, ano e nível de escolaridade.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14

    dorker-py

    Descubre archivos, rutas escondidas realizando busquedas avanzadas

    Dorking Google - Dorker Py Descubre archivos, rutas escondidas realizando busquedas avanzadas (ES) Discover files, hidden paths by performing advanced searches (EN)
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Please follow this link to get latest version https://sourceforge.net/projects/web-spider-web-crawler-extract/ Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby OR MySQL Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    AutoScraper

    AutoScraper

    A Smart, Automatic, Fast and Lightweight Web Scraper for Python

    This project is made for automatic web scraping to make scraping easy. It gets a URL or the HTML content of a web page and a list of sample data that we want to scrape from that page. This data can be text, URL or any HTML tag value of that page. It learns the scraping rules and returns similar elements. Then you can use this learned object with new URLs to get similar content or the exact same element of those new pages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    webotron

    Using industrial automation techniques for creating web scraping tools

    ... relies on smart pallets and autonomous workcells. Each workcell itself will consist of smaller autonomous workcells with very simple signalling used between workcells. All hardware must extremely robust and well sealed from the surrounding environment. The webotron project is a program environment which emulates the above for building web scraping tools using temporary Add-on programs running in the Firefox browser. The browser is used to handle all internet interaction and also as a user...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    JSSoup

    JSSoup

    JavaScript + BeautifulSoup = JSSoup

    I'm a fan of Python library BeautifulSoup. It's feature-rich and very easy to use. But when I am working on a small react-native project, and I tried to find a HTML parser library like BeautifulSoup, I failed. So I want to write a HTML parser library that can be so easy to use just like BeautifulSoup in Javascript. JSSoup uses tautologistics/node-htmlparser as HTML dom parser, and creates a series of BeautifulSoup like API on top of it. JSSoup supports both node and react-native. JSSoup tries...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    CEF Python

    CEF Python

    Python bindings for the Chromium Embedded Framework (CEF)

    ... use cases for CEF. Use it as a modern HTML5 based rendering engine that can act as a replacement for classic desktop GUI frameworks. Think of it as Electron for Python. Embed a web browser widget in a classic Qt / GTK / wxPython desktop application. Use it for automated testing of web applications with more advanced capabilities than Selenium web browser automation due to CEF low level programming APIs.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20

    Mowglee

    Mowglee - The Geo Crawler!

    Mowglee is a distributed, multi-threaded, asynchronous task execution based web crawler in Java.It is designed for geographic affinity and is highly modular.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    yabasta

    yabasta

    Yet Another BAsic Scraper and Text Analysis

    YA BASTA! is a Python/R application for Lyrics Web Scraper and Text Analysis. Web scraping is developed in Python, text analysis in R as Python subprocesses. YA BASTA! is only tested on windows OS. To run YA BASTA! just type on window command prompt: python.exe yabasta.py
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    X-RAY

    X-RAY

    The next web scraper, see through the <html> noise

    Supports strings, arrays, arrays of objects, and nested object structures. The schema is not tied to the structure of the page you're scraping, allowing you to pull the data in the structure of your choosing. The API is entirely composable, giving you great flexibility in how you scrape each page. Paginate through websites, scraping each page. X-ray also supports a request delay and a pagination limit. Scraped pages can be streamed to a file, so if there's an error on one page, you won't...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Requests-HTML

    Requests-HTML

    Pythonic HTML Parsing for Humans

    This library intends to make parsing HTML (e.g. scraping the web) as simple and intuitive as possible. When using this library you automatically get full JavaScript support! (Using Chromium, thanks to puppeteer) CSS Selectors (a.k.a jQuery-style, thanks to PyQuery). XPath Selectors, for the faint of heart. Mocked user-agent (like a real web browser). Automatic following of redirects. Connection–pooling and cookie persistence. The Requests experience you know and love, with magical parsing...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    django-dynamic-scraper

    django-dynamic-scraper

    Creating Scrapy scrapers via the Django admin interface

    Django Dynamic Scraper (DDS) is an app for Django build on top of the scraping framework Scrapy. While preserving many of the features of Scrapy it lets you dynamically create and manage spiders via the Django admin interface. With Django Dynamic Scraper (DDS) you can define your Scrapy scrapers dynamically via the Django admin interface and save your scraped items in the database you defined for your Django project. Since it simplifies things DDS is not usable for all kinds of scrapers, but...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    YouTube Video Downloader

    YouTube Video Downloader

    Allows you to download youtube videos into a video/audio format.

    YouTube Video Downloader By Chase, This is a tool developed in python, by web scraping I can get the videos from YouTube and download it on my machine in a video/audio format, easy-to-use GUI for your needs, dark theme.
    Downloads: 1 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.