Showing 681 open source projects for "python web crawler"

View related business solutions
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • Trumba is an All-in-one Calendar Management and Event Registration platform Icon
    Trumba is an All-in-one Calendar Management and Event Registration platform

    Great for live, virtual and hybrid events

    Publish, promote and track your events more affordably and effectively—all in one place.
    Learn More
  • 1
    Gerapy

    Gerapy

    Distributed Crawler Management Framework Based on Scrapy

    Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Scrapyd-Client, Scrapyd-API, Django and Vue.js. Someone who has worked as a crawler with Python may use Scrapy. Scrapy is indeed a very powerful crawler framework. It has high crawling efficiency and good scalability. It is basically a necessary tool for developing crawlers using Python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Synapse Machine Learning

    Synapse Machine Learning

    Simple and distributed Machine Learning

    SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. SynapseML builds on Apache Spark and SparkML to enable new kinds of machine learning, analytics, and model deployment workflows. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with the Open Neural Network Exchange (ONNX), LightGBM, The Cognitive Services, Vowpal Wabbit,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Crawlab

    Crawlab

    Distributed web crawler admin platform for spiders management

    Golang-based distributed web crawler management platform, supporting various languages including Python, NodeJS, Go, Java, PHP and various web crawler frameworks including Scrapy, Puppeteer, Selenium. Please use docker-compose to one-click to start up. By doing so, you don't even have to configure MongoDB database. The frontend app interacts with the master node, which communicates with other components such as MongoDB, SeaweedFS and worker nodes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Requests for PHP

    Requests for PHP

    Requests for PHP is a humble HTTP request library

    Requests is a HTTP library written in PHP, for human beings. It is roughly based on the API from the excellent Requests Python library. Requests is ISC Licensed (similar to the new BSD license) and has no dependencies, except for PHP 5.6+. Despite PHP’s use as a language for the web, its tools for sending HTTP requests are severely lacking. cURL has an interesting API, to say the least, and you can’t always rely on it being available. Sockets provide only low-level access and require you to build most of the HTTP response parsing yourself. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Reach Your Audience with Rise Vision, the #1 Cloud Digital Signage Software Solution Icon
    Reach Your Audience with Rise Vision, the #1 Cloud Digital Signage Software Solution

    K-12 Schools, Higher Education, Businesses, Restaurants

    Rise Vision is the #1 digital signage company, offering easy-to-use cloud digital signage software compatible with any player across multiple screens. Forget about static displays. Save time and boost sales with 500+ customizable content templates for your screens. If you ever need help, get free training and exceptionally fast support.
    Learn More
  • 5
    WFDownloader App

    WFDownloader App

    Free batch downloader for image, wallpaper, video, audio, document,

    Use as an image gallery, wallpaper, audio/music, video, document, and other media bulk downloader from supported websites. Also use to download sequential website urls that have a certain pattern (e.g. image01.png to image100.png). Also use app's built-in site crawler for advanced link search or extraction. There is also special support for forum media and open directory downloading. It's a programmable downloader and also works with password protected sites. Say goodbye to downloading one...
    Leader badge
    Downloads: 232 This Week
    Last Update:
    See Project
  • 6
    Eric Integrated Development Environment

    Eric Integrated Development Environment

    Python Development Environment with all batteries included

    Eric is a Python IDE written using PyQt and QScintilla. It provides various features such as any number of open editors, an integrated (remote) debugger, project management facilities, unit test, refactoring and much more.
    Leader badge
    Downloads: 229 This Week
    Last Update:
    See Project
  • 7
    KemonoDownloader

    KemonoDownloader

    Kemono Downloader - A cross-platform Python app built with PyQt6

    Welcome to Kemono Downloader, a versatile Python-based desktop application built with PyQt6, designed to download content from Kemono.su. This tool enables users to archive individual posts or entire creator profiles from services like Patreon, Fanbox, and more, supporting a wide range of file types with customizable settings and advanced features.
    Leader badge
    Downloads: 250 This Week
    Last Update:
    See Project
  • 8
    RPA for Python

    RPA for Python

    Python package for doing RPA

    Python package for doing RPA. RPA for Python's simple and powerful API makes robotic process automation fun! You can use it to quickly automate away repetitive time-consuming tasks on websites, desktop applications, or the command line. See sample Python script, the RPA Challenge solution, and RedMart groceries example. To send a Telegram app notification, simply look up @rpapybot to allow receiving messages. To automate Chrome browser invisibly, use headless mode. To run 10X faster instead...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    Whakerexa

    Whakerexa

    A minimalist and lightweight web kit for accessible contents

    `Whakerexa` provides a lightweight, modular set of CSS and JavaScript tools for building accessible, consistent, and customizable web interfaces. It is intended to be as simple as possible to make **accessible web content**, and to minimize the use of CSS classes for enhancing the readability of HTML code. It was designed to be easily customizable, allowing users to adjust properties such as fonts, colors, borders, etc., effortlessly. Most of the properties are stored into variables...
    Downloads: 17 This Week
    Last Update:
    See Project
  • D&B Hoovers is Your Sales Accelerator Icon
    D&B Hoovers is Your Sales Accelerator

    For sales teams that want to accelerate B2B sales with better data

    Speed up sales prospecting with the rich audience targeting capabilities of D&B Hoovers so you can spend more sales time closing.
    Learn More
  • 10
    WhakerKit

    WhakerKit

    A seamless toolkit to manage dynamic websites and shared documents

    WhakerKit is a versatile toolkit for building websites with both static and dynamic HTML pages, developed by Brigitte Bigi, CNRS. WhakerKit offers seamless management of public and authenticated access, and simplifies document sharing for collaborative environments. It is based on the following technologies: * python >= 3.9 * (optional) PyJWT and ldap3 for authentication (install with pip) * WhakerPy >= 1.3: <https://whakerpy.sourceforge.io> (install with pip) * Whakerexa >=...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    WhakerPy

    WhakerPy

    Whakerpy - A light web application framework

    Whakerpy is a simple library useful to create dynamic HTML content; it's a light web application framework. Create and manipulate HTML from the power of Python: - Easy to learn. Consistent, simple syntax. - Flexible and easy usage. - Create HTML pages dynamically - Can save as static files, and/or - Run locally with its httpd server and response "bakery" system. Access the documentation: <https://whakerpy.sourceforge.io>.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    TimerMiddleware

    Timing & instrumentation for python web apps

    Docs: https://pythonhosted.org/TimerMiddleware/ PYPI: https://pypi.python.org/pypi/TimerMiddleware
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    apache-logs-to-mysql

    Apache Log Parser and Data Normalization Application

    Apache Log Parser and Data Normalization Application Python handles File Processing & MySQL handles Data Processing ApacheLogs2MySQL consists of two Python Modules & one MySQL Schema to automate importing Access & Error files and normalizing data into database designed for reports & data analysis. Runs on Windows, Linux and MacOS & tested with MySQL versions 8.0.39, 8.4.3, 9.0.0 & 9.1.0. 4 LogFormats & 2 ErrorLogFormats can be loaded and 5 MySQL Stored Procedures can be processed in a single Python `ProcessLogs function` execution. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Automated Bluesky Poster

    Automated Bluesky Poster

    Free, easy, infinite Bluesky posting automation.

    Part of a series of programs to fully automate Twitter/Bluesky marketing: 1. https://reactorcore.itch.io/web-link-collector-1000 (Collect links to your stuff) 2. https://reactorcore.itch.io/links-into-social-media-posts (Make those links into a social media posts spreadsheet) 3a. https://reactorcore.itch.io/automated-twitter-poster (Post them automatically from that spreadsheet, to Twitter) 3b. https://reactorcore.itch.io/automated-bluesky-poster (Post them automatically from that...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Automated Twitter Poster

    Automated Twitter Poster

    Free, easy, infinite Twitter posting automation.

    Part of a series of programs to fully automate Twitter/Bluesky marketing: 1. https://reactorcore.itch.io/web-link-collector-1000 (Collect links to your stuff) 2. https://reactorcore.itch.io/links-into-social-media-posts (Make those links into a social media posts spreadsheet) 3a. https://reactorcore.itch.io/automated-twitter-poster (Post them automatically from that spreadsheet, to Twitter) 3b. https://reactorcore.itch.io/automated-bluesky-poster (Post them automatically from that...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Plum Cave Twofish

    Plum Cave Twofish

    A version of Plum Cave that uses the ChaCha20 and Twofish ciphers

    A version of Plum Cave that employs the "ChaCha20 + Twofish-256 CBC + HMAC-SHA3-512" authenticated encryption scheme for data encryption and ML-KEM-1024 for quantum-resistant key exchange.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    elFinder
    elFinder is a file manager for web similar to that you use on your computer. Written in JavaScript using jQuery UI, it just work's in any modern browser. Its creation is inspired by simplicity and convenience of Finder.app program used in Mac OS X.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18

    uweb browser: unlimited power

    minimal suckless android web browser with unlimited power

    - AI bot as search engine; append file content as input for complex query. - Powerful: html5 enhancement; any urls to host a website; javascript and shell scripting for general processing; and more with Termux. - Customizable: user-defined menus, (new) buttons and gestures for user agents, bookmarklets, url services, shell commands, internal functionality links and text processing etc. - Convenient: book/dictionary/txt/command line/app can be search engine. - Tiny: less than 200k -...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19

    pySocketHTTPserver

    HTTP server developed with Python and socket as the only web module.

    # pySocketHTTPserver 1.0 by CHEN Guang (Chin Hikaru) # Using only one web module: socket, thus allow user to see and test every detail of HTTP-server. # Run this script and visit http://127.0.0.1:880/ with browser and you will see a picture. # Double click the picture for full screen, # move mouse cursor to the screen top to get the "X" button for exitting full screen. # You can drag the pictur with left mouse button. # You can change to other pictures by rolling the mouse wheel. #...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Buku

    Buku

    Powerful command-line bookmark manager. Your mini web!

    buku is a powerful bookmark manager written in Python3 and SQLite3. buku fetches the title of a bookmarked web page and stores it along with any additional comments and tags. You can use your favourite editor to compose and update bookmarks. With multiple search options, including regex and a deep scan mode (particularly for URLs), it can find any bookmark instantly. Multiple search results can be opened in the browser at once. Though a terminal utility, it's possible to add bookmarks...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    ddgr

    ddgr

    DuckDuckGo from the terminal

    ddgr is a cmdline utility to search DuckDuckGo from the terminal. While googler is highly popular among cmdline users, in many forums the need of a similar utility for privacy-aware DuckDuckGo came up. DuckDuckGo Bangs are super-cool too! So here's ddgr for you! Unlike the web interface, you can specify the number of search results you would like to see per page. It's more convenient than skimming through 30-odd search results per page. The default interface is carefully designed to use...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Endian Firewall Community
    Endian Firewall Community (EFW) is a "turn-key" linux security distribution that makes your system a full featured security appliance with Unified Threat Management (UTM) functionalities. The software has been designed for the best usability: very easy to install, use and manage and still greatly flexible. The feature suite includes stateful packet inspection firewall, application-level proxies for various protocols (HTTP, FTP, POP3, SMTP) with antivirus support, virus and spam-filtering...
    Leader badge
    Downloads: 623 This Week
    Last Update:
    See Project
  • 23
    barcraft

    barcraft

    A simple QrCode / barcode generator in python

    A simple QrCode / barcode generator that you can also use from this website version : https://secret-guest.github.io/barcraft/ Interface made with pyQt5, made with a MSI installer with Inno setup
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    CerberusCMS5

    CerberusCMS5

    Cerberus Content Management System

    Cerberus Content Management System is a dynamic, secure and infinitely expandable CMS designed after a Unix-Like model. It is a custom written Web Application Framework ( W.A.F. ) with a consistent and custom written Pre-Hyper-Text-Post-Processor Programming Code Framework ( P.C.F. ). This Web Application Software Project' aim is to be the fastest and most secure Web Application Framework, Web Application Programming Code Framework, Text, Voice and Video Communications Platform and Content...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Easyspider - Distributed Web Crawler

    Easyspider - Distributed Web Crawler

    Easy Spider is a distributed Perl Web Crawler Project from 2006

    Easy Spider is a distributed Perl Web Crawler Project from 2006. It features code from crawling webpages, distributing it to a server and generating xml files from it. The client site can be any computer (Windows or Linux) and the Server stores all data. Websites that use EasySpider Crawling for Article Writing Software: https://www.artikelschreiber.com/en/ https://www.unaique.net/en/ https://www.unaique.com/ https://www.artikelschreiben.com/ https://www.buzzerstar.com/ https://easyperlspider.sourceforge.io/ https://www.sebastianenger.com/ https://www.artikelschreiber.com/opensource/ It is fun to look at some code that is few years ago and to see how one has improved himself. ...
    Downloads: 0 This Week
    Last Update:
    See Project