Showing 739 open source projects for "python web crawler"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • 1
    django-viewflow

    django-viewflow

    Reusable workflow library for Django

    Viewflow is a lightweight reusable workflow library that helps to organize people collaboration business logic in Django applications. In conjunction with Django-material, they could be used as the framework to build ready-to-use business applications in minutes. Django web framework solves only technical problems related to the client-server interaction on top of the stateless HTTP protocol. Model-View-Template separation pattern helps to maintain simple CRUD-based logic. Viewflow...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Flask-Caching

    Flask-Caching

    A caching extension for Flask

    Flask-Caching is an extension to Flask that adds caching support for various backends to any Flask application. By running on top of cachelib it supports all of werkzeug’s original caching backends through a uniformed API. It is also possible to develop your own caching backend by subclassing flask_caching.backends.base.BaseCache class. Flask’s pluggable view classes are also supported. To cache them, use the same cached() decorator on the dispatch_request method. Using the same @cached...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    dynaconf

    dynaconf

    Configuration Management for Python

    ... Vault and Redis as settings and secrets storage. Built-in extensions for Django and Flask web frameworks. CLI for common operations such as init, list, write, validate, export. On your own code you import and use settings object imported from your config.py file. Dynaconf prioritizes the use of environment variables and you can optionally store settings in Settings Files using any of toml|yaml|json|ini|py extension.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Slack Machine

    Slack Machine

    A simple, yet powerful and extendable Slack bot

    Slack Machine is a simple, yet powerful and extendable Slack bot framework. More than just a bot, Slack Machine is a framework that helps you develop your Slack workspace into a ChatOps powerhouse. Slack Machine is built with an intuitive plugin system that lets you build bots quickly but also allows for easy code organization. A plugin can look as simple as this:
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5
    AutoPkg

    AutoPkg

    Automating packaging and software distribution on macOS

    AutoPkg is a system that automatically prepares software for distribution to managed clients. Recipes allow you to specify a series of simple actions which combined together can perform complex tasks, similar to Automator workflows or Unix pipes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    SickChill

    SickChill

    Less rage, more chill

    Automatic Video Library Manager for TV shows. It watches for new episodes of your favorite shows, and when they are posted it does its magic. Select the show you want to grab, add it, and let SickChill handle the rest. See what SickChill holds in store for you. SickChill has a nice calendar that allows you to know what you will see next. It watches for new episodes of your favorite shows, and when they are posted it does its magic: automatic torrent/nzb searching, downloading, and processing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    HTTPie CLI

    HTTPie CLI

    Modern, user-friendly command-line HTTP client for the API era

    HTTPie (pronounced aitch-tee-tee-pie) is a command-line HTTP client. Its goal is to make CLI interaction with web services as human-friendly as possible. HTTPie is designed for testing, debugging, and generally interacting with APIs & HTTP servers. The HTTP & HTTPS commands allow for creating and sending arbitrary HTTP requests. They use simple and natural syntax and provide formatted and colorized output.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    OpenWPM

    OpenWPM

    A web privacy measurement framework

    OpenWPM is a web privacy measurement framework that makes it easy to collect data for privacy studies on a scale of thousands to millions of websites. OpenWPM is built on top of Firefox, with automation provided by Selenium. It includes several hooks for data collection. Check out the instrumentation section below for more details. OpenWPM is tested on Ubuntu 18.04 via TravisCI and is commonly used via the docker container that this repo builds, which is also based on Ubuntu. Although we don't...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Tencent Cloud Code Analysis

    Tencent Cloud Code Analysis

    Static code analysis

    Tencent Cloud Code Analysis (TCA for short, used internally by the R&D code CodeDog ) is a cloud-native, distributed, high-performance comprehensive code analysis and tracking platform that integrates many analysis tools, including server, web and client The three components have integrated a number of self-developed tools, and also support the dynamic integration of analysis tools of various programming languages ​​in the industry. Obtain the Tencent Cloud code analysis platform by deploying...
    Downloads: 0 This Week
    Last Update:
    See Project
  • No-Nonsense Code-to-Cloud Security for Devs | Aikido Icon
    No-Nonsense Code-to-Cloud Security for Devs | Aikido

    Connect your GitHub, GitLab, Bitbucket, or Azure DevOps account to start scanning your repos for free.

    Aikido provides a unified security platform for developers, combining 12 powerful scans like SAST, DAST, and CSPM. AI-driven AutoFix and AutoTriage streamline vulnerability management, while runtime protection blocks attacks.
    Start for Free
  • 10
    Mezzanine

    Mezzanine

    CMS framework for Django

    Mezzanine is a powerful open source content management platform built using the Django framework. In many ways it is like many other content management tools, offering an intuitive interface for managing all of your content. But Mezzanine is different in that it provides most of its functionality by default. While other platforms rely heavily on modules or reusable applications, Mezzanine comes ready with all the functionality you need, making it the more efficient choice. Mezzanine has a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Healthchecks

    Healthchecks

    A cron monitoring tool written in Python & Django

    We notify you when your nightly backups, weekly reports, cron jobs, and scheduled tasks don't run on time. Healthchecks is a cron job monitoring service. It listens for HTTP requests and email messages ("pings") from your cron jobs and scheduled tasks ("checks"). When a ping does not arrive on time, Healthchecks sends out alerts. Healthchecks comes with a web dashboard, API, 25+ integrations for delivering notifications, monthly email reports, WebAuthn 2FA support, and team management features...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Graph Notebook

    Graph Notebook

    Library extending Jupyter notebooks to integrate with Apache TinkerPop

    The graph notebook provides an easy way to interact with graph databases using Jupyter notebooks. Using this open-source Python package, you can connect to any graph database that supports the Apache TinkerPop, openCypher or the RDF SPARQL graph models. These databases could be running locally on your desktop or in the cloud. Graph databases can be used to explore a variety of use cases including knowledge graphs and identity graphs. This project includes many examples of Jupyter notebooks...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    CapRover

    CapRover

    Scalable PaaS (automated Docker+nginx), aka Heroku on Steroids

    CapRover is an extremely easy-to-use app/database deployment & web server manager for your NodeJS, Python, PHP, ASP.NET, Ruby, MySQL, MongoDB, Postgres, WordPress (and etc...) applications! It's blazingly fast and very robust as it uses Docker, Nginx, LetsEncrypt and NetData under the hood behind its simple-to-use interface. For a developer who does not like spending hours and days setting up a server, building tools, sending code to the server, building it, getting an SSL certificate...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Papis

    Papis

    Powerful and highly extensible command-line based document

    Papis is a powerful and highly extensible CLI document and bibliography manager. With Papis, you can search your library for books and papers, add documents and notes, import and export to and from other formats, and much much more. Papis uses a human-readable and easily hackable .yaml file to store each entry's bibliographical data. It strives to be easy to use while providing a wide range of features. And for those who still want more, Papis makes it easy to write scripts that extend its...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Siddhi Core Libraries

    Siddhi Core Libraries

    Stream Processing and Complex Event Processing Engine

    ... to various endpoints in real time. Agile development experience with SQL-like query language and graphical drag-and-drop editor supporting event simulation. Lightweight runtime that can natively run on Kubernetes, Docker, VM, or bare metal, and embedded in any Java or Python application. Scalable, and highly available distributed event processing on Kubernetes, with NATS Streaming and Siddhi Kubernetes Operator.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Apprise

    Apprise

    Apprise - Push Notifications that work with just about every platform!

    Take advantage of Apprise through your network with a user-friendly API. Apprise API was designed to easily fit into existing (and new) eco-systems that are looking for a simple notification solution. There is a small built-in Configuration Manager that can be optionally accessed through your web browser allowing you to create and save as many configurations as you'd like. Each configuration is differentiated by a unique {KEY} that you decide on. Once you've saved your configuration, you'll...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Framework Benchmarks

    Framework Benchmarks

    Source for the TechEmpower Framework Benchmarks project

    If you're new to the project, welcome! Please feel free to ask questions here. We encourage new frameworks and contributors to ask questions. We're here to help! This project provides representative performance measures across a wide field of web application frameworks. With much help from the community, coverage is quite broad and we are happy to broaden it further with contributions. The project presently includes frameworks on many languages including Go, Python, Java, Ruby, PHP, C#, F...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    SiteOne Crawler (desktop app)

    SiteOne Crawler (desktop app)

    A free, feature-rich web analyzer and exporter/cloner you will love!

    A free in-depth website analyzer providing audits of security, performance, SEO, accessibility and other technical aspects. Available as a desktop application for Windows/macOS/Linux and as a CLI tool for advanced users and CI/CD processes. It also includes an offline web page exporter (website clone, mirror).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Atri Framework

    Atri Framework

    Open-source full-stack web development framework built on top of React

    The web framework to build stunning web apps. Build frontend easily using Atri visual builder or React code. Build backend using FastAPI. Atri framework is not just limited to the JavaScript world. You can use this framework with many languages such as Python, NodeJS (upcoming), etc. Atri framework comes with a suite of productivity tools such as visual editor, asset management tools, etc. that significantly reduce development time from months to hours. Using Atri framework, developers do...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Headscale-WebUI

    Headscale-WebUI

    A simple Headscale web UI for small-scale deployments

    A simple Headscale web UI for small-scale deployments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Portainer Templates

    Portainer Templates

    500+ 1-click Portainer app templates

    A compiled list of 500+ ready-to-go Portainer App templates. In Portainer, App Templates enable you to easily deploy services with a predetermined configuration, while allowing you to customize options through the web UI. While Portainer ships with some default templates, it's often helpful to have 1-click access to many more apps + stacks, without having to constantly switch template sources. This repo combines app templates from several sources, to create a ready-to-go template file...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Goutte

    Goutte

    Goutte, a simple PHP Web Scraper

    Goutte is a screen scraping and web crawling library for PHP. Goutte provides a nice API to crawl websites and extract data from the HTML/XML responses. Goutte depends on PHP 7.1+. Add fabpot/goutte as a require dependency in your composer.json file. Create a Goutte Client instance (which extends Symfony\Component\BrowserKit\HttpBrowser). Make requests with the request() method. The method returns a Crawler object (Symfony\Component\DomCrawler\Crawler). To use your own HTTP settings, you may...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    ProxyPool

    ProxyPool

    An Efficient ProxyPool with Getter, Tester and Server

    Simple and efficient proxy pool, providing the following functions. Regularly crawl free proxy websites, easy and scalable. Use Redis to store brokers and sort broker availability. Regular testing and screening to eliminate unavailable proxies and leave available proxies. Provides a proxy API to randomly select available proxies that pass the test. The principle analysis of the proxy pool can be seen in " How to Build an Efficient Proxy Pool ". It is recommended to read it before using it....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    s3cmd

    s3cmd

    Command line tool for managing Amazon S3 and CloudFront services

    Open-source tool to access Amazon S3 file storage. S3cmd is a free command line tool and client for uploading, retrieving and managing data in Amazon S3 and other cloud storage service providers that use the S3 protocol, such as Google Cloud Storage. Lots of features and options have been added to s3cmd since its very first release in 2008.... we recently counted more than 60 command line options, including multipart uploads, encryption, incremental backup, s3 sync, ACL and Metadata...
    Leader badge
    Downloads: 1,335 This Week
    Last Update:
    See Project
  • 25
    GnuCOBOL

    GnuCOBOL

    A free COBOL compiler

    GnuCOBOL (formerly OpenCOBOL) is a free, modern COBOL compiler. GnuCOBOL implements a substantial part of the COBOL 85, X/Open COBOL and newer ISO COBOL standards (2002, 2014, 2023), as well as many extensions included in other COBOL compilers (IBM COBOL, MicroFocus COBOL, ACUCOBOL-GT and others). GnuCOBOL translates COBOL into C and internally compiles the translated code using a native C compiler. Build COBOL programs on various platforms, including GNU/Linux, Unix, Mac OS X, and...
    Leader badge
    Downloads: 603 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.