Search Results for "python web crawler" - Page 6

Showing 3141 open source projects for "python web crawler"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Secure remote access solution to your private network, in the cloud or on-prem. Icon
    Secure remote access solution to your private network, in the cloud or on-prem.

    Deliver secure remote access with OpenVPN.

    OpenVPN is here to bring simple, flexible, and cost-effective secure remote access to companies of all sizes, regardless of where their resources are located.
    Get started — no credit card required.
  • 1
    BentoCache

    BentoCache

    Bentocache is a robust multi-tier caching library for Node.js app

    Bentocache is a flexible caching library for Python that supports multiple backends like memory, disk, and Redis. It offers decorators for easy function-level caching and is designed to be lightweight, extensible, and developer-friendly. Bentocache is well-suited for performance optimization in web apps, scripts, and data pipelines.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    OpenAdapt

    OpenAdapt

    Open Source Generative Process Automation

    OpenAdapt is the open source software adapter between Large Multimodal Models (LMMs) and traditional desktop and web Graphical User Interfaces (GUIs). OpenAdapt learns to automate your desktop and web workflows by observing your demonstrations. Spend less time on repetitive tasks and more on work that truly matters. Boost team productivity in HR operations. Automate candidate sourcing using LinkedIn Recruiter, LinkedIn Talent Solutions, GetProspect, Reply.io, outreach.io, Gmail/Outlook...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    Flask App Builder

    Flask App Builder

    Simple and rapid application development framework

    Simple and rapid application development framework, built on top of Flask. includes detailed security, auto CRUD generation for your models, google charts and much more. Automatic permissions lookup, based on exposed methods. Inserts on the Database all the detailed permissions possible on your application. Public (no authentication needed) and Private permissions. Role-based permissions. Authentication support for OpenID, Database and LDAP. Support for self-user registration. Automatic,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Scout Suite

    Scout Suite

    Multi-cloud security auditing tool

    Scout Suite is an open-source multi-cloud security-auditing tool, which enables security posture assessment of cloud environments. Using the APIs exposed by cloud providers, Scout Suite gathers configuration data for manual inspection and highlights risk areas. Rather than going through dozens of pages on the web consoles, Scout Suite presents a clear view of the attack surface automatically. Scout Suite was designed by security consultants/auditors. It is meant to provide a point-in-time...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Powering the best of the internet | Fastly Icon
    Powering the best of the internet | Fastly

    Fastly's edge cloud platform delivers faster, safer, and more scalable sites and apps to customers.

    Ensure your websites, applications and services can effortlessly handle the demands of your users with Fastly. Fastly’s portfolio is designed to be highly performant, personalized and secure while seamlessly scaling to support your growth.
    Try for free
  • 5
    crawley

    crawley

    The unix-way web crawler

    Crawls web pages and prints any link it can find. Fast HTML SAX-parser (powered by golang.org/x/net/html) Small (below 1500 SLOC), idiomatic, 100% test-covered codebase. Grabs most of useful resources URLs (pics, videos, audios, forms, etc...) Found URLs are streamed to stdout and guaranteed to be unique (with fragments omitted) Scan depth (limited by starting host and path, by default - 0) can be configured. Can crawl rules and sitemaps from robots.txt. Brute mode - scan HTML comments for URLs...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Reflex Dev

    Reflex Dev

    Web apps in pure Python

    Reflex is a Python framework for building full-stack web apps entirely in Python—without writing JavaScript for the frontend. It provides fast live reloads, built-in state management, deployment tooling, and optional AI-powered scaffolding to accelerate development.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    ChatGPT UI

    ChatGPT UI

    A ChatGPT web client that supports multiple users, and databases

    A ChatGPT web client that supports multiple users, multiple database connections for persistent data storage, supports i18n. Provides Docker images and quick deployment scripts. Support gpt-4 model. You can select the model in the "Model Parameters" of the front-end. The GPT-4 model requires whitelist access from OpenAI. Added web search capability to generate more relevant and up-to-date answers from ChatGPT! This feature is off by default, you can turn it on in `Chat->Settings` in the admin...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    django CMS

    django CMS

    Easy-to-use and developer-friendly enterprise CMS powered by Django

    Create modern websites that content editors love. django CMS was originally conceived by web developers frustrated with the technical and security limitations of other systems. Its lightweight core makes it easy to integrate with other software and put to use immediately, while its ease of use makes it the go-to choice for content managers, content editors and website admins. Developers can integrate other existing Django applications rapidly, or build brand new compatible apps that take...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    Synapse Machine Learning

    Synapse Machine Learning

    Simple and distributed Machine Learning

    ..., and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of data sources. SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. For production-grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • 10
    jQuery Terminal

    jQuery Terminal

    JavaScript library for creating web-based terminals

    jQuery Terminal is a JavaScript library for creating command-line interpreters in your applications. You can use this JavaScript Terminal library to create interactive web-based terminal applications on your website. Where commands are defined by you. You can define them on the server or in the browser's JavaScript. It can automatically call JSON-RPC service when the user types a command. Alternatively, you can provide an object with methods; each method will be invoked on the user's command...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    Confluent's .NET Client for Apache Kafka

    Confluent's .NET Client for Apache Kafka

    Confluent's Apache Kafka .NET client

    confluent-kafka-dotnet is Confluent's .NET client for Apache Kafka and the Confluent Platform. Confluent-kafka-dotnet is a lightweight wrapper around librdkafka, a finely tuned C client. There are a lot of details to get right when writing an Apache Kafka client. We get them right in one place (librdkafka) and leverage this work across all of our clients (also confluent-kafka-python and confluent-kafka-go). Confluent, founded by the creators of Kafka, is building a streaming platform...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    Substra

    Substra

    Low-level Python library used to interact with a Substra network

    An open-source framework supporting privacy-preserving, traceable federated learning and machine learning orchestration. Offers a Python SDK, high-level FL library (SubstraFL), and web UI to define datasets, models, tasks, and orchestrate secure, auditable collaborations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    DrissionPage

    DrissionPage

    Python based web automation tool. Powerful and elegant

    DrissionPage is a Python-based automation framework that blends the capabilities of Selenium for browser automation with Requests-HTML for fast, headless web data extraction. It enables seamless switching between browser-controlled and headless HTTP sessions within the same interface. Ideal for web scraping, testing, and automation, DrissionPage is lightweight and highly efficient, offering more flexibility than standard Selenium or Requests usage alone.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    redis-py

    redis-py

    Redis Python client

    redis-py is the official Python client for interacting with Redis, the in-memory data structure store. It supports all Redis commands and data types, making it easy to build caching, messaging, or real-time analytics features in Python applications. With both synchronous and asyncio support, redis-py is suited for modern Python projects and integrates smoothly into web frameworks, task queues, and backend services.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    ngx_waf

    ngx_waf

    Handy, High performance, ModSecurity compatible Nginx firewall module

    Handy, High-performance Nginx firewall module. Such as black and white list of IPs or IP range, uri black and white list, and request body black list, etc. Directives and rules are easy to write and readable. The IP detection is a constant-time operation. Most of the remaining inspections use caching to improve performance. Compatible with ModSecurity's rules, you can use OWASP ModSecurity Core Rule Set. Supports verifying Google, Bing, Baidu and Yandex crawlers and allowing them...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Notte

    Notte

    Opensource browser using agents

    Notte is an open-source browser framework that enables the development and deployment of web-based AI agents. It introduces a perception layer that transforms web pages into structured, navigable maps described in natural language, allowing agents to interact with the internet more effectively. Notte is designed for building scalable and efficient browser-based AI applications.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Browser Use MCP Server

    Browser Use MCP Server

    Browse the web, directly from Cursor etc.

    A browser automation server implementing the Model Context Protocol, designed to allow AI assistants to browse the web directly from applications like Cursor. It supports natural language commands for web navigation and interaction. ​
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    TinyStatus

    TinyStatus

    Tiny status page generated by a Python script

    TinyStatus is a simple, customizable status page generator that allows you to monitor the status of various services and display them on a clean, responsive web page.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    OAuthLib

    OAuthLib

    A generic, spec-compliant, thorough implementation of the OAuth

    A generic, spec-compliant, thorough implementation of the OAuth request-signing logic for Python 3.8+. OAuthLib is a framework which implements the logic of OAuth1 or OAuth2 without assuming a specific HTTP request object or web framework. Use it to graft OAuth client support onto your favorite HTTP library, or provide support onto your favourite web framework. If you're a maintainer of such a library, write a thin veneer on top of OAuthLib and get OAuth support for very little effort.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    openvpn-monitor

    openvpn-monitor

    openvpn-monitor is a web based OpenVPN monitor

    openvpn-monitor is a simple Python program to generate HTML that displays the status of an OpenVPN server, including all current connections. It uses the OpenVPN management console. It typically runs on the same host as the OpenVPN server, however, it does not necessarily need to. OpenVPN-monitor is a web-based OpenVPN monitor, that shows current connection information, such as users, location, and data transferred.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    Basketball Reference

    Basketball Reference

    NBA Stats API via Basketball Reference

    Basketball Reference is a great site (especially for a basketball stats nut like me), and hopefully, they don't get too pissed off at me for creating this. I initially wrote this library as an exercise for creating my first PyPi package, hope you find it valuable! This library was created for another Python project where I was trying to estimate an NBA player's productivity. A lot of sports-related APIs are expensive - luckily, Basketball Reference provides a free service which can be scraped...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    Auto Bangumi

    Auto Bangumi

    Automated Bangumi episode downloader and organizer with Web UI

    Auto_Bangumi is a fully automated tool for downloading, organizing, and tracking anime (Bangumi) episodes using RSS feeds and download clients like qBittorrent. It offers a modern Web UI for managing subscriptions, custom filtering rules, automatic file renaming, and subtitle matching. Designed for anime fans, it streamlines the process of staying up-to-date with seasonal shows by integrating feed parsing, downloading, and library organization.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    AGiXT

    AGiXT

    AGiXT is a dynamic AI Automation Platform

    AGiXT is a dynamic Artificial Intelligence Automation Platform engineered to orchestrate efficient AI instruction management and task execution across a multitude of providers. Our solution infuses adaptive memory handling with a broad spectrum of commands to enhance AI's understanding and responsiveness, leading to improved task completion. The platform's smart features, like Smart Instruct and Smart Chat, seamlessly integrate web search, planning strategies, and conversation continuity...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    SeleniumBase

    SeleniumBase

    A framework for browser automation and testing with Selenium

    SeleniumBase automatically handles common WebDriver actions such as launching web browsers before tests, saving screenshots during failures, and closing web browsers after tests. SeleniumBase lets you customize test runs from the command line. SeleniumBase uses simple syntax for commands. pytest includes automatic test discovery. If you don't specify a specific file or folder to run, pytest will automatically search through all subdirectories for tests to run. No More Flaky Tests! SeleniumBase...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Full Stack FastAPI and PostgreSQL

    Full Stack FastAPI and PostgreSQL

    Full stack, modern web application generator

    Generate a backend and frontend stack using Python, including interactive API documentation. Production ready Python web server using Uvicorn and Gunicorn. Very high performance, on par with NodeJS and Go (thanks to Starlette and Pydantic). Great editor support. Completion everywhere. Less time debugging. Designed to be easy to use and learn. Less time reading docs. Minimize code duplication. Multiple features from each parameter declaration. Get production-ready code. With automatic...
    Downloads: 2 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.