Search Results for "python web crawler" - Page 8

Showing 2703 open source projects for "python web crawler"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • No-Nonsense Code-to-Cloud Security for Devs | Aikido Icon
    No-Nonsense Code-to-Cloud Security for Devs | Aikido

    Connect your GitHub, GitLab, Bitbucket, or Azure DevOps account to start scanning your repos for free.

    Aikido provides a unified security platform for developers, combining 12 powerful scans like SAST, DAST, and CSPM. AI-driven AutoFix and AutoTriage streamline vulnerability management, while runtime protection blocks attacks.
    Start for Free
  • 1
    Gunicorn

    Gunicorn

    WSGI HTTP Server for UNIX, fast clients and sleepy applications

    Gunicorn 'Green Unicorn' is a Python WSGI HTTP Server for UNIX. It's a pre-fork worker model. The Gunicorn server is broadly compatible with various web frameworks, simply implemented, light on server resources, and fairly speedy. You can run Gunicorn by using commands or integrate with popular frameworks like Django, Pyramid, or TurboGears. For deploying Gunicorn in production see Deploying Gunicorn. After installing Gunicorn you will have access to the command line script gunicorn. Gunicorn...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Online Boutique

    Online Boutique

    Sample cloud-first application with 10 microservices

    Online Boutique is a cloud-first microservices demo application. The application is a web-based e-commerce app where users can browse items, add them to the cart, and purchase them. Google uses this application to demonstrate the use of technologies like Kubernetes, GKE, Istio, Stackdriver, and gRPC. This application works on any Kubernetes cluster, like Google Kubernetes Engine (GKE). It’s easy to deploy with little to no configuration.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    LinkChecker

    LinkChecker

    Check links in web documents or full websites

    LinkChecker is a free, GPL licensed website validator. LinkChecker checks links in web documents or full websites. It runs on Python 3 systems, requiring Python 3.8 or later. The version in the pip repository may be old, to find out how to get the latest code, plus platform-specific information and other advice see doc/install.txt in the source code archive. If you do not want to install any additional libraries/dependencies you can use the Docker image which is published on GitHub Packages.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Promgen

    Promgen

    Promgen is a configuration file generator for Prometheus

    Promgen is a configuration file generator for Prometheus. Promgen is a web application written with Django and can help you do several jobs. The primary management UI is a Django application and many of the concepts that apply to a typical Django application will apply to Promgen. Configure Prometheus to load the target file from Prometheus and configure AlertManager to send notifications back to Promgen. Arbitrary Django settings can be set for the Promgen web app by adding those under...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • 5
    Scrapy-Redis

    Scrapy-Redis

    Redis-based components for Scrapy

    You can start multiple spider instances that share a single redis queue. Best suitable for broad multi-domain crawls. Scraped items gets pushed into a redis queued meaning that you can start as many as needed post-processing processes sharing the items queue. Scheduler + Duplication Filter, Item Pipeline, Base Spiders. Default requests serializer is pickle, but it can be changed to any module with loads and dumps functions. Note that pickle is not compatible between python versions. Version 0.3...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    You-Get

    You-Get

    Dumb downloader that scrapes the web

    You-Get is a small command-line utility for downloading media (video, audio and images) from the Web when there are no other means to do so. It can download video and audio files from such popular web sites as YouTube, Twitter, Niconico, Vimeo, Flickr, Instagram and a whole lot more. You-Get is a great option for when you want to enjoy your favorite videos, audio or images from the internet without having to open any web browsers or get interrupted by ads. It’s also a good choice for when...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    buku

    buku

    Personal mini-web in text

    buku is a powerful bookmark manager and a personal textual mini-web. For those who prefer the GUI, bukuserver exposes a browsable front-end on a local web host server. When I started writing it, I couldn't find a flexible command-line solution with a private, portable, merge-able database along with seamless GUI integration. Hence, buku. buku can import bookmarks from the browser(s) or fetch the title, tags and description of a URL from the web. Use your favorite editor to add, compose...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Modoboa

    Modoboa

    Mail hosting made simple

    Modoboa is a mail hosting and management platform including a modern and simplified Web User Interface. It provides useful components such as an administration panel and webmail. Modoboa integrates with well known software such as Postfix or Dovecot. A SQL database (MySQL, PostgreSQL or SQLite) is used as a central point of communication between all components. Modoboa is developed with modularity in mind, expanding it is really easy. Actually, all current features are extensions. It is written...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    GRR

    GRR

    GRR Rapid Response, remote live forensics for incident response

    GRR Rapid Response is an incident response framework focused on remote live forensics. It consists of a python client (agent) that is installed on target systems, and python server infrastructure that can manage and talk to clients. The goal of GRR is to support forensics and investigations in a fast, scalable manner to allow analysts to quickly triage attacks and perform analysis remotely. GRR client is deployed on systems that one might want to investigate. On every such system, once deployed...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    FastUI

    FastUI

    Build better UIs faster

    FastUI is a library that lets developers build interactive user interfaces for FastAPI applications using Pydantic models. It automatically generates frontend components based on data schemas and endpoint logic, reducing the need for manual UI development. Designed to be type-safe, reactive, and fast, FastUI streamlines the creation of web dashboards, admin panels, and internal tools within a FastAPI backend.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Hyperledger Cello

    Hyperledger Cello

    Operating System for Enterprise Blockchain

    Hyperledger Cello is a blockchain operation and provisioning system designed to automate the deployment, management, and scaling of Hyperledger Fabric networks. As part of the Hyperledger project under the Linux Foundation, Cello aims to offer Blockchain-as-a-Service (BaaS) by abstracting the complexity of infrastructure setup for consortiums and enterprises. It provides a dashboard, APIs, and orchestration tools to help users create, monitor, and manage blockchain nodes, ledgers, and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Alerta

    Alerta

    Alerta monitoring system

    ... can be queried from the command line or viewed in a slick web console optimized for desktop, tablet, and mobile. User logins can be added using Google, GitHub or GitLab OAuth and programmatic access is managed using API keys.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Slack Machine

    Slack Machine

    A simple, yet powerful and extendable Slack bot

    Slack Machine is a simple, yet powerful and extendable Slack bot framework. More than just a bot, Slack Machine is a framework that helps you develop your Slack workspace into a ChatOps powerhouse. Slack Machine is built with an intuitive plugin system that lets you build bots quickly but also allows for easy code organization. A plugin can look as simple as this:
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Recap

    Recap

    Recap tracks and transform schemas across your whole application

    Recap is a schema language and multi-language toolkit to track and transform schemas across your whole application. Your data passes through web services, databases, message brokers, and object stores. Recap describes these schemas in a single language, regardless of which system your data passes through. Recap schemas can be defined in YAML, TOML, JSON, XML, or any other compatible language.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    CodeChecker

    CodeChecker

    CodeChecker is an analyzer tooling, defect database

    ... configuration and forming the corresponding clang analyzer invocations. Incremental analysis: Only the changed files and its dependencies need to be reanalyzed. False positive suppression with a possibility to add review comments. Result visualization in command line or in static HTML. Web application for viewing discovered code defects with a streamlined, easy experience (with PostgreSQL, or SQLite backend).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    HTTPie

    HTTPie

    A CLI, cURL-like tool for humans

    HTTPie is a modern command-line HTTP client that makes CLI interaction with web services as human-friendly as possible. It offers a plethora of friendly features that make it an excellent curl alternative. It is equipped with an intuitive UI, JSON support, syntax highlighting and so much more. HTTPie gives a single http command for sending arbitrary HTTP requests with a simple, natural syntax, and displayed in a formatted, colorized terminal output. HTTPie can be installed on macOS, Windows...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    PyTorch Geometric Temporal

    PyTorch Geometric Temporal

    Spatiotemporal Signal Processing with Neural Machine Learning Models

    The library consists of various dynamic and temporal geometric deep learning, embedding, and Spatio-temporal regression methods from a variety of published research papers. Moreover, it comes with an easy-to-use dataset loader, train-test splitter and temporal snaphot iterator for dynamic and temporal graphs. The framework naturally provides GPU support. It also comes with a number of benchmark datasets from the epidemiological forecasting, sharing economy, energy production and web traffic...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Maltrail

    Maltrail

    Malicious traffic detection system

    Maltrail is a malicious traffic detection system, utilizing publicly available (black)lists containing malicious and/or generally suspicious trails, along with static trails compiled from various AV reports and custom user-defined lists, where trail can be anything from domain name, URL, IP address (e.g. 185.130.5.231 for the known attacker) or HTTP User-Agent header value (e.g. sqlmap for automatic SQL injection and database takeover tool). Also, it uses (optional) advanced heuristic...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    bookdown

    bookdown

    Authoring Books and Technical Documents with R Markdown

    ... for languages other than R, including C/C++, Python, and SQL, etc. LaTeX equations, theorems, and proofs work for all output formats. Can be published to GitHub, bookdown.org, and any web servers. Integrated with the RStudio IDE. The easiest way to start a new Bookdown project is from within RStudio IDE. Go to File, New Project, New Directory, Book project using bookdown.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Letterboxd Recommendations

    Letterboxd Recommendations

    Scraping publicly-accessible Letterboxd data for movie recommendations

    Scraping publicly-accessible Letterboxd data and creating a movie recommendation model with it that can generate recommendations when provided with a Letterboxd username. A user's "star" ratings are scraped from their Letterboxd profile and assigned numerical ratings from 1 to 10 (accounting for half stars). Their ratings are then combined with a sample of ratings from the top 4000 most active users on the site to create a collaborative filtering recommender model using singular value...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Flask-Limiter

    Flask-Limiter

    Rate Limiting extension for Flask

    Flask-Limiter provides rate-limiting features to flask applications. It allows configuring various backends to persist the rate limits, which is provided by the limits library. Sponsored by Zuplo - fully-managed API Gateway with rate limiting, authentication, and more. Add rate limiting to your API in minutes, try it at zuplo.com Test it out. The fast endpoint respects the default rate limit while the slow endpoint uses the decorated one. ping has no rate limit associated with it. By adding...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Weblate

    Weblate

    Web based localization tool with tight version control integration

    Weblate is a copylefted libre software web-based continuous localization system, used by over 2500 libre projects and companies in more than 165 countries. Copylefted libre software, used by over 2,500 libre software projects and companies in over 165 countries. Hosted service and standalone tool with tight version control integration. Simple and clean user interface, propagation of translations across components, quality checks and automatic linking to source files. There is infrastructure...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Werkzeug

    Werkzeug

    The comprehensive WSGI web application library

    Werkzeug is a comprehensive WSGI web application library. It began as a simple collection of various utilities for WSGI applications and has become one of the most advanced WSGI utility libraries. Werkzeug doesn’t enforce any dependencies. It is up to the developer to choose a template engine, database adapter, and even how to handle requests. Includes an interactive debugger that allows inspecting stack traces and source code in the browser with an interactive interpreter for any frame...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    CWhy

    CWhy

    Explains and suggests fixes for compile-time errors for C, C++, C#, Go

    Explains and suggests fixes for compiler error messages for a wide range of programming languages, including C, C++, C#, Go, Java, LaTeX, PHP, Python, Ruby, Rust, Swift, and TypeScript. CWhy needs to be connected to an OpenAI account or an Amazon Web Services account. Your account will need to have a positive balance for this to work (check your OpenAI balance). CWhy currently defaults to GPT-4, and falls back to GPT-3.5-turbo if a request error occurs. For the newest and best model (GPT-4...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Marvin

    Marvin

    A batteries-included library for building AI-powered software

    Meet Marvin: a batteries-included library for building AI-powered software. Marvin's job is to integrate AI directly into your codebase by making it look and feel like any other function. Marvin introduces a new concept called AI Functions. These functions differ from conventional ones in that they don’t rely on source code, but instead generate their outputs on-demand through AI. With AI functions, you don't have to write complex code for tasks like extracting entities from web pages, scoring...
    Downloads: 1 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.