Search Results for "python web crawler" - Page 4

Showing 3141 open source projects for "python web crawler"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    Listen 1

    Listen 1

    One for all free music in china (chrome extension)

    .... Download the Windows zip file and choose the 32-bit or 64-bit version according to the system. The original web player, using Python to develop a web server. Can run directly on the server, or use the packaged Windows and Mac versions to run the web server locally. Windows, Mac, Linux desktop. Using Electron framework, based on Listen 1 Chrome plug-in version JS library development.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    Posting

    Posting

    The modern API client that lives in your terminal

    posting is a lightweight command-line tool that lets users schedule and automate Mastodon posts using Markdown files. It reads a simple folder structure of Markdown drafts and posts them at predefined intervals or manually. Designed for content creators and developers, posting helps maintain consistent and organized Mastodon accounts without depending on web UIs or third-party schedulers.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    Indico

    Indico

    A feature-rich event management system

    The effortless open-source tool for event organization, archival, and collaboration. Event-organization workflow that fits lectures, meetings, workshops, and conferences. A feature-rich event management system, made @ CERN, the place where the Web was born. A powerful and flexible hierarchical content management system for events, a full-blown conference organization workflow with call for Abstracts and abstract reviewing modules; flexible registration form creation and configuration...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    Pyodide

    Pyodide

    Pyodide is a Python distribution for the browser and Node.js

    Pyodide brings the Python runtime to the browser by compiling Python and its scientific libraries to WebAssembly. It allows developers to run Python code directly in web browsers without a server, supporting packages like NumPy, Pandas, and Matplotlib. Pyodide opens up new possibilities for interactive data analysis, scientific computing, and educational tools in web environments, all while integrating seamlessly with JavaScript.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Deliver secure remote access with OpenVPN. Icon
    Deliver secure remote access with OpenVPN.

    Trusted by nearly 20,000 customers worldwide, and all major cloud providers.

    OpenVPN's products provide scalable, secure remote access — giving complete freedom to your employees to work outside the office while securely accessing SaaS, the internet, and company resources.
    Get started — no credit card required.
  • 5
    CyberScraper 2077

    CyberScraper 2077

    A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

    CyberScraper 2077 is not just another web scraping tool – it's a glimpse into the future of data extraction. Born from the neon-lit streets of a cyberpunk world, this AI-powered scraper uses OpenAI, Gemini and LocalLLM Models to slice through the web's defenses, extracting the data you need with unparalleled precision and style.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    Maltrail

    Maltrail

    Malicious traffic detection system

    Maltrail is a malicious traffic detection system, utilizing publicly available (black)lists containing malicious and/or generally suspicious trails, along with static trails compiled from various AV reports and custom user-defined lists, where trail can be anything from domain name, URL, IP address (e.g. 185.130.5.231 for the known attacker) or HTTP User-Agent header value (e.g. sqlmap for automatic SQL injection and database takeover tool). Also, it uses (optional) advanced heuristic...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7
    GPT Researcher

    GPT Researcher

    LLM based autonomous agent that does online comprehensive research

    Say Hello to GPT Researcher, your AI agent for rapid insights and comprehensive research. GPT Researcher is the leading autonomous agent that takes care of everything from accurate source gathering to organization of research results.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 8
    CKAN

    CKAN

    CKAN is an open-source DMS for powering data hubs

    CKAN is the world’s leading open-source data portal platform. CKAN makes it easy to publish, share and work with data. It's a data management system that provides a powerful platform for cataloging, storing and accessing datasets with a rich front-end, full API (for both data and catalog), visualization tools and more.CKAN is used by national and regional government organizations throughout the European Union, the Americas, Asia, and Oceania to power a variety of official and community data...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    AWX

    AWX

    A web-based user interface built on top of Ansible

    AWX provides a web-based user interface, REST API, and task engine built on top of Ansible. It is one of the upstream projects for Red Hat Ansible Automation Platform. Starting in version 18.0, the AWX Operator is the preferred way to install AWX. AWX can also alternatively be installed and run in Docker, but this install path is only recommended for development/test-oriented deployments, and has no official published release. Uses naming and structure consistent with the AWX HTTP API. Provides...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Picsart Enterprise Background Removal API for Stunning eCommerce Visuals Icon
    Picsart Enterprise Background Removal API for Stunning eCommerce Visuals

    Instantly remove the background from your images in just one click.

    With our Remove Background API tool, you can access the transformative capabilities of automation , which will allow you to turn any photo asset into compelling product imagery. With elevated visuals quality on your digital platforms, you can captivate your audience, and therefore achieve higher engagement and sales.
    Learn More
  • 10
    crwlr

    crwlr

    Library for Rapid (Web) Crawler and Scraper Development

    This library provides kind of a framework and a lot of ready-to-use, so-called steps, that you can use as building blocks, to build your own crawlers and scrapers with. Before diving into the library, let's have a look at the terms crawling and scraping. For most real-world use cases, those two things go hand in hand, which is why this library helps with and combines both. A (web) crawler is a program that (down)loads documents and follows the links in it to load them as well. A crawler could...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    BlackSheep

    BlackSheep

    Fast ASGI web framework for Python

    BlackSheep is an asynchronous web framework to build event-based web applications with Python. A rich code API, based on dependency injection and inspired by Flask and ASP.NET Core. A typing-friendly codebase, which enables a comfortable development experience thanks to hints when coding with IDEs. Built-in generation of OpenAPI Documentation, supporting version 3, YAML, and JSON. A cross-platform framework, using the most modern versions of Python. BlackSheep supports automatic binding...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    spaCy models

    spaCy models

    Models for the spaCy Natural Language Processing (NLP) library

    spaCy is designed to help you do real work, to build real products, or gather real insights. The library respects your time, and tries to avoid wasting it. It's easy to install, and its API is simple and productive. spaCy excels at large-scale information extraction tasks. It's written from the ground up in carefully memory-managed Cython. If your application needs to process entire web dumps, spaCy is the library you want to be using. Since its release in 2015, spaCy has become an industry...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    X-Crawl

    X-Crawl

    Flexible Node.js AI-assisted crawler library

    A high-performance web crawling and scraping framework for Node.js, designed for large-scale data extraction.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    uvicorn

    uvicorn

    An ASGI web server, for Python

    Uvicorn is an ASGI web server implementation for Python. Until recently Python has lacked a minimal low-level server/application interface for async frameworks. The ASGI specification fills this gap, and means we're now able to start building a common set of tooling usable across all async frameworks. Uvicorn currently supports HTTP/1.1 and WebSockets.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    Klavis AI

    Klavis AI

    Open Source MCP integration for AI applications

    Klavis AI is an open-source platform that simplifies the integration of Model Context Protocols (MCPs) into AI applications. It provides hosted, secure MCP servers with built-in OAuth support, eliminating the need for complex authentication management and client-side code. Klavis AI enables developers to connect AI agents to various tools and services efficiently, facilitating scalable and secure AI deployments.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    Gerapy

    Gerapy

    Distributed Crawler Management Framework Based on Scrapy

    Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Scrapyd-Client, Scrapyd-API, Django and Vue.js. Someone who has worked as a crawler with Python may use Scrapy. Scrapy is indeed a very powerful crawler framework. It has high crawling efficiency and good scalability. It is basically a necessary tool for developing crawlers using Python. If you use Scrapy as a crawler, then of course we can use our own host to crawl when crawling, but when the crawl is very large, we can’t run...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    FastHTML

    FastHTML

    The fastest way to create an HTML app

    Built on solid web foundations, not the latest fads - with FastHTML you can get started on anything from simple dashboards to scalable web applications in minutes.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    microStudio

    microStudio

    Free, open source game engine online

    ... as a guest. microStudio includes all you need to write code, create sprites and maps for your 2D game. All from your web browser. Your project is stored in the cloud, and accessible from anywhere. Write your game code in microScript, a simple language inspired by Lua. The documentation is always there to help. Create cool demos in just a few lines of code. microScript shines by its simplicity and interactivity. But you can also code in JavaScript, Python, or Lua if you prefer.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 19
    Eel

    Eel

    A Python library for making simple Electron-like HTML/JS GUI apps

    Eel is a little Python library for making simple Electron-like offline HTML/JS GUI apps, with full access to Python capabilities and libraries. Eel hosts a local webserver, then lets you annotate functions in Python so that they can be called from Javascript, and vice versa. Eel is designed to take the hassle out of writing short and simple GUI applications. If you are familiar with Python and web development, probably just jump to this example which picks random file names out of the given...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    JumpServer

    JumpServer

    Manage assets on different clouds at the same time

    The JumpServer bastion machine complies with the 4A specification of operation and maintenance security audit. Zero threshold, fast online acquisition and installation. Just a browser, the ultimate Web Terminal experience. Easily support massive concurrent access. One system manages assets on different clouds at the same time. Audit recordings are stored in the cloud and will never be lost. One system, is used by multiple subsidiaries and departments at the same time. Prevent identity fraud...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    Khoj

    Khoj

    An AI personal assistant for your digital brain

    Get more done with your open-source AI personal assistant. Khoj is a desktop application to search and chat with your notes, documents, and images. It is an offline-first, open-source AI personal assistant that is accessible from Emacs, Obsidian or your Web browser. Khoj is a thinking tool that is transparent, fun, and easy to engage with. You can build faster and better by using Khoj to search and reason across all your data sources. Khoj learns from your notes and documents to function...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    WAFW00F

    WAFW00F

    WAFW00F allows one to identify and fingerprint Web App Firewall

    The Web Application Firewall Fingerprinting Tool. Sends a normal HTTP request and analyses the response; this identifies a number of WAF solutions. If that is not successful, it sends a number of (potentially malicious) HTTP requests and uses simple logic to deduce which WAF it is. If that is also not successful, it analyses the responses previously returned and uses another simple algorithm to guess if a WAF or security solution is actively responding to our attacks. For further details, check...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    SimpleLogin

    SimpleLogin

    The SimpleLogin back-end

    With email aliases, you can be anonymous online and protect your inbox against spams and phishing. Open-source. Made and hosted in Europe. Receive and send emails anonymously. Next time a website asks for your email address, give an alias instead of your real email. Emails sent to an alias are instantly forwarded to your inbox without the sender knowing anything. Just hit "Reply" if you want to reply to a forwarded email: the reply is sent from your alias and your real email stays hidden....
    Downloads: 5 This Week
    Last Update:
    See Project
  • 24
    Flagsmith

    Flagsmith

    Open source feature flagging and remote config service

    Release features with confidence; manage feature flags across web, mobile, and server-side applications. Use our hosted API, deploy to your own private cloud, or run on-premises. Flagsmith provides an all-in-one platform for developing, implementing, and managing your feature flags. Whether you are moving off an in-house solution or using toggles for the first time, you will be amazed by the power and efficiency gained by using Flagsmith. Flagsmith makes it easy to create and manage feature...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    Memvid

    Memvid

    Video-based AI memory library. Store millions of text chunks in MP4

    Memvid encodes text chunks as QR codes within MP4 frames to build a portable “video memory” for AI systems. This innovative approach uses standard video containers and offers millisecond-level semantic search across large corpora with dramatically less storage than vector DBs. It's self-contained—no DB needed—and supports features like PDF indexing, chat integration, and cloud dashboards.
    Downloads: 4 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.