Showing 166 open source projects for "python web crawler"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • 1
    ChatGLM-6B

    ChatGLM-6B

    ChatGLM-6B: An Open Bilingual Dialogue Language Model

    ChatGLM-6B is an open bilingual (Chinese + English) conversational language model based on the GLM architecture, with approximately 6.2 billion parameters. The project provides inference code, demos (command line, web, API), quantization support for lower memory deployment, and tools for finetuning (e.g., via P-Tuning v2). It is optimized for dialogue and question answering with a balance between performance and deployability in consumer hardware settings. Support for quantized inference...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    OpenAdapt

    OpenAdapt

    Open Source Generative Process Automation

    OpenAdapt is the open source software adapter between Large Multimodal Models (LMMs) and traditional desktop and web Graphical User Interfaces (GUIs). OpenAdapt learns to automate your desktop and web workflows by observing your demonstrations. Spend less time on repetitive tasks and more on work that truly matters. Boost team productivity in HR operations. Automate candidate sourcing using LinkedIn Recruiter, LinkedIn Talent Solutions, GetProspect, Reply.io, outreach.io, Gmail/Outlook, and...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    gpt-oss

    gpt-oss

    gpt-oss-120b and gpt-oss-20b are two open-weight language models

    gpt-oss is OpenAI’s open-weight family of large language models designed for powerful reasoning, agentic workflows, and versatile developer use cases. The series includes two main models: gpt-oss-120b, a 117-billion parameter model optimized for general-purpose, high-reasoning tasks that can run on a single H100 GPU, and gpt-oss-20b, a lighter 21-billion parameter model ideal for low-latency or specialized applications on smaller hardware. Both models use a native MXFP4 quantization for...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 4
    Yandex Smart Home

    Yandex Smart Home

    Adds support for Yandex Smart Home (Alice voice assistant)

    Adds support for Yandex Smart Home (Alice voice assistant) into Home Assistant. The component allows you to add devices from Home Assistant to the Yandex smart home platform and manage them from any device with Alice. The component runs on Home Assistant version 2023.2 or later.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Simple, Secure Domain Registration Icon
    Simple, Secure Domain Registration

    Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

    Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
    Sign up for free
  • 5
    CogVideo

    CogVideo

    text and image to video generation: CogVideoX (2024) and CogVideo

    CogVideo is an open source text-/image-/video-to-video generation project that hosts the CogVideoX family of diffusion-transformer models and end-to-end tooling. The repo includes SAT and Diffusers implementations, turnkey demos, and fine-tuning pipelines (including LoRA) designed to run across a wide range of NVIDIA GPUs, from desktop cards (e.g., RTX 3060) to data-center hardware (A100/H100). Current releases cover CogVideoX-2B, CogVideoX-5B, and the upgraded CogVideoX1.5-5B variants, plus...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    ChatGLM3

    ChatGLM3

    ChatGLM3 series: Open Bilingual Chat LLMs | Open Source Bilingual Chat

    ...It keeps the series’ smooth dialog and low deployment cost while adding native tool use (function calling), a built-in code interpreter, and agent-style workflows. The family includes base and long-context variants (8K/32K/128K). The repo ships Python APIs, CLI and web demos (Gradio/Streamlit), an OpenAI-format API server, and a compact fine-tuning kit. Quantization (4/8-bit), CPU/MPS support, and accelerator backends (TensorRT-LLM, OpenVINO, chatglm.cpp) enable lightweight local or edge deployment.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Composio

    Composio

    Composio equip's your AI agents & LLMs

    Empower your AI agents with Composio - a platform for managing and integrating tools with LLMs & AI agents using Function Calling. Equip your agent with high-quality tools & integrations without worrying about authentication, accuracy, and reliability in a single line of code.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    LLaVA

    LLaVA

    Visual Instruction Tuning: Large Language-and-Vision Assistant

    Visual instruction tuning towards large language and vision models with GPT-4 level capabilities.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    spaCy models

    spaCy models

    Models for the spaCy Natural Language Processing (NLP) library

    spaCy is designed to help you do real work, to build real products, or gather real insights. The library respects your time, and tries to avoid wasting it. It's easy to install, and its API is simple and productive. spaCy excels at large-scale information extraction tasks. It's written from the ground up in carefully memory-managed Cython. If your application needs to process entire web dumps, spaCy is the library you want to be using. Since its release in 2015, spaCy has become an industry...
    Downloads: 12 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    GPT All Star

    GPT All Star

    AI-powered code generation tool for scratch development of web apps

    AI-powered code generation tool for scratch development of web applications with a team collaboration of autonomous AI agents. This is a research project, and its primary value is to explore the possibility of autonomous AI agents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Quadratic

    Quadratic

    Data science spreadsheet with Python & SQL

    Quadratic enables your team to work together on data analysis to deliver better results, faster. You already know how to use a spreadsheet, but you’ve never had this much power before. Quadratic is a Web-based spreadsheet application that runs in the browser and as a native app (via Electron). Our goal is to build a spreadsheet that enables you to pull your data from its source (SaaS, Database, CSV, API, etc) and then work with that data using the most popular data science tools today (Python, Pandas, SQL, JS, Excel Formulas, etc). ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    nanochat

    nanochat

    The best ChatGPT that $100 can buy

    nanochat is a from-scratch, end-to-end “mini ChatGPT” that shows the entire path from raw text to a chatty web app in one small, dependency-lean codebase. The repository stitches together every stage of the lifecycle: tokenizer training, pretraining a Transformer on a large web corpus, mid-training on dialogue and multiple-choice tasks, supervised fine-tuning, optional reinforcement learning for alignment, and finally efficient inference with caching. Its north star is approachability and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    GPT Researcher

    GPT Researcher

    LLM based autonomous agent that does online comprehensive research

    Say Hello to GPT Researcher, your AI agent for rapid insights and comprehensive research. GPT Researcher is the leading autonomous agent that takes care of everything from accurate source gathering to organization of research results.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    Scrapling

    Scrapling

    An undetectable, powerful, flexible, high-performance Python library

    Scrapling is a Python scraping framework built for the modern web, combining high-performance fetchers with a rapid parsing engine to handle dynamic sites and anti-bot countermeasures. It emphasizes being “undetectable,” flexible, and fast, offering an approachable API for both experienced scrapers and newcomers. The library targets the full scraping pipeline: session handling, fetching, rendering when needed, parsing, and export—while keeping ergonomics front and center. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Klavis AI

    Klavis AI

    MCP integration platforms for AI agents to use tools at any scale

    Klavis AI is a Y Combinator X25-backed open-source infrastructure platform that enables AI agents to reliably connect with external tools and services at scale through Model Context Protocol (MCP). Founded by ex-Google DeepMind and ex-Lyft engineers, Klavis provides 50+ production-ready MCP servers with enterprise OAuth support for GitHub, Slack, Gmail, Salesforce, Linear, Notion, and more. The flagship product Strata solves tool overload through progressive discovery, achieving +13% higher...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 16
    Aider

    Aider

    Aider is AI pair programming in your terminal

    Aider is an open-source AI pair programming tool that runs directly in your terminal, allowing developers to collaborate with LLMs as if they were coding alongside a senior engineer. It maps your entire codebase to provide deep context, making it effective for both small scripts and large, complex projects. Aider supports leading cloud models like Claude 3.7 Sonnet, DeepSeek R1, OpenAI’s o1/o3/GPT-4o, as well as local LLMs for privacy-conscious workflows. Its Git integration ensures every...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 17
    OpenHands

    OpenHands

    Open-source autonomous AI software engineer

    Welcome to OpenHands (formerly OpenDevin), an open-source autonomous AI software engineer who is capable of executing complex engineering tasks and collaborating actively with users on software development projects. Use AI to tackle the toil in your backlog, so you can focus on what matters: hard problems, creative challenges, and over-engineering your dotfiles We believe agentic technology is too important to be controlled by a few corporations. So we're building all our agents in the...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18
    Magentic UI

    Magentic UI

    A research prototype of a human-centered web agent

    Magentic-UI is a research prototype developed by Microsoft that serves as a human-centered interface powered by a multi-agent system. It enables users to automate complex web tasks, such as browsing, form filling, and data analysis, while maintaining control over the process. The system emphasizes transparency and user involvement, making it suitable for tasks requiring both automation and human oversight.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    AskUI Vision Agent

    AskUI Vision Agent

    Enable AI to control your desktop, mobile and HMI devices

    AskUI’s Vision Agent is an automation framework that allows you—and AI agents—to control real desktops, mobile devices, and HMI systems by perceiving the UI and performing actions like clicking, typing, scrolling, and drag-and-drop. It is designed for multi-platform compatibility and supports multiple AI models so you can tailor perception and decision-making to your workload. The repository presents a feature overview, sample media, and frequent release notes, which show ongoing...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    langrocks

    langrocks

    Tools like web browser, computer access and code runner for LLMs

    Langrocks is a programming language experimentation toolkit that enables developers to create, test, and optimize custom programming languages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    MagicTime

    MagicTime

    Time-lapse Video Generation Models as Metamorphic Simulators

    This repository is the official implementation of MagicTime, a metamorphic video generation pipeline based on the given prompts. The main idea is to enhance the capacity of video generation models to accurately depict the real world through our proposed methods and dataset. Compared to general videos, metamorphic videos contain physical knowledge, long persistence, and strong variation, making them difficult to generate.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    HunyuanWorld 1.0

    HunyuanWorld 1.0

    Generating Immersive, Explorable, and Interactive 3D Worlds

    HunyuanWorld-1.0 is an open-source, simulation-capable 3D world generation model developed by Tencent Hunyuan that creates immersive, explorable, and interactive 3D environments from text or image inputs. It combines the strengths of video-based diversity and 3D-based geometric consistency through a novel framework using panoramic world proxies and semantically layered 3D mesh representations. This approach enables 360° immersive experiences, seamless mesh export for graphics pipelines, and...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 23
    ClearML

    ClearML

    Streamline your ML workflow

    ...It is designed as an end-to-end MLOps suite allowing you to focus on developing your ML code & automation, while ClearML ensures your work is reproducible and scalable. The ClearML Python Package for integrating ClearML into your existing scripts by adding just two lines of code, and optionally extending your experiments and other workflows with ClearML powerful and versatile set of classes and methods. The ClearML Server storing experiment, model, and workflow data, and supports the Web UI experiment manager, and ML-Ops automation for reproducibility and tuning. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    Stable Diffusion Version 2

    Stable Diffusion Version 2

    High-Resolution Image Synthesis with Latent Diffusion Models

    Stable Diffusion (the stablediffusion repo by Stability-AI) is an open-source implementation and reference codebase for high-resolution latent diffusion image models that power many text-to-image systems. The repository provides code for training and running Stable Diffusion-style models, instructions for installing dependencies (with notes about performance libraries like xformers), and guidance on hardware/driver requirements for efficient GPU inference and training. It’s organized as a...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 25
    ChatGLM2-6B

    ChatGLM2-6B

    ChatGLM2-6B: An Open Bilingual Chat LLM

    ...It upgrades the base model with GLM’s hybrid pretraining objective, 1.4 TB bilingual data, and preference alignment—delivering big gains on MMLU, CEval, GSM8K, and BBH. The context window extends up to 32K (FlashAttention), and Multi-Query Attention improves speed and memory use. The repo includes Python APIs, CLI & web demos, OpenAI-style/FASTAPI servers, and quantized checkpoints for lightweight local deployment on GPUs or CPU/MPS.
    Downloads: 1 This Week
    Last Update:
    See Project