Open Source Python Large Language Models (LLM)

Python Large Language Models (LLM)

View 396 business solutions

Browse free open source Python Large Language Models (LLM) and projects below. Use the toggles on the left to filter open source Python Large Language Models (LLM) by OS, license, language, programming language, and project status.

  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    MiroFish

    MiroFish

    A Simple and Universal Swarm Intelligence Engine

    MiroFish is a next-generation artificial intelligence prediction engine that leverages multi-agent technology and swarm-intelligence simulation to model, simulate, and forecast complex real-world scenarios. The system extracts “seed” information from sources such as breaking news, policy documents, and market signals to construct a high-fidelity digital parallel world populated by thousands of virtual agents with independent memory and behavior rules. Users can inject variables or conditions into this simulated environment from a “god’s eye view,” enabling iterative prediction of future trends under different assumptions, which can be useful for decision support, scenario planning, or creative exploration. The engine includes both backend and frontend components, with configuration and deployment instructions for local and containerized setups, and is designed to produce detailed predictive reports based on interactions and emergent patterns within the simulated world.
    Downloads: 340 This Week
    Last Update:
    See Project
  • 2
    GPT4All

    GPT4All

    Run Local LLMs on Any Device. Open-source

    GPT4All is an open-source project that allows users to run large language models (LLMs) locally on their desktops or laptops, eliminating the need for API calls or GPUs. The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware. It integrates with the llama.cpp implementation and supports multiple LLMs, allowing users to interact with AI models privately. This project also supports Python integrations for easy automation and customization. GPT4All is ideal for individuals and businesses seeking private, offline access to powerful LLMs.
    Downloads: 106 This Week
    Last Update:
    See Project
  • 3
    DeepSeek R1

    DeepSeek R1

    Open-source, high-performance AI model with advanced reasoning

    DeepSeek-R1 is an open-source large language model developed by DeepSeek, designed to excel in complex reasoning tasks across domains such as mathematics, coding, and language. DeepSeek R1 offers unrestricted access for both commercial and academic use. The model employs a Mixture of Experts (MoE) architecture, comprising 671 billion total parameters with 37 billion active parameters per token, and supports a context length of up to 128,000 tokens. DeepSeek-R1's training regimen uniquely integrates large-scale reinforcement learning (RL) without relying on supervised fine-tuning, enabling the model to develop advanced reasoning capabilities. This approach has resulted in performance comparable to leading models like OpenAI's o1, while maintaining cost-efficiency. To further support the research community, DeepSeek has released distilled versions of the model based on architectures such as LLaMA and Qwen.
    Downloads: 91 This Week
    Last Update:
    See Project
  • 4
    DeepSeek-V3

    DeepSeek-V3

    Powerful AI language model (MoE) optimized for efficiency/performance

    DeepSeek-V3 is a robust Mixture-of-Experts (MoE) language model developed by DeepSeek, featuring a total of 671 billion parameters, with 37 billion activated per token. It employs Multi-head Latent Attention (MLA) and the DeepSeekMoE architecture to enhance computational efficiency. The model introduces an auxiliary-loss-free load balancing strategy and a multi-token prediction training objective to boost performance. Trained on 14.8 trillion diverse, high-quality tokens, DeepSeek-V3 underwent supervised fine-tuning and reinforcement learning to fully realize its capabilities. Evaluations indicate that it outperforms other open-source models and rivals leading closed-source models, achieving this with a training duration of 55 days on 2,048 Nvidia H800 GPUs, costing approximately $5.58 million.
    Downloads: 56 This Week
    Last Update:
    See Project
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 5
    GLM-4.6

    GLM-4.6

    Agentic, Reasoning, and Coding (ARC) foundation models

    GLM-4.6 is the latest iteration of Zhipu AI’s foundation model, delivering significant advancements over GLM-4.5. It introduces an extended 200K token context window, enabling more sophisticated long-context reasoning and agentic workflows. The model achieves superior coding performance, excelling in benchmarks and practical coding assistants such as Claude Code, Cline, Roo Code, and Kilo Code. Its reasoning capabilities have been strengthened, including improved tool usage during inference and more effective integration within agent frameworks. GLM-4.6 also enhances writing quality, producing outputs that better align with human preferences and role-playing scenarios. Benchmark evaluations demonstrate that it not only outperforms GLM-4.5 but also rivals leading global models such as DeepSeek-V3.1-Terminus and Claude Sonnet 4.
    Downloads: 47 This Week
    Last Update:
    See Project
  • 6
    GLM-4.7

    GLM-4.7

    Advanced language and coding AI model

    GLM-4.7 is an advanced agent-oriented large language model designed as a high-performance coding and reasoning partner. It delivers significant gains over GLM-4.6 in multilingual agentic coding, terminal-based workflows, and real-world developer benchmarks such as SWE-bench and Terminal Bench 2.0. The model introduces stronger “thinking before acting” behavior, improving stability and accuracy in complex agent frameworks like Claude Code, Cline, and Roo Code. GLM-4.7 also advances “vibe coding,” producing cleaner, more modern UIs, better-structured webpages, and visually improved slide layouts. Its tool-use capabilities are substantially enhanced, with notable improvements in browsing, search, and tool-integrated reasoning tasks. Overall, GLM-4.7 shows broad performance upgrades across coding, reasoning, chat, creative writing, and role-play scenarios.
    Downloads: 42 This Week
    Last Update:
    See Project
  • 7
    GLM-4.5

    GLM-4.5

    GLM-4.5: Open-source LLM for intelligent agents by Z.ai

    GLM-4.5 is a cutting-edge open-source large language model designed by Z.ai for intelligent agent applications. The flagship GLM-4.5 model has 355 billion total parameters with 32 billion active parameters, while the compact GLM-4.5-Air version offers 106 billion total parameters and 12 billion active parameters. Both models unify reasoning, coding, and intelligent agent capabilities, providing two modes: a thinking mode for complex reasoning and tool usage, and a non-thinking mode for immediate responses. They are released under the MIT license, allowing commercial use and secondary development. GLM-4.5 achieves strong performance on 12 industry-standard benchmarks, ranking 3rd overall, while GLM-4.5-Air balances competitive results with greater efficiency. The models support FP8 and BF16 precision, and can handle very large context windows of up to 128K tokens. Flexible inference is supported through frameworks like vLLM and SGLang with tool-call and reasoning parsers included.
    Downloads: 33 This Week
    Last Update:
    See Project
  • 8
    MLC LLM

    MLC LLM

    Universal LLM Deployment Engine with ML Compilation

    MLC LLM is a machine learning compiler and deployment framework designed to enable efficient execution of large language models across a wide range of hardware platforms. The project focuses on compiling models into optimized runtimes that can run natively on devices such as GPUs, mobile processors, browsers, and edge hardware. By leveraging machine learning compilation techniques, mlc-llm produces high-performance inference engines that maintain consistent APIs across platforms. The system supports deployment on environments including Linux, macOS, Windows, iOS, Android, and web browsers while utilizing different acceleration technologies such as CUDA, Vulkan, Metal, and WebGPU. It also provides OpenAI-compatible APIs that allow developers to integrate locally deployed models into existing AI applications without major code changes.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 9
    BruteForceAI

    BruteForceAI

    Advanced LLM-powered brute-force tool combining AI intelligence

    BruteForceAI is an open-source security testing tool that applies large language models to the analysis of login forms and authentication flows in web applications. At a high level, the project uses AI to inspect HTML content, identify the relevant form elements, and automate selector discovery so that a tester does not need to hand-map every field before evaluation. It combines that analysis layer with automated credential testing workflows, framing itself as a more adaptive alternative to older brute-force tooling that depends heavily on manual configuration. The repository emphasizes features such as threaded execution, logging, and notification integrations, which position it as an automation-oriented project for controlled security assessment environments. From a software design perspective, its distinguishing idea is the use of language models as a front-end analysis layer that interprets a target page before the rest of the workflow proceeds.
    Downloads: 26 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    SD.Next

    SD.Next

    All-in-one WebUI for AI generative image and video creation

    SD.Next is an all-in-one web user interface for generative image creation that expands beyond basic Stable Diffusion workflows to cover broader image and video generation, captioning, and processing tasks. It is designed as a power-user environment where model management, generation features, and workflow controls are centralized in a single UI rather than spread across separate scripts and utilities. The project emphasizes broad model support and includes mechanisms for discovering, downloading, and configuring models through integrated tooling, lowering the setup burden for experimentation. It also provides documentation and an ecosystem of guides that help users move from basic generation to more advanced usage patterns, including API-based automation. SD.Next is built to run across common desktop platforms and focuses on practicality: install, generate, iterate, and automate with minimal friction.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 11
    PrivateGPT

    PrivateGPT

    Interact with your documents using the power of GPT

    PrivateGPT is a production-ready, privacy-first AI system that allows querying of uploaded documents using LLMs, operating completely offline in your own environment. It provides contextual generative AI capabilities without sending data externally. Now maintained under Zylon.ai with enterprise deployment options (air gapped, cloud, or on-prem).
    Downloads: 21 This Week
    Last Update:
    See Project
  • 12
    Qwen3

    Qwen3

    Qwen3 is the large language model series developed by Qwen team

    Qwen3 is a cutting-edge large language model (LLM) series developed by the Qwen team at Alibaba Cloud. The latest updated version, Qwen3-235B-A22B-Instruct-2507, features significant improvements in instruction-following, reasoning, knowledge coverage, and long-context understanding up to 256K tokens. It delivers higher quality and more helpful text generation across multiple languages and domains, including mathematics, coding, science, and tool usage. Various quantized versions, tools/pipelines provided for inference using quantized formats (e.g. GGUF, etc.). Coverage for many languages in training and usage, alignment with human preferences in open-ended tasks, etc.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 13
    Alpa

    Alpa

    Training and serving large-scale neural networks

    Alpa is a system for training and serving large-scale neural networks. Scaling neural networks to hundreds of billions of parameters has enabled dramatic breakthroughs such as GPT-3, but training and serving these large-scale neural networks require complicated distributed system techniques. Alpa aims to automate large-scale distributed training and serving with just a few lines of code.
    Downloads: 19 This Week
    Last Update:
    See Project
  • 14
    CodeGeeX

    CodeGeeX

    CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)

    CodeGeeX is a large-scale multilingual code generation model with 13 billion parameters, trained on 850B tokens across more than 20 programming languages. Developed with MindSpore and later made PyTorch-compatible, it is capable of multilingual code generation, cross-lingual code translation, code completion, summarization, and explanation. It has been benchmarked on HumanEval-X, a multilingual program synthesis benchmark introduced alongside the model, and achieves state-of-the-art performance compared to other open models like InCoder and CodeGen. CodeGeeX also powers IDE plugins for VS Code and JetBrains, offering features like code completion, translation, debugging, and annotation. The model supports Ascend 910 and NVIDIA GPUs, with optimizations like quantization and FasterTransformer acceleration for faster inference.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 15
    Strix

    Strix

    Open-source AI hackers to find and fix your app’s vulnerabilities

    Strix is an open source agent-driven security platform that uses autonomous AI agents to identify, investigate, and validate vulnerabilities in software applications. The system is designed to mimic the behavior of real attackers by executing dynamic testing and verifying findings through proof-of-concept exploitation. Unlike traditional vulnerability scanners that rely heavily on static analysis, Strix agents actively run code, probe systems, and attempt exploitation to confirm whether vulnerabilities are genuinely exploitable. The platform is intended for developers and security teams that need rapid security assessments without the overhead of manual penetration testing engagements. Strix can orchestrate multiple cooperating agents that divide investigation tasks and collaboratively analyze complex applications or infrastructure.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 16
    vLLM

    vLLM

    A high-throughput and memory-efficient inference and serving engine

    vLLM is a fast and easy-to-use library for LLM inference and serving. High-throughput serving with various decoding algorithms, including parallel sampling, beam search, and more.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 17
    LLaMA 3

    LLaMA 3

    The official Meta Llama 3 GitHub site

    This repository is the former home for Llama 3 model artifacts and getting-started code, covering pre-trained and instruction-tuned variants across multiple parameter sizes. It introduced the public packaging of weights, licenses, and quickstart examples that helped developers fine-tune or run the models locally and on common serving stacks. As the Llama stack evolved, Meta consolidated repositories and marked this one deprecated, pointing users to newer, centralized hubs for models, utilities, and docs. Even as a deprecated repo, it documents the transition path and preserves references that clarify how Llama 3 releases map into the current ecosystem. Practically, it functioned as a bridge between Llama 2 and later Llama releases by standardizing distribution and starter code for inference and fine-tuning. Teams still treat it as historical reference material for version lineage and migration notes.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 18
    WhisperJAV

    WhisperJAV

    Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD

    WhisperJAV is an open-source speech transcription pipeline designed specifically for generating subtitles for Japanese adult video content. The project addresses challenges that standard speech recognition models face when transcribing this type of audio, which often includes low signal-to-noise ratios and large numbers of non-verbal vocalizations. Traditional automatic speech recognition systems can misinterpret these sounds as words, leading to inaccurate transcripts. WhisperJAV introduces a specialized pipeline that separates text generation from timestamp alignment, allowing the system to generate transcripts and then align them with audio using forced alignment techniques. The framework supports several speech recognition models, including Qwen-based ASR systems and fine-tuned Whisper models trained on domain-specific dialogue.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 19
    Free LLM API resources

    Free LLM API resources

    A list of free LLM inference resources accessible via API

    Free LLM API resources repository curated by cheahjs is a community-driven index of free and open API endpoints, tools, datasets, runtimes, and utilities for working with large language models (LLMs) without cost-barriers. It collects a wide range of resources including hosted free-tier LLM APIs, documentation links, public model endpoints, open datasets useful for training or evaluation, tooling integrations, and examples showing how to interact with these services in real applications. This list helps developers, hobbyists, and researchers quickly find models they can use for prototyping, experimentation, or production proofs-of-concept without needing paid subscriptions, reducing friction for innovation. The repository typically categorizes offerings by provider, type of service (text, embeddings, vision), availability conditions (open without key, free tier with key), and usage examples to make discovery and adoption easier.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 20
    Qwen3-Coder

    Qwen3-Coder

    Qwen3-Coder is the code version of Qwen3

    Qwen3-Coder is the latest and most powerful agentic code model developed by the Qwen team at Alibaba Cloud. Its flagship version, Qwen3-Coder-480B-A35B-Instruct, features a massive 480 billion-parameter Mixture-of-Experts architecture with 35 billion active parameters, delivering top-tier performance on coding and agentic tasks. This model sets new state-of-the-art benchmarks among open models for agentic coding, browser-use, and tool-use, matching performance comparable to leading models like Claude Sonnet. Qwen3-Coder supports an exceptionally long context window of 256,000 tokens, extendable to 1 million tokens using Yarn, enabling repository-scale code understanding and generation. It is capable of handling 358 programming languages, from common to niche, making it versatile for a wide range of development environments. The model integrates a specially designed function call format and supports popular platforms such as Qwen Code and CLINE for agentic coding workflows.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 21
    Ludwig AI

    Ludwig AI

    Low-code framework for building custom LLMs, neural networks

    Declarative deep learning framework built for scale and efficiency. Ludwig is a low-code framework for building custom AI models like LLMs and other deep neural networks. Declarative YAML configuration file is all you need to train a state-of-the-art LLM on your data. Support for multi-task and multi-modality learning. Comprehensive config validation detects invalid parameter combinations and prevents runtime failures. Automatic batch size selection, distributed training (DDP, DeepSpeed), parameter efficient fine-tuning (PEFT), 4-bit quantization (QLoRA), and larger-than-memory datasets. Retain full control of your models down to the activation functions. Support for hyperparameter optimization, explainability, and rich metric visualizations. Experiment with different model architectures, tasks, features, and modalities with just a few parameter changes in the config. Think building blocks for deep learning.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 22
    Qwen-2.5-VL

    Qwen-2.5-VL

    Qwen2.5-VL is the multimodal large language model series

    Qwen2.5 is a series of large language models developed by the Qwen team at Alibaba Cloud, designed to enhance natural language understanding and generation across multiple languages. The models are available in various sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B parameters, catering to diverse computational requirements. Trained on a comprehensive dataset of up to 18 trillion tokens, Qwen2.5 models exhibit significant improvements in instruction following, long-text generation (exceeding 8,000 tokens), and structured data comprehension, such as tables and JSON formats. They support context lengths up to 128,000 tokens and offer multilingual capabilities in over 29 languages, including Chinese, English, French, Spanish, and more. The models are open-source under the Apache 2.0 license, with resources and documentation available on platforms like Hugging Face and ModelScope.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 23
    ChatGLM-6B

    ChatGLM-6B

    ChatGLM-6B: An Open Bilingual Dialogue Language Model

    ChatGLM-6B is an open bilingual (Chinese + English) conversational language model based on the GLM architecture, with approximately 6.2 billion parameters. The project provides inference code, demos (command line, web, API), quantization support for lower memory deployment, and tools for finetuning (e.g., via P-Tuning v2). It is optimized for dialogue and question answering with a balance between performance and deployability in consumer hardware settings. Support for quantized inference (INT4, INT8) to reduce GPU memory requirements. Automatic mode switching between precision/memory tradeoffs (full/quantized).
    Downloads: 11 This Week
    Last Update:
    See Project
  • 24
    Heretic

    Heretic

    Fully automatic censorship removal for language models

    Heretic is an open-source Python tool that automatically removes the built-in censorship or “safety alignment” from transformer-based language models so they respond to a broader range of prompts with fewer refusals. It works by applying directional ablation techniques and a parameter optimization strategy to adjust internal model behaviors without expensive post-training or altering the core capabilities. Designed for researchers and advanced users, Heretic makes it possible to study and experiment with uncensored model responses in a reproducible, automated way. The project can decensor many popular dense and some mixture-of-experts (MoE) models, supporting workflows that would otherwise require manual tuning. Beyond simple decensoring, Heretic includes research-oriented options for analyzing model internals and interpretability data.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 25
    LLaMA-Factory

    LLaMA-Factory

    Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

    LLaMA-Factory is a fine-tuning and training framework for Meta's LLaMA language models. It enables researchers and developers to train and customize LLaMA models efficiently using advanced optimization techniques.
    Downloads: 11 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Auth0 Logo