Open Source Linux Large Language Models (LLM)

Large Language Models (LLM) for Linux

View 105 business solutions

Browse free open source Large Language Models (LLM) and projects for Linux below. Use the toggles on the left to filter open source Large Language Models (LLM) by OS, license, language, programming language, and project status.

  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Atera - an All-in-one platform for IT management Icon
    Atera - an All-in-one platform for IT management

    Ideal for IT departments and MSPs (managed service providers)

    Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!
    Try Atera now
  • 1
    Ollama

    Ollama

    Run models like Kimi-K2.5, GLM-5, DeepSeek, gpt-oss, Gemma, Qwen etc.

    Ollama is an open-source platform that enables developers to run large language models locally on their own machines. It simplifies working with modern AI models by providing a unified interface to download, manage, and interact with them. Users can run models like Llama, Gemma, Qwen, and others directly from the command line or through APIs. Ollama also integrates with popular developer tools and AI agents, allowing seamless workflows across coding environments and applications. It supports REST APIs, Python, and JavaScript SDKs, making it easy to build AI-powered features into software projects. Overall, Ollama focuses on privacy, local-first AI execution, and developer-friendly tooling for building with open models.
    Downloads: 741 This Week
    Last Update:
    See Project
  • 2
    SillyTavern

    SillyTavern

    LLM Frontend for Power Users

    Mobile-friendly, Multi-API (KoboldAI/CPP, Horde, NovelAI, Ooba, OpenAI, OpenRouter, Claude, Scale), VN-like Waifu Mode, Horde SD, System TTS, WorldInfo (lorebooks), customizable UI, auto-translate, and more prompt options than you'd ever want or need. Optional Extras server for more SD/TTS options + ChromaDB/Summarize. SillyTavern is a user interface you can install on your computer (and Android phones) that allows you to interact with text generation AIs and chat/roleplay with characters you or the community create. SillyTavern is a fork of TavernAI 1.2.8 which is under more active development and has added many major features. At this point, they can be thought of as completely independent programs.
    Downloads: 556 This Week
    Last Update:
    See Project
  • 3
    MiroFish

    MiroFish

    A Simple and Universal Swarm Intelligence Engine

    MiroFish is a next-generation artificial intelligence prediction engine that leverages multi-agent technology and swarm-intelligence simulation to model, simulate, and forecast complex real-world scenarios. The system extracts “seed” information from sources such as breaking news, policy documents, and market signals to construct a high-fidelity digital parallel world populated by thousands of virtual agents with independent memory and behavior rules. Users can inject variables or conditions into this simulated environment from a “god’s eye view,” enabling iterative prediction of future trends under different assumptions, which can be useful for decision support, scenario planning, or creative exploration. The engine includes both backend and frontend components, with configuration and deployment instructions for local and containerized setups, and is designed to produce detailed predictive reports based on interactions and emergent patterns within the simulated world.
    Downloads: 515 This Week
    Last Update:
    See Project
  • 4
    llama.cpp

    llama.cpp

    Port of Facebook's LLaMA model in C/C++

    The llama.cpp project enables the inference of Meta's LLaMA model (and other models) in pure C/C++ without requiring a Python runtime. It is designed for efficient and fast model execution, offering easy integration for applications needing LLM-based capabilities. The repository focuses on providing a highly optimized and portable implementation for running large language models directly within C/C++ environments.
    Downloads: 382 This Week
    Last Update:
    See Project
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 5
    WeChatMsg

    WeChatMsg

    Project aimed at extracting, exporting, and analyzing chat records

    WeChatMsg repository hosts an open-source project aimed at extracting, exporting, and analyzing chat records from the WeChat messaging platform. It provides tools that read local WeChat database files and allow users to convert chat data into readable formats such as HTML, Word, and CSV, making it possible to inspect conversations outside the mobile app environment. Beyond simple export, the project includes mechanisms for analyzing chat histories and generating annual reports or visual summaries about messaging trends, interaction patterns, and more. The original README communicates a guiding philosophy about owning personal data and using it responsibly to train personalized AI agents or preserve memories. Although the repository has seen periods of inactivity and may not receive frequent updates, its widespread use indicates community interest in preserving chat logs and understanding conversation data outside of the WeChat interface.
    Downloads: 285 This Week
    Last Update:
    See Project
  • 6
    AnythingLLM

    AnythingLLM

    The all-in-one Desktop & Docker AI application with full RAG and AI

    A full-stack application that enables you to turn any document, resource, or piece of content into a context that any LLM can use as references during chatting. This application allows you to pick and choose which LLM or Vector Database you want to use as well as supporting multi-user management and permissions. AnythingLLM is a full-stack application where you can use commercial off-the-shelf LLMs or popular open-source LLMs and vectorDB solutions to build a private ChatGPT with no compromises that you can run locally as well as host remotely and be able to chat intelligently with any documents you provide it. AnythingLLM divides your documents into objects called workspaces. A Workspace functions a lot like a thread, but with the addition of containerization of your documents. Workspaces can share documents, but they do not talk to each other so you can keep your context for each workspace clean.
    Downloads: 148 This Week
    Last Update:
    See Project
  • 7
    llamafile

    llamafile

    Distribute and run LLMs with a single file

    llamafile lets you distribute and run LLMs with a single file. (announcement blog post). Our goal is to make open LLMs much more accessible to both developers and end users. We're doing that by combining llama.cpp with Cosmopolitan Libc into one framework that collapses all the complexity of LLMs down to a single-file executable (called a "llamafile") that runs locally on most computers, with no installation. The easiest way to try it for yourself is to download our example llamafile for the LLaVA model (license: LLaMA 2, OpenAI). LLaVA is a new LLM that can do more than just chat; you can also upload images and ask it questions about them. With llamafile, this all happens locally; no data ever leaves your computer.
    Downloads: 135 This Week
    Last Update:
    See Project
  • 8
    GPT4All

    GPT4All

    Run Local LLMs on Any Device. Open-source

    GPT4All is an open-source project that allows users to run large language models (LLMs) locally on their desktops or laptops, eliminating the need for API calls or GPUs. The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware. It integrates with the llama.cpp implementation and supports multiple LLMs, allowing users to interact with AI models privately. This project also supports Python integrations for easy automation and customization. GPT4All is ideal for individuals and businesses seeking private, offline access to powerful LLMs.
    Downloads: 118 This Week
    Last Update:
    See Project
  • 9
    DeepSeek R1

    DeepSeek R1

    Open-source, high-performance AI model with advanced reasoning

    DeepSeek-R1 is an open-source large language model developed by DeepSeek, designed to excel in complex reasoning tasks across domains such as mathematics, coding, and language. DeepSeek R1 offers unrestricted access for both commercial and academic use. The model employs a Mixture of Experts (MoE) architecture, comprising 671 billion total parameters with 37 billion active parameters per token, and supports a context length of up to 128,000 tokens. DeepSeek-R1's training regimen uniquely integrates large-scale reinforcement learning (RL) without relying on supervised fine-tuning, enabling the model to develop advanced reasoning capabilities. This approach has resulted in performance comparable to leading models like OpenAI's o1, while maintaining cost-efficiency. To further support the research community, DeepSeek has released distilled versions of the model based on architectures such as LLaMA and Qwen.
    Downloads: 98 This Week
    Last Update:
    See Project
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 10
    Hands-On Large Language Models

    Hands-On Large Language Models

    Official code repo for the O'Reilly Book

    Hands-On-Large-Language-Models is the official GitHub code repository accompanying the practical technical book Hands-On Large Language Models authored by Jay Alammar and Maarten Grootendorst, providing a comprehensive collection of example notebooks, code labs, and supporting materials that illustrate the core concepts and real-world applications of large language models. The repository is structured into chapters that align with the educational progression of the book — covering everything from foundational topics like tokens, embeddings, and transformer architecture to advanced techniques such as prompt engineering, semantic search, retrieval-augmented generation (RAG), multimodal LLMs, and fine-tuning. Each chapter contains executable Jupyter notebooks that are designed to be run in environments like Google Colab, making it easy for learners to experiment interactively with models, visualize attention patterns, implement classification and generation tasks.
    Downloads: 92 This Week
    Last Update:
    See Project
  • 11
    FreeLLMAPI

    FreeLLMAPI

    OpenAI-compatible proxy that aggregates free-tier keys from ~14 AI

    FreeLLMAPI is an OpenAI-compatible proxy that aggregates free-tier API keys from multiple AI providers into one unified endpoint. It is designed for personal experimentation, testing, and lightweight development workflows where users want to route requests through several providers without rewriting client code for each one. The project can automatically fail over between configured providers when one is unavailable or exhausted. Its OpenAI-compatible design makes it easier to use with existing tools, SDKs, and applications that already expect that API shape. It is not positioned as an enterprise-grade service or a way to bypass provider terms, but as a local coordination layer for personally owned free-tier credentials. freellmapi is useful for developers who want a practical testing proxy for comparing models, managing limits, and improving request continuity.
    Downloads: 78 This Week
    Last Update:
    See Project
  • 12
    DeepSeek-V3

    DeepSeek-V3

    Powerful AI language model (MoE) optimized for efficiency/performance

    DeepSeek-V3 is a robust Mixture-of-Experts (MoE) language model developed by DeepSeek, featuring a total of 671 billion parameters, with 37 billion activated per token. It employs Multi-head Latent Attention (MLA) and the DeepSeekMoE architecture to enhance computational efficiency. The model introduces an auxiliary-loss-free load balancing strategy and a multi-token prediction training objective to boost performance. Trained on 14.8 trillion diverse, high-quality tokens, DeepSeek-V3 underwent supervised fine-tuning and reinforcement learning to fully realize its capabilities. Evaluations indicate that it outperforms other open-source models and rivals leading closed-source models, achieving this with a training duration of 55 days on 2,048 Nvidia H800 GPUs, costing approximately $5.58 million.
    Downloads: 64 This Week
    Last Update:
    See Project
  • 13
    LLPlayer

    LLPlayer

    The media player for language learning, with dual subtitles

    LLPlayer is an open-source media player designed specifically for language learning through video content. Unlike traditional media players, the application focuses on advanced subtitle-related features that help learners understand and interact with foreign language media more effectively. The player supports dual subtitles so users can simultaneously view text in both the original language and their native language while watching videos. It can also automatically generate subtitles in real time using speech-to-text systems such as Whisper, allowing subtitles to be created even when none are available. Real-time translation capabilities enable subtitles to be translated using multiple translation engines and language models. Additional tools such as instant word lookup, contextual translation, and subtitle search allow learners to interact with the text while watching videos.
    Downloads: 59 This Week
    Last Update:
    See Project
  • 14
    GLM-4.6

    GLM-4.6

    Agentic, Reasoning, and Coding (ARC) foundation models

    GLM-4.6 is the latest iteration of Zhipu AI’s foundation model, delivering significant advancements over GLM-4.5. It introduces an extended 200K token context window, enabling more sophisticated long-context reasoning and agentic workflows. The model achieves superior coding performance, excelling in benchmarks and practical coding assistants such as Claude Code, Cline, Roo Code, and Kilo Code. Its reasoning capabilities have been strengthened, including improved tool usage during inference and more effective integration within agent frameworks. GLM-4.6 also enhances writing quality, producing outputs that better align with human preferences and role-playing scenarios. Benchmark evaluations demonstrate that it not only outperforms GLM-4.5 but also rivals leading global models such as DeepSeek-V3.1-Terminus and Claude Sonnet 4.
    Downloads: 52 This Week
    Last Update:
    See Project
  • 15
    GLM-5

    GLM-5

    From Vibe Coding to Agentic Engineering

    GLM-5 is a next-generation open-source large language model (LLM) developed by the Z .ai team under the zai-org organization that pushes the boundaries of reasoning, coding, and long-horizon agentic intelligence. Building on earlier GLM series models, GLM-5 dramatically scales the parameter count (to roughly 744 billion) and expands pre-training data to significantly improve performance on complex tasks such as multi-step reasoning, software engineering workflows, and agent orchestration compared to its predecessors like GLM-4.5. It incorporates innovations like DeepSeek Sparse Attention (DSA) to preserve massive context windows while reducing deployment costs and supporting long context processing, which is crucial for detailed plans and agent tasks.
    Downloads: 50 This Week
    Last Update:
    See Project
  • 16
    OmniRoute

    OmniRoute

    OmniRoute is an AI gateway for multi-provider LLM

    OmniRoute is a routing and orchestration framework designed to simplify the handling of requests, workflows, or data flows across multiple services or endpoints in a unified manner. It focuses on providing a flexible abstraction layer where developers can define routing logic that dynamically directs traffic based on conditions, context, or predefined rules. The project emphasizes modularity and extensibility, allowing users to plug in different services or handlers without tightly coupling components. It is particularly useful in distributed systems where requests need to be intelligently routed between APIs, microservices, or processing pipelines. OmniRoute aims to reduce boilerplate by centralizing routing logic and providing reusable patterns for managing complex flows. Its architecture supports scalability and maintainability, making it suitable for both small applications and larger systems with multiple integrations.
    Downloads: 50 This Week
    Last Update:
    See Project
  • 17
    GLM-4.7

    GLM-4.7

    Advanced language and coding AI model

    GLM-4.7 is an advanced agent-oriented large language model designed as a high-performance coding and reasoning partner. It delivers significant gains over GLM-4.6 in multilingual agentic coding, terminal-based workflows, and real-world developer benchmarks such as SWE-bench and Terminal Bench 2.0. The model introduces stronger “thinking before acting” behavior, improving stability and accuracy in complex agent frameworks like Claude Code, Cline, and Roo Code. GLM-4.7 also advances “vibe coding,” producing cleaner, more modern UIs, better-structured webpages, and visually improved slide layouts. Its tool-use capabilities are substantially enhanced, with notable improvements in browsing, search, and tool-integrated reasoning tasks. Overall, GLM-4.7 shows broad performance upgrades across coding, reasoning, chat, creative writing, and role-play scenarios.
    Downloads: 47 This Week
    Last Update:
    See Project
  • 18
    rtk

    rtk

    CLI proxy that reduces LLM token consumption

    rtk is an open-source command-line proxy designed to optimize interactions between AI coding agents and the terminal by reducing unnecessary token consumption. When AI assistants execute shell commands during software development tasks, the resulting terminal output often contains large amounts of repetitive or irrelevant information that can overwhelm the model’s context window. RTK intercepts these command outputs and compresses them into concise summaries before sending them to the language model. This process helps maintain important information while removing redundant data such as boilerplate logs, long directory listings, or repetitive test outputs. By minimizing the amount of noise sent to the AI model, the tool improves reasoning quality and allows longer development sessions within the same context window. The system is implemented as a lightweight Rust binary that runs locally and integrates easily with common AI coding environments.
    Downloads: 44 This Week
    Last Update:
    See Project
  • 19
    GLM-4.5

    GLM-4.5

    GLM-4.5: Open-source LLM for intelligent agents by Z.ai

    GLM-4.5 is a cutting-edge open-source large language model designed by Z.ai for intelligent agent applications. The flagship GLM-4.5 model has 355 billion total parameters with 32 billion active parameters, while the compact GLM-4.5-Air version offers 106 billion total parameters and 12 billion active parameters. Both models unify reasoning, coding, and intelligent agent capabilities, providing two modes: a thinking mode for complex reasoning and tool usage, and a non-thinking mode for immediate responses. They are released under the MIT license, allowing commercial use and secondary development. GLM-4.5 achieves strong performance on 12 industry-standard benchmarks, ranking 3rd overall, while GLM-4.5-Air balances competitive results with greater efficiency. The models support FP8 and BF16 precision, and can handle very large context windows of up to 128K tokens. Flexible inference is supported through frameworks like vLLM and SGLang with tool-call and reasoning parsers included.
    Downloads: 37 This Week
    Last Update:
    See Project
  • 20
    LangGraph Studio

    LangGraph Studio

    Desktop app for prototyping and debugging LangGraph applications

    LangGraph Studio offers a new way to develop LLM applications by providing a specialized agent IDE that enables visualization, interaction, and debugging of complex agentic applications. With visual graphs and the ability to edit state, you can better understand agent workflows and iterate faster. LangGraph Studio integrates with LangSmith so you can collaborate with teammates to debug failure modes. While in Beta, LangGraph Studio is available for free to all LangSmith users on any plan tier. LangGraph Studio requires docker-compose version 2.22.0+ or higher. Please make sure you have Docker installed and running before continuing. When you open LangGraph Studio desktop app for the first time, you need to login via LangSmith. Once you have successfully authenticated, you can choose the LangGraph application folder to use, you can either drag and drop or manually select it in the file picker.
    Downloads: 37 This Week
    Last Update:
    See Project
  • 21
    llmfit

    llmfit

    157 models, 30 providers, one command to find what runs on hardware

    llmfit is a terminal-based utility that helps developers determine which large language models can realistically run on their local hardware by analyzing system resources and model requirements. The tool automatically detects CPU, RAM, GPU, and VRAM specifications, then ranks available models based on performance factors such as speed, quality, and memory fit. It provides both an interactive terminal user interface and a traditional CLI mode, enabling flexible workflows for different user preferences. llmfit also supports advanced configurations including multi-GPU setups, mixture-of-experts architectures, and dynamic quantization recommendations. By presenting clear performance estimates and compatibility guidance, the project reduces the trial-and-error typically involved in local LLM experimentation. Overall, llmfit serves as a practical decision assistant for developers who want to run language models efficiently on their own machines.
    Downloads: 37 This Week
    Last Update:
    See Project
  • 22
    Clippy

    Clippy

    Clippy, now with some AI

    Clippy is an open-source desktop assistant that allows users to run modern large language models locally while presenting them through a nostalgic interface inspired by Microsoft’s classic Clippy assistant from the 1990s. The project serves as both a playful homage to the early days of personal computing and a practical demonstration of local AI inference. Clippy integrates with the llama.cpp runtime to run models directly on a user’s computer without requiring cloud-based AI services. It supports models in the GGUF format, which allows it to run many publicly available open-source LLMs efficiently on consumer hardware. Users interact with the system through a simple animated assistant interface that can answer questions, generate text, and perform conversational tasks. The application includes one-click installation support for several popular models such as Meta’s Llama, Google’s Gemma, and other open models.
    Downloads: 35 This Week
    Last Update:
    See Project
  • 23
    Dify

    Dify

    One API for plugins and datasets, one interface for prompt engineering

    Dify is an easy-to-use LLMOps platform designed to empower more people to create sustainable, AI-native applications. With visual orchestration for various application types, Dify offers out-of-the-box, ready-to-use applications that can also serve as Backend-as-a-Service APIs. Unify your development process with one API for plugins and datasets integration, and streamline your operations using a single interface for prompt engineering, visual analytics, and continuous improvement. Out-of-the-box web sites supporting form mode and chat conversation mode A single API encompassing plugin capabilities, context enhancement, and more, saving you backend coding effort Visual data analysis, log review, and annotation for applications
    Downloads: 34 This Week
    Last Update:
    See Project
  • 24
    tt-metal

    tt-metal

    TT-NN operator library, and TT-Metalium low level kernel programming

    tt-metal, also referred to in its documentation as TT-Metalium, is Tenstorrent’s low-level software development kit for programming applications on Tenstorrent AI accelerators. The project is designed for developers who need direct access to the company’s Tensix processor architecture, exposing a programming model that is closer to hardware control than high-level inference frameworks. Instead of following a traditional GPU model centered on massive thread parallelism, the platform is built around a grid of specialized compute nodes called Tensix cores, each with local SRAM, dedicated compute units, and multiple RISC-V control processors. The SDK provides the abstractions and APIs needed to manage data movement, compute kernels, memory coordination, and execution flow across this architecture.
    Downloads: 34 This Week
    Last Update:
    See Project
  • 25
    Kimi K2

    Kimi K2

    Kimi K2 is the large language model series developed by Moonshot AI

    Kimi K2 is Moonshot AI’s advanced open-source large language model built on a scalable Mixture-of-Experts (MoE) architecture that combines a trillion total parameters with a subset of ~32 billion active parameters to deliver powerful and efficient performance on diverse tasks. It was trained on an enormous corpus of over 15.5 trillion tokens to push frontier capabilities in coding, reasoning, and general agentic tasks while addressing training stability through novel optimizer and architecture design strategies. The model family includes variants like a foundational base model that researchers can fine-tune for specific use cases and an instruct-optimized variant primed for general-purpose chat and agent-style interactions, offering flexibility for both experimentation and deployment. With its high-dimensional attention mechanisms and expert routing, Kimi-K2 excels across benchmarks in live coding, math reasoning, and problem solving.
    Downloads: 32 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Auth0 Logo