Open Source Linux Large Language Models (LLM) - Page 2

Large Language Models (LLM) for Linux

View 101 business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    OmniRoute

    OmniRoute

    OmniRoute is an AI gateway for multi-provider LLM

    OmniRoute is a routing and orchestration framework designed to simplify the handling of requests, workflows, or data flows across multiple services or endpoints in a unified manner. It focuses on providing a flexible abstraction layer where developers can define routing logic that dynamically directs traffic based on conditions, context, or predefined rules. The project emphasizes modularity and extensibility, allowing users to plug in different services or handlers without tightly coupling components. It is particularly useful in distributed systems where requests need to be intelligently routed between APIs, microservices, or processing pipelines. OmniRoute aims to reduce boilerplate by centralizing routing logic and providing reusable patterns for managing complex flows. Its architecture supports scalability and maintainability, making it suitable for both small applications and larger systems with multiple integrations.
    Downloads: 23 This Week
    Last Update:
    See Project
  • 2
    LocalAI

    LocalAI

    The free, Open Source alternative to OpenAI, Claude and others

    LocalAI is an open-source platform that allows users to run large language models and other AI systems locally on their own hardware. It acts as a drop-in replacement for APIs such as OpenAI, enabling developers to build AI-powered applications without relying on external cloud services. The platform supports a wide range of model types, including text generation, image creation, speech processing, and embeddings. LocalAI can run on consumer-grade hardware and does not necessarily require a GPU, making it accessible for local development and private deployments. It integrates with multiple backends like llama.cpp, transformers, and diffusers to support different AI workloads. With its self-hosted architecture and OpenAI-compatible API, LocalAI enables developers to build secure, local-first AI applications.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 3
    MLC LLM

    MLC LLM

    Universal LLM Deployment Engine with ML Compilation

    MLC LLM is a machine learning compiler and deployment framework designed to enable efficient execution of large language models across a wide range of hardware platforms. The project focuses on compiling models into optimized runtimes that can run natively on devices such as GPUs, mobile processors, browsers, and edge hardware. By leveraging machine learning compilation techniques, mlc-llm produces high-performance inference engines that maintain consistent APIs across platforms. The system supports deployment on environments including Linux, macOS, Windows, iOS, Android, and web browsers while utilizing different acceleration technologies such as CUDA, Vulkan, Metal, and WebGPU. It also provides OpenAI-compatible APIs that allow developers to integrate locally deployed models into existing AI applications without major code changes.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 4
    vLLM

    vLLM

    A high-throughput and memory-efficient inference and serving engine

    vLLM is a fast and easy-to-use library for LLM inference and serving. High-throughput serving with various decoding algorithms, including parallel sampling, beam search, and more.
    Downloads: 20 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 5
    WhisperJAV

    WhisperJAV

    Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD

    WhisperJAV is an open-source speech transcription pipeline designed specifically for generating subtitles for Japanese adult video content. The project addresses challenges that standard speech recognition models face when transcribing this type of audio, which often includes low signal-to-noise ratios and large numbers of non-verbal vocalizations. Traditional automatic speech recognition systems can misinterpret these sounds as words, leading to inaccurate transcripts. WhisperJAV introduces a specialized pipeline that separates text generation from timestamp alignment, allowing the system to generate transcripts and then align them with audio using forced alignment techniques. The framework supports several speech recognition models, including Qwen-based ASR systems and fine-tuned Whisper models trained on domain-specific dialogue.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 6
    FinGPT

    FinGPT

    Open-Source Financial Large Language Models

    FinGPT is an open-source, finance-specialized large language model framework that blends the capabilities of general LLMs with real-time financial data feeds, domain-specific knowledge bases, and task-oriented agents to support market analysis, research automation, and decision support. It extends traditional GPT-style models by connecting them to live or historical financial datasets, news APIs, and economic indicators so that outputs are grounded in relevant and recent market conditions rather than generic knowledge alone. The platform typically includes tools for fine-tuning, context engineering, and prompt templating, enabling users to build specialized assistants for tasks like sentiment analysis, earnings summary generation, risk profiling, trading signal interpretation, and document extraction from financial reports.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 7
    GLM-4

    GLM-4

    GLM-4 series: Open Multilingual Multimodal Chat LMs

    GLM-4 is a family of open models from ZhipuAI that spans base, chat, and reasoning variants at both 32B and 9B scales, with long-context support and practical local-deployment options. The GLM-4-32B-0414 models are trained on ~15T high-quality data (including substantial synthetic reasoning data), then post-trained with preference alignment, rejection sampling, and reinforcement learning to improve instruction following, coding, function calling, and agent-style behaviors. The GLM-Z1-32B-0414 line adds deeper mathematical, coding, and logical reasoning via extended reinforcement learning and pairwise ranking feedback, while GLM-Z1-Rumination-32B-0414 introduces a “rumination” mode that performs longer, tool-using deep research for complex, open-ended tasks. A lightweight GLM-Z1-9B-0414 brings many of these techniques to a smaller model, targeting strong reasoning under tight resource budgets.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 8
    Wiseflow

    Wiseflow

    Enhance any agent's browser use skill

    Wiseflow is an open-source information extraction and knowledge discovery system designed to collect, filter, and organize valuable information from large volumes of online content. The platform continuously monitors specified sources such as websites, social platforms, and other digital channels to identify relevant data according to user-defined interests or topics. By combining web crawling, content parsing, and large language model analysis, the system extracts concise insights from raw information streams and converts them into structured data that can be stored or analyzed. This automated workflow helps reduce the noise associated with large information ecosystems and highlights the most important insights for users. Wiseflow can automatically categorize extracted content, assign tags, and upload processed results into databases or knowledge systems for further use.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 9
    Qwen-2.5-VL

    Qwen-2.5-VL

    Qwen2.5-VL is the multimodal large language model series

    Qwen2.5 is a series of large language models developed by the Qwen team at Alibaba Cloud, designed to enhance natural language understanding and generation across multiple languages. The models are available in various sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B parameters, catering to diverse computational requirements. Trained on a comprehensive dataset of up to 18 trillion tokens, Qwen2.5 models exhibit significant improvements in instruction following, long-text generation (exceeding 8,000 tokens), and structured data comprehension, such as tables and JSON formats. They support context lengths up to 128,000 tokens and offer multilingual capabilities in over 29 languages, including Chinese, English, French, Spanish, and more. The models are open-source under the Apache 2.0 license, with resources and documentation available on platforms like Hugging Face and ModelScope.
    Downloads: 14 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    Dramatron

    Dramatron

    Dramatron uses large language models to generate coherent scripts

    Dramatron is an interactive co-writing tool developed by Google DeepMind that leverages large language models to help authors create screenplays and theatre scripts. It uses a hierarchical story generation approach to maintain coherence and structure across multiple levels of a narrative, from a single logline to detailed character descriptions, locations, plot points, and dialogue. Dramatron operates as a creative assistant rather than a fully autonomous system, offering human writers material to edit, adapt, and reinterpret. It was evaluated through user studies with professional playwrights and screenwriters, who found it particularly valuable for world-building, idea generation, and exploring alternative plotlines. The system can be run locally or in Google Colab, where users can integrate their own large language models by implementing sampling functions.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 11
    FreedomGPT

    FreedomGPT

    React and Electron-based app that executes the FreedomGPT LLM locally

    FreedomGPT is a locally executed large language model (LLM) application built using React and Electron, allowing users to interact with AI models privately on their Mac or Windows devices. The app enables offline operation, ensuring privacy and security while providing a chat-based interface for seamless communication with the AI. It supports integration with models like Liberty Edge and offers an open-source solution for those seeking more control over their AI interactions. The app's setup is simple, and it includes clear installation guides for both macOS and Windows platforms, as well as detailed instructions for building necessary libraries like llama.cpp.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 12
    Awesome LLM Apps

    Awesome LLM Apps

    Collection of awesome LLM apps with AI Agents and RAG using OpenAI

    Awesome LLM Apps is a community-curated directory of interesting, practical, and innovative applications built on or around large language models, serving as a discovery hub for developers, researchers, and enthusiasts. The list spans a wide range of categories including productivity tools, creative assistants, utilities, education platforms, research frameworks, and niche vertical apps, showcasing how generative models are being used across domains. Each entry includes a brief description, language model dependencies, technology stack notes, and sometimes links to demos or source code, making it easy to explore ideas and reuse concepts for your own projects. Because the landscape of LLM-powered applications changes quickly, the repository is designed to be updated regularly through community contributions, ensuring it stays current with new tools and releases.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 13
    PrivateGPT

    PrivateGPT

    Interact with your documents using the power of GPT

    PrivateGPT is a production-ready, privacy-first AI system that allows querying of uploaded documents using LLMs, operating completely offline in your own environment. It provides contextual generative AI capabilities without sending data externally. Now maintained under Zylon.ai with enterprise deployment options (air gapped, cloud, or on-prem).
    Downloads: 11 This Week
    Last Update:
    See Project
  • 14
    Qwen-Image

    Qwen-Image

    Qwen-Image is a powerful image generation foundation model

    Qwen-Image is a powerful 20-billion parameter foundation model designed for advanced image generation and precise editing, with a particular strength in complex text rendering across diverse languages, especially Chinese. Built on the MMDiT architecture, it achieves remarkable fidelity in integrating text seamlessly into images while preserving typographic details and layout coherence. The model excels not only in text rendering but also in a wide range of artistic styles, including photorealistic, impressionist, anime, and minimalist aesthetics. Qwen-Image supports sophisticated editing tasks such as style transfer, object insertion and removal, detail enhancement, and even human pose manipulation, making it suitable for both professional and casual users. It also includes advanced image understanding capabilities like object detection, semantic segmentation, depth and edge estimation, and novel view synthesis.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 15
    SD.Next

    SD.Next

    All-in-one WebUI for AI generative image and video creation

    SD.Next is an all-in-one web user interface for generative image creation that expands beyond basic Stable Diffusion workflows to cover broader image and video generation, captioning, and processing tasks. It is designed as a power-user environment where model management, generation features, and workflow controls are centralized in a single UI rather than spread across separate scripts and utilities. The project emphasizes broad model support and includes mechanisms for discovering, downloading, and configuring models through integrated tooling, lowering the setup burden for experimentation. It also provides documentation and an ecosystem of guides that help users move from basic generation to more advanced usage patterns, including API-based automation. SD.Next is built to run across common desktop platforms and focuses on practicality: install, generate, iterate, and automate with minimal friction.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 16
    CodeGeeX

    CodeGeeX

    CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)

    CodeGeeX is a large-scale multilingual code generation model with 13 billion parameters, trained on 850B tokens across more than 20 programming languages. Developed with MindSpore and later made PyTorch-compatible, it is capable of multilingual code generation, cross-lingual code translation, code completion, summarization, and explanation. It has been benchmarked on HumanEval-X, a multilingual program synthesis benchmark introduced alongside the model, and achieves state-of-the-art performance compared to other open models like InCoder and CodeGen. CodeGeeX also powers IDE plugins for VS Code and JetBrains, offering features like code completion, translation, debugging, and annotation. The model supports Ascend 910 and NVIDIA GPUs, with optimizations like quantization and FasterTransformer acceleration for faster inference.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 17
    LLaMA 3

    LLaMA 3

    The official Meta Llama 3 GitHub site

    This repository is the former home for Llama 3 model artifacts and getting-started code, covering pre-trained and instruction-tuned variants across multiple parameter sizes. It introduced the public packaging of weights, licenses, and quickstart examples that helped developers fine-tune or run the models locally and on common serving stacks. As the Llama stack evolved, Meta consolidated repositories and marked this one deprecated, pointing users to newer, centralized hubs for models, utilities, and docs. Even as a deprecated repo, it documents the transition path and preserves references that clarify how Llama 3 releases map into the current ecosystem. Practically, it functioned as a bridge between Llama 2 and later Llama releases by standardizing distribution and starter code for inference and fine-tuning. Teams still treat it as historical reference material for version lineage and migration notes.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 18
    LangChain

    LangChain

    ⚡ Building applications with LLMs through composability ⚡

    Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge. This library is aimed at assisting in the development of those types of applications.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 19
    LiteLLM

    LiteLLM

    lightweight package to simplify LLM API calls

    Call all LLM APIs using the OpenAI format [Anthropic, Huggingface, Cohere, Azure OpenAI etc.] liteLLM supports streaming the model response back, pass stream=True to get a streaming iterator in response. Streaming is supported for OpenAI, Azure, Anthropic, and Huggingface models.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 20
    MetaGPT

    MetaGPT

    The Multi-Agent Framework

    The Multi-Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo. Assign different roles to GPTs to form a collaborative software entity for complex tasks. MetaGPT takes a one-line requirement as input and outputs user stories / competitive analysis/requirements/data structures / APIs / documents, etc. Internally, MetaGPT includes product managers/architects/project managers/engineers. It provides the entire process of a software company along with carefully orchestrated SOPs.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 21
    oh my PI

    oh my PI

    AI Coding agent for the terminal

    Oh-My-Pi is an open-source AI agent toolkit focused on creating intelligent coding assistants that operate directly from the terminal environment. The project provides a command-line coding agent capable of analyzing repositories, generating commits, editing code, and interacting with development tools through an integrated tool system. Instead of functioning as a simple prompt-based assistant, the system includes an agent architecture that can inspect Git repositories, analyze changes, and perform development actions with fine-grained control. The platform also supports tool-based workflows where the agent can run shell commands, read files, modify code, and stage changes during development tasks. It includes infrastructure for integrating different AI providers and models through a unified API layer, allowing developers to switch between models while keeping the same agent interface.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 22
    DeepSeek LLM

    DeepSeek LLM

    DeepSeek LLM: Let there be answers

    The DeepSeek-LLM repository hosts the code, model files, evaluations, and documentation for DeepSeek’s LLM series (notably the 67B Chat variant). Its tagline is “Let there be answers.” The repo includes an “evaluation” folder (with results like math benchmark scores) and code artifacts (e.g. pre-commit config) that support model development and deployment. According to the evaluation files, DeepSeek LLM 67B Chat achieves strong performance on math benchmarks under both chain-of-thought (CoT) and tool-assisted reasoning modes. The model is trained from scratch, reportedly on a vast multilingual + code + reasoning dataset, and competes with other open or open-weight models. The architecture mirrors established decoder-only transformer families: pre-norm structure, rotational embeddings (RoPE), grouped query attention (GQA), and mixing in languages and tasks. It supports both “Base” (foundation model) and “Chat” (instruction / conversation tuned) variants.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 23
    Emscripten

    Emscripten

    Emscripten: An LLVM-to-WebAssembly Compiler

    Emscripten is a complete open-source compiler toolchain that transforms C, C++, and other LLVM-based source code into WebAssembly (and JavaScript), enabling native‑like applications to run in web browsers, Node.js, and other Wasm environments. While Emscripten mostly focuses on compiling C and C++ using Clang, it can be integrated with other LLVM-using compilers (for example, Rust has Emscripten integration, with the wasm32-unknown-emscripten and asmjs-unknown-emscripten targets). Emscripten provides Web support for popular portable APIs such as OpenGL and SDL2, allowing complex graphical native applications to be ported, such as the Unity game engine and Google Earth. It can probably port your codebase, too.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 24
    Khoj

    Khoj

    An AI personal assistant for your digital brain

    Get more done with your open-source AI personal assistant. Khoj is a desktop application to search and chat with your notes, documents, and images. It is an offline-first, open-source AI personal assistant that is accessible from Emacs, Obsidian or your Web browser. Khoj is a thinking tool that is transparent, fun, and easy to engage with. You can build faster and better by using Khoj to search and reason across all your data sources. Khoj learns from your notes and documents to function as an extension of your brain. So that you can stay focused on doing what matters. Khoj started with the founding principle that a personal assistant be understandable, accessible and hackable. This means you can always customize and self-host your Khoj on your own machines.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 25
    LandPPT

    LandPPT

    An LLM-based presentation generation platform

    LandPPT is an open-source AI platform that automatically generates professional presentation slides using large language models. The system allows users to create complete PowerPoint presentations simply by entering a topic or uploading source documents such as PDFs, Word files, or Markdown notes. Using natural language processing and structured content generation, the platform produces presentation outlines and converts them into fully formatted slide decks. The application integrates multiple AI models from providers such as OpenAI, Anthropic, Google, and locally hosted models to generate text, images, and structured presentation layouts. It also includes template systems and style options that allow presentations to be customized for different industries, visual themes, or storytelling formats.
    Downloads: 9 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB