Showing 100 open source projects for "llm local"

View related business solutions
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    WhisperJAV

    WhisperJAV

    Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD

    WhisperJAV is an open-source speech transcription pipeline designed specifically for generating subtitles for Japanese adult video content. The project addresses challenges that standard speech recognition models face when transcribing this type of audio, which often includes low signal-to-noise ratios and large numbers of non-verbal vocalizations. Traditional automatic speech recognition systems can misinterpret these sounds as words, leading to inaccurate transcripts. WhisperJAV introduces...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 2
    LangChain-ChatGLM-Webui

    LangChain-ChatGLM-Webui

    Automatic question answering for local knowledge bases based on LLM

    LangChain-ChatGLM-Webui is an open-source web interface that integrates the ChatGLM large language model with the LangChain framework to create an interactive conversational AI platform. The project provides a graphical interface that allows users to interact with language models through chat sessions while also connecting those models to external knowledge sources. It supports retrieval-augmented generation workflows that enable the system to answer questions based on local documents or...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    dstack

    dstack

    Open-source tool designed to enhance the efficiency of workloads

    dstack is an open-source tool designed to enhance the efficiency of running ML workloads in any cloud (AWS, GCP, Azure, Lambda, etc). It streamlines development and deployment, reduces cloud costs, and frees users from vendor lock-in.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Qwen3

    Qwen3

    Qwen3 is the large language model series developed by Qwen team

    Qwen3 is a cutting-edge large language model (LLM) series developed by the Qwen team at Alibaba Cloud. The latest updated version, Qwen3-235B-A22B-Instruct-2507, features significant improvements in instruction-following, reasoning, knowledge coverage, and long-context understanding up to 256K tokens. It delivers higher quality and more helpful text generation across multiple languages and domains, including mathematics, coding, science, and tool usage. Various quantized versions,...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 8 Monitoring Tools in One APM. Install in 5 Minutes. Icon
    8 Monitoring Tools in One APM. Install in 5 Minutes.

    Errors, performance, logs, uptime, hosts, anomalies, dashboards, and check-ins. One interface.

    AppSignal works out of the box for Ruby, Elixir, Node.js, Python, and more. 30-day free trial, no credit card required.
    Start Free
  • 5
    METATRON

    METATRON

    AI-powered penetration testing assistant using local LLM on linux

    METATRON is a multi-agent AI orchestration framework designed to coordinate complex workflows across multiple intelligent agents. It provides a structured system for task delegation, communication, and collaboration between agents. The framework emphasizes scalability, allowing multiple agents to work together on large or complex problems. It includes mechanisms for managing context, memory, and execution flow across tasks. METATRON is particularly useful for building advanced AI systems...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    ChatGPT Clone

    ChatGPT Clone

    ChatGPT interface with better UI

    ...While it illustrates how to hook into third-party LLM endpoints, it is typically positioned as an educational, self-hosted starter that you should operate responsibly and within provider's terms of use.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 7
    yt-fts

    yt-fts

    Search all of YouTube from the command line

    yt-fts, short for YouTube Full Text Search, is an open-source command-line tool that enables users to search the spoken content of YouTube videos by indexing their subtitles. The program automatically downloads subtitles from a specified YouTube channel using the yt-dlp utility and stores them in a local SQLite database. Once indexed, users can perform full-text searches across all transcripts to quickly locate keywords or phrases mentioned within the videos. The tool returns search results...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    PocketFlow Tutorial Codebase Knowledge
    ...It supports both GitHub URL crawling and local directory analysis, and can tailor output tutorials to different languages, making it accessible for international developers.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    OpenAI Forward

    OpenAI Forward

    An efficient forwarding service designed for LLMs

    ...Its main purpose is to make model access more manageable and efficient by adding operational controls such as request rate limiting, token rate limiting, caching, logging, routing, and key management around existing LLM endpoints. The project can proxy both local and cloud-hosted language model services, which makes it useful for teams that want a single control layer regardless of whether they are using something like LocalAI or a hosted provider compatible with OpenAI-style APIs. A major emphasis of the repository is asynchronous performance, using tools such as uvicorn, aiohttp, and asyncio to support high-throughput forwarding workloads.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 10
    QA-Pilot

    QA-Pilot

    QA-Pilot is an interactive chat project that leverages LLMs

    ...It enables users to clone repositories locally and then interact with them through a conversational interface, allowing for rapid exploration of codebases without manual searching. The system supports both local and cloud-based LLM providers, making it flexible for different environments and privacy requirements. It includes features such as chat session storage, search functionality for retrieving previous interactions, and multi-session management for working on multiple repositories simultaneously. QA-Pilot also integrates code visualization tools such as code graphs, helping users better understand file structures and relationships within projects. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    RamaLama

    RamaLama

    Simplifies the local serving of AI models from any source

    RamaLama is an open-source developer tool that simplifies working with and serving AI models locally or in production by leveraging container technologies like Docker, Podman, and OCI registries, allowing AI inference workflows to be treated like standard container deployments. It abstracts away much of the complexity of configuring AI runtimes, dependencies, and hardware optimizations by detecting available GPUs (or falling back to CPU) and automatically pulling a container image...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Self-hosted AI Package

    Self-hosted AI Package

    Run all your local AI together in one package

    Self-hosted AI Package is an open-source Docker Compose-based starter kit that makes it easy to bootstrap a full local AI and low-code development environment with commonly used open tools, empowering developers to run LLMs and AI workflows entirely on their infrastructure. The stack typically includes Ollama for running local large language models, n8n as a low-code workflow automation platform, Supabase for database and vector storage, Open WebUI for interacting with models, Flowise for...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Unsloth-MLX

    Unsloth-MLX

    Bringing the Unsloth experience to Mac users via Apple's MLX framework

    Unsloth-MLX offers developers the power of Unsloth’s efficient large language model fine-tuning experience on Apple Silicon Macs by wrapping Apple’s native MLX framework with an API fully compatible with Unsloth workflows. This project removes traditional barriers that prevent Mac users from prototyping and experimenting with LLM training locally by allowing the same code used in cloud GPU environments to run on M-series hardware, improving workflow continuity and reducing iteration costs....
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    Unstructured.IO

    Unstructured.IO

    Open source libraries and APIs to build custom preprocessing pipelines

    The unstructured library provides open-source components for ingesting and pre-processing images and text documents, such as PDFs, HTML, Word docs, and many more. The use cases of unstructured revolve around streamlining and optimizing the data processing workflow for LLMs. unstructured modular bricks and connectors form a cohesive system that simplifies data ingestion and pre-processing, making it adaptable to different platforms and is efficient in transforming unstructured data into...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Pathway AI Pipelines

    Pathway AI Pipelines

    Ready-to-run cloud templates for RAG

    Pathway AI Pipelines is a collection of ready-to-deploy AI pipeline templates designed to help developers rapidly build production-grade retrieval-augmented generation and enterprise search applications. The project provides end-to-end examples that connect live data sources to LLM workflows, enabling applications to stay synchronized with continuously changing information. It supports numerous connectors including local files, Google Drive, SharePoint, Kafka, PostgreSQL, and real-time APIs, making it suitable for enterprise data environments. The templates include built-in indexing, vector search, hybrid search, and caching capabilities that remove the need to assemble separate infrastructure components. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    LMCache

    LMCache

    Supercharge Your LLM with the Fastest KV Cache Layer

    LMCache is an extension layer for LLM serving engines that accelerates inference, especially with long contexts, by storing and reusing key-value (KV) attention caches across requests. Instead of rebuilding KV states for repeated or shared text segments, LMCache persists and retrieves them from multiple tiers—GPU memory, CPU DRAM, and local disk—then injects them into subsequent requests to reduce TTFT and increase throughput.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Elia

    Elia

    Terminal-based LLM chat tool with multi-model and local support

    Elia is an open source terminal-based interface designed for interacting with large language models in a fast and efficient way. It runs entirely in the command line, offering a keyboard-driven experience that reduces the need for switching between apps. Users can chat with both proprietary models like ChatGPT and Claude, as well as local models such as Llama 3, Mistral, and Gemma. Elia stores conversations in a local SQLite database, making it easy to revisit past interactions. It supports...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    OneFileLLM

    OneFileLLM

    Specify a github or local repo, github pull request

    OneFileLLM is an open-source project designed to simplify the distribution and execution of large language model applications by packaging them into a single portable file. The concept behind the project is to eliminate the complexity normally associated with deploying AI systems, which often require multiple dependencies, frameworks, and configuration steps. Instead, the entire runtime environment, model interface, and application logic are bundled together into a single executable...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Chat with LLMs Everywhere

    Chat with LLMs Everywhere

    Run PyTorch LLMs locally on servers, desktop and mobile

    TorchChat is an open-source project from the PyTorch ecosystem designed to demonstrate how large language models can be executed efficiently across different computing environments. The project provides a compact codebase that illustrates how to run conversational AI systems using PyTorch models on laptops, servers, and mobile devices. It is intended primarily as a reference implementation that shows developers how to integrate large language models into applications without requiring a...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Mistral Inference

    Mistral Inference

    Official inference library for Mistral models

    Open and portable generative AI for devs and businesses. We release open-weight models for everyone to customize and deploy where they want it. Our super-efficient model Mistral Nemo is available under Apache 2.0, while Mistral Large 2 is available through both a free non-commercial license, and a commercial license.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Sage Chat

    Sage Chat

    Chat with any codebase in under two minutes | Fully local

    Sage is an open-source AI developer assistant designed to help engineers understand and work with complex codebases more effectively. The tool functions similarly to an intelligent research agent that can analyze a repository and answer questions about how the software works. Instead of focusing solely on code generation, Sage emphasizes code comprehension, system architecture analysis, and integration guidance. Developers can ask natural language questions about a project, and the system...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    SeaGOAT

    SeaGOAT

    local-first semantic code search engine

    SeaGOAT is an open-source semantic code search engine designed to help developers explore and understand large codebases more efficiently. Instead of relying solely on traditional keyword search, it uses vector embeddings to represent the meaning of code and queries, allowing users to perform semantic searches that find relevant code even when the exact keywords are not present. The tool runs locally on a developer’s machine and processes repositories using a combination of embedding models...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    text-extract-api

    text-extract-api

    Document (PDF, Word, PPTX ...) extraction and parse API

    text-extract-api is an open-source service designed to extract readable text from a wide variety of document formats through a simple API interface. The project focuses on converting complex files such as PDFs, images, scanned documents, and office files into structured plain text that can be processed by downstream applications or language models. Instead of requiring developers to integrate multiple document parsing libraries individually, the system centralizes text extraction...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    DataChain

    DataChain

    AI-data warehouse to enrich, transform and analyze unstructured data

    Datachain enables multimodal API calls and local AI inferences to run in parallel over many samples as chained operations. The resulting datasets can be saved, versioned, and sent directly to PyTorch and TensorFlow for training. Datachain can persist features of Python objects returned by AI models, and enables vectorized analytical operations over them. The typical use cases are data curation, LLM analytics and validation, image segmentation, pose detection, and GenAI alignment. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    TradingAgents

    TradingAgents

    Chinese Financial Trading Framework Based on Multi-Agent LLM

    TradingAgents-CN is a Chinese-enhanced, multi-agent LLM framework aimed at building financial analysis and trading-oriented workflows, with an emphasis on collaboration between specialized agents rather than a single monolithic prompt. It organizes market-related tasks into roles and stages so different agents can contribute research, reasoning, aggregation, and decision support in a structured pipeline. The project is oriented toward practical usage, including a stack that can be run in a...
    Downloads: 4 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB