Showing 63 open source projects for "llama"

View related business solutions
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 1
    Axolotl

    Axolotl

    Go ahead and axolotl questions

    Axolotl is a powerful and flexible framework for fine-tuning large language models on custom datasets. Built for researchers and developers, Axolotl simplifies the process of adapting LLMs for specific tasks, including chat, code generation, and instruction following. It supports a wide variety of model architectures and offers out-of-the-box optimization strategies for efficient training.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    kotaemon

    kotaemon

    An open-source RAG-based tool for chatting with your documents

    An open-source clean & customizable RAG UI for chatting with your documents. Built with both end users and developers in mind. This project serves as a functional RAG UI for both end users who want to do QA on their documents and developers who want to build their own RAG pipeline.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    OpenLLM

    OpenLLM

    Operating LLMs in production

    ...With OpenLLM, you can run inference with any open-source large-language models, deploy to the cloud or on-premises, and build powerful AI apps. Built-in supports a wide range of open-source LLMs and model runtime, including Llama 2, StableLM, Falcon, Dolly, Flan-T5, ChatGLM, StarCoder, and more. Serve LLMs over RESTful API or gRPC with one command, query via WebUI, CLI, our Python/Javascript client, or any HTTP client.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 4
    LLaMA-MoE

    LLaMA-MoE

    Building Mixture-of-Experts from LLaMA with Continual Pre-training

    LLaMA-MoE is an open-source project that builds mixture-of-experts language models from LLaMA through expert partitioning and continual pre-training. The repository is centered on making MoE research more accessible by offering smaller and more affordable models with only about 3.0 to 3.5 billion activated parameters, which helps reduce deployment and experimentation costs.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 5
    SGLang

    SGLang

    SGLang is a fast serving framework for large language models

    SGLang is a fast serving framework for large language models and vision language models. It makes your interaction with models faster and more controllable by co-designing the backend runtime and frontend language.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    Lepton AI

    Lepton AI

    A Pythonic framework to simplify AI service building

    A Pythonic framework to simplify AI service building. Cutting-edge AI inference and training, unmatched cloud-native experience, and top-tier GPU infrastructure. Ensure 99.9% uptime with comprehensive health checks and automatic repairs.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    Curated Transformers

    Curated Transformers

    PyTorch library of curated Transformer models and their components

    ...It provides state-of-the-art models that are composed of a set of reusable components. Supports state-of-the-art transformer models, including LLMs such as Falcon, Llama, and Dolly v2. Implementing a feature or bugfix benefits all models. For example, all models support 4/8-bit inference through the bitsandbytes library and each model can use the PyTorch meta device to avoid unnecessary allocations and initialization.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    QuivrHQ

    QuivrHQ

    Opiniated RAG for integrating GenAI in your apps

    Quivr is an open-source platform that leverages Retrieval-Augmented Generation (RAG) to integrate Generative AI into applications. It serves as a "second brain," enabling users to build powerful AI-driven assistants that can process and retrieve information efficiently. Quivr supports various large language models and vector stores, providing flexibility and customization for developers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Elia

    Elia

    Terminal-based LLM chat tool with multi-model and local support

    ...It runs entirely in the command line, offering a keyboard-driven experience that reduces the need for switching between apps. Users can chat with both proprietary models like ChatGPT and Claude, as well as local models such as Llama 3, Mistral, and Gemma. Elia stores conversations in a local SQLite database, making it easy to revisit past interactions. It supports flexible usage with inline and full-screen chat modes, along with simple configuration through a single file. Installation is straightforward via pipx, and users can customize themes, system prompts, and model settings. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 10
    Cosmos-RL

    Cosmos-RL

    Cosmos-RL is a flexible and scalable Reinforcement Learning framework

    ...The framework supports multiple parallelism strategies, including tensor, pipeline, and data parallelism, allowing it to leverage large GPU clusters effectively. It is built with compatibility in mind, supporting popular model families such as LLaMA, Qwen, and diffusion-based world models, as well as integration with Hugging Face ecosystems. cosmos-rl also includes support for advanced RL algorithms, low-precision training, and fault-tolerant execution, making it suitable for large-scale production workloads.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    Pruna AI

    Pruna AI

    Pruna is a model optimization framework built for developers

    Pruna is an open-source, self-hostable AI inference engine designed to help teams deploy and manage large language models (LLMs) efficiently across private or hybrid infrastructures. Built with performance and developer ergonomics in mind, Pruna simplifies inference workflows by enabling multi-model orchestration, autoscaling, GPU resource allocation, and compatibility with popular open-source models. It is ideal for companies or teams looking to reduce reliance on external APIs while...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    EmoLLM

    EmoLLM

    Pre & Post-training & Dataset & Evaluation & Depoly & RAG

    ...Its repository includes multiple model variants and training configurations spanning several underlying model families, including InternLM, Qwen, DeepSeek, Mixtral, LLaMA, and others, which shows that the initiative is structured as a broad ecosystem rather than a single release. The project also covers more than just model weights, with material for datasets, fine-tuning, evaluation, deployment, demos, RAG, and related subprojects such as its psychological digital assistant work.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    JAX Toolbox

    JAX Toolbox

    Public CI, Docker images for popular JAX libraries

    ...It provides prebuilt Docker images, continuous integration pipelines, and optimized example implementations that help developers quickly set up and run JAX workloads without complex configuration. The project supports popular JAX-based frameworks and models, including architectures used for large-scale pretraining such as GPT and LLaMA variants. By offering curated environments and tested configurations, it reduces compatibility issues and accelerates development workflows for both research and production. The repository also includes performance-optimized examples that demonstrate best practices for leveraging NVIDIA hardware effectively. Its integration with container-based workflows makes it suitable for reproducible experiments and scalable deployments across different environments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    LLM-Pruner

    LLM-Pruner

    On the Structural Pruning of Large Language Models

    LLM-Pruner is an open-source framework designed to compress large language models through structured pruning techniques while maintaining their general capabilities. Large language models often require enormous computational resources, making them expensive to deploy and inefficient for many practical applications. LLM-Pruner addresses this issue by identifying and removing non-essential components within transformer architectures, such as redundant attention heads or feed-forward...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Intel LLM Library for PyTorch

    Intel LLM Library for PyTorch

    Accelerate local LLM inference and finetuning

    ...The framework provides hardware-aware optimizations and low-precision computation techniques that significantly improve the performance of large language models while reducing memory consumption. IPEX-LLM supports a wide range of popular models, including architectures such as LLaMA, Mistral, Qwen, and other transformer-based systems. The library can integrate with common AI frameworks and serving tools such as Hugging Face Transformers, LangChain, and vLLM, allowing developers to incorporate optimized inference into existing pipelines.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Speech-AI-Forge

    Speech-AI-Forge

    Speech-AI-Forge is a project developed around TTS generation model

    ...It is model-agnostic and advertises support for a variety of TTS and speech models such as ChatTTS, CosyVoice, Fish-Speech, FireredTTS and others, as well as Whisper-based ASR, giving you a flexible playground for experimenting with different speech stacks. The project also integrates with general-purpose LLMs (for example GPT- or LLaMA-style models), which can be used to pre-process text, manage conversations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    TAME LLM

    TAME LLM

    Traditional Mandarin LLMs for Taiwan

    TAME LLM is an open-source initiative focused on building and releasing large language models optimized for Traditional Mandarin and the linguistic context of Taiwan. The project includes models such as Llama-3-Taiwan-70B, which are fine-tuned versions of large transformer architectures trained on extensive corpora containing both Traditional Mandarin and English text. These models are designed to support applications such as conversational AI, knowledge retrieval, and domain-specific reasoning in fields like manufacturing, law, healthcare, and electronics. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    LongWriter

    LongWriter

    Unleashing 10,000+ Word Generation from Long Context LLMs

    LongWriter is an open-source framework and set of large language models designed to enable ultra-long text generation that can exceed 10,000 words while maintaining coherence and structure. Traditional large language models can process large inputs but often struggle to generate long outputs due to limitations in training data and alignment strategies. LongWriter addresses this challenge by introducing a specialized dataset and training approach that encourages models to produce longer...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Chat with LLMs Everywhere

    Chat with LLMs Everywhere

    Run PyTorch LLMs locally on servers, desktop and mobile

    ...TorchChat supports running models through Python interfaces as well as integrating them directly into native applications written in languages such as C or C++. The project also demonstrates how modern LLMs like LLaMA-style models can be deployed locally while maintaining good performance across different hardware platforms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Penzai

    Penzai

    A JAX research toolkit to build, edit, & visualize neural networks

    Penzai, developed by Google DeepMind, is a JAX-based library for representing, visualizing, and manipulating neural network models as functional pytree data structures. It is designed to make machine learning research more interpretable and interactive, particularly for tasks like model surgery, ablation studies, architecture debugging, and interpretability research. Unlike conventional neural network libraries, Penzai exposes the full internal structure of models, enabling fine-grained...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Synthetic Data Kit

    Synthetic Data Kit

    Tool for generating high quality Synthetic datasets

    Synthetic Data Kit is a CLI-centric toolkit for generating high-quality synthetic datasets to fine-tune Llama models, with an emphasis on producing reasoning traces and QA pairs that line up with modern instruction-tuning formats. It ships an opinionated, modular workflow that covers ingesting heterogeneous sources (documents, transcripts), prompting models to create labeled examples, and exporting to fine-tuning schemas with minimal glue code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    h2oGPT

    h2oGPT

    Private chat with local GPT with document, images, video, etc.

    h2oGPT is an open-source platform that allows users to interact with local GPT models in a completely private environment. It supports a variety of document types, including PDFs, Word files, images, video frames, and even audio, enabling users to query and analyze their documents or engage in a private chat with AI. The platform is designed to be secure and offline, ensuring that all data remains private and under the user's control. h2oGPT supports several AI models, including oLLaMa and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Chinese-LLaMA-Alpaca 2

    Chinese-LLaMA-Alpaca 2

    Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project

    This project is developed based on the commercially available large model Llama-2 released by Meta. It is the second phase of the Chinese LLaMA&Alpaca large model project. The Chinese LLaMA-2 base model and the Alpaca-2 instruction fine-tuning large model are open-sourced. These models expand and optimize the Chinese vocabulary on the basis of the original Llama-2, use large-scale Chinese data for incremental pre-training, and further improve the basic semantics and command understanding of Chinese. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    CogVLM2

    CogVLM2

    GPT4V-level open-source multi-modal model based on Llama3-8B

    CogVLM2 is the second generation of the CogVLM vision-language model series, developed by ZhipuAI and released in 2024. Built on Meta-Llama-3-8B-Instruct, CogVLM2 significantly improves over its predecessor by providing stronger performance across multimodal benchmarks such as TextVQA, DocVQA, and ChartQA, while introducing extended context length support of up to 8K tokens and high-resolution image input up to 1344×1344. The series includes models for both image understanding and video understanding, with CogVLM2-Video supporting up to 1-minute videos by analyzing keyframes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    CSM (Conversational Speech Model)

    CSM (Conversational Speech Model)

    A Conversational Speech Generation Model

    The CSM (Conversational Speech Model) is a speech generation model developed by Sesame AI that creates RVQ audio codes from text and audio inputs. It uses a Llama backbone and a smaller audio decoder to produce audio codes for realistic speech synthesis. The model has been fine-tuned for interactive voice demos and is hosted on platforms like Hugging Face for testing. CSM offers a flexible setup and is compatible with CUDA-enabled GPUs for efficient execution.
    Downloads: 2 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB