Showing 248 open source projects for "memory"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Open-LLM-VTuber

    Open-LLM-VTuber

    Open source AI VTuber platform with voice chat and Live2D avatars

    Open-LLM-VTuber is an open source platform designed to create AI-powered VTuber characters that can interact with users through voice and animated avatars. It enables hands-free conversations with large language models by combining speech recognition, language processing, and text-to-speech synthesis into a single system. Users can speak directly to the AI character, and the system can respond with a generated voice while animating a Live2D avatar to simulate a talking virtual personality....
    Downloads: 20 This Week
    Last Update:
    See Project
  • 2
    Bailing

    Bailing

    Bailing is a voice dialogue robot similar to GPT-4o

    ...It aims to be light enough to run without a GPU, making it usable on modest hardware or edge devices, while still maintaining low latency and smooth interaction. Bailing includes a memory system, giving the assistant the ability to remember user preferences and context across sessions, which enables more personalized and context-aware conversations.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Whisper-WebUI

    Whisper-WebUI

    A Web UI for easy subtitle using whisper model

    ...Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools. The platform integrates optimized implementations such as faster-whisper, significantly improving transcription speed and reducing memory usage compared to standard models. It supports multiple input sources including local files, YouTube content, and microphone input, making it versatile for different workflows. Whisper WebUI also includes advanced preprocessing and postprocessing features such as voice activity detection, background music separation, and speaker diarization, enabling more accurate and structured outputs.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 4
    verl-agent

    verl-agent

    Designed for training LLM/VLM agents via RL

    ...This step-wise interaction model makes it possible to train agents to operate in long-horizon scenarios where decisions depend on cumulative context and previous outcomes. Developers can configure memory modules that determine how historical information is stored and incorporated into each step of the reasoning process.
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 5
    NagaAgent

    NagaAgent

    A simple yet powerful agent framework for personal assistants

    ...It provides abstractions for representing goals, context, and state so that agents can plan sequences of actions, evaluate outcomes, and adjust behavior over time. The project includes mechanisms for semantic memory, reasoning pipelines, and integration points with external data sources and language models so that agents can interpret natural language instructions and produce coherent multi-step outputs. Rather than being a simple chatbot, NagaAgent emphasizes persistent thought cycles, context retention, and the ability to decompose complex tasks into smaller executable units, earning it a place in research explorations of agent design. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    DLRM

    DLRM

    An implementation of a deep learning recommendation model (DLRM)

    ...It includes data loaders for standard benchmarks (like Criteo), training scripts, evaluation tools, and capabilities like mixed precision, gradient compression, and memory fusion to maximize throughput.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Ring

    Ring

    Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI

    ...Its architectures and training approaches are tuned to enable efficient and capable reasoning performance. Reasoning-optimized model with reinforcement learning enhancements. Efficient architecture and memory design for large-scale reasoning. If you are located in mainland China, we also provide the model on ModelScope.cn to speed up the download process.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Tencent-Hunyuan-Large

    Tencent-Hunyuan-Large

    Open-source large language model family from Tencent Hunyuan

    ...It is designed with long-context capabilities, quantization support, and high performance on benchmarks across general reasoning, mathematics, language understanding, and Chinese / multilingual tasks. It aims to provide competitive capability with efficient deployment and inference. FP8 quantization support to reduce memory usage (~50%) while maintaining precision. High benchmarking performance on tasks like MMLU, MATH, CMMLU, C-Eval, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Mem0

    Mem0

    The Memory layer for AI Agents

    Mem0 is a self-improving memory layer designed for Large Language Model (LLM) applications, enabling personalized AI experiences that save costs and delight users. It remembers user preferences, adapts to individual needs, and continuously improves over time. Key features include enhancing future conversations by building smarter AI that learns from every interaction, reducing LLM costs by up to 80% through intelligent data filtering, delivering more accurate and personalized AI outputs by leveraging historical context, and offering easy integration compatible with platforms like OpenAI and Claude. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    how-to-optim-algorithm-in-cuda

    how-to-optim-algorithm-in-cuda

    How to optimize some algorithm in cuda

    ...The project combines technical notes, code examples, and practical experiments that demonstrate how common computational kernels can be optimized to improve speed and memory efficiency. Instead of presenting only theoretical explanations, the repository includes hand-written CUDA implementations of fundamental operations such as reductions, element-wise computations, softmax, and attention mechanisms. These examples show how different optimization techniques influence performance on modern GPU hardware and allow readers to experiment with real implementations. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 11
    HunyuanVideo-Avatar

    HunyuanVideo-Avatar

    Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model

    HunyuanVideo-Avatar is a multimodal diffusion transformer (MM-DiT) model by Tencent Hunyuan for animating static avatar images into dynamic, emotion-controllable, and multi-character dialogue videos, conditioned on audio. It addresses challenges of motion realism, identity consistency, and emotional alignment. Innovations include a character image injection module, an Audio Emotion Module for transferring emotion cues, and a Face-Aware Audio Adapter to isolate audio effects on faces,...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 12
    TurboQuant PyTorch

    TurboQuant PyTorch

    From-scratch PyTorch implementation of Google's TurboQuant

    TurboQuant PyTorch is a specialized deep learning optimization framework designed to accelerate neural network inference and training through advanced quantization techniques within the PyTorch ecosystem. The project focuses on reducing the computational and memory footprint of models by converting floating-point representations into lower-precision formats while preserving performance. It provides tools for experimenting with different quantization strategies, enabling developers to balance accuracy and efficiency depending on their application. The framework integrates directly with PyTorch workflows, making it accessible for researchers and engineers already familiar with the ecosystem. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    SDGym

    SDGym

    Benchmarking synthetic data generation methods

    ...Choose from any of the SDV synthesizers and baselines. Or write your own custom machine learning model. In addition to performance and memory usage, you can also measure synthetic data quality and privacy through a variety of metrics. Install SDGym using pip or conda. We recommend using a virtual environment to avoid conflicts with other software on your device.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    spaCy models

    spaCy models

    Models for the spaCy Natural Language Processing (NLP) library

    ...It's easy to install, and its API is simple and productive. spaCy excels at large-scale information extraction tasks. It's written from the ground up in carefully memory-managed Cython. If your application needs to process entire web dumps, spaCy is the library you want to be using. Since its release in 2015, spaCy has become an industry standard with a huge ecosystem. Choose from a variety of plugins, integrate with your machine learning stack and build custom components and workflows.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    mergekit

    mergekit

    Tools for merging pretrained large language models

    ...This approach allows researchers to combine specialized models into a more versatile system capable of performing multiple tasks. mergekit implements a variety of merging algorithms and strategies that control how model parameters are blended together during the merging process. The library is designed to operate efficiently even in environments with limited hardware resources by using memory-efficient processing methods that can run entirely on CPUs. It also provides configuration-driven workflows that allow users to experiment with different merging strategies without modifying source code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Continuous Claude v3

    Continuous Claude v3

    Context management for Claude Code. Hooks maintain state via ledgers

    Continuous Claude v3 is a persistent, multi-agent development environment built around the Claude Code CLI that aims to overcome the limitations of standard LLM context windows. Rather than relying on a single session’s context, Continuous Claude uses mechanisms like ledgers, YAML handoffs, and a memory system to preserve and recall state across multiple sessions, ensuring that learned insights and plans are not lost when context compaction occurs. The project orchestrates many specialized agents and skills—109 skills and 32 agents—so that complex coding tasks can be broken down, analyzed, and executed collaboratively by different components. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Learn Claude Code

    Learn Claude Code

    Bash is all you need, write a claude code with only 16 line code

    ...It emphasizes a hands-on learning path where each version (from v0 to v4) adds conceptual building blocks like the core agent loop, todo planning, task decomposition, and domain knowledge skills, illuminating the patterns behind what makes a true AI agent tick. The goal is to demystify agent architectures like Claude Code by having learners build simplified versions themselves and observe how tools, memory management, planning constraints, and context isolation contribute to reliable agent behavior. Along the way, the project teaches fundamentals such as how to let models call external tools, maintain clean memory for long tasks, and inject domain expertise without retraining the model.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    TensorRT LLM

    TensorRT LLM

    TensorRT LLM provides users with an easy-to-use Python API

    ...It provides a Python-based API built on top of PyTorch that allows developers to define, customize, and deploy LLMs efficiently across a variety of hardware configurations, from single GPUs to large multi-node clusters. The library focuses on maximizing throughput and minimizing latency through advanced techniques such as quantization, custom attention kernels, and optimized memory management strategies. It includes support for cutting-edge inference methods like speculative decoding and inflight batching, enabling real-time and large-scale AI applications. TensorRT-LLM integrates seamlessly with NVIDIA’s broader inference ecosystem, including Triton Inference Server and distributed deployment frameworks, making it suitable for production environments.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19
    LLM Vision

    LLM Vision

    Visual intelligence for your home.

    LLM Vision is an open-source integration for Home Assistant that adds multimodal large language model capabilities to smart home environments. The project enables Home Assistant to analyze images, video files, and live camera feeds using vision-capable AI models. Instead of relying only on traditional object detection pipelines, it allows users to send prompts about visual content and receive contextual descriptions or answers about what is happening in camera footage. The system can process...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 20
    DeepSeed

    DeepSeed

    Deep learning optimization library making distributed training easy

    DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective. DeepSpeed delivers extreme-scale model training for everyone, from data scientists training on massive supercomputers to those training on low-end clusters or even on a single GPU. Using current generation of GPU clusters with hundreds of devices, 3D parallelism of DeepSpeed can efficiently train deep learning models with trillions of parameters. With just a single GPU,...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 21
    ChatRWKV

    ChatRWKV

    ChatRWKV is like ChatGPT but powered by RWKV

    ...ChatRWKV is useful for developers who want to test RWKV as a practical conversational model rather than only as a research architecture. Its main value is showing how RWKV can be applied to chatbot-style interfaces while preserving the speed and memory benefits of its model design.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    Portia SDK Python

    Portia SDK Python

    Portia Labs Python SDK for building agentic workflows

    ...Designed for production environments, the SDK integrates with local or cloud LLMs (e.g. OpenAI, Anthropic, Mistral, Gemini) and supports extensive tool registries, session handling, and memory management.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    DeepSeek-V3.2-Exp

    DeepSeek-V3.2-Exp

    An experimental version of DeepSeek model

    DeepSeek-V3.2-Exp is an experimental release of the DeepSeek model family, intended as a stepping stone toward the next generation architecture. The key innovation in this version is DeepSeek Sparse Attention (DSA), a sparse attention mechanism that aims to optimize training and inference efficiency in long-context settings without degrading output quality. According to the authors, they aligned the training setup of V3.2-Exp with V3.1-Terminus so that benchmark results remain largely...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 24
    LuxTTS

    LuxTTS

    A high-quality rapid TTS voice cloning model

    ...Intended for developers, hobbyists, and creators, the repository includes installation instructions, usage examples, and Python APIs that make it feasible to integrate the model in local workflows, web demos, or production systems. Its design emphasizes efficiency and practicality, fitting within modest GPU memory footprints.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 25
    LangGraph

    LangGraph

    Build resilient language agents as graphs

    ...As a very low-level framework, it provides fine-grained control over both the flow and state of your application, crucial for creating reliable agents. Additionally, LangGraph includes built-in persistence, enabling advanced human-in-the-loop and memory features.
    Downloads: 7 This Week
    Last Update:
    See Project
Auth0 Logo