Page 6 | memory free download

Showing 248 open source projects for "memory"

View related business solutions

Artificial Intelligence Python Clear Filters & Widen Search

$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
Build Agents and Models on One Platform
Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.

Try It Free
1

whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps

Multilingual Automatic Speech Recognition with word-level timestamps and confidence. Whisper is a set of multi-lingual, robust speech recognition models trained by OpenAI that achieve state-of-the-art results in many languages. Whisper models were trained to predict approximate timestamps on speech segments (most of the time with 1-second accuracy), but they cannot originally predict word timestamps. This repository proposes an implementation to predict word timestamps and provide a more...

Downloads: 5 This Week

Last Update: 2025-09-09
See Project
2

LingBot-World

Advancing Open-source World Models

LingBot-World is an open-source, high-fidelity world simulator designed to advance the state of world models through video generation. Built on top of Wan2.2, it enables realistic, dynamic environment simulation across diverse styles, including real-world, scientific, and stylized domains. LingBot-World supports long-term temporal consistency, maintaining coherent scenes and interactions over minute-level horizons. With real-time interactivity and sub-second latency at 16 FPS, it is...

Downloads: 8 This Week

Last Update: 2026-05-22
See Project
3

LMCache

Supercharge Your LLM with the Fastest KV Cache Layer

LMCache is an extension layer for LLM serving engines that accelerates inference, especially with long contexts, by storing and reusing key-value (KV) attention caches across requests. Instead of rebuilding KV states for repeated or shared text segments, LMCache persists and retrieves them from multiple tiers—GPU memory, CPU DRAM, and local disk—then injects them into subsequent requests to reduce TTFT and increase throughput. Its design supports reuse beyond strict prefix matching and enables sharing across serving instances, improving efficiency under real multi-tenant traffic. The broader project includes examples, tests, a server component, and public posts describing cross-engine sharing and inter-GPU KV transfers. ...

Downloads: 6 This Week

Last Update: 2 days ago
See Project
4

ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM

...It upgrades the base model with GLM’s hybrid pretraining objective, 1.4 TB bilingual data, and preference alignment—delivering big gains on MMLU, CEval, GSM8K, and BBH. The context window extends up to 32K (FlashAttention), and Multi-Query Attention improves speed and memory use. The repo includes Python APIs, CLI & web demos, OpenAI-style/FASTAPI servers, and quantized checkpoints for lightweight local deployment on GPUs or CPU/MPS.

Downloads: 4 This Week

Last Update: 22 hours ago
See Project
Build Securely on AWS with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
5

SpeechRecognition

Speech recognition module for Python

...This is required to use the library. PyAudio is required if and only if you want to use microphone input (Microphone). PyAudio version 0.2.11+ is required, as earlier versions have known memory management bugs when recording from microphones in certain situations. To hack on this library, first make sure you have all the requirements listed in the "Requirements" section.

Downloads: 5 This Week

Last Update: 2026-04-24
See Project
6

Burr

Build applications that make decisions. Chatbots, agents, simulations

...Burr works well for any application that uses LLMs and can integrate with any of your favorite frameworks. Burr includes a UI that can track/monitor/trace your system in real-time, along with pluggable persisters (e.g. for memory) to save & load application state.

Downloads: 1 This Week

Last Update: 2026-05-10
See Project
7

gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models

...The series includes two main models: gpt-oss-120b, a 117-billion parameter model optimized for general-purpose, high-reasoning tasks that can run on a single H100 GPU, and gpt-oss-20b, a lighter 21-billion parameter model ideal for low-latency or specialized applications on smaller hardware. Both models use a native MXFP4 quantization for efficient memory use and support OpenAI’s Harmony response format, enabling transparent full chain-of-thought reasoning and advanced tool integrations such as function calling, browsing, and Python code execution. The repository provides multiple reference implementations—including PyTorch, Triton, and Metal—for educational and experimental use, as well as example clients and tools like a terminal chat app and a Responses API server.

1 Review

Downloads: 10 This Week

Last Update: 2026-01-13
See Project
8

AIMET

AIMET is a library that provides advanced quantization and compression

...For example, models that we’ve run on the Qualcomm® Hexagon™ DSP rather than on the Qualcomm® Kryo™ CPU have resulted in a 5x to 15x speedup. Plus, an 8-bit model also has a 4x smaller memory footprint relative to a 32-bit model. However, often when quantizing a machine learning model (e.g., from 32-bit floating point to an 8-bit fixed point value), the model accuracy is sacrificed.

Downloads: 10 This Week

Last Update: 2026-06-04
See Project
9

CogVideo

Text and image to video generation: CogVideoX and CogVideo

CogVideo is an open-source family of advanced video generation models that can create videos from text, images, or existing video inputs. Built on large-scale Transformer and diffusion architectures, it enables multimodal generation across text-to-video, image-to-video, and video continuation tasks. The latest CogVideoX models offer higher resolution outputs, longer video durations, and improved controllability through prompt engineering. The project includes tools for inference,...

Downloads: 6 This Week

Last Update: 2025-10-04
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
10

Qwen

The official repo of Qwen chat & pretrained large language model

Qwen is a series of large language models developed by Alibaba Cloud, consisting of various pretrained versions like Qwen-1.8B, Qwen-7B, Qwen-14B, and Qwen-72B. These models, which range from smaller to larger configurations, are designed for a wide range of natural language processing tasks. They are openly available for research and commercial use, with Qwen's code and model weights shared on GitHub. Qwen's capabilities include text generation, comprehension, and conversation, making it a...

1 Review

Downloads: 6 This Week

Last Update: 2026-03-05
See Project
11

Swarms

Enterprise multi-agent orchestration framework for scalable AI apps

...It supports integration with multiple model providers and existing ecosystems, allowing developers to combine different AI tools and frameworks within a unified system. Swarms also includes mechanisms for agent lifecycle management, memory handling, and dynamic composition, making it adaptable to evolving workloads. Additionally, it focuses on developer productivity through APIs, CLI tools, and templates that simplify building and deploying agent-based applications.

Downloads: 3 This Week

Last Update: 2026-03-17
See Project
12

OpenViking

Context database designed specifically for AI Agents

...It’s primarily designed to serve as a high-performance, scalable backend for storing app context, embeddings, conversational histories, and other textual artifacts that need rapid lookup and semantic search, which makes it especially useful for systems like chatbots or memory-augmented agents. The project is implemented with performance in mind, often leveraging optimized data structures that balance fast reads and writes with minimal resource consumption. Developers can integrate OpenViking into modern AI stacks to unify context storage across services, enabling consistent session history, personalized responses, and richer search experiences.

Downloads: 3 This Week

Last Update: 2026-06-05
See Project
13

cognee

Deterministic LLMs Outputs for AI Applications and AI Agents

Cognee implements scalable, modular data pipelines that allow for creating the LLM-enriched data layer using graph and vector stores. Cognee acts a semantic memory layer, unveiling hidden connections within your data and infusing it with your company's language and principles. This self-optimizing process ensures ultra-relevant, personalized, and contextually aware LLM retrievals. Any kind of data works; unstructured text or raw media files, PDFs, tables, presentations, JSON files, and so many more. ...

Downloads: 3 This Week

Last Update: 2026-05-30
See Project
14

Core ML Tools

Core ML tools contain supporting tools for Core ML model conversion

...Your app uses Core ML APIs and user data to make predictions, and to fine-tune models, all on the user’s device. Core ML optimizes on-device performance by leveraging the CPU, GPU, and Neural Engine while minimizing its memory footprint and power consumption. Running a model strictly on the user’s device removes any need for a network connection, which helps keep the user’s data private and your app responsive.

Downloads: 3 This Week

Last Update: 2025-11-10
See Project
15

Torch Pruning

DepGraph: Towards Any Structural Pruning

Torch-Pruning is an open-source toolkit designed to optimize deep neural networks by performing structural pruning directly within PyTorch models. The library focuses on reducing the size and computational cost of neural networks by removing redundant parameters and channels while maintaining model performance. It introduces a graph-based algorithm called DepGraph that automatically identifies dependencies between layers, allowing parameters to be pruned safely across complex architectures....

Downloads: 4 This Week

Last Update: 2026-03-05
See Project
16

nanobot

🐈 nanobot: The Ultra-Lightweight Clawdbot / OpenClaw

nanobot is an ultra-lightweight personal AI assistant designed to deliver powerful agent capabilities without unnecessary complexity. Built in just ~4,000 lines of clean, readable code, it offers a minimalist alternative to heavyweight agent frameworks while retaining core intelligence and extensibility. nanobot is optimized for speed and efficiency, enabling fast startup times and low resource usage across environments. Its research-ready architecture makes it easy for developers to...

Downloads: 5 This Week

Last Update: 2026-06-01
See Project
17

Deep Lake

Data Lake for Deep Learning. Build, manage, and query datasets

...Deeplake automatically decompresses them to raw data only when needed, e.g., when training a model. Treat your cloud datasets as if they are a collection of NumPy arrays in your system's memory. Slice them, index them, or iterate through them.

Downloads: 5 This Week

Last Update: 2026-02-12
See Project
18

AGiXT

AGiXT is a dynamic AI Automation Platform

AGiXT is a dynamic Artificial Intelligence Automation Platform engineered to orchestrate efficient AI instruction management and task execution across a multitude of providers. Our solution infuses adaptive memory handling with a broad spectrum of commands to enhance AI's understanding and responsiveness, leading to improved task completion. The platform's smart features, like Smart Instruct and Smart Chat, seamlessly integrate web search, planning strategies, and conversation continuity, transforming the interaction between users and AI. ...

Downloads: 1 This Week

Last Update: 2026-04-08
See Project
19

Agentic Context Engine

Make your agents learn from experience

...In this workflow, one component generates solutions, another reflects on outcomes, and a third curates useful knowledge so it can be reused in future interactions. This architecture allows agents to gradually build persistent operational memory without requiring additional training datasets or model retraining.

Downloads: 3 This Week

Last Update: 2026-05-07
See Project
20

Chitu

High-performance inference framework for large language models

...Chitu is designed to scale from small single-machine deployments to large distributed clusters that handle high volumes of concurrent inference requests. The system also includes performance optimizations for large models, including support for quantized formats and efficient computation operators that reduce memory usage and latency. Its architecture aims to support enterprise adoption by ensuring stable long-term operation under production workloads.

Downloads: 3 This Week

Last Update: 2026-06-04
See Project
21

Tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models

...It handles encoding and decoding text to token IDs efficiently, with minimal overhead. Because tokenization is a fundamental step in preparing text for models, tiktoken is optimized for speed, memory, and correctness in model contexts (e.g. matching OpenAI’s internal tokenization). The repo supports multiple encodings (e.g. “cl100k_base”) and lets users switch encoding names to match different model contexts. It also offers extension mechanisms so that custom encodings can be registered. Internally, it includes the core tokenizer logic (often implemented in Rust or efficient lower-level code), APIs for encoding, decoding, and counting tokens, and binding layers to Python (and sometimes other languages) for easy use.

Downloads: 3 This Week

Last Update: 2026-05-15
See Project
22

PySpur

Visual tool for building, testing, and deploying AI agent workflows

PySpur is a visual development environment designed to help AI engineers build, test, and iterate on agent-based workflows more efficiently. It provides a structured playground where users can define test cases, construct agents either through Python code or a graphical interface, and continuously refine their behavior. It addresses common challenges in AI agent development such as prompt tuning difficulties and lack of visibility into workflow execution. By offering a visual representation...

Downloads: 2 This Week

Last Update: 2026-03-17
See Project
23

SageAttention

NeurIPS2025 Spotlight] Quantized Attention

...The system achieves this by using low-precision numerical formats such as INT4, FP8, or INT8 to represent key matrices within the attention computation. These optimizations allow models to perform matrix operations faster and consume less memory during inference. SageAttention is designed to function as a plug-and-play replacement for standard attention implementations, enabling developers to accelerate existing models without modifying their architecture.

Downloads: 2 This Week

Last Update: 2026-03-08
See Project
24

Pandas Profiling

Create HTML profiling reports from pandas DataFrame objects

...File sizes, creation dates, dimensions, indication of truncated images and existance of EXIF metadata. Mostly global details about the dataset (number of records, number of variables, overall missigness and duplicates, memory footprint). Comprehensive and automatic list of potential data quality issues (high correlation, skewness, uniformity, zeros, missing values, constant values, between others).

Downloads: 4 This Week

Last Update: 2026-04-22
See Project
25

OpenRecall

OpenRecall is a fully open-source, privacy-first alternative

OpenRecall is an open-source, privacy-first system designed to capture, index, and make searchable a user’s entire digital activity history, effectively acting as a personal memory layer for computing environments. It works by taking periodic screenshots of a user’s screen and applying local AI processing, including OCR and semantic analysis, to extract and structure information from both text and images. This data is then indexed into a searchable database, allowing users to retrieve past information quickly using natural language queries. ...

Downloads: 1 This Week

Last Update: 2026-03-18
See Project