Page 7 | model-builder free download

LLM CLI

Access large language models from the command-line

A CLI utility and Python library for interacting with Large Language Models, both via remote APIs and models that can be installed and run on your own machine.

Downloads: 0 This Week

Last Update: 2026-04-24

See Project

SageAttention

NeurIPS2025 Spotlight] Quantized Attention

...Since attention operations are often the most computationally expensive component of modern AI models, SageAttention introduces quantization techniques that significantly reduce computational overhead while preserving model accuracy. The system achieves this by using low-precision numerical formats such as INT4, FP8, or INT8 to represent key matrices within the attention computation. These optimizations allow models to perform matrix operations faster and consume less memory during inference. SageAttention is designed to function as a plug-and-play replacement for standard attention implementations, enabling developers to accelerate existing models without modifying their architecture.

Downloads: 2 This Week

Last Update: 2026-03-08

See Project

LangBot

Production-grade platform for building agentic IM bots

LangBot is an open source platform designed to build and deploy AI-powered chatbots across multiple instant messaging ecosystems. The system allows developers to integrate large language models into messaging platforms so that bots can perform tasks, answer questions, and automate workflows directly within everyday communication tools. It supports numerous messaging services including Discord, Slack, Telegram, WeChat, and other enterprise communication systems, making it a flexible solution...

Downloads: 2 This Week

Last Update: 4 days ago

See Project

WeClone

One-stop solution for creating your digital avatar from chat history

...By processing large volumes of conversation data, WeClone can build a profile of an individual’s writing tone, vocabulary preferences, and conversational tendencies. Developers can use the resulting model to create chatbots that simulate a specific user’s communication patterns for testing or research purposes. Overall, WeClone explores the idea of digital identity replication through machine learning and conversational modeling.

Downloads: 2 This Week

Last Update: 2026-03-04

See Project

HumanEval

Code for the paper "Evaluating Large Language Models Trained on Code"

...It consists of hand-written programming problems with unit tests, designed to assess functional correctness rather than superficial metrics like text similarity. Each task includes a natural language prompt and a function signature, requiring the model to generate an implementation that passes all provided tests. The benchmark has become a standard for evaluating code generation models, including those in the Codex and GPT families. Researchers can use the dataset to run reproducible comparisons across models and track improvements in functional code synthesis. By focusing on correctness through execution, human-eval provides a rigorous and practical way to evaluate programming capabilities in AI systems.

Downloads: 2 This Week

Last Update: 3 days ago

See Project

Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM

Qwen3-Omni is a natively end-to-end multilingual omni-modal foundation model that processes text, images, audio, and video and delivers real-time streaming responses in text and natural speech. It uses a Thinker-Talker architecture with a Mixture-of-Experts (MoE) design, early text-first pretraining, and mixed multimodal training to support strong performance across all modalities without sacrificing text or image quality.

Downloads: 0 This Week

Last Update: 2026-04-23

See Project

OpenLLM

Operating LLMs in production

...With OpenLLM, you can run inference with any open-source large-language models, deploy to the cloud or on-premises, and build powerful AI apps. Built-in supports a wide range of open-source LLMs and model runtime, including Llama 2， StableLM, Falcon, Dolly, Flan-T5, ChatGLM, StarCoder, and more. Serve LLMs over RESTful API or gRPC with one command, query via WebUI, CLI, our Python/Javascript client, or any HTTP client.

Downloads: 1 This Week

Last Update: 2025-04-21

See Project

Megatron

Ongoing research training transformer models at scale

Megatron is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA. This repository is for ongoing research on training large transformer language models at scale. We developed efficient, model-parallel (tensor, sequence, and pipeline), and multi-node pre-training of transformer based models such as GPT, BERT, and T5 using mixed precision. Megatron is also used in NeMo Megatron, a framework to help enterprises overcome the challenges of building and training sophisticated natural language processing models with billions and trillions of parameters. ...

Downloads: 0 This Week

Last Update: 2026-05-25

See Project

hCaptcha Challenger

Gracefully face hCaptcha challenge with multimodal llms

...The framework includes support for multiple types of captcha challenges such as object selection, drag-and-drop puzzles, and image labeling tasks. It implements an agent-style workflow where the system interprets the challenge prompt, selects the appropriate vision model, and generates the required interaction automatically.

Downloads: 3 This Week

Last Update: 2026-03-06

See Project

OpenPlanter

Language-model investigation agent with a terminal UI

OpenPlanter is an open-source Python project focused on building an intelligent automated planting or gardening system powered by software control and data processing. The repository is designed to help developers and hobbyists create programmable plant management workflows that can monitor, schedule, and optimize growing conditions. It emphasizes automation and extensibility, allowing integration with sensors, environmental data, and control logic for smart cultivation setups. The system is...

Downloads: 1 This Week

Last Update: 2026-03-06

See Project

Qwen2.5-Omni

Capable of understanding text, audio, vision, video

Qwen2.5-Omni is an end-to-end multimodal flagship model in the Qwen series by Alibaba Cloud, designed to process multiple modalities (text, images, audio, video) and generate responses both as text and natural speech in streaming real-time. It supports “Thinker-Talker” architecture, and introduces innovations for aligning modalities over time (for example synchronizing video/audio), robust speech generation, and low-VRAM/quantized versions to make usage more accessible.

Downloads: 0 This Week

Last Update: 2025-09-23

See Project

Shell-AI

LangChain powered shell command generator and runner CLI

Shell-AI is an open-source command-line interface utility that allows users to generate and execute shell commands using natural language prompts. Instead of requiring users to remember complex command syntax, the tool lets them describe their intent in plain English and automatically suggests commands that accomplish the task. The system is powered by large language models and integrates with frameworks such as LangChain to interpret user requests and translate them into executable shell...

Downloads: 2 This Week

Last Update: 2026-03-09

See Project

MemoryOS

MemoryOS is designed to provide a memory operating system

MemoryOS is an open-source framework designed to provide a structured memory management system for AI agents and large language model applications. The project addresses one of the major limitations of modern language models: their inability to maintain long-term context beyond the limits of their prompt window. MemoryOS introduces a hierarchical memory architecture inspired by operating system memory management principles, allowing agents to store, update, retrieve, and generate information from multiple layers of memory. ...

Downloads: 2 This Week

Last Update: 2026-03-09

See Project

LongWriter

Unleashing 10,000+ Word Generation from Long Context LLMs

...LongWriter addresses this challenge by introducing a specialized dataset and training approach that encourages models to produce longer responses. The system uses an agent-based pipeline called AgentWrite that decomposes large writing tasks into smaller subtasks, allowing the model to produce long documents section by section. Researchers also created the LongWriter-6k dataset containing thousands of examples with outputs ranging from a few thousand to tens of thousands of words.

Downloads: 2 This Week

Last Update: 2026-03-06

See Project

Lagent

A lightweight framework for building LLM-based agents

Lagent is a lightweight open-source framework designed to help developers build autonomous agents powered by large language models. The framework provides tools and abstractions that allow language models to interact with external tools, execute tasks, and perform multi-step reasoning processes. Instead of using LLMs only for text generation, Lagent enables developers to transform models into agents capable of performing actions such as retrieving data, executing code, or interacting with...

Downloads: 2 This Week

Last Update: 2026-05-13

See Project

In-The-Wild Jailbreak Prompts on LLMs

A dataset consists of 15,140 ChatGPT prompts from Reddit

In-The-Wild Jailbreak Prompts on LLMs is an open-source research repository that provides datasets and analytical tools for studying jailbreak prompts used to bypass safety restrictions in large language models. The project is part of a research effort to understand how users attempt to circumvent alignment and safety mechanisms built into modern AI systems. The repository includes a large collection of prompts gathered from real-world platforms such as Reddit, Discord, prompt-sharing...

Downloads: 2 This Week

Last Update: 2026-03-05

See Project

Gemini Fullstack LangGraph Quickstart

Get started w/ building Fullstack Agents using Gemini 2.5 & LangGraph

...The backend agent dynamically generates search queries based on user input, retrieves information via the Google Search API, and performs reflective reasoning to identify knowledge gaps. It then iteratively refines its search until it produces a comprehensive, well-cited answer synthesized by the Gemini model. The repository provides both a browser-based chat interface and a command-line script (cli_research.py) for executing research queries directly. For production deployment, the backend integrates with Redis and PostgreSQL to manage persistent memory, streaming outputs, & background task coordination.

Downloads: 3 This Week

Last Update: 3 days ago

See Project

RAPTOR

The official implementation of RAPTOR

RAPTOR is a retrieval architecture designed to improve retrieval-augmented generation systems by organizing documents into hierarchical structures that enable more effective context retrieval. Traditional RAG systems typically retrieve small text chunks independently, which can limit a model’s ability to understand broader document context. RAPTOR addresses this limitation by recursively embedding, clustering, and summarizing documents to create a tree-structured hierarchy of information....

Downloads: 1 This Week

Last Update: 2026-03-06

See Project

nano-graphrag

A simple, easy-to-hack GraphRAG implementation

nano-graphrag is a lightweight implementation of the GraphRAG approach designed to simplify experimentation with graph-based retrieval-augmented generation systems. GraphRAG expands traditional RAG pipelines by constructing knowledge graphs from documents and using relationships between entities to improve the quality and reasoning of AI responses. The nano-GraphRAG project focuses on reducing complexity by providing a compact and readable codebase that preserves the core functionality of...

Downloads: 1 This Week

Last Update: 2026-03-05

See Project

Context Engineering

A frontier, first-principles handbook

...With extensive materials drawn from research, surveys, and visual explanations, the project acts as both a learning resource and a reference for practitioners looking to improve model behavior by engineering richer inputs.

Downloads: 1 This Week

Last Update: 2026-02-27

See Project

Xorbits Inference

Replace OpenAI GPT with another LLM in your app

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop. Xorbits Inference(Xinference) is a powerful and versatile library designed to serve language, speech recognition, and multimodal models. With Xorbits...

Downloads: 2 This Week

Last Update: 2026-06-05

See Project

KVCache-Factory

Unified KV Cache Compression Methods for Auto-Regressive Models

...In large language models, the key-value cache stores intermediate attention states that enable efficient token generation during inference, but these caches can consume large amounts of GPU memory when handling long contexts. KVCache-Factory provides a platform for implementing and evaluating multiple compression strategies that reduce memory usage while preserving model performance. The framework integrates several state-of-the-art methods such as PyramidKV, SnapKV, H2O, and StreamingLLM, allowing researchers to compare and experiment with different approaches within the same environment. It also supports advanced inference configurations such as Flash Attention v2 and multi-GPU inference setups for very large models.

Downloads: 1 This Week

Last Update: 2026-03-09

See Project

ControlFlow

Take control of your AI agents

ControlFlow is an open-source Python framework developed to help engineers design and orchestrate agentic workflows powered by large language models. The framework provides a structured approach for building AI systems by breaking complex tasks into smaller units called tasks that can be assigned to specialized AI agents. Developers can combine these tasks into flows that define how work is executed, enabling the creation of multi-step reasoning pipelines and collaborative agent systems....

Downloads: 1 This Week

Last Update: 2026-03-09

See Project

AgentBench

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

AgentBench is an open-source benchmark designed to evaluate the capabilities of large language models when used as autonomous agents. Unlike traditional language model benchmarks that focus on static text tasks, AgentBench measures how models perform in interactive environments that require planning, reasoning, and decision-making. The benchmark includes multiple environments that simulate realistic scenarios such as web interaction, database querying, and problem solving tasks. These environments require agents to interpret instructions, take actions, and adapt their strategies based on feedback from the environment. ...

Downloads: 1 This Week

Last Update: 2026-03-05

See Project

FastDeploy

High-performance Inference and Deployment Toolkit for LLMs and VLMs

FastDeploy is an open-source inference and deployment toolkit designed to simplify the process of running and serving deep learning models across a wide range of hardware platforms. Developed within the PaddlePaddle ecosystem, the toolkit focuses on providing high-performance deployment capabilities for modern AI models including large language models and vision-language systems. The platform enables developers to deploy trained models quickly using optimized inference pipelines that support...

Downloads: 1 This Week

Last Update: 2026-04-08

See Project

Search Results for "model-builder" - Page 7

Showing 266 open source projects for "model-builder"

LLM CLI

SageAttention

LangBot

WeClone

HumanEval

Qwen3-Omni

OpenLLM

Megatron

hCaptcha Challenger

OpenPlanter

Qwen2.5-Omni

Shell-AI

MemoryOS

LongWriter

Lagent

In-The-Wild Jailbreak Prompts on LLMs

Gemini Fullstack LangGraph Quickstart

RAPTOR

nano-graphrag

Context Engineering

Xorbits Inference

KVCache-Factory

ControlFlow

AgentBench

FastDeploy

Search Results for "model-builder" - Page 7

Showing 266 open source projects for "model-builder"

LLM CLI

SageAttention

LangBot

WeClone

HumanEval

Qwen3-Omni

OpenLLM

Megatron

hCaptcha Challenger

OpenPlanter

Qwen2.5-Omni

Shell-AI

MemoryOS

LongWriter

Lagent

In-The-Wild Jailbreak Prompts on LLMs

Gemini Fullstack LangGraph Quickstart

RAPTOR

nano-graphrag

Context Engineering

Xorbits Inference

KVCache-Factory

ControlFlow

AgentBench

FastDeploy

Related Searches

Related Categories