Page 2 | deploy free download

Showing 140 open source projects for "deploy"

View related business solutions

Artificial Intelligence Python Clear Filters & Widen Search

Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
Earn up to 16% annual interest with Nexo.
Access competitive interest rates on your digital assets.

Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
1

A.I.G

Full-stack AI Red Teaming platform

...It brings together AI infrastructure vulnerability scanning, MCP server risk analysis, and jailbreak evaluation into a unified workflow so that enterprises and individuals can identify critical security issues without relying on external services. Users can deploy it via Docker or scripts to get a modern web UI that guides them through tasks like scanning third-party frameworks for known CVEs and experimenting with prompt security against attack vectors. The tool provides both a visual interface and a comprehensive API, making integration with internal security systems or CI/CD pipelines practical for ongoing risk management.

Downloads: 3 This Week

Last Update: 4 days ago
See Project
2

ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM

ChatGLM2-6B is the second-gen Chinese-English conversational LLM from ZhipuAI/Tsinghua. It upgrades the base model with GLM’s hybrid pretraining objective, 1.4 TB bilingual data, and preference alignment—delivering big gains on MMLU, CEval, GSM8K, and BBH. The context window extends up to 32K (FlashAttention), and Multi-Query Attention improves speed and memory use. The repo includes Python APIs, CLI & web demos, OpenAI-style/FASTAPI servers, and quantized checkpoints for lightweight local...

Downloads: 3 This Week

Last Update: 22 hours ago
See Project
3

Agent Framework

Framework for building, orchestrating, and deploying AI agents

Microsoft Agent Framework is an open source framework designed to help developers build, orchestrate, and deploy AI agents and multi-agent systems. It provides a unified programming model that supports both Python and .NET implementations, allowing developers to create agent-driven applications in multiple programming environments. It includes tools and abstractions for constructing simple conversational agents as well as complex workflows where multiple agents collaborate to complete tasks. ...

Downloads: 4 This Week

Last Update: 4 days ago
See Project
4

Eigent

The Open Source Cowork Desktop to Unlock Your Exceptional Productivity

Eigent is an open-source cowork desktop application designed to help you build, manage, and deploy a custom AI workforce. It enables multiple specialized AI agents to collaborate in parallel, turning complex workflows into automated, end-to-end tasks. Built on the CAMEL-AI multi-agent framework, Eigent emphasizes productivity, flexibility, and transparent system design. You can run Eigent fully locally for maximum privacy and data control, or choose a cloud-connected experience for quick access. ...

Downloads: 4 This Week

Last Update: 2026-05-19
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

FAY

Framework for building AI-powered interactive digital humans and agent

Fay is an open source framework designed to build and deploy interactive digital humans powered by large language models. It acts as a middleware layer that connects digital character technologies with conversational AI systems and business applications. Fay supports various types of digital humans, including 2.5D and 3D avatars, and can be integrated with applications running on mobile devices, PCs, web platforms, and embedded systems.

Downloads: 5 This Week

Last Update: 12 hours ago
See Project
6

TensorRT LLM

TensorRT LLM provides users with an easy-to-use Python API

TensorRT-LLM is an open-source high-performance inference library specifically designed to optimize and accelerate large language model deployment on NVIDIA GPUs. It provides a Python-based API built on top of PyTorch that allows developers to define, customize, and deploy LLMs efficiently across a variety of hardware configurations, from single GPUs to large multi-node clusters. The library focuses on maximizing throughput and minimizing latency through advanced techniques such as quantization, custom attention kernels, and optimized memory management strategies. It includes support for cutting-edge inference methods like speculative decoding and inflight batching, enabling real-time and large-scale AI applications. ...

Downloads: 2 This Week

Last Update: 2026-04-16
See Project
7

CoPaw

Your Personal AI Assistant; easy to install, deploy on local or coud

CoPaw is a personal AI assistant designed to run on your own machine or in the cloud, giving you full control over memory, models, and data. Built by the AgentScope team, it connects to multiple chat platforms—including DingTalk, Feishu, QQ, Discord, iMessage, and more—through a single unified assistant. CoPaw supports both cloud-based LLM providers and fully local models such as llama.cpp, MLX, and Ollama, allowing you to operate without API keys if preferred. It includes a browser-based...

1 Review

Downloads: 8 This Week

Last Update: 5 days ago
See Project
8

Vocode

Build voice-based LLM agents. Modular + open source

Vocode is an open source library that makes it easy to build voice-based LLM apps. Using Vocode, you can build real-time streaming conversations with LLMs and deploy them to phone calls, Zoom meetings, and more. You can also build personal assistants or apps like voice-based chess. Vocode provides easy abstractions and integrations so that everything you need is in a single library.

Downloads: 0 This Week

Last Update: 2025-02-05
See Project
9

Mistral Inference

Official inference library for Mistral models

Open and portable generative AI for devs and businesses. We release open-weight models for everyone to customize and deploy where they want it. Our super-efficient model Mistral Nemo is available under Apache 2.0, while Mistral Large 2 is available through both a free non-commercial license, and a commercial license.

Downloads: 0 This Week

Last Update: 2025-03-20
See Project
Ship Agents Faster
Transform your applications and workflows into powerful agentic systems at global scale.

Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.

Get Started Free
10

TFX

TFX is an end-to-end platform for deploying production ML pipelines

TensorFlow Extended (TFX) is a Google-production-scale machine learning platform based on TensorFlow. It provides a configuration framework to express ML pipelines consisting of TFX components. TFX pipelines can be orchestrated using Apache Airflow and Kubeflow Pipelines. Both the components themselves and the integrations with orchestration systems can be extended. TFX components interact with an ML Metadata backend that keeps a record of component runs, input and output artifacts, and...

Downloads: 0 This Week

Last Update: 2026-04-09
See Project
11

LitGPT

20+ high-performance LLMs with recipes to pretrain, finetune at scale

LitGPT is a collection of over 20 high-performance large language models (LLMs) accompanied by recipes to pretrain, finetune, and deploy them at scale. It provides implementations without abstractions, making it beginner-friendly while offering advanced features like flash attention and support for various precision levels. LitGPT is designed to run efficiently across multiple GPUs or TPUs, catering to both small-scale and large-scale deployments.

Downloads: 0 This Week

Last Update: 2025-12-18
See Project
12

Prompt flow

Build high-quality LLM apps

Prompt flow is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, and evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

Downloads: 0 This Week

Last Update: 2025-01-09
See Project
13

Chitu

High-performance inference framework for large language models

Chitu is a high-performance inference engine designed to deploy and run large language models efficiently in production environments. The framework focuses on improving efficiency, flexibility, and scalability for organizations that need to run LLM inference workloads across different hardware platforms. It supports heterogeneous computing environments, including CPUs, GPUs, and various specialized AI accelerators, allowing models to run across a wide range of infrastructure configurations. ...

Downloads: 2 This Week

Last Update: 2026-05-21
See Project
14

GPUStack

Performance-optimized AI inference on your GPUs

...The platform supports GPUs from a wide range of vendors and can run on laptops, workstations, and servers across operating systems such as macOS, Windows, and Linux. It also enables developers to deploy models from common repositories like Hugging Face and access them through APIs similar to cloud-based AI services.

Downloads: 2 This Week

Last Update: 2026-04-21
See Project
15

Cognita

Open source RAG framework for building scalable modular AI apps

Cognita is an open source framework designed to help developers build, organize, and deploy Retrieval-Augmented Generation (RAG) applications in a structured and production-ready way. It addresses the gap between quick experimentation in notebooks and the complexity of deploying scalable AI systems by introducing a modular and API-driven architecture. Cognita provides reusable components such as parsers, data loaders, embedders, retrievers, and query controllers, allowing teams to customize each stage of the RAG pipeline independently. ...

Downloads: 1 This Week

Last Update: 2 days ago
See Project
16

LangBot

Production-grade platform for building agentic IM bots

LangBot is an open source platform designed to build and deploy AI-powered chatbots across multiple instant messaging ecosystems. The system allows developers to integrate large language models into messaging platforms so that bots can perform tasks, answer questions, and automate workflows directly within everyday communication tools. It supports numerous messaging services including Discord, Slack, Telegram, WeChat, and other enterprise communication systems, making it a flexible solution for both personal projects and organizational deployments. ...

Downloads: 1 This Week

Last Update: 2026-05-12
See Project
17

agentic-stack

One brain, many harnesses. Portable .agent/ folder

agentic-stack is a framework or toolkit designed to build, orchestrate, and deploy AI agents in a structured and scalable way. It likely provides components for managing agent workflows, communication, and task execution across different systems. The project emphasizes modularity, enabling developers to assemble custom pipelines using various AI models, tools, and APIs. It may include abstractions for memory, planning, and tool usage, reflecting modern agentic AI design patterns. ...

Downloads: 0 This Week

Last Update: 2026-05-10
See Project
18

smolagents

Agents write python code to call tools and orchestrate other agents

...We provide our definition in this page, where you’ll also find tips for when to use them or not (spoilers: you’ll often be better off without agents). smolagents is a lightweight framework for building AI agents using large language models (LLMs). It simplifies the development of AI-driven applications by providing tools to create, train, and deploy language model-based agents.

Downloads: 0 This Week

Last Update: 3 days ago
See Project
19

Triton Inference Server

The Triton Inference Server provides an optimized cloud

Triton Inference Server is an open-source inference serving software that streamlines AI inferencing. Triton enables teams to deploy any AI model from multiple deep learning and machine learning frameworks, including TensorRT, TensorFlow, PyTorch, ONNX, OpenVINO, Python, RAPIDS FIL, and more. Triton supports inference across cloud, data center, edge, and embedded devices on NVIDIA GPUs, x86 and ARM CPU, or AWS Inferentia. Triton delivers optimized performance for many query types, including real-time, batched, ensembles, and audio/video streaming. ...

Downloads: 2 This Week

Last Update: 2026-04-28
See Project
20

stt

Voice Recognition to Text Tool

...It leverages open-source speech models such as Faster-Whisper to recognize and transcribe human speech into plain text, structured JSON objects, or subtitle files with time codes, making it suitable for both personal and professional transcription tasks. The project is designed to be easy to deploy: you can run a local Python server that exposes an HTTP API for uploading audio/video files and retrieving transcriptions in different formats. It supports GPU acceleration if available, enabling faster processing on compatible hardware but still offers reliable performance on CPUs alone.

Downloads: 0 This Week

Last Update: 2026-02-17
See Project
21

OpenMLSys-ZH

Machine Learning Systems: Design and Implementation

...The repository includes scripts or tooling to keep translation synchronized with upstream changes, versioning, and possibly translation metadata (contributors, timestamp). Users can browse or clone the translated documentation to follow along with the original content, deploy examples, or understand system internals in their preferred language.

Downloads: 0 This Week

Last Update: 2026-03-15
See Project
22

Portia SDK Python

Portia Labs Python SDK for building agentic workflows

portia‑sdk‑python is an open-source Python SDK by Portia Labs for creating reliable, stateful, authenticated multi-agent AI workflows. It supports tool-backed agents capable of real-world interactions—like web browsing, API access, and human-in-the-loop clarifications—while maintaining transparency and auditability through structured plans and execution hooks. Designed for production environments, the SDK integrates with local or cloud LLMs (e.g. OpenAI, Anthropic, Mistral, Gemini) and...

Downloads: 0 This Week

Last Update: 2025-09-09
See Project
23

Pruna AI

Pruna is a model optimization framework built for developers

Pruna is an open-source, self-hostable AI inference engine designed to help teams deploy and manage large language models (LLMs) efficiently across private or hybrid infrastructures. Built with performance and developer ergonomics in mind, Pruna simplifies inference workflows by enabling multi-model orchestration, autoscaling, GPU resource allocation, and compatibility with popular open-source models. It is ideal for companies or teams looking to reduce reliance on external APIs while maintaining speed, cost-efficiency, and full control over their data and AI stack. ...

Downloads: 0 This Week

Last Update: 2026-04-22
See Project
24

LLM-Pruner

On the Structural Pruning of Large Language Models

LLM-Pruner is an open-source framework designed to compress large language models through structured pruning techniques while maintaining their general capabilities. Large language models often require enormous computational resources, making them expensive to deploy and inefficient for many practical applications. LLM-Pruner addresses this issue by identifying and removing non-essential components within transformer architectures, such as redundant attention heads or feed-forward structures. The framework relies on gradient-based analysis to determine which parameters contribute least to model performance, enabling targeted structural pruning rather than simple weight removal. ...

Downloads: 1 This Week

Last Update: 2026-03-09
See Project
25

Stable Diffusion WebUI Docker

Easy Docker setup for Stable Diffusion with user-friendly UI

Stable Diffusion WebUI Docker is a Docker-based repository that simplifies running Stable Diffusion with rich user interfaces by packaging multiple popular web UIs into an easy-to-deploy containerized solution. It integrates leading community UIs like AUTOMATIC1111 and ComfyUI into a Docker Compose setup that can be started with a single command, abstracting away dependency installation and environment configuration. Users can choose which UI profile they want to run — for example, full feature AUTOMATIC1111, CPU-only automatic builds, or ComfyUI workflows — and launch them in a consistent, isolated container environment with automatic model and data caching. ...

Downloads: 1 This Week

Last Update: 2026-02-03
See Project