fastapi free download

Showing 27 open source projects for "fastapi"

View related business solutions

Artificial Intelligence Python Clear Filters & Widen Search

Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
1

FastAPI-MCP

Expose your FastAPI endpoints as Model Context Protocol (MCP) tools

fastapi_mcp lets you expose existing FastAPI endpoints as Model Context Protocol (MCP) tools with minimal setup, so AI agents can call your app as first-class tools. Rather than acting as a thin converter, it’s built as a native FastAPI extension that understands dependency injection, so you can reuse Depends() for authentication and authorization across your MCP tools. The server speaks directly to your app over its ASGI interface, avoiding extra HTTP hops between the MCP layer and your API, which reduces latency and simplifies deployment. ...

Downloads: 0 This Week

Last Update: 2025-10-08
See Project
2

FastKoko

Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model

FastKoko is a self-hosted text-to-speech server built around the Kokoro-82M model and exposed through a FastAPI backend. It is designed to be easy to deploy via Docker, with separate CPU and GPU images so that users can choose between pure CPU inference and NVIDIA GPU acceleration. The project exposes an OpenAI-compatible speech endpoint, which means existing code that talks to the OpenAI audio API can often be pointed at a Kokoro-FastAPI instance with minimal changes. ...

Downloads: 2 This Week

Last Update: 2025-12-13
See Project
3

LitServe

Minimal Python framework for scalable AI inference servers fast

...Unlike traditional serving tools that enforce rigid abstractions, LitServe focuses on flexibility by letting users control request handling, batching strategies, and output processing directly in Python. LitServe is built on top of FastAPI and extends it with AI-specific optimizations such as efficient multi-worker execution, which can significantly improve throughput. It includes built-in capabilities for batching, streaming responses, and automatic scaling across CPUs and GPUs, enabling high-performance deployments.

Downloads: 2 This Week

Last Update: 2026-03-18
See Project
4

ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM

...The context window extends up to 32K (FlashAttention), and Multi-Query Attention improves speed and memory use. The repo includes Python APIs, CLI & web demos, OpenAI-style/FASTAPI servers, and quantized checkpoints for lightweight local deployment on GPUs or CPU/MPS.

Downloads: 2 This Week

Last Update: 17 hours ago
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

Habit Tracker

Habit Tracker for the AI Coding Workshop

Habit Tracker is a personal habit-tracking web application designed to help users build and maintain daily habits through intuitive UI and analytics that visualize progress over time. It runs locally with a FastAPI backend (Python) and a React frontend, storing all data in a lightweight SQLite database so there’s no need for user accounts or cloud storage, which keeps habit data fully private and self-contained. The app provides streak tracking and completion rates for each habit, giving users feedback on consistency and motivation by showing how often habits are completed and where they may be lagging. ...

Downloads: 1 This Week

Last Update: 2026-01-28
See Project
6

MLX-Audio

A text-to-speech, speech-to-text and speech-to-speech library

...It includes examples such as audiobook generation to demonstrate long-form synthesis and joined audio segments. On top of that, MLX-Audio offers a modern web interface powered by FastAPI, with real-time waveform and 3D visualizations, file upload, and audio management.

Downloads: 3 This Week

Last Update: 2026-03-14
See Project
7

supabase-py

Python Client for Supabase. Query Postgres from Flask, Django

Python Client for Supabase. Query Postgres from Flask, Django, FastAPI. Python user authentication, security policies, edge functions, file storage, and realtime data streaming. Good first issue.

Downloads: 3 This Week

Last Update: 2026-03-20
See Project
8

rag-search

RAG Search API

rag-search is a lightweight Retrieval-Augmented Generation API service designed to provide structured semantic search and answer generation through a simple FastAPI backend. The project integrates web search, vector embeddings, and reranking logic to retrieve relevant context before passing it to a language model for response generation. It is built to be easily deployable, requiring only environment configuration and dependency installation to run a functional RAG service. The system supports configurable filtering, scoring thresholds, and reranking options, allowing developers to fine-tune retrieval quality. ...

Downloads: 0 This Week

Last Update: 2026-03-03
See Project
9

LangChain Extract

Did you say you like data?

...The project implements a lightweight web service that allows developers to define extraction schemas and apply them to various sources such as plain text, HTML, or PDF documents. Built using FastAPI and the LangChain framework, the application exposes a REST API that can process documents and return structured outputs that match user-defined JSON schemas. Developers can create reusable “extractors” that define what type of information should be pulled from a document, along with example prompts that improve extraction quality through in-context learning.

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
Application Monitoring That Won't Slow Your App Down
AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.

Start Free
10

OAGI Python SDK

Python SDK for the Computer Use model Lux, developed by OpenAGI

...It provides high-level asynchronous agents (like AsyncDefaultAgent and AsyncActor) that encapsulate the loop of capturing screenshots, sending them to Lux, interpreting responses, and executing UI actions with PyAutoGUI. Multiple installation flavors let you choose between a minimal oagi-core package or variants that bundle desktop automation and FastAPI/Socket.IO server capabilities.

Downloads: 0 This Week

Last Update: 2026-02-22
See Project
11

LLaMA Efficient Tuning

Easy-to-use LLM fine-tuning framework (LLaMA-2, BLOOM, Falcon

Easy-to-use LLM fine-tuning framework (LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, ChatGLM2)

Downloads: 0 This Week

Last Update: 2025-12-31
See Project
12

RAG API

ID-based RAG FastAPI: Integration with Langchain and PostgreSQL

rag_api is an open-source REST API for building Retrieval-Augmented Generation (RAG) systems using LLMs like GPT. It lets users index documents, search semantically, and retrieve relevant content for use in generative AI workflows. Designed for rapid prototyping, it is ideal for chatbot development, document assistants, and knowledge-based LLM apps.

Downloads: 0 This Week

Last Update: 2026-03-20
See Project
13

Hamilton DAGWorks

Helps scientists define testable, modular, self-documenting dataflow

Hamilton is a lightweight Python library for directed acyclic graphs (DAGs) of data transformations. Your DAG is portable; it runs anywhere Python runs, whether it's a script, notebook, Airflow pipeline, FastAPI server, etc. Your DAG is expressive; Hamilton has extensive features to define and modify the execution of a DAG (e.g., data validation, experiment tracking, remote execution). To create a DAG, write regular Python functions that specify their dependencies with their parameters. As shown below, it results in readable code that can always be visualized. ...

Downloads: 0 This Week

Last Update: 2025-10-11
See Project
14

Gemini Fullstack LangGraph Quickstart

Get started w/ building Fullstack Agents using Gemini 2.5 & LangGraph

gemini-fullstack-langgraph-quickstart is a fullstack reference application from Google DeepMind’s Gemini team that demonstrates how to build a research-augmented conversational AI system using LangGraph and Google Gemini models. The project features a React (Vite) frontend and a LangGraph/FastAPI backend designed to work together seamlessly for real-time research and reasoning tasks. The backend agent dynamically generates search queries based on user input, retrieves information via the Google Search API, and performs reflective reasoning to identify knowledge gaps. It then iteratively refines its search until it produces a comprehensive, well-cited answer synthesized by the Gemini model. ...

Downloads: 4 This Week

Last Update: 17 hours ago
See Project
15

BasedHardware

Open source AI wearable platform for recording and summarizing speech

...Users can connect the wearable device to a mobile phone and automatically record and transcribe meetings, conversations, and voice memos. Omi includes firmware for wearable hardware, a Flutter-based mobile companion application, backend services built with Python and FastAPI, and various SDKs for developers. These components work together to process audio, perform speech recognition, and integrate AI features such as summaries and automated actions. Developers can extend the platform by building plugins, integrations, and custom applications using provided SDKs and APIs. The repository also supports experimental hardware implementations.

Downloads: 0 This Week

Last Update: 47 minutes ago
See Project
16

Infinity

Low-latency REST API for serving text-embeddings

Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting all sentence-transformer models and frameworks. Infinity is developed under MIT License. Infinity powers inference behind Gradient.ai and other Embedding API providers.

Downloads: 0 This Week

Last Update: 2025-08-22
See Project
17

Style-Bert-VITS2

Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles

Style-Bert-VITS2 is a text-to-speech system based on Bert-VITS2 that focuses on highly controllable voice styles and emotional expression. It takes the original Bert-VITS2 v2.1 and its Japanese-Extra variant and extends them so you can control emotion and speaking style with fine-grained intensity, not just choose a generic tone. The project targets both power users and beginners: Windows users without Git or Python can install and run it using bundled .bat scripts, while advanced users can...

Downloads: 2 This Week

Last Update: 2025-11-28
See Project
18

PySyft

Data science on data without acquiring a copy

Most software libraries let you compute over the information you own and see inside of machines you control. However, this means that you cannot compute on information without first obtaining (at least partial) ownership of that information. It also means that you cannot compute using machines without first obtaining control over those machines. This is very limiting to human collaboration and systematically drives the centralization of data, because you cannot work with a bunch of data...

Downloads: 0 This Week

Last Update: 2025-02-13
See Project
19

autollm

Ship RAG based LLM web apps in seconds

...The project focuses on simplifying the usual stack of model selection, document ingestion, vector storage, querying, and API deployment into a more unified developer experience. Its core idea is that a developer can create a query engine from a document set in just a few lines and then turn that same engine into a FastAPI application almost instantly. AutoLLM supports a broad range of language models and vector databases, which makes it useful for teams that want flexibility without rewriting their application architecture every time they switch providers. The framework also includes built-in readers for multiple content sources such as PDFs, DOCX files, notebooks, websites, and other document types, which helps shorten the time between raw data and a working knowledge application.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
20

Langcorn

Serving LangChain LLM apps automagically with FastApi

LangCorn is an API server that enables you to serve LangChain models and pipelines with ease, leveraging the power of FastAPI for a robust and efficient experience.

Downloads: 0 This Week

Last Update: 2023-11-06
See Project
21

LangChain Apps on Production with Jina

Langchain Apps on Production with Jina & FastAPI

...And if you prefer, you can also deploy your LangChain apps on your own infrastructure to ensure data privacy. With long chain-serve, you can craft REST/WebSocket APIs, spin up LLM-powered conversational Slack bots, or wrap your LangChain apps into FastAPI packages on the cloud or on-premises.

Downloads: 0 This Week

Last Update: 2023-08-25
See Project
22

RasaGPT

Headless Rasa chatbot platform with LLM integration and APIs

RasaGPT is a headless chatbot platform that combines Rasa with modern LLM tooling such as Langchain and LlamaIndex. It serves as a reference implementation and boilerplate for building conversational AI systems with retrieval and context injection. RasaGPT includes a FastAPI backend for creating custom bot endpoints, along with document ingestion and a training pipeline. It simplifies integration challenges between Rasa and LLM libraries, including metadata handling and library conflicts. RasaGPT supports multi-tenant deployments, session management, and custom schemas using pgvector. It also enables Telegram bot integration and remote access via ngrok. ...

Downloads: 1 This Week

Last Update: 2 days ago
See Project
23

Hugging Face Transformer

CPU/GPU inference server for Hugging Face transformer models

...In that setup, latency is key to providing a good user experience, and relevancy inference is done online for hundreds of snippets per user query. Most tutorials on Transformer deployment in production are built over Pytorch and FastAPI. Both are great tools but not very performant in inference. Then, if you spend some time, you can build something over ONNX Runtime and Triton inference server. You will usually get from 2X to 4X faster inference compared to vanilla Pytorch. It's cool! However, if you want the best in class performances on GPU, there is only a single possible combination: Nvidia TensorRT and Triton. ...

Downloads: 0 This Week

Last Update: 2022-08-22
See Project
24

gpt-j-api

API for the GPT-J language mode. Including a FastAPI backend

An API to interact with the GPT-J language model and variants! You can use and test the model in two different ways. These are the endpoints of the public API and require no authentication. Just SSH into a TPU VM. This code was tested on both the v2-8 and v3-8 variants.

Downloads: 0 This Week

Last Update: 2023-03-23
See Project
25

Neuro-comma

Punctuation restoration production-ready model for Russian language

...Feel free to fork this repo and edit model or dataset classes for your purposes. Our team always uses the latest version and features of Python. We started with Python 3.9, but realized, that there is no FastAPI image for Python 3.9. There is several PRs in image repositories, but no response from maintainers. So we decided to change code which we use in production to work with the 3.8 version of Python. In some functions we have 3.9 code, but we still use them, these functions are needed only for development purposes.

Downloads: 0 This Week

Last Update: 2023-04-21
See Project