Open Source Python Artificial Intelligence Software - Page 8

Python Artificial Intelligence Software

View 14358 business solutions

Browse free open source Python Artificial Intelligence Software and projects below. Use the toggles on the left to filter open source Python Artificial Intelligence Software by OS, license, language, programming language, and project status.

  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    Fast3R

    Fast3R

    Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

    Fast3R is Meta AI’s official CVPR 2025 release for “Towards 3D Reconstruction of 1000+ Images in One Forward Pass.” It represents a next-generation feedforward 3D reconstruction model capable of producing dense point clouds and camera poses for hundreds to thousands of images or video frames in a single inference pass—eliminating the need for slow, iterative structure-from-motion pipelines. Built on PyTorch Lightning and extending concepts from DUSt3R and Spann3r, Fast3R unifies multi-view geometry, depth estimation, and camera registration within a single transformer-based architecture. It outputs high-quality 3D scene representations from unordered or sequential views, scaling to large datasets and varied camera intrinsics. The repository includes pretrained models, Gradio-based demos, and modular APIs for direct integration into research or production workflows.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 2
    IndexTTS2

    IndexTTS2

    Industrial-level controllable zero-shot text-to-speech system

    IndexTTS is a modern, zero-shot text-to-speech (TTS) system engineered to deliver high-quality, natural-sounding speech synthesis with few requirements and strong voice-cloning capabilities. It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. The system supports zero-shot voice cloning — meaning it can mimic a target speaker’s voice from a short reference sample — making it versatile for multi-voice uses. Compared to many open-source TTS tools, IndexTTS emphasizes efficiency and controllability: it offers faster inference, simpler training pipelines, and controllable speech parameters (like duration, pitch, and prosody), which is critical for production use.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 3
    LiteLLM

    LiteLLM

    lightweight package to simplify LLM API calls

    Call all LLM APIs using the OpenAI format [Anthropic, Huggingface, Cohere, Azure OpenAI etc.] liteLLM supports streaming the model response back, pass stream=True to get a streaming iterator in response. Streaming is supported for OpenAI, Azure, Anthropic, and Huggingface models.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 4
    Memobase

    Memobase

    Fast backend for long-term AI user memory via structured profiles

    Memobase is an open source backend system that enables long-term user memory functionality for AI applications by capturing and structuring information about users across interactions. Its design centers on creating user profiles and recording event timelines, allowing AI systems to remember, understand, and evolve in their behaviour toward individual users over time. Instead of relying purely on traditional embedding-based retrieval or RAG systems, Memobase uses profile and timeline structures to deliver memory that reflects user context efficiently and meaningfully. The system focuses on three principal performance metrics: high search performance, reduced large language model (LLM) costs through batch processing techniques, and low latency with minimal SQL operations. Memobase supports integration with existing LLM workflows via APIs and SDKs (including Python, Node, and Go), making it easy to adopt within diverse application stacks.
    Downloads: 9 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Mistral Vibe CLI

    Mistral Vibe CLI

    Minimal CLI coding agent by Mistral

    Mistral Vibe is an AI-powered “vibe-coding” command-line interface (CLI) and coding-assistant framework built by Mistral AI to let developers write, refactor, search, and manage code through natural language and context-aware automation, rather than manual typing only. It aims to take developers out of repetitive boilerplate and let them stay “in the flow”: you can ask the tool to generate functions, refactor code, search across the codebase, manipulate files, commit changes via Git, or run commands — all from a unified CLI interface. Behind the scenes, it leverages Mistral’s coding-optimized LLM stack (including models tuned for code understanding and generation), with project-wide context awareness: it scans your file structure, Git status, and recent history to inform suggestions so that generated code aligns with existing context.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    Open-Sora

    Open-Sora

    Open-Sora: Democratizing Efficient Video Production for All

    Open-Sora is an open-source initiative aimed at democratizing high-quality video production. It offers a user-friendly platform that simplifies the complexities of video generation, making advanced video techniques accessible to everyone. The project embraces open-source principles, fostering creativity and innovation in content creation. Open-Sora provides tools, models, and resources to create high-quality videos, aiming to lower the entry barrier for video production and support diverse content creators.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 7
    Oumi

    Oumi

    Everything you need to build state-of-the-art foundation models

    Oumi is an open-source framework that provides everything needed to build state-of-the-art foundation models, end-to-end. It aims to simplify the development of large-scale machine-learning models.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 8
    QwenPaw

    QwenPaw

    A personal AI assistant, easy to install

    QwenPaw is an AI agent framework developed by the AgentScope ecosystem to provide a desktop-style intelligent assistant powered by Qwen language models and modular agent orchestration. The project combines conversational AI, memory systems, tool usage, workflow automation, and multimodal interaction into a unified assistant environment designed for daily productivity and experimentation. It supports structured reasoning, autonomous task execution, and integration with external tools and APIs, allowing the assistant to perform actions beyond standard chatbot conversations. QwenPaw emphasizes extensibility through modular components that developers can customize for research, personal assistants, automation systems, or enterprise AI workflows. The architecture integrates agent collaboration, planning systems, and long-term contextual memory to create more persistent and adaptive interactions.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 9
    StableSwarmUI

    StableSwarmUI

    Multi-user UI for managing and running Stable Diffusion workflows tool

    StableSwarmUI is a web-based interface designed to manage and coordinate Stable Diffusion image generation workflows in a multi-user environment. It focuses on enabling multiple users to interact with shared resources, making it suitable for collaborative or server-based deployments. It provides a centralized system where users can submit, monitor, and manage generation tasks through a browser interface. It abstracts much of the complexity involved in running diffusion models by offering a structured environment for handling prompts, outputs, and processing queues. StableSwarmUI is built to work alongside backend systems that execute the actual image generation, allowing separation between user interaction and compute workloads. It also emphasizes scalability, making it useful for setups where multiple jobs need to be processed efficiently. Overall, it serves as a coordination layer for Stable Diffusion usage rather than a standalone model implementation.
    Downloads: 9 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    TokenSpeed

    TokenSpeed

    TokenSpeed is a speed-of-light LLM inference engine

    TokenSpeed is an LLM inference engine designed for high-performance production agent workloads. It aims to combine TensorRT-LLM-level speed with vLLM-level usability, making it relevant for teams that need fast generation without sacrificing developer ergonomics. The project is focused on the specific needs of agentic systems, where latency, throughput, and efficient scheduling matter across many short or tool-heavy requests. It builds on ideas and components from the broader open-source inference ecosystem while presenting its own execution stack. TokenSpeed is useful for developers building local or server-side LLM infrastructure for agents, coding systems, and high-volume AI applications. Its main value is providing an inference layer optimized for fast token generation under practical agent workloads.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 11
    Ultroid

    Ultroid

    Telegram UserBot, Built in Python Using Telethon lib

    Ultroid, a pluggable telegram userbot, made in python using Telethon! Ultroid has been written from scratch, making it more stable and less crashes. Ultroid warns you when you try to install/execute dangerous stuff (people nowadays make plugins to hack user accounts, Ultroid is safe). Unlike many others userbots that are being suspended by Heroku, Ultroid doesn't get suspended. Ultroid has been written from scratch, making it more stable and less of crashes. Error handling been done in the best way possible, such that the bot doesn't crash and stop all of a sudden. Ultroid has minimal amount of plugins (just the necessary ones) in the main repository, and all the other less-useful stuff in the addons repository. This facilitates quick deployments and lag-free use. Ultroid can install any plugin from the most of the other 'userbots' without any issue.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 12
    ViMax

    ViMax

    Director, Screenwriter, Producer, and Video Generator All-in-One

    ViMax is an open-source framework for performing large-scale multi-modal vision-language modeling and reasoning by combining powerful image encoders with advanced language models to solve complex visual tasks. It integrates components like visual encoders, cross-modal fusion techniques, and reasoning modules so that users can go beyond simple captioning or classification to perform tasks such as visual question answering, multi-image inference, and structured scene understanding. ViMax’s design accommodates large image sets and supports retrieval augmentation, enabling it to work with external image databases, supplementary metadata, and semantic search to enhance context awareness. The system aims to bridge foundational vision backbones and generative language models through adapters and fusion layers that maximize both signal integration and reasoning depth, and includes utility pipelines for training, evaluation, and deployment.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 13
    VibeVoice

    VibeVoice

    Open-source multi-speaker long-form text-to-speech model

    VibeVoice-1.5B is Microsoft’s frontier open-source text-to-speech (TTS) model designed for generating expressive, long-form, multi-speaker conversational audio such as podcasts. Unlike traditional TTS systems, it excels in scalability, speaker consistency, and natural turn-taking for up to 90 minutes of continuous speech with as many as four distinct speakers. A key innovation is its use of continuous acoustic and semantic speech tokenizers operating at an ultra-low frame rate of 7.5 Hz, enabling high audio fidelity with efficient processing of long sequences. The model integrates a Qwen2.5-based large language model with a diffusion head to produce realistic acoustic details and capture conversational context. Training involved curriculum learning with increasing sequence lengths up to 65K tokens, allowing VibeVoice to handle very long dialogues effectively. Safety mechanisms include an audible disclaimer and imperceptible watermarking in all generated audio to mitigate misuse risks.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 14
    vLLM

    vLLM

    A high-throughput and memory-efficient inference and serving engine

    vLLM is a fast and easy-to-use library for LLM inference and serving. High-throughput serving with various decoding algorithms, including parallel sampling, beam search, and more.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 15
    AI YouTube Shorts Generator

    AI YouTube Shorts Generator

    A python tool that uses GPT-4, FFmpeg, and OpenCV

    AI-YouTube-Shorts-Generator is a Python-based tool that automates the creation of short-form vertical video clips (“shorts”) from longer source videos — ideal for adapting content for platforms like YouTube Shorts, Instagram Reels, or TikTok. It analyzes input video (whether a local file or a YouTube URL), transcribes audio (with optional GPU-accelerated speech-to-text), uses an AI model to identify the most compelling or engaging segments, and then crops/resizes the video and applies subtitle overlays, producing a polished short video without manual editing. The tool streamlines multiple steps of the tedious short-form video workflow: highlight detection, clipping, subtitle generation, cropping to vertical 9:16 format, and final rendering — reducing hours of editing to a mostly automated pipeline. Because it supports both local and online video sources, it's flexible whether you're working with your own recorded content or repurposing existing longer-form videos.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 16
    Agent Skills for Context Engineering

    Agent Skills for Context Engineering

    A comprehensive collection of Agent Skills for context engineering

    Agent Skills for Context Engineering is a curated collection of reusable “agent skills” focused on helping AI agents perform better on long-horizon, multi-step work by managing context deliberately. Rather than being a single application, it packages practical guidance into skill modules that agents can load to improve planning, retrieval, memory usage, and overall reliability in real workflows. The repository emphasizes context engineering as a discipline, covering why agents fail when context gets too large, too noisy, or poorly structured, and how to mitigate those failure modes with repeatable patterns. It is designed to be used across modern agent environments that support skill folders and structured instructions, so teams can standardize how agents operate instead of relying on ad-hoc prompting.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 17
    AudioCraft

    AudioCraft

    Audiocraft is a library for audio processing and generation

    AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. The repo provides inference scripts, checkpoints, and simple Python APIs so you can generate clips from prompts or incorporate the models into applications. It also contains training code and recipes, so researchers can fine-tune on custom data or explore new objectives without building infrastructure from scratch. Example notebooks, CLI tools, and audio utilities help with prompt design, conditioning on reference audio, and post-processing to produce ready-to-share outputs.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 18
    Cheat on Content

    Cheat on Content

    Workflow that turns every post into a calibrated experiment

    Cheat on Content is an AI-assisted workflow for creators who want to make content performance measurable instead of relying on instinct alone. It turns every post into a structured experiment by asking creators to score ideas, make blind predictions, publish, review results after a defined time window, and evolve their own content rubric. Rather than generating posts for the creator, it focuses on sharpening judgment and helping users understand why certain content performs better. The project is built around the loop of score, predict, publish, retro, and improve. It is aimed at creators, marketers, and operators who want to build a repeatable system for learning from every published piece. Its value is strongest for people who already create consistently but need a better way to extract insight from their output.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 19
    ChemCrow

    ChemCrow

    Chemcrow

    ChemCrow is an AI-powered framework designed to assist in chemical research and discovery. It integrates AI models with chemical knowledge bases to provide intelligent recommendations for synthesis planning, reaction prediction, and material discovery. This tool helps automate and accelerate research in computational chemistry and drug development.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 20
    CodeGeeX

    CodeGeeX

    CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)

    CodeGeeX is a large-scale multilingual code generation model with 13 billion parameters, trained on 850B tokens across more than 20 programming languages. Developed with MindSpore and later made PyTorch-compatible, it is capable of multilingual code generation, cross-lingual code translation, code completion, summarization, and explanation. It has been benchmarked on HumanEval-X, a multilingual program synthesis benchmark introduced alongside the model, and achieves state-of-the-art performance compared to other open models like InCoder and CodeGen. CodeGeeX also powers IDE plugins for VS Code and JetBrains, offering features like code completion, translation, debugging, and annotation. The model supports Ascend 910 and NVIDIA GPUs, with optimizations like quantization and FasterTransformer acceleration for faster inference.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 21
    FID score for PyTorch

    FID score for PyTorch

    Compute FID scores with PyTorch

    This is a port of the official implementation of Fréchet Inception Distance to PyTorch. FID is a measure of similarity between two datasets of images. It was shown to correlate well with human judgement of visual quality and is most often used to evaluate the quality of samples of Generative Adversarial Networks. FID is calculated by computing the Fréchet distance between two Gaussians fitted to feature representations of the Inception network. The weights and the model are exactly the same as in the official Tensorflow implementation, and were tested to give very similar results (e.g. .08 absolute error and 0.0009 relative error on LSUN, using ProGAN generated images). However, due to differences in the image interpolation implementation and library backends, FID results still differ slightly from the original implementation. In difference to the official implementation, you can choose to use a different feature layer of the Inception network instead of the default pool3 layer.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 22
    Flyte
    Build production-grade data and ML workflows, hassle-free The infinitely scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks. Don’t let friction between development and production slow down the deployment of new data/ML workflows and cause an increase in production bugs. Flyte enables rapid experimentation with production-grade software. Debug in the cloud by iterating on the workflows locally to achieve tighter feedback loops. As your data and ML workflows expand and demand more computing power, your workflow orchestration platform must keep up. If it’s not designed to scale, your platform will require constant monitoring and maintenance. Flyte was built with scalability in mind, ready to handle changing workloads and resource needs.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 23
    Free LLM API resources

    Free LLM API resources

    A list of free LLM inference resources accessible via API

    Free LLM API resources repository curated by cheahjs is a community-driven index of free and open API endpoints, tools, datasets, runtimes, and utilities for working with large language models (LLMs) without cost-barriers. It collects a wide range of resources including hosted free-tier LLM APIs, documentation links, public model endpoints, open datasets useful for training or evaluation, tooling integrations, and examples showing how to interact with these services in real applications. This list helps developers, hobbyists, and researchers quickly find models they can use for prototyping, experimentation, or production proofs-of-concept without needing paid subscriptions, reducing friction for innovation. The repository typically categorizes offerings by provider, type of service (text, embeddings, vision), availability conditions (open without key, free tier with key), and usage examples to make discovery and adoption easier.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 24
    Freqtrade

    Freqtrade

    Free, open source crypto trading bot

    Freqtrade is a free and open-source crypto trading bot written in Python. It is designed to support all major exchanges and be controlled via Telegram or WebUI. It contains backtesting, plotting, and money management tools as well as strategy optimization by machine learning. Always start by running a trading bot in Dry-run and do not engage money before you understand how it works and what profit/loss you should expect. We strongly recommend you have basic coding skills and Python knowledge. Do not hesitate to read the source code and understand the mechanisms of this bot, algorithms, and techniques implemented in it. Write your strategy in python, using pandas. Example strategies to inspire you are available in the strategy repository. Download historical data of the exchange and the markets you may want to trade with. Find the best parameters for your strategy using hyper optimization which employs machining learning methods.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 25
    GPT-2

    GPT-2

    Code for the paper Language Models are Unsupervised Multitask Learners

    This repository contains the code and model weights for GPT-2, a large-scale unsupervised language model described in the OpenAI paper “Language Models are Unsupervised Multitask Learners.” The intent is to provide a starting point for researchers and engineers to experiment with GPT-2: generate text, fine‐tune on custom datasets, explore model behavior, or study its internal phenomena. The repository includes scripts for sampling, training, downloading pre-trained models, and utilities for tokenization and model handling. Support for memory-saving gradient techniques/optimizations during training. Sampling/generation scripts (conditional, unconditional, interactive).
    Downloads: 8 This Week
    Last Update:
    See Project