Open Source Python Artificial Intelligence Software - Page 9

Python Artificial Intelligence Software

View 13511 business solutions

Browse free open source Python Artificial Intelligence Software and projects below. Use the toggles on the left to filter open source Python Artificial Intelligence Software by OS, license, language, programming language, and project status.

  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 1
    OpenCLIP

    OpenCLIP

    An open source implementation of CLIP

    The goal of this repository is to enable training models with contrastive image-text supervision and to investigate their properties such as robustness to distribution shift. Our starting point is an implementation of CLIP that matches the accuracy of the original CLIP models when trained on the same dataset. Specifically, a ResNet-50 model trained with our codebase on OpenAI's 15 million image subset of YFCC achieves 32.7% top-1 accuracy on ImageNet. OpenAI's CLIP model reaches 31.3% when trained on the same subset of YFCC. For ease of experimentation, we also provide code for training on the 3 million images in the Conceptual Captions dataset, where a ResNet-50x4 trained with our codebase reaches 22.2% top-1 ImageNet accuracy. This codebase is work in progress, and we invite all to contribute in making it more accessible and useful. In the future, we plan to add support for TPU training and release larger models. We hope this codebase facilitates and promotes further research.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    OpenHands

    OpenHands

    Open-source autonomous AI software engineer

    Welcome to OpenHands (formerly OpenDevin), an open-source autonomous AI software engineer who is capable of executing complex engineering tasks and collaborating actively with users on software development projects. Use AI to tackle the toil in your backlog, so you can focus on what matters: hard problems, creative challenges, and over-engineering your dotfiles We believe agentic technology is too important to be controlled by a few corporations. So we're building all our agents in the open on GitHub, under the MIT license. Our agents can do anything a human developer can: they write code, run commands, and use the web. We're partnering with AI safety experts like Invariant Labs to balance innovation with security.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    SenseVoice

    SenseVoice

    Multilingual speech recognition and audio understanding model

    SenseVoice is a speech foundation model designed to perform multiple voice understanding tasks from audio input. It provides capabilities such as automatic speech recognition, spoken language identification, speech emotion recognition, and audio event detection within a single system. SenseVoice is trained on more than 400,000 hours of speech data and supports over 50 languages for multilingual recognition tasks. It is built to achieve high transcription accuracy while maintaining efficient inference performance. It includes different model variants optimized for either speed or accuracy, allowing developers to choose a configuration suitable for their use case. In addition to speech transcription, SenseVoice can detect emotional cues in speech and identify common sound events such as applause, laughter, or coughing. It also provides tools for running inference, exporting models to formats like ONNX or LibTorch, and deploying the system through APIs.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 4
    Solace Agent Mesh

    Solace Agent Mesh

    An event-driven framework designed to build multi-agent AI systems

    Solace Agent Mesh is an event-driven framework designed to build, orchestrate, and scale multi-agent AI systems where specialized agents collaborate to solve complex tasks across distributed environments. It addresses one of the main challenges in modern AI systems, which is connecting isolated agents, data sources, and enterprise systems into a cohesive and interoperable ecosystem. The framework uses an asynchronous messaging architecture powered by an event broker, enabling agents to communicate reliably without tight coupling, which significantly improves scalability and fault tolerance. It introduces a standardized agent-to-agent communication protocol that allows different agents, regardless of their implementation or location, to exchange tasks, share data, and coordinate workflows efficiently. Solace Agent Mesh also includes orchestration mechanisms that dynamically break down user requests into smaller tasks and assign them to the most appropriate agents in real time.
    Downloads: 8 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    StyleTTS 2

    StyleTTS 2

    Towards Human-Level Text-to-Speech through Style Diffusion

    StyleTTS2 is a state-of-the-art text-to-speech system that aims for human-level naturalness by combining style diffusion, adversarial training, and large speech language models. It extends the original StyleTTS idea by introducing a style diffusion model that can sample rich, realistic speaking styles conditioned on reference speech, allowing highly expressive and diverse prosody. The architecture uses a two-stage training process and leverages an auxiliary speech language model to guide generation toward more natural and coherent utterances. StyleTTS2 supports both single-speaker and multi-speaker configurations, with the ability to sample or transfer styles from reference audio, making it powerful for expressive TTS and character voices. The repository includes training scripts, configuration files, and pre-trained auxiliary modules such as a text aligner, pitch extractor, and PL-BERT-based linguistic encoder.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    Sunfish

    Sunfish

    Sunfish: a Python Chess Engine in 111 lines of code

    sunfish is a minimalist yet surprisingly strong chess engine written in Python, designed to demonstrate how powerful algorithms can be implemented in a highly compact codebase. Despite being only around a hundred lines of core logic, the engine achieves competitive performance, reaching ratings above 2000 on online platforms. It implements classic chess engine techniques such as alpha-beta pruning and efficient board representation while maintaining readability and simplicity. The project is often used as an educational tool for understanding game AI, search algorithms, and evaluation functions without the complexity of larger engines. It includes a simple UCI-compatible interface, allowing it to be integrated with graphical chess interfaces or used in terminal-based gameplay. The codebase is intentionally minimal, making it ideal for experimentation, modification, and learning purposes.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    SurfSense

    SurfSense

    Connect any LLM to your internal knowledge sources

    SurfSense is an open-source AI research and knowledge assistant platform that connects any large language model to internal knowledge sources so teams and individuals can explore, query, and collaborate on insights in real time. Built as an alternative to proprietary tools like NotebookLM, Perplexity, and Glean, SurfSense allows integrations with a wide range of external data sources including Slack, Notion, Google Drive, GitHub, YouTube, and many enterprise systems, making it possible to interact with documents, chat logs, and structured data using natural language. Team collaboration is a core focus, with real-time shared chats, role-based access control, and comment threads enabling organized workflows. The platform also supports advanced retrieval augmented generation (RAG) capabilities, enabling powerful search and citation features that help answer questions with contextually relevant data.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    Transformers

    Transformers

    State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX

    Hugging Face Transformers provides APIs and tools to easily download and train state-of-the-art pre-trained models. Using pre-trained models can reduce your compute costs, carbon footprint, and save you the time and resources required to train a model from scratch. These models support common tasks in different modalities. Text, for tasks like text classification, information extraction, question answering, summarization, translation, text generation, in over 100 languages. Images, for tasks like image classification, object detection, and segmentation. Audio, for tasks like speech recognition and audio classification. Transformers provides APIs to quickly download and use those pretrained models on a given text, fine-tune them on your own datasets and then share them with the community on our model hub. At the same time, each python module defining an architecture is fully standalone and can be modified to enable quick research experiments.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    VideoCaptioner

    VideoCaptioner

    AI-powered tool for generating, optimizing, and translating subtitles

    VideoCaptioner is an open source AI-powered subtitle processing tool designed to simplify the workflow of creating subtitles for videos. It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps. After transcription, large language models are used to intelligently restructure subtitles into natural sentences, correct wording, and improve readability for viewers. It can also translate subtitles into other languages while preserving the original timing, making it suitable for multilingual video publishing and accessibility. In addition to generating subtitles, it supports editing, formatting, and embedding subtitles into videos as either hard or soft subtitles.
    Downloads: 8 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    Whisper-WebUI

    Whisper-WebUI

    A Web UI for easy subtitle using whisper model

    Whisper WebUI is an open-source browser-based interface that simplifies the use of Whisper speech recognition models by providing an intuitive graphical environment for transcription, translation, and subtitle generation. Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools. The platform integrates optimized implementations such as faster-whisper, significantly improving transcription speed and reducing memory usage compared to standard models. It supports multiple input sources including local files, YouTube content, and microphone input, making it versatile for different workflows. Whisper WebUI also includes advanced preprocessing and postprocessing features such as voice activity detection, background music separation, and speaker diarization, enabling more accurate and structured outputs.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 11
    nanobot

    nanobot

    🐈 nanobot: The Ultra-Lightweight Clawdbot / OpenClaw

    nanobot is an ultra-lightweight personal AI assistant designed to deliver powerful agent capabilities without unnecessary complexity. Built in just ~4,000 lines of clean, readable code, it offers a minimalist alternative to heavyweight agent frameworks while retaining core intelligence and extensibility. nanobot is optimized for speed and efficiency, enabling fast startup times and low resource usage across environments. Its research-ready architecture makes it easy for developers to understand, customize, and extend for experimentation or production use. With simple one-click deployment and a straightforward CLI, users can get a working AI assistant running in minutes. Inspired by Clawdbot but radically simplified, nanobot proves that capable AI agents don’t need massive codebases.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 12
    Axolotl

    Axolotl

    Go ahead and axolotl questions

    Axolotl is a powerful and flexible framework for fine-tuning large language models on custom datasets. Built for researchers and developers, Axolotl simplifies the process of adapting LLMs for specific tasks, including chat, code generation, and instruction following. It supports a wide variety of model architectures and offers out-of-the-box optimization strategies for efficient training.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    ChatGLM-6B

    ChatGLM-6B

    ChatGLM-6B: An Open Bilingual Dialogue Language Model

    ChatGLM-6B is an open bilingual (Chinese + English) conversational language model based on the GLM architecture, with approximately 6.2 billion parameters. The project provides inference code, demos (command line, web, API), quantization support for lower memory deployment, and tools for finetuning (e.g., via P-Tuning v2). It is optimized for dialogue and question answering with a balance between performance and deployability in consumer hardware settings. Support for quantized inference (INT4, INT8) to reduce GPU memory requirements. Automatic mode switching between precision/memory tradeoffs (full/quantized).
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    Chatterbox

    Chatterbox

    SoTA open-source TTS

    Chatterbox is Resemble AI's first production-grade open source TTS model. Licensed under MIT, Chatterbox has been benchmarked against leading closed-source systems like ElevenLabs and is consistently preferred in side-by-side evaluations. Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. It's also the first open source TTS model to support emotion exaggeration control, a powerful feature that makes your voices stand out. Try it now on our Hugging Face Gradio app. If you like the model but need to scale or tune it for higher accuracy, check out our competitively priced TTS service (link). It delivers reliable performance with ultra-low latency of sub-200ms—ideal for production use in agents, applications, or interactive media.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 15
    ChemCrow

    ChemCrow

    Chemcrow

    ChemCrow is an AI-powered framework designed to assist in chemical research and discovery. It integrates AI models with chemical knowledge bases to provide intelligent recommendations for synthesis planning, reaction prediction, and material discovery. This tool helps automate and accelerate research in computational chemistry and drug development.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    Claude Agent SDK for Python

    Claude Agent SDK for Python

    Python SDK for Claude Agent

    Claude Agent SDK (Python) is the official Python counterpart to the TypeScript Agent SDK from Anthropic, designed to let Python developers build powerful autonomous AI agents with Claude Code under the hood. The SDK wraps the core functionality of Claude Code and exposes high-level asynchronous and synchronous interfaces to query prompts, manage sessions, and orchestrate tool use — so you can build agents that understand code, make edits, run bash commands, interact with files, and handle workflows without writing low-level agent loop logic yourself. It ships with a bundled Claude Code CLI for convenience, though you can also point it to a custom installation, and supports defining custom tools and hooks directly in Python, which become callable by the agent during execution.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 17
    Claude Code Tools

    Claude Code Tools

    Practical productivity tools for Claude Code, Codex-CLI

    Claude Code Tools is an open-source collection of command-line utilities and productivity plugins designed to enhance developer workflows when using AI coding agents such as Claude Code and Codex-CLI. The project focuses on solving common problems encountered in AI-assisted development environments, including managing session history, automating terminal interactions, and maintaining context across multiple coding sessions. It includes tools that allow developers to search conversation logs quickly, manage environment variables securely, and execute interactive terminal workflows that AI agents can control. Some components enable Claude Code to interact with terminal multiplexers such as tmux so that it can run programs, debug applications, and interact with scripts that require user input. The toolkit also provides safety mechanisms that prevent potentially dangerous shell commands from being executed automatically by AI agents.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    GPT Computer Assistant

    GPT Computer Assistant

    gpt-4o for windows, macos and linux

    This is an alternative work for providing ChatGPT MacOS app to Windows and Linux. In this way, this is a fresh and stable work. You can easily install as a Python library for this time but we will prepare a pipeline for providing native install scripts (.exe).
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19
    Generative Models

    Generative Models

    Collection of generative models, e.g. GAN, VAE in Pytorch

    This project is a comprehensive open-source collection of implementations of various generative machine learning models designed to help researchers and developers experiment with deep generative techniques. The repository contains practical implementations of well-known architectures such as Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), Restricted Boltzmann Machines, and Helmholtz Machines, implemented primarily using modern deep learning frameworks like PyTorch and TensorFlow. These models are widely used in artificial intelligence to generate new data that resembles the training data, such as images, text, or other structured outputs. The repository serves as an educational and experimental environment where users can study how generative models work internally and replicate results from academic research papers.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 20
    Genv

    Genv

    GPU environment management and cluster orchestration

    Genv is an open-source environment and cluster management system for GPUs. Genv lets you easily control, configure, monitor and enforce the GPU resources that you are using in a GPU machine or cluster. It is intended to ease up the process of GPU allocation for data scientists without code changes.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 21
    IndexTTS2

    IndexTTS2

    Industrial-level controllable zero-shot text-to-speech system

    IndexTTS is a modern, zero-shot text-to-speech (TTS) system engineered to deliver high-quality, natural-sounding speech synthesis with few requirements and strong voice-cloning capabilities. It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. The system supports zero-shot voice cloning — meaning it can mimic a target speaker’s voice from a short reference sample — making it versatile for multi-voice uses. Compared to many open-source TTS tools, IndexTTS emphasizes efficiency and controllability: it offers faster inference, simpler training pipelines, and controllable speech parameters (like duration, pitch, and prosody), which is critical for production use.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 22
    Letta

    Letta

    Letta (formerly MemGPT) is a framework for creating LLM services

    Letta is an AI-powered task automation framework designed to handle workflow automation, natural language commands, and AI-driven decision-making.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    LightRAG

    LightRAG

    "LightRAG: Simple and Fast Retrieval-Augmented Generation"

    LightRAG is a lightweight Retrieval-Augmented Generation (RAG) framework designed for efficient document retrieval and response generation. It is optimized for speed and lower resource consumption, making it ideal for real-time applications.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 24
    Luna AI

    Luna AI

    Virtual AI anchor that combines state-of-the-art technology

    Luna AI is a virtual AI streamer framework designed to power an interactive VTuber that can go live on major platforms and chat with viewers in real time. It is built around a core assistant persona called “Luna AI,” which can be driven by a wide range of large language models and platforms, including GPT-style APIs, Claude, LangChain-based backends, ChatGLM, Kimi, Ollama, and many others. The project supports multiple rendering backends for the avatar, such as Live2D, Unreal Engine (UE), and “xuniren,” and can output to streaming platforms like Bilibili, Douyin, Kuaishou, WeChat Channels, Pinduoduo, Douyu, YouTube, Twitch, and TikTok. For voice, it integrates with numerous TTS engines (Edge-TTS, VITS-Fast, ElevenLabs, VALL-E-X, OpenVoice, GPT-SoVITS, Azure TTS, fish-speech, ChatTTS, CosyVoice, F5-TTS, MultiTTS, MeloTTS, and others), and can optionally pass the output through voice conversion systems like so-vits-svc or DDSP-SVC to change timbre.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 25
    MLX-Audio

    MLX-Audio

    A text-to-speech, speech-to-text and speech-to-speech library

    MLX-Audio is a speech library built on Apple’s MLX framework and optimized for Apple Silicon machines (M-series Macs). It focuses on text-to-speech and speech-to-speech workflows, with APIs and a command-line interface that make it easy to generate high-quality audio from text. Because it uses MLX and targets Apple Silicon, inference is fast and can take advantage of hardware acceleration and quantization for efficient on-device performance. The project provides a straightforward CLI (mlx_audio.tts.generate) as well as a Python API for programmatic generation of audio, including parameters for voice choice, speed, language hints, output format, and sample rate. It includes examples such as audiobook generation to demonstrate long-form synthesis and joined audio segments. On top of that, MLX-Audio offers a modern web interface powered by FastAPI, with real-time waveform and 3D visualizations, file upload, and audio management.
    Downloads: 7 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB