Open Source Python Artificial Intelligence Software - Page 7

Python Artificial Intelligence Software

View 13511 business solutions

Browse free open source Python Artificial Intelligence Software and projects below. Use the toggles on the left to filter open source Python Artificial Intelligence Software by OS, license, language, programming language, and project status.

  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1
    Real-Time Voice Cloning

    Real-Time Voice Cloning

    Clone a voice in 5 seconds to generate arbitrary speech in real-time

    Real-Time Voice Cloning is an influential deep-learning repository that demonstrates how to clone a voice from just a few seconds of audio and then generate arbitrary speech in that voice in near real time. It implements the SV2TTS pipeline (“Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis”) in three stages: a speaker encoder, a synthesizer, and a vocoder. In the first stage, short audio clips are converted into a fixed-dimensional speaker embedding that captures voice characteristics; this embedding is then used by a Tacotron-style synthesizer to generate spectrograms from text, which a WaveRNN-based vocoder finally turns into audio. The repo includes both a command-line demo and a graphical “toolbox” application where you can load reference voices, type text, and hear the synthesized results interactively. It also provides scripts for preprocessing datasets (such as LibriSpeech), training each of the three components.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 2
    WhisperLive

    WhisperLive

    A nearly-live implementation of OpenAI's Whisper

    WhisperLive is a “nearly live” implementation of OpenAI’s Whisper model focused on real-time transcription. It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and network streams such as RTSP and HLS, making it flexible for live events, monitoring, or accessibility workflows. Configuration options let you control the number of clients, maximum connection time, and threading behavior so the server can be tuned for different deployment environments. On the client side, you can set the language, whether to translate into English, model size, voice activity detection, and output recording behavior.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 3
    X-AnyLabeling

    X-AnyLabeling

    Effortless data labeling with AI support from Segment Anything

    X-AnyLabeling is an open-source data annotation platform designed to streamline the process of labeling datasets for computer vision and multimodal AI applications. The software integrates an AI-powered labeling engine that allows users to generate annotations automatically with the assistance of modern vision models such as Segment Anything and various object detection frameworks. It supports labeling tasks across images and videos and enables developers to prepare training datasets for tasks such as object detection, segmentation, classification, tracking, and pose estimation. The tool is built with an interactive graphical interface that simplifies annotation workflows and allows users to draw and edit labels directly on visual data. It also supports a wide range of export formats compatible with popular machine learning pipelines, making it easier to integrate with training frameworks.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 4
    XHS-Downloader

    XHS-Downloader

    GUI/CLI tool for downloading Xiaohongshu

    XHS-Downloader is a GUI/CLI tool for downloading Xiaohongshu (Little Red Book) content without watermarks, supporting both graphics and video posts. Prebuilt packages for Windows and macOS are available from Releases and GitHub Actions artifacts, so most users can run it by unzipping and launching the included executable. The project offers two execution paths—run the compiled app or run from source—and documents default download and configuration paths to simplify first use. Recent releases add format support like JPEG and HEIC, clipboard-listening mode improvements, author-based archiving, SOCKS/HTTP proxy options, and the ability to set the file’s modification time to the post’s publish time for cleaner library organization. There is an active issues/discussions area with community tips, including approaches that use Selenium to acquire cookies and user agents for more reliable downloads.
    Downloads: 11 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    Agent S

    Agent S

    Agent S: an open agentic framework that uses computers like a human

    Agent S is an open-source agentic framework designed to enable autonomous computer use through an Agent-Computer Interface (ACI). Built to operate graphical user interfaces like a human, it allows AI agents to perceive screens, reason about tasks, and execute actions across macOS, Windows, and Linux systems. The latest version, Agent S3, surpasses human-level performance on the OSWorld benchmark, demonstrating state-of-the-art results in complex multi-step computer tasks. Agent S combines powerful foundation models (such as GPT-5) with grounding models like UI-TARS to translate visual inputs into precise executable actions. It supports flexible deployment via CLI, SDK, or cloud, and integrates with multiple model providers including OpenAI, Anthropic, Gemini, Azure, and Hugging Face endpoints. With optional local code execution, reflection mechanisms, and compositional planning, Agent S provides a scalable and research-driven framework for building advanced computer-use agents.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    BruteForceAI

    BruteForceAI

    Advanced LLM-powered brute-force tool combining AI intelligence

    BruteForceAI is an open-source security testing tool that applies large language models to the analysis of login forms and authentication flows in web applications. At a high level, the project uses AI to inspect HTML content, identify the relevant form elements, and automate selector discovery so that a tester does not need to hand-map every field before evaluation. It combines that analysis layer with automated credential testing workflows, framing itself as a more adaptive alternative to older brute-force tooling that depends heavily on manual configuration. The repository emphasizes features such as threaded execution, logging, and notification integrations, which position it as an automation-oriented project for controlled security assessment environments. From a software design perspective, its distinguishing idea is the use of language models as a front-end analysis layer that interprets a target page before the rest of the workflow proceeds.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 7
    DeepSeek Coder

    DeepSeek Coder

    DeepSeek Coder: Let the Code Write Itself

    DeepSeek-Coder is a series of code-specialized language models designed to generate, complete, and infill code (and mixed code + natural language) with high fluency in both English and Chinese. The models are trained from scratch on a massive corpus (~2 trillion tokens), of which about 87% is code and 13% is natural language. This dataset covers project-level code structure (not just line-by-line snippets), using a large context window (e.g. 16K) and a secondary fill-in-the-blank objective to encourage better contextual completions and infilling. Multiple sizes of the model are offered (e.g. 1B, 5.7B, 6.7B, 33B) so users can trade off inference cost vs capability. The repo provides model weights, documentation on training setup, evaluation results on common benchmarks (HumanEval, MultiPL-E, APPS, etc.), and inference tools.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 8
    Generative AI for Beginners (Version 3)

    Generative AI for Beginners (Version 3)

    21 Lessons, Get Started Building with Generative AI

    Generative AI for Beginners is a 21-lesson course by Microsoft Cloud Advocates that teaches the fundamentals of building generative AI applications in a practical, project-oriented way. Lessons are split into “Learn” modules for core concepts and “Build” modules with hands-on code in Python and TypeScript, so you can jump in at any point that matches your goals. The course covers everything from model selection, prompt engineering, and chat/text/image app patterns to secure development practices and UX for AI. It also walks through modern application techniques such as function calling, RAG with vector databases, working with open source models, agents, fine-tuning, and using SLMs. Each lesson includes a short video, a written guide, runnable samples for Azure OpenAI, the GitHub Marketplace Model Catalog, and the OpenAI API, plus a “Keep Learning” section for deeper study.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 9
    HexStrike AI MCP Agents

    HexStrike AI MCP Agents

    HexStrike AI MCP Agents is an advanced MCP server

    HexStrike AI is an MCP server that lets LLM agents autonomously operate a large catalog of offensive-security tools. Its goal is to bridge “language models” and practical pentest workflows—enumeration, exploitation, vulnerability discovery, and bug bounty reconnaissance—under safe, auditable controls. The server exposes typed tools and guardrails so agent prompts translate to concrete, parameterized actions rather than brittle shell strings. It ships with curated tool adapters, task orchestration, and guidance for connecting popular agent clients (Claude, GPT, Copilot) to a hardened execution environment. Documentation highlights the breadth of supported utilities and positions HexStrike as a research and red-team aid, not a point-and-click exploit kit. A public site and active repository activity signal an expanding community around autonomous security research agents.
    Downloads: 10 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    Lemonade

    Lemonade

    Lemonade helps users run local LLMs with the highest performance

    Lemonade is a local LLM runtime that aims to deliver the highest possible performance on your own hardware by auto-configuring state-of-the-art inference engines for both NPUs and GPUs. The project positions itself as a “local LLM server” you can run on laptops and workstations, abstracting away backend differences while giving you a single place to serve and manage models. Its README emphasizes real-world adoption across startups, research groups, and large companies, signaling a focus on practical deployments rather than toy demos. The repository highlights easy onboarding with downloads, docs, and a Discord for support, suggesting an active user community. Messaging centers on squeezing maximum throughput/latency from modern accelerators without users having to hand-tune kernels or flags. Releases further reinforce the “server” framing, pointing developers toward a service that can be integrated into apps and tools.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 11
    Mistral Vibe CLI

    Mistral Vibe CLI

    Minimal CLI coding agent by Mistral

    Mistral Vibe is an AI-powered “vibe-coding” command-line interface (CLI) and coding-assistant framework built by Mistral AI to let developers write, refactor, search, and manage code through natural language and context-aware automation, rather than manual typing only. It aims to take developers out of repetitive boilerplate and let them stay “in the flow”: you can ask the tool to generate functions, refactor code, search across the codebase, manipulate files, commit changes via Git, or run commands — all from a unified CLI interface. Behind the scenes, it leverages Mistral’s coding-optimized LLM stack (including models tuned for code understanding and generation), with project-wide context awareness: it scans your file structure, Git status, and recent history to inform suggestions so that generated code aligns with existing context.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 12
    MoneyPrinter V2

    MoneyPrinter V2

    Automate the process of making money online

    MoneyPrinter V2 is an open-source automation platform designed to streamline and scale online income generation workflows by combining content creation, social media automation, and marketing strategies into a single system. It is a complete rewrite of the original MoneyPrinter project, focusing on modularity, extensibility, and broader functionality across multiple monetization channels. The platform operates primarily through Python-based scripts that automate tasks such as generating and publishing YouTube Shorts, posting on social media platforms like Twitter, and executing affiliate marketing campaigns. It integrates scheduling mechanisms that allow users to run automated workflows at defined intervals, enabling continuous content production and distribution without manual intervention.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 13
    Nerfstudio

    Nerfstudio

    A collaboration friendly studio for NeRFs

    Nerfstudio provides a simple API that allows for a simplified end-to-end process of creating, training, and testing NeRFs. The library supports a more interpretable implementation of NeRFs by modularizing each component. With more modular NeRFs, we hope to create a more user-friendly experience in exploring the technology. This is a contributor-friendly repo with the goal of building a community where users can more easily build upon each other’s contributions. Nerfstudio initially launched as an opensource project by Berkeley students in KAIR lab at Berkeley AI Research (BAIR) in October 2022 as a part of a research project (paper). It is currently developed by Berkeley students and community contributors. We are committed to providing learning resources to help you understand the basics of (if you’re just getting started), and keep up-to-date with (if you’re a seasoned veteran) all things NeRF.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 14
    Open-Sora

    Open-Sora

    Open-Sora: Democratizing Efficient Video Production for All

    Open-Sora is an open-source initiative aimed at democratizing high-quality video production. It offers a user-friendly platform that simplifies the complexities of video generation, making advanced video techniques accessible to everyone. The project embraces open-source principles, fostering creativity and innovation in content creation. Open-Sora provides tools, models, and resources to create high-quality videos, aiming to lower the entry barrier for video production and support diverse content creators.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 15
    OpenViking

    OpenViking

    Context database designed specifically for AI Agents

    OpenViking is an open-source context database engineered for efficient indexing and retrieval of large amounts of unstructured or semi-structured context data used by AI applications. It’s primarily designed to serve as a high-performance, scalable backend for storing app context, embeddings, conversational histories, and other textual artifacts that need rapid lookup and semantic search, which makes it especially useful for systems like chatbots or memory-augmented agents. The project is implemented with performance in mind, often leveraging optimized data structures that balance fast reads and writes with minimal resource consumption. Developers can integrate OpenViking into modern AI stacks to unify context storage across services, enabling consistent session history, personalized responses, and richer search experiences.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 16
    RAGFlow

    RAGFlow

    RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine

    RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. It offers a streamlined RAG workflow for businesses of any scale, combining LLM (Large Language Models) to provide truthful question-answering capabilities, backed by well-founded citations from various complex formatted data.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 17
    Transformer Engine

    Transformer Engine

    A library for accelerating Transformer models on NVIDIA GPUs

    Transformer Engine (TE) is a library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference. TE provides a collection of highly optimized building blocks for popular Transformer architectures and an automatic mixed precision-like API that can be used seamlessly with your framework-specific code. TE also includes a framework-agnostic C++ API that can be integrated with other deep-learning libraries to enable FP8 support for Transformers. As the number of parameters in Transformer models continues to grow, training and inference for architectures such as BERT, GPT, and T5 become very memory and compute-intensive. Most deep learning frameworks train with FP32 by default. This is not essential, however, to achieve full accuracy for many deep learning models.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 18
    abogen

    abogen

    Generate audiobooks from EPUBs, PDFs and text with captions

    abogen is a tool designed to generate audiobooks (or speech narrations) from textual sources such as EPUBs, PDFs, or plain text, with synchronized captions. In other words, it automates the pipeline of reading a digital book (or document), converting its text into speech via a TTS engine, and packaging the result into an audiobook format — likely along with timestamped captions or subtitles that align with the spoken audio. This can be very useful for accessibility, content consumption on the go, or for users who prefer audio over reading. The repository supports handling common ebook formats and generating outputs that combine audio plus caption metadata. By automating text-to-speech for arbitrary documents, abogen reduces the friction of producing audiobooks and could be integrated into larger workflows (e.g., batch converting a library of texts).
    Downloads: 10 This Week
    Last Update:
    See Project
  • 19
    AI-Tutorials/Implementations Notebooks

    AI-Tutorials/Implementations Notebooks

    Codes/Notebooks for AI Projects

    AI-Tutorials/Implementations Notebooks repository is a comprehensive collection of artificial intelligence tutorials and implementation examples intended for developers, students, and researchers who want to learn by building practical AI projects. The repository contains numerous Jupyter notebooks and code samples that demonstrate modern techniques in machine learning, deep learning, data science, and large language model workflows. It includes implementations for a wide range of AI topics such as computer vision, agent systems, federated learning, distributed systems, adversarial attacks, and generative AI. Many of the tutorials focus on building AI agents, multi-agent systems, and workflows that integrate language models with external tools or APIs. The codebase acts as a hands-on learning resource, allowing users to experiment with new frameworks, architectures, and machine learning workflows through guided examples.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 20
    ChatTTS_colab

    ChatTTS_colab

    One-click deployment (including offline integration package)

    ChatTTS_colab is a wrapper project around the ChatTTS model that focuses on “one-click” deployment, especially in Google Colab. It provides an integrated offline bundle and scripts for Windows and macOS so users can run ChatTTS locally without wrestling with complex environment setup. The repository includes Colab notebooks that launch a Gradio-based web UI and expose streaming TTS, making it possible to listen to generated audio as it is produced. A distinctive feature is the “voice gacha” system, which batch-generates many distinct voice timbres and allows users to save the ones they like into a curated voice library. It has first-class support for long-form audio generation, making it suitable for audiobooks, podcasts, or long narration tasks. The project also implements multi-speaker or role-based reading, letting users assign different voices to different characters in a script and even use a large language model to generate that script in one step.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 21
    Claude Code Security Reviewer

    Claude Code Security Reviewer

    An AI-powered security review GitHub Action using Claude

    The claude-code-security-review repository implements a GitHub Action that uses Claude (via the Anthropic API) to perform semantic security audits of code changes in pull requests. Rather than relying purely on pattern matching or static analysis, this action feeds diffs and surrounding context to Claude to reason about potential vulnerabilities (e.g. injection, misconfigurations, secrets exposure, etc). When a PR is opened, the action analyzes only the changed files (diff-aware scanning), generates findings (with explanations, severity, and remediation suggestions), filters false positives using custom prompt logic, and posts comments directly on the PR. It supports configuration inputs (which files/directories to skip, model timeout, whether to comment on the PR, etc). The tool is language-agnostic (it doesn’t need language-specific parsers), uses contextual understanding rather than simplistic rules, and aims to reduce noise with smarter filtering.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 22
    DeepCode

    DeepCode

    DeepCode: Open Agentic Coding

    DeepCode is an agentic coding platform built around a multi-agent architecture that turns high-level inputs, including research papers, documents, and natural-language requirements, into working software artifacts. It positions itself as an “open agentic coding” system that can handle tasks like paper-to-code reproduction, frontend generation, and backend implementation by decomposing problems into structured steps and coordinating specialized agents. The system description highlights an orchestration layer that plans, assigns subtasks, and adapts strategies as complexity changes, rather than relying on a single monolithic prompt. It also describes document parsing capabilities aimed at extracting algorithmic and mathematical details from technical materials, translating them into implementable specifications and code.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 23
    Dream Textures

    Dream Textures

    Stable Diffusion built-in to Blender

    Create textures, concept art, background assets, and more with a simple text prompt. Use the 'Seamless' option to create textures that tile perfectly with no visible seam. Texture entire scenes with 'Project Dream Texture' and depth to image. Re-style animations with the Cycles render pass. Run the models on your machine to iterate without slowdowns from a service. Create textures, concept art, and more with text prompts. Learn how to use the various configuration options to get exactly what you're looking for. Texture entire models and scenes with depth to image. Inpaint to fix up images and convert existing textures into seamless ones automatically. Outpaint to increase the size of an image by extending it in any direction. Perform style transfer and create novel animations with Stable Diffusion as a post processing step. Dream Textures has been tested with CUDA and Apple Silicon GPUs. Over 4GB of VRAM is recommended.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 24
    Fish Speech

    Fish Speech

    SOTA Open Source TTS

    Fish Speech is a state-of-the-art open-source text-to-speech project that has evolved into the OpenAudio series of advanced TTS models. The repository hosts the code and tooling for training, fine-tuning, and serving high-quality TTS, while the current flagship models (OpenAudio-S1 and S1-mini) are distributed via Fish Audio’s playground and Hugging Face. The models are evaluated with Seed TTS metrics and achieve exceptionally low word and character error rates, indicating strong intelligibility and alignment between text and audio. Fish Speech emphasizes expressive and controllable voices: it supports a long list of emotion tags, tone markers, and special audio effect markers that can be embedded in the text to drive prosody and vocal style, from basic emotions to nuanced states like sarcastic, conciliative, or hysterical. The system is multilingual and cross-lingual, handling multiple languages in a single input without explicit phoneme markup, and is trained on large-scale datasets.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 25
    GLM-OCR

    GLM-OCR

    Accurate × Fast × Comprehensive

    GLM-OCR is an open-source multimodal optical character recognition (OCR) model built on a GLM-V encoder–decoder foundation that brings robust, accurate document understanding to complex real-world layouts and modalities. Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B), enabling deployment in high-concurrency services and edge environments. The model’s multimodal capabilities allow it to reason across image and text content holistically, capturing structured and unstructured information from pages that include dense tables, seals, code snippets, and varied document graphics. GLM-OCR integrates a comprehensive SDK and inference toolchain that makes it easy for developers to install, invoke, and embed into production pipelines with simple commands or APIs.
    Downloads: 9 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB