Open Source Python Artificial Intelligence Software - Page 6

Python Artificial Intelligence Software

View 11946 business solutions

Browse free open source Python Artificial Intelligence Software and projects below. Use the toggles on the left to filter open source Python Artificial Intelligence Software by OS, license, language, programming language, and project status.

  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    DeepSeek Math

    DeepSeek Math

    Pushing the Limits of Mathematical Reasoning in Open Language Models

    DeepSeek-Math is DeepSeek’s specialized model (or dataset + evaluation) focusing on mathematical reasoning, symbolic manipulation, proof steps, and advanced quantitative problem solving. The repository is likely to include fine-tuning routines or task datasets (e.g. MATH, GSM8K, ARB), demonstration notebooks, prompt templates, and evaluation results on math benchmarks. The goal is to push DeepSeek’s performance in domains that require rigorous symbolic steps, calculus, linear algebra, number theory, or multi-step derivations. The repo may also include modules that integrate external computational tools (e.g. a CAS / computer algebra system) or calculator assistance backends to enhance correctness. Because math reasoning is a high bar for LLMs, DeepSeek-Math aims to showcase their model’s ability not just in natural text but in precise formal reasoning.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    DeepSparse

    DeepSparse

    Sparsity-aware deep learning inference runtime for CPUs

    A sparsity-aware enterprise inferencing system for AI models on CPUs. Maximize your CPU infrastructure with DeepSparse to run performant computer vision (CV), natural language processing (NLP), and large language models (LLMs).
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    EPUB to Audiobook Converter

    EPUB to Audiobook Converter

    EPUB to audiobook converter, optimized for Audiobookshelf

    EPUB to Audiobook Converter is a tool designed to convert EPUB ebooks into chaptered audiobooks, optimized specifically for Audiobookshelf servers. It reads each chapter from an EPUB file, generates audio using a chosen text-to-speech backend, and outputs separate MP3 files with chapter titles preserved as metadata to make navigation easier. The project supports multiple TTS providers, including Microsoft Azure TTS, EdgeTTS, OpenAI TTS, local Piper, and Kokoro via an OpenAI-compatible endpoint, allowing users to choose between cloud and self-hosted voices. A recent addition is a Gradio-based WebUI, which wraps all configuration options in a graphical interface for users who prefer not to work with the command line. The tool offers advanced options such as controlling chapter ranges, handling paragraph detection via newline modes, removing endnote markers, and using regex-based search-and-replace files to tweak pronunciations. It can be run directly with Python or via Docker.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4

    Face Recognition

    World's simplest facial recognition api for Python & the command line

    Face Recognition is the world's simplest face recognition library. It allows you to recognize and manipulate faces from Python or from the command line using dlib's (a C++ toolkit containing machine learning algorithms and tools) state-of-the-art face recognition built with deep learning. Face Recognition is highly accurate and is able to do a number of things. It can find faces in pictures, manipulate facial features in pictures, identify faces in pictures, and do face recognition on a folder of images from the command line. It could even do real-time face recognition and blur faces on videos when used with other Python libraries.
    Downloads: 7 This Week
    Last Update:
    See Project
  • Level Up Your Cyber Defense with External Threat Management Icon
    Level Up Your Cyber Defense with External Threat Management

    See every risk before it hits. From exposed data to dark web chatter. All in one unified view.

    Move beyond alerts. Gain full visibility, context, and control over your external attack surface to stay ahead of every threat.
    Try for Free
  • 5
    Gemini-API

    Gemini-API

    Reverse-engineered Python API for Google Gemini web app

    Gemini-API is a community-created asynchronous Python wrapper for the web interface of Google’s Gemini models (formerly Bard). It is the result of reverse-engineering the Gemini web app and exposing its functionality through a programmatic API. This enables developers to incorporate Gemini into Python applications, scripts, bots, or tools without relying solely on official SDKs. The wrapper supports streaming responses, model selection, and handling of the web-based authentication/session mechanisms used by Google’s interface. While the project offers a powerful integration, users should note that the API is reverse-engineered (not officially supported by Google) and may face changes or rate-limits. The project is licensed under AGPL-3.0, emphasizing the “open” nature but also requiring derivative works to remain open. It has a strong community following and active discussions/issue tracking around model support, error handling, and new features.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    Hands-on Unsupervised Learning

    Hands-on Unsupervised Learning

    Code for Hands-on Unsupervised Learning Using Python (O'Reilly Media)

    This repo contains the code for the O'Reilly Media, Inc. book "Hands-on Unsupervised Learning Using Python: How to Build Applied Machine Learning Solutions from Unlabeled Data" by Ankur A. Patel. Many industry experts consider unsupervised learning the next frontier in artificial intelligence, one that may hold the key to the holy grail in AI research, the so-called general artificial intelligence. Since the majority of the world's data is unlabeled, conventional supervised learning cannot be applied; this is where unsupervised learning comes in. Unsupervised learning can be applied to unlabeled datasets to discover meaningful patterns buried deep in the data, patterns that may be near impossible for humans to uncover. Author Ankur Patel provides practical knowledge on how to apply unsupervised learning using two simple, production-ready Python frameworks - scikit-learn and TensorFlow. With the hands-on examples and code provided, you will identify difficult-to-find patterns in data.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 7
    IndexTTS2

    IndexTTS2

    Industrial-level controllable zero-shot text-to-speech system

    IndexTTS is a modern, zero-shot text-to-speech (TTS) system engineered to deliver high-quality, natural-sounding speech synthesis with few requirements and strong voice-cloning capabilities. It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. The system supports zero-shot voice cloning — meaning it can mimic a target speaker’s voice from a short reference sample — making it versatile for multi-voice uses. Compared to many open-source TTS tools, IndexTTS emphasizes efficiency and controllability: it offers faster inference, simpler training pipelines, and controllable speech parameters (like duration, pitch, and prosody), which is critical for production use.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    LLM Action

    LLM Action

    Technical principles related to large models

    LLM-Action is a knowledge/tutorial/repository that shares principles, techniques, and real-world experience related to large language models (LLMs), focusing on LLM engineering, deployment, optimization, inference, compression, and tooling. It organizes content in domains like training, inference, compression, alignment, evaluation, pipelines, and applications. Sections covering infrastructure, engineering, and deployment. Repository templates, sample code, and resource links. Articles/code on LLM compression (quantization, pruning).
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    MONAI

    MONAI

    AI Toolkit for Healthcare Imaging

    The MONAI framework is the open-source foundation being created by Project MONAI. MONAI is a freely available, community-supported, PyTorch-based framework for deep learning in healthcare imaging. It provides domain-optimized foundational capabilities for developing healthcare imaging training workflows in a native PyTorch paradigm. Project MONAI also includes MONAI Label, an intelligent open source image labeling and learning tool that helps researchers and clinicians collaborate, create annotated datasets, and build AI models in a standardized MONAI paradigm. MONAI is an open-source project. It is built on top of PyTorch and is released under the Apache 2.0 license. Aiming to capture best practices of AI development for healthcare researchers, with an immediate focus on medical imaging. Providing user-comprehensible error messages and easy to program API interfaces. Provides reproducibility of research experiments for comparisons against state-of-the-art implementations.
    Downloads: 7 This Week
    Last Update:
    See Project
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 10
    MoneyPrinterTurbo

    MoneyPrinterTurbo

    Generate short videos with one click using AI LLM

    MoneyPrinterTurbo is an AI-driven tool that enables users to generate high-definition short videos with minimal input. By providing a topic or keyword, the system automatically creates video scripts, sources relevant media assets, adds subtitles, and incorporates background music, resulting in a polished video ready for distribution.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    PIFuHD

    PIFuHD

    High-Resolution 3D Human Digitization from A Single Image

    PIFuHD (Pixel-Aligned Implicit Function for 3D human reconstruction at high resolution) is a method and codebase to reconstruct high-fidelity 3D human meshes from a single image. It extends prior PIFu work by increasing resolution and detail, enabling fine geometry in cloth folds, hair, and subtle surface features. The method operates by learning an implicit occupancy / surface function conditioned on the image and camera projection; at inference time it queries dense points to reconstruct a mesh via marching cubes. It also uses a two-stage architecture: a coarse global model followed by local refinement patches to capture fine detail, balancing global consistency and local detail. The repo includes training pipelines, dataset loaders (for Multi-POP, etc.), and inference scripts for mesh output including depth maps for postprocessing. To help practical use, there are utilities for normal estimation, texture back-projection, mesh cleanup, and integration with rendering pipelines.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    Pyrogram

    Pyrogram

    Elegant, modern and asynchronous Telegram MTProto API framework

    Pyrogram is a modern, elegant and asynchronous MTProto API framework. It enables you to easily interact with the main Telegram API through a user account (custom client) or a bot identity (bot API alternative) using Python. Ready: Install Pyrogram with pip and start building your applications right away. Easy: Makes the Telegram API simple and intuitive, while still allowing advanced usages. Elegant: Low-level details are abstracted and re-presented in a more convenient way. Fast: Boosted up by TgCrypto, a high-performance cryptography library written in C. Type-hinted: Types and methods are all type-hinted, enabling excellent editor support. Async: Fully asynchronous (also usable synchronously if wanted, for convenience). Powerful: Full access to Telegram's API to execute any official client action and more.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    RAGFlow

    RAGFlow

    RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine

    RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. It offers a streamlined RAG workflow for businesses of any scale, combining LLM (Large Language Models) to provide truthful question-answering capabilities, backed by well-founded citations from various complex formatted data.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    VibeThinker

    VibeThinker

    Diversity-driven optimization and large-model reasoning ability

    VibeThinker is a compact but high-capability open-source language model released by WeiboAI (Sina AI Lab). It contains about 1.5 billion parameters, far smaller than many “frontier” models, yet it is explicitly optimized for reasoning, mathematics, and code generation tasks rather than general open-domain chat. The innovation lies in its training methodology: the team uses what they call the Spectrum-to-Signal Principle (SSP), where a first stage emphasizes diversity of reasoning paths (the “spectrum” phase) and a second stage uses reinforcement techniques (the “signal” phase) to refine toward correctness and strong reasoning. The result is a model that outpaces many much larger models on domain-specific benchmarks, demonstrating that smaller models, if trained carefully and with the right objectives, can achieve high performance in reasoning-centric tasks.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 15
    abogen

    abogen

    Generate audiobooks from EPUBs, PDFs and text with captions

    abogen is a tool designed to generate audiobooks (or speech narrations) from textual sources such as EPUBs, PDFs, or plain text, with synchronized captions. In other words, it automates the pipeline of reading a digital book (or document), converting its text into speech via a TTS engine, and packaging the result into an audiobook format — likely along with timestamped captions or subtitles that align with the spoken audio. This can be very useful for accessibility, content consumption on the go, or for users who prefer audio over reading. The repository supports handling common ebook formats and generating outputs that combine audio plus caption metadata. By automating text-to-speech for arbitrary documents, abogen reduces the friction of producing audiobooks and could be integrated into larger workflows (e.g., batch converting a library of texts).
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    audio-diffusion-pytorch

    audio-diffusion-pytorch

    Audio generation using diffusion models, in PyTorch

    A fully featured audio diffusion library, for PyTorch. Includes models for unconditional audio generation, text-conditional audio generation, diffusion autoencoding, upsampling, and vocoding. The provided models are waveform-based, however, the U-Net (built using a-unet), DiffusionModel, diffusion method, and diffusion samplers are both generic to any dimension and highly customizable to work on other formats. Note: no pre-trained models are provided here, this library is meant for research purposes.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 17
    KoboldAI

    KoboldAI

    Your gateway to GPT writing

    This is a browser-based front-end for AI-assisted writing with multiple local & remote AI models. It offers the standard array of tools, including Memory, Author's Note, World Info, Save & Load, adjustable AI settings, formatting options, and the ability to import existing AI Dungeon adventures. You can also turn on Adventure mode and play the game like AI Dungeon Unleashed. Stories can be played like a Novel, a text adventure game or used as a chatbot with an easy toggles to change between the multiple gameplay styles. This makes KoboldAI both a writing assistant, a game and a platform for so much more. The way you play and how good the AI will be depends on the model or service you decide to use. No matter if you want to use the free, fast power of Google Colab, your own high end graphics card, an online service you have an API key for (Like OpenAI or Inferkit) or if you rather just run it slower on your CPU you will be able to find a way to use KoboldAI that works for you.
    Leader badge
    Downloads: 169 This Week
    Last Update:
    See Project
  • 18
    Screen Translate

    Screen Translate

    An OCR translator tool made by utilizing tesseract & python-opencv

    STL is an easy to use and light OCR translator tool that can be use to translate your screen. Made with python by utilizing Tesseract and opencv-python. For full view of the project you can check the Github repository: https://github.com/Dadangdut33/Screen-Translate REQUIREMENTS - Tesseract : https://github.com/UB-Mannheim/tesseract/wiki. Needed for the ocr. Install it with all the language pack. - Libretranslate (Optional for offline translation support) - Internet connection for translation if not using libretranslate # Tutorial on How To Setup https://github.com/Dadangdut33/Screen-Translate#installation-and-setup
    Leader badge
    Downloads: 46 This Week
    Last Update:
    See Project
  • 19
    Agentic Security

    Agentic Security

    Agentic LLM Vulnerability Scanner / AI red teaming kit

    The open-source Agentic LLM Vulnerability Scanner.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 20
    Algobot

    Algobot

    Cryptocurrency trading bot with a graphical user interface

    Cryptocurrency trading bot that allows users to create strategies and then backtest, optimize, simulate, or run live bots using them. Telegram integration has been added to support easier and remote trading. Please note that Algobot requires TA-LIB. You can view instructions on how to download TA-LIB. For Windows users, it's best to download the .whl package for your Python install and pip install it. For Linux and MacOS users, there's excellent documentation available. Create graphs with real time data and/or moving averages. Run simulations with parameters configured. Run custom backtests with parameters configured. Run live bots with parameters configured. Telegram integration that allows users to trade or view statistics. Create custom, trailing, or limit stop losses.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    Applio

    Applio

    A simple, high-quality voice conversion tool focused on ease of use

    Applio is a high-quality voice conversion toolkit designed to make modern RVC/VITS-based voice cloning accessible to non-experts. It focuses strongly on ease of use: installation scripts for Windows, Linux, and macOS set up dependencies and then launch a browser-based Gradio interface. Within that interface, users can train and run voice conversion models for tasks like singing conversion, speech-to-speech transformation, and voice cloning. The project is structured to be flexible through plugins and configurations so users can extend functionality without touching the core code. Applio is considered stable and mature; ongoing development is now centered on security patches, dependency maintenance, and occasional improvements, which makes it attractive for production or repeatable workflows. It also includes TensorBoard helper scripts so people training custom models can monitor metrics and experiment more systematically.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 22
    AudioLM - Pytorch

    AudioLM - Pytorch

    Implementation of AudioLM audio generation model in Pytorch

    Implementation of AudioLM, a Language Modeling Approach to Audio Generation out of Google Research, in Pytorch It also extends the work for conditioning with classifier free guidance with T5. This allows for one to do text-to-audio or TTS, not offered in the paper. Yes, this means VALL-E can be trained from this repository. It is essentially the same. This repository now also contains a MIT licensed version of SoundStream. It is also compatible with EnCodec, however, be aware that it has a more restrictive non-commercial license, if you choose to use it.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 23
    BackgroundMattingV2

    BackgroundMattingV2

    Real-Time High-Resolution Background Matting

    Official repository for the paper Real-Time High-Resolution Background Matting. Our model requires capturing an additional background image and produces state-of-the-art matting results at 4K 30fps and HD 60fps on an Nvidia RTX 2080 TI GPU.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 24
    BioNeMo

    BioNeMo

    BioNeMo Framework: For building and adapting AI models

    BioNeMo is an AI-powered framework developed by NVIDIA for protein and molecular generation using deep learning models. It provides researchers and developers with tools to design, analyze, and optimize biological molecules, aiding in drug discovery and synthetic biology applications.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 25
    Chatterbox

    Chatterbox

    SoTA open-source TTS

    Chatterbox is Resemble AI's first production-grade open source TTS model. Licensed under MIT, Chatterbox has been benchmarked against leading closed-source systems like ElevenLabs and is consistently preferred in side-by-side evaluations. Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. It's also the first open source TTS model to support emotion exaggeration control, a powerful feature that makes your voices stand out. Try it now on our Hugging Face Gradio app. If you like the model but need to scale or tune it for higher accuracy, check out our competitively priced TTS service (link). It delivers reliable performance with ultra-low latency of sub-200ms—ideal for production use in agents, applications, or interactive media.
    Downloads: 6 This Week
    Last Update:
    See Project