Open Source Artificial Intelligence Software - Page 4

Sort By:

Artificial Intelligence Software

View 13810 business solutions

Artificial Intelligence Clear Filters

Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
1

Voicebox

The open-source voice synthesis studio powered by Qwen3-TTS

Voicebox is a local-first voice synthesis studio that aims to bring professional, DAW-like voice generation workflows to a desktop app while keeping models and voice data entirely on your machine. It positions itself as an open-source alternative to cloud voice platforms by emphasizing privacy, offline use, and freedom from subscriptions or usage caps. The tool supports downloading voice models, cloning voices from short audio samples, and generating speech locally, then organizing the results using studio-oriented editing concepts. A standout capability is its multi-track timeline editor and supporting audio tools (like trimming and conversation mixing), which let creators compose multi-voice scenes instead of generating single clips in isolation. It is API-first, meaning you can use it as an app for production work or integrate its speech generation into your own software via an API layer.

Downloads: 90 This Week

Last Update: 2026-04-25
See Project
2

Applio

A simple, high-quality voice conversion tool focused on ease of use

Applio is a high-quality voice conversion toolkit designed to make modern RVC/VITS-based voice cloning accessible to non-experts. It focuses strongly on ease of use: installation scripts for Windows, Linux, and macOS set up dependencies and then launch a browser-based Gradio interface. Within that interface, users can train and run voice conversion models for tasks like singing conversion, speech-to-speech transformation, and voice cloning. The project is structured to be flexible through plugins and configurations so users can extend functionality without touching the core code. Applio is considered stable and mature; ongoing development is now centered on security patches, dependency maintenance, and occasional improvements, which makes it attractive for production or repeatable workflows. It also includes TensorBoard helper scripts so people training custom models can monitor metrics and experiment more systematically.

Downloads: 89 This Week

Last Update: 2026-02-18
See Project
3

eGuideDog free software for the blind

eGuideDog project develops free software for the blind. Currently, we focus on WebSpeech, Ekho TTS and WebAnywhere.

16 Reviews

Downloads: 410 This Week

Last Update: 5 days ago
See Project
4

Robocode

Robocode is a programming tank game for Java

Robocode is a programming game, where the goal is to develop a robot battle tank to battle against other tanks with Java. The robot battles are running in real-time and on-screen. The motto of Robocode is: Build the best, destroy the rest!

29 Reviews

Downloads: 407 This Week

Last Update: 2025-12-09
See Project
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
5

Netron

Visualizer for neural network, deep learning, machine learning models

Netron is a viewer for neural network, deep learning and machine learning models. Netron supports ONNX, Keras, TensorFlow Lite, Caffe, Darknet, Core ML, MNN, MXNet, ncnn, PaddlePaddle, Caffe2, Barracuda, Tengine, TNN, RKNN, MindSpore Lite, and UFF. Netron has experimental support for TensorFlow, PyTorch, TorchScript, OpenVINO, Torch, Arm NN, BigDL, Chainer, CNTK, Deeplearning4j, MediaPipe, ML.NET, scikit-learn, TensorFlow.js. There is an extense variety of sample model files to download or open using the browser version. It is supported by macOS, Windows, Linux, Python Server and browser.

Downloads: 86 This Week

Last Update: 3 days ago
See Project
6

Screen Translator

Screen capture, OCR and translation tool

This software allows you to translate any text on screen. Basically it is a combination of screen capture, OCR and translation tools. More info and the latest release on the homepage (https://github.com/OneMoreGres/ScreenTranslator)

20 Reviews

Downloads: 731 This Week

Last Update: 2022-02-05
See Project
7

ChatGPT Desktop Application

🔮 ChatGPT Desktop Application (Mac, Windows and Linux)

ChatGPT Desktop Application (Mac, Windows and Linux)

Downloads: 85 This Week

Last Update: 2023-08-03
See Project
8

COLMAP

Structure-from-Motion and Multi-View Stereo

COLMAP is a general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline with a graphical and command-line interface. It offers a wide range of features for the reconstruction of ordered and unordered image collections. The software is licensed under the new BSD license.

Downloads: 82 This Week

Last Update: 2026-04-27
See Project
9

GLM-4.6

Agentic, Reasoning, and Coding (ARC) foundation models

GLM-4.6 is the latest iteration of Zhipu AI’s foundation model, delivering significant advancements over GLM-4.5. It introduces an extended 200K token context window, enabling more sophisticated long-context reasoning and agentic workflows. The model achieves superior coding performance, excelling in benchmarks and practical coding assistants such as Claude Code, Cline, Roo Code, and Kilo Code. Its reasoning capabilities have been strengthened, including improved tool usage during inference and more effective integration within agent frameworks. GLM-4.6 also enhances writing quality, producing outputs that better align with human preferences and role-playing scenarios. Benchmark evaluations demonstrate that it not only outperforms GLM-4.5 but also rivals leading global models such as DeepSeek-V3.1-Terminus and Claude Sonnet 4.

Downloads: 81 This Week

Last Update: 2026-02-01
See Project
Add Two Lines of Code. Get Full APM.
AppSignal installs in minutes and auto-configures dashboards, alerts, and error tracking.

Works out of the box for Rails, Django, Express, Phoenix, and more. Monitoring exceptions and performance in no time.

Start Free
10

LabelImg

Graphical image annotation tool and label object bounding boxes

LabelImg is a graphical image annotation tool. It is written in Python and uses Qt for its graphical interface. Annotations are saved as XML files in PASCAL VOC format, the format used by ImageNet. Besides, it also supports YOLO and CreateML formats. Linux/Ubuntu/Mac requires at least Python 2.6 and has been tested with PyQt 4.8. However, Python 3 or above and PyQt5 are strongly recommended. Virtualenv can avoid a lot of the QT / Python version issues. Build and launch using the instructions. Click 'Change default saved annotation folder' in Menu/File. Click 'Open Dir'. Click 'Create RectBox'. Click and release left mouse to select a region to annotate the rect box. You can use right mouse to drag the rect box to copy or move it. The annotation will be saved to the folder you specify. You can refer to the hotkeys to speed up your workflow.

Downloads: 81 This Week

Last Update: 2021-05-05
See Project
11

OpenMontage

World's first open-source, agentic video production system

OpenMontage is an open-source, agent-driven video production system that transforms AI coding assistants into fully automated multimedia creation pipelines. Instead of focusing on a single capability such as text-to-video generation, it treats video production as a structured, multi-stage workflow that mirrors how a real production team operates, including research, scripting, asset generation, editing, and final rendering. The system orchestrates a large collection of tools and models through coordinated pipelines, enabling an AI agent to autonomously gather information, write scripts, generate visuals, synthesize voiceovers, and assemble a complete video output. One of its defining characteristics is its modular and extensible architecture, which allows users to mix and match different providers, including both cloud APIs and local models, depending on performance, cost, or privacy needs.

Downloads: 80 This Week

Last Update: 6 days ago
See Project
12

Whisper

Robust Speech Recognition via Large-Scale Weak Supervision

OpenAI Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a sequence of tokens to be predicted by the decoder, allowing a single model to replace many stages of a traditional speech-processing pipeline. The multitask training format uses a set of special tokens that serve as task specifiers or classification targets.

Downloads: 80 This Week

Last Update: 2025-06-26
See Project
13

FlashAttention

Fast and memory-efficient exact attention

FlashAttention is a high-performance deep learning optimization library that reimplements the attention mechanism used in transformer models to be significantly faster and more memory-efficient than standard implementations. It achieves this by using IO-aware algorithms that minimize memory reads and writes, reducing the quadratic memory overhead typically associated with attention operations. The project provides implementations of FlashAttention, FlashAttention-2, and newer iterations optimized for modern GPU architectures such as NVIDIA Hopper and AMD accelerators. By improving both forward and backward pass efficiency, it enables training and inference of large language models with longer sequence lengths and higher throughput. The library integrates with PyTorch and supports various attention configurations, including causal masking, multi-query attention, and rotary embeddings.

Downloads: 77 This Week

Last Update: 2026-03-18
See Project
14

CMU Sphinx

Speech Recognition Toolkit

Thank you for visiting! ----> Maintenance and improvement work has MOVED to https://cmusphinx.github.io/ Please go there for the most recent software and documentation. <---- CMUSphinx is a speaker-independent large vocabulary continuous speech recognizer released under BSD style license. It is also a collection of open source tools and resources that allows researchers and developers to build speech recognition systems.

58 Reviews

Downloads: 325 This Week

Last Update: 2024-01-11
See Project
15

GFPGAN

GFPGAN aims at developing Practical Algorithms

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration. Colab Demo for GFPGAN; (Another Colab Demo for the original paper model) Online demo: Huggingface (return only the cropped face) Online demo: Replicate.ai (may need to sign in, return the whole image). Online demo: Baseten.co (backed by GPU, returns the whole image). We provide a clean version of GFPGAN, which can run without CUDA extensions. So that it can run in Windows or on CPU mode. GFPGAN aims at developing a Practical Algorithm for Real-world Face Restoration. It leverages rich and diverse priors encapsulated in a pretrained face GAN (e.g., StyleGAN2) for blind face restoration. Add V1.3 model, which produces more natural restoration results, and better results on very low-quality / high-quality inputs.

Downloads: 72 This Week

Last Update: 2022-09-16
See Project
16

Handy STT

A free, open source, and extensible speech-to-text application

Handy is a free, open-source, offline speech-to-text application built for privacy, accessibility, and extensibility. Developed using Tauri (Rust + React/TypeScript), it runs natively across Windows, macOS, and Linux while performing local speech recognition without sending any audio to cloud servers. Handy allows users to start transcription instantly using a configurable keyboard shortcut—press to record, release to transcribe—and automatically pastes the resulting text into any active text field. Its backend leverages OpenAI’s Whisper models for GPU-accelerated speech recognition and Parakeet V3 for efficient CPU-only transcription with automatic language detection. To further refine accuracy and responsiveness, Handy integrates Silero’s Voice Activity Detection (VAD) for silence filtering, ensuring only speech segments are processed.

Downloads: 72 This Week

Last Update: 2026-04-27
See Project
17

OpenClaw Installer

ClawdBot one-click deployment tool

OpenClaw Installer is an open-source one-click deployment and configuration tool for installing OpenClaw — a personal AI assistant — onto systems with minimal manual setup, giving users a streamlined path to get their own AI assistant running quickly. The project provides shell scripts and configuration menus that detect the host environment, install dependencies, download OpenClaw, configure core settings like AI models and identity channels, and start the server automatically. It supports multiple platforms, including macOS, Linux distributions (Ubuntu, Debian, CentOS), and Windows environments via compatible shells, and simplifies otherwise complex installation steps into a guided, terminal-based experience. The tool also includes options to test API connections, validate channel integrations like Telegram or Discord bots, and launch persistent services that keep OpenClaw running in the background.

Downloads: 71 This Week

Last Update: 2026-03-22
See Project
18

Telegram File Stream Bot

A telegram bot that will give instant stream links for telegram files

A Telegram bot to generate direct link for your Telegram files.

Downloads: 71 This Week

Last Update: 2026-02-16
See Project
19

Open Generative AI

Uncensored, open-source alternative to Higgsfield AI

Open Generative AI is a curated collection of resources, tools, and frameworks related to generative AI, covering a wide range of topics from foundational concepts to advanced applications. The repository organizes information about models, libraries, datasets, and learning materials, making it easier for developers to navigate the rapidly evolving AI landscape. It includes references to tools for natural language processing, computer vision, and multimodal systems. The project is designed as a knowledge hub, helping users discover technologies and best practices for building generative AI applications. It is particularly useful for beginners who need a structured overview as well as for experienced developers looking for new tools. The repository is continuously updated to reflect the latest developments in the field. Overall, it serves as a comprehensive guide to the generative AI ecosystem.

Downloads: 68 This Week

Last Update: 4 days ago
See Project
20

Wan2.1

Wan2.1: Open and Advanced Large-Scale Video Generative Model

Wan2.1 is a foundational open-source large-scale video generative model developed by the Wan team, providing high-quality video generation from text and images. It employs advanced diffusion-based architectures to produce coherent, temporally consistent videos with realistic motion and visual fidelity. Wan2.1 focuses on efficient video synthesis while maintaining rich semantic and aesthetic detail, enabling applications in content creation, entertainment, and research. The model supports text-to-video and image-to-video generation tasks with flexible resolution options suitable for various GPU hardware configurations. Wan2.1’s architecture balances generation quality and inference cost, paving the way for later improvements seen in Wan2.2 such as Mixture-of-Experts and enhanced aesthetics. It was trained on large-scale video and image datasets, providing generalization across diverse scenes and motion patterns.

1 Review

Downloads: 68 This Week

Last Update: 2026-03-05
See Project
21

Umbrel

A beautiful personal server OS for Raspberry Pi or any Linux distro

Run your personal server with a Bitcoin and Lightning node in your home, self-host open source apps like Nextcloud and Matrix to break away from big tech, and take full control of your data. For free. All our interactions on the internet today are mediated by a few companies who offer “free” services in exchange for storing our data on their servers to spy on us. Running a personal server fundamentally changes that. You and your family’s photos, videos, files, notes, passwords, everything, have nothing to do with someone else’s computer. They’re a part of your private life, and now they can all be stored by you, in your home, on your Umbrel. The Bitcoin network is made up of thousands of nodes that verify every single transaction in the blockchain. Some of them mine Bitcoin too, but unlike a mining node, running a non-mining node doesn’t require expensive hardware. Achieve unparalleled privacy by connecting your wallet directly to the Bitcoin node on your Umbrel.

Downloads: 66 This Week

Last Update: 7 days ago
See Project
22

RisuAI

Make your own story. User-friendly software for LLM roleplaying

RisuAI (or Risu) is a cross-platform AI roleplay chat application—available as both a desktop and web solution—offering creative story-building and character interaction experiences with support for multiple APIs, in-chat assets, regex capabilities, and more. Supports OpenAI, Claude, Gemini, DeepInfra, Ooba, OpenRouter... and More. Display the image of the current character, according to his/her expressions. Modify model's output by regex, to make a custom GUI and others.

Downloads: 65 This Week

Last Update: 2026-04-18
See Project
23

YOLOv3

Object detection architectures and models pretrained on the COCO data

Fast, precise and easy to train, YOLOv5 has a long and successful history of real time object detection. Treat YOLOv5 as a university where you'll feed your model information for it to learn from and grow into one integrated tool. You can get started with less than 6 lines of code. with YOLOv5 and its Pytorch implementation. Have a go using our API by uploading your own image and watch as YOLOv5 identifies objects using our pretrained models. Start training your model without being an expert. Students love YOLOv5 for its simplicity and there are many quickstart examples for you to get started within seconds. Export and deploy your YOLOv5 model with just 1 line of code. There are also loads of quickstart guides and tutorials available to get your model where it needs to be. Create state of the art deep learning models with YOLOv5

Downloads: 64 This Week

Last Update: 2022-08-01
See Project
24

Hands-On Large Language Models

Official code repo for the O'Reilly Book

Hands-On-Large-Language-Models is the official GitHub code repository accompanying the practical technical book Hands-On Large Language Models authored by Jay Alammar and Maarten Grootendorst, providing a comprehensive collection of example notebooks, code labs, and supporting materials that illustrate the core concepts and real-world applications of large language models. The repository is structured into chapters that align with the educational progression of the book — covering everything from foundational topics like tokens, embeddings, and transformer architecture to advanced techniques such as prompt engineering, semantic search, retrieval-augmented generation (RAG), multimodal LLMs, and fine-tuning. Each chapter contains executable Jupyter notebooks that are designed to be run in environments like Google Colab, making it easy for learners to experiment interactively with models, visualize attention patterns, implement classification and generation tasks.

Downloads: 63 This Week

Last Update: 2026-04-24
See Project
25

OpenKM Document Management - DMS

Document Management System and Content Management System

OpenKM Community Edition is a free Document Management System (DMS) that helps businesses control the production, storage, management and distribution of electronic documents, boosting effectiveness and productivity. It integrates document management, collaboration and advanced search into one easy-to-use solution, including administration tools for user roles, access control, security levels, activity logs and automation setup. With OpenKM Community Edition you can: Collect information from any digital source. Collaborate with colleagues on documents and projects. Capitalize on accumulated knowledge by locating documents and information sources. Control business processes with an embedded workflow engine. Automate tasks. For a complete feature list visit: http://goo.gl/au8cQy

32 Reviews

Downloads: 300 This Week

Last Update: 2026-04-17
See Project