+
+

Related Products

  • ONLYOFFICE Docs
    715 Ratings
    Visit Website
  • StackAI
    53 Ratings
    Visit Website
  • Enterprise Bot
    23 Ratings
    Visit Website
  • Evertune
    1 Rating
    Visit Website
  • Gemini Enterprise Agent Platform
    967 Ratings
    Visit Website
  • AthenaHQ
    38 Ratings
    Visit Website
  • Files.com
    332 Ratings
    Visit Website
  • ISL Light Remote Desktop
    1,568 Ratings
    Visit Website
  • Devin Desktop
    171 Ratings
    Visit Website
  • Jobma
    277 Ratings
    Visit Website

About

PyGPT is an open source, personal desktop AI assistant for Linux, Windows, and Mac, written in Python. It works similarly to ChatGPT, but locally on a desktop computer, with chat, vision, agents, image and video generation, tools, voice control, and more. PyGPT supports multiple models, including OpenAI GPT-5, GPT-4, o1, o3, o4, Google Gemini, Anthropic Claude, xAI Grok, Perplexity Sonar, DeepSeek, Mistral AI, and models accessible through Ollama and LlamaIndex. It offers 12 modes of operation, including chat, chat with files, realtime + audio, research, completion, image and video generation, vision, assistants, experts, computer use, agents, and autonomous mode. Users can chat with their own files and data using integrated LlamaIndex support. PyGPT includes built-in vector database support, automated files and data embedding, full conversation context, short- and long-term memory, internet access through Google, Microsoft Bing, and DuckDuckGo, plus speech synthesis and recognition.

About

Vision Agents is an open source Python framework for building low-latency voice and video AI agents with any model. It lets developers plug in LLM, speech, and vision models from more than 25 providers and ship real-time agents for telehealth, voice support, live coaching, video analysis, interactive avatars, security monitoring, sports commentary, and other multimodal applications. It is designed to help teams build agents that can listen, speak, see, process media, call tools, and respond in real time while running on Stream’s global edge network with sub-500ms latency. Developers can build a first agent in minutes, using a small Python setup with Gemini Realtime, OpenAI, Deepgram, ElevenLabs, Stream, or other supported providers. Vision Agents supports both real-time speech-to-speech models and custom STT/LLM/TTS pipelines, giving teams either the fastest path to a working voice agent or full control over speech recognition, language reasoning, text-to-speech, etc.

Platforms Supported

Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook

Platforms Supported

Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook

Audience

Power users and developers seeking a local desktop AI assistant that connects models, files, tools, agents, voice, and automation in one configurable workspace

Audience

AI product engineers and developer teams who need a tool to build real-time voice, video, camera-aware, and multimodal agents with swappable model providers

Support

Phone Support
24/7 Live Support
Online

Support

Phone Support
24/7 Live Support
Online

API

Offers API

API

Offers API

Screenshots and Videos

Screenshots and Videos

Pricing

Free
Free Version
Free Trial

Pricing

Free
Free Version
Free Trial

Reviews/Ratings

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Reviews/Ratings

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Training

Documentation
Webinars
Live Online
In Person

Training

Documentation
Webinars
Live Online
In Person

Company Information

PyGPT
Founded: 2026
Poland
pygpt.net

Company Information

Stream
United States
visionagents.ai/

Alternatives

Jan

Jan

Jan.ai

Alternatives

ElevenAgents

ElevenAgents

ElevenLabs
Merlin AI

Merlin AI

Foyer

Categories

Categories

Integrations

ElevenLabs
GPT-5
Grok
OpenAI
Python
Amazon Nova
Anthropic
Baseten
DALL·E 2
Docker
Fish Audio
Gemini
Moondream
OpenAI o4-mini
Prometheus
Roboflow
Slack
Sonar
Voxtral
gpt-oss-120b

Integrations

ElevenLabs
GPT-5
Grok
OpenAI
Python
Amazon Nova
Anthropic
Baseten
DALL·E 2
Docker
Fish Audio
Gemini
Moondream
OpenAI o4-mini
Prometheus
Roboflow
Slack
Sonar
Voxtral
gpt-oss-120b
Claim PyGPT and update features and information
Claim PyGPT and update features and information
Claim Vision Agents and update features and information
Claim Vision Agents and update features and information