+
+

Related Products

  • LM-Kit.NET
    27 Ratings
    Visit Website
  • LTX
    181 Ratings
    Visit Website
  • Google AI Studio
    11 Ratings
    Visit Website
  • Rise Vision
    1,438 Ratings
    Visit Website
  • FAMCare Human Services
    25 Ratings
    Visit Website
  • Mentornity
    99 Ratings
    Visit Website
  • Jesta Vision Suite
    25 Ratings
    Visit Website
  • MicroStation
    567 Ratings
    Visit Website
  • All in One Accessibility
    32 Ratings
    Visit Website
  • Gemini Enterprise Agent Platform
    961 Ratings
    Visit Website

About

HunyuanVision is a cutting-edge vision-language model developed by Tencent’s Hunyuan team. It uses a mamba-transformer hybrid architecture to deliver strong performance and efficient inference in multimodal reasoning tasks. The version Hunyuan-Vision-1.5 is designed for “thinking on images,” meaning it not only understands vision+language content, but can perform deeper reasoning that involves manipulating or reflecting on image inputs, such as cropping, zooming, pointing, box drawing, or drawing on the image to acquire additional knowledge. It supports a variety of vision tasks (image + video recognition, OCR, diagram understanding), visual reasoning, and even 3D spatial comprehension, all in a unified multilingual framework. The model is built to work seamlessly across languages and tasks and is intended to be open sourced (including checkpoints, technical report, inference support) to encourage the community to experiment and adopt.

About

Molmo 2 is a new suite of state-of-the-art open vision-language models with fully open weights, training data, and training code that extends the original Molmo family’s grounded image understanding to video and multi-image inputs, enabling advanced video understanding, pointing, tracking, dense captioning, and question-answering capabilities; all with strong spatial and temporal reasoning across frames. Molmo 2 includes three variants: an 8 billion-parameter model optimized for overall video grounding and QA, a 4 billion-parameter version designed for efficiency, and a 7 billion-parameter Olmo-backed model offering a fully open end-to-end architecture including the underlying language model. These models outperform earlier Molmo versions on core benchmarks and set new open-model high-water marks for image and video understanding tasks, often competing with substantially larger proprietary systems while training on a fraction of the data used by comparable closed models.

Platforms Supported

Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook

Platforms Supported

Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook

Audience

AI researchers, developers, and teams interested in a solution offering multimodal understanding and reasoning across languages

Audience

Researchers, developers, and AI practitioners who need an open, state-of-the-art video and multi-image understanding model for grounded vision, tracking, and reasoning tasks

Support

Phone Support
24/7 Live Support
Online

Support

Phone Support
24/7 Live Support
Online

API

Offers API

API

Offers API

Screenshots and Videos

Screenshots and Videos

Pricing

Free
Free Version
Free Trial

Pricing

No information available.
Free Version
Free Trial

Reviews/Ratings

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Reviews/Ratings

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Training

Documentation
Webinars
Live Online
In Person

Training

Documentation
Webinars
Live Online
In Person

Company Information

Tencent
Founded: 1998
China
github.com/Tencent-Hunyuan/HunyuanVision

Company Information

Ai2
Founded: 2014
United States
allenai.org/blog/molmo2

Alternatives

HunyuanOCR

HunyuanOCR

Tencent

Alternatives

GLM-4.1V

GLM-4.1V

Zhipu AI
Hunyuan T1

Hunyuan T1

Tencent
Pixtral Large

Pixtral Large

Mistral AI
GLM-4.1V

GLM-4.1V

Zhipu AI
Devstral 2

Devstral 2

Mistral AI
Qwen3-VL

Qwen3-VL

Alibaba
Phi-2

Phi-2

Microsoft

Categories

Categories

Integrations

Ai2 OLMoE
Bluesky
Hugging Face
HunyuanOCR
ImagineX
Olmo 2
Threads

Integrations

Ai2 OLMoE
Bluesky
Hugging Face
HunyuanOCR
ImagineX
Olmo 2
Threads
Claim Hunyuan-Vision-1.5 and update features and information
Claim Hunyuan-Vision-1.5 and update features and information
Claim Molmo 2 and update features and information
Claim Molmo 2 and update features and information