Holo2H Company
|
Hunyuan-Vision-1.5Tencent
|
|||||
Related Products
|
||||||
About
H Company’s Holo2 model family delivers cost-efficient, high-performance vision-language models tailored for computer-use agents that navigate, localize UI elements, and act across web, desktop, and mobile environments. The series, available in 4 B, 8 B, and 30 B-A3B sizes, builds on their earlier Holo1 and Holo1.5 models, retaining strong UI grounding while significantly enhancing navigation capabilities. Holo2 models use a mixture-of-experts (MoE) architecture, activating only necessary parameters, to optimize efficiency. Trained on curated localization and agent datasets, they can be deployed as drop-in replacements for their predecessors. They support seamless inference in frameworks compatible with Qwen3-VL models and can be integrated into agentic pipelines like Surfer 2. In benchmark testing, Holo2-30B-A3B achieved 66.1% accuracy on ScreenSpot-Pro and 76.1% on OSWorld-G, leading the UI localization category.
|
About
HunyuanVision is a cutting-edge vision-language model developed by Tencent’s Hunyuan team. It uses a mamba-transformer hybrid architecture to deliver strong performance and efficient inference in multimodal reasoning tasks. The version Hunyuan-Vision-1.5 is designed for “thinking on images,” meaning it not only understands vision+language content, but can perform deeper reasoning that involves manipulating or reflecting on image inputs, such as cropping, zooming, pointing, box drawing, or drawing on the image to acquire additional knowledge. It supports a variety of vision tasks (image + video recognition, OCR, diagram understanding), visual reasoning, and even 3D spatial comprehension, all in a unified multilingual framework. The model is built to work seamlessly across languages and tasks and is intended to be open sourced (including checkpoints, technical report, inference support) to encourage the community to experiment and adopt.
|
|||||
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
|||||
Audience
AI developers and enterprise teams interested in a solution to build autonomous agents that interact with user-interfaces and need efficient, high‐capability foundation models
|
Audience
AI researchers, developers, and teams interested in a solution offering multimodal understanding and reasoning across languages
|
|||||
Support
Phone Support
24/7 Live Support
Online
|
Support
Phone Support
24/7 Live Support
Online
|
|||||
API
Offers API
|
API
Offers API
|
|||||
Screenshots and Videos |
Screenshots and Videos |
|||||
Pricing
No information available.
Free Version
Free Trial
|
Pricing
Free
Free Version
Free Trial
|
|||||
Reviews/
|
Reviews/
|
|||||
Training
Documentation
Webinars
Live Online
In Person
|
Training
Documentation
Webinars
Live Online
In Person
|
|||||
Company InformationH Company
Founded: 2023
France
www.hcompany.ai/blog/holo2
|
Company InformationTencent
Founded: 1998
China
github.com/Tencent-Hunyuan/HunyuanVision
|
|||||
Alternatives |
Alternatives |
|||||
|
|
||||||
|
|
|
|||||
|
|
|
|||||
|
|
||||||
Categories |
Categories |
|||||
|
|
|