HunyuanOCR Integrations

4 Integrations with HunyuanOCR

View a list of HunyuanOCR integrations and software that integrates with HunyuanOCR below. Compare the best HunyuanOCR integrations as well as features, ratings, user reviews, and pricing of software that integrates with HunyuanOCR. Here are the current HunyuanOCR integrations in 2026:

1

GitHub

GitHub

GitHub is the world’s most secure, most scalable, and most loved developer platform. Join millions of developers and businesses building the software that powers the world. Build with the world’s most innovative communities, backed by our best tools, support, and services. If you manage multiple contributors , there’s a free option: GitHub Team for Open Source. We also run GitHub Sponsors, where we help fund your work. The Pack is back. We’ve partnered up to give students and teachers free access to the best developer tools—for the school year and beyond. Work for a government-recognized nonprofit, association, or 501(c)(3)? Get a discounted Organization account on us.

21 Ratings

Starting Price: $7 per month

View Software
2

Hugging Face

Hugging Face

Hugging Face is a leading platform for AI and machine learning, offering a vast hub for models, datasets, and tools for natural language processing (NLP) and beyond. The platform supports a wide range of applications, from text, image, and audio to 3D data analysis. Hugging Face fosters collaboration among researchers, developers, and companies by providing open-source tools like Transformers, Diffusers, and Tokenizers. It enables users to build, share, and access pre-trained models, accelerating AI development for a variety of industries.

Starting Price: $9 per month

View Software
3

Hunyuan-Vision-1.5

Tencent

HunyuanVision is a cutting-edge vision-language model developed by Tencent’s Hunyuan team. It uses a mamba-transformer hybrid architecture to deliver strong performance and efficient inference in multimodal reasoning tasks. The version Hunyuan-Vision-1.5 is designed for “thinking on images,” meaning it not only understands vision+language content, but can perform deeper reasoning that involves manipulating or reflecting on image inputs, such as cropping, zooming, pointing, box drawing, or drawing on the image to acquire additional knowledge. It supports a variety of vision tasks (image + video recognition, OCR, diagram understanding), visual reasoning, and even 3D spatial comprehension, all in a unified multilingual framework. The model is built to work seamlessly across languages and tasks and is intended to be open sourced (including checkpoints, technical report, inference support) to encourage the community to experiment and adopt.

Starting Price: Free

View Software
4

arXiv

arXiv

arXiv is a free distribution service and an open-access archive for 2,228,103 scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics. Materials on this site are not peer-reviewed by arXiv. arXiv provides an article submission portal, a TeX compilation service, search and discovery tools, web distribution for human readers,API access, machine readable data sets, and community-developed tools. Our emphasis on openness, collaboration, and scholarship provides the strong foundation on which arXiv thrives. The foundation of arXiv is based on open access, transparency, open mindedness, collaboration, and flexibility. Our institutional members, collaborators, moderators, authors, and readers are not passive recipients—they are arXiv.

Starting Price: Free

View Software