Reasoning-powered OCR VLM for converting complex documents to Markdown
Multimodal Transformer for document image understanding and layout
Multimodal 7B model for image, video, and text understanding tasks
Versatile 8B-base multimodal LLM, flexible foundation for custom AI
Lightweight multimodal translation model for 55 languages
Summarization model fine-tuned on CNN/DailyMail articles
Qwen3-Next: 80B instruct LLM with ultra-long context up to 1M tokens
Efficient 13B MoE language model with long context and reasoning modes
Small 3B-base multimodal model ideal for custom AI on edge hardware
Efficient 8B multimodal model tuned for advanced reasoning tasks.
VaultGemma: 1B DP-trained Gemma variant for private NLP tasks
Powerful 14B-base multimodal model — flexible base for fine-tuning