CodeQwen
CodeQwen is the code version of Qwen, the large language model series developed by the Qwen team, Alibaba Cloud. It is a transformer-based decoder-only language model pre-trained on a large amount of data of codes. Strong code generation capabilities and competitive performance across a series of benchmarks. Supporting long context understanding and generation with the context length of 64K tokens. CodeQwen supports 92 coding languages and provides excellent performance in text-to-SQL, bug fixes, etc. You can just write several lines of code with transformers to chat with CodeQwen. Essentially, we build the tokenizer and the model from pre-trained methods, and we use the generate method to perform chatting with the help of the chat template provided by the tokenizer. We apply the ChatML template for chat models following our previous practice. The model completes the code snippets according to the given prompts, without any additional formatting.
Learn more
Yi-Large
Yi-Large is a proprietary large language model developed by 01.AI, offering a 32k context length with both input and output costs at $2 per million tokens. It stands out with its advanced capabilities in natural language processing, common-sense reasoning, and multilingual support, performing on par with leading models like GPT-4 and Claude3 in various benchmarks. Yi-Large is designed for tasks requiring complex inference, prediction, and language understanding, making it suitable for applications like knowledge search, data classification, and creating human-like chatbots. Its architecture is based on a decoder-only transformer with enhancements such as pre-normalization and Group Query Attention, and it has been trained on a vast, high-quality multilingual dataset. This model's versatility and cost-efficiency make it a strong contender in the AI market, particularly for enterprises aiming to deploy AI solutions globally.
Learn more
Pixtral Large
Pixtral Large is a 124-billion-parameter open-weight multimodal model developed by Mistral AI, building upon their Mistral Large 2 architecture. It integrates a 123-billion-parameter multimodal decoder with a 1-billion-parameter vision encoder, enabling advanced understanding of documents, charts, and natural images while maintaining leading text comprehension capabilities. With a context window of 128,000 tokens, Pixtral Large can process at least 30 high-resolution images simultaneously. The model has demonstrated state-of-the-art performance on benchmarks such as MathVista, DocVQA, and VQAv2, surpassing models like GPT-4o and Gemini-1.5 Pro. Pixtral Large is available under the Mistral Research License for research and educational use, and under the Mistral Commercial License for commercial applications.
Learn more
Phi-4-mini-flash-reasoning
Phi-4-mini-flash-reasoning is a 3.8 billion‑parameter open model in Microsoft’s Phi family, purpose‑built for edge, mobile, and other resource‑constrained environments where compute, memory, and latency are tightly limited. It introduces the SambaY decoder‑hybrid‑decoder architecture with Gated Memory Units (GMUs) interleaved alongside Mamba state‑space and sliding‑window attention layers, delivering up to 10× higher throughput and a 2–3× reduction in latency compared to its predecessor without sacrificing advanced math and logic reasoning performance. Supporting a 64 K‑token context length and fine‑tuned on high‑quality synthetic data, it excels at long‑context retrieval, reasoning tasks, and real‑time inference, all deployable on a single GPU. Phi-4-mini-flash-reasoning is available today via Azure AI Foundry, NVIDIA API Catalog, and Hugging Face, enabling developers to build fast, scalable, logic‑intensive applications.
Learn more