voyage-code-3
Voyage AI introduces voyage-code-3, a next-generation embedding model optimized for code retrieval. It outperforms OpenAI-v3-large and CodeSage-large by an average of 13.80% and 16.81% on a suite of 32 code retrieval datasets, respectively. It supports embeddings of 2048, 1024, 512, and 256 dimensions and offers multiple embedding quantization options, including float (32-bit), int8 (8-bit signed integer), uint8 (8-bit unsigned integer), binary (bit-packed int8), and ubinary (bit-packed uint8). With a 32 K-token context length, it surpasses OpenAI's 8K and CodeSage Large's 1K context lengths. Voyage-code-3 employs Matryoshka learning to create embeddings with a nested family of various lengths within a single vector. This allows users to vectorize documents into a 2048-dimensional vector and later use shorter versions (e.g., 256, 512, or 1024 dimensions) without re-invoking the embedding model.
Learn more
Mistral AI
Mistral AI is a pioneering artificial intelligence startup specializing in open-source generative AI. The company offers a range of customizable, enterprise-grade AI solutions deployable across various platforms, including on-premises, cloud, edge, and devices. Flagship products include "Le Chat," a multilingual AI assistant designed to enhance productivity in both personal and professional contexts, and "La Plateforme," a developer platform that enables the creation and deployment of AI-powered applications. Committed to transparency and innovation, Mistral AI positions itself as a leading independent AI lab, contributing significantly to open-source AI and policy development.
Learn more
Azure AI Search
Deliver high-quality responses with a vector database built for advanced retrieval augmented generation (RAG) and modern search. Focus on exponential growth with an enterprise-ready vector database that comes with security, compliance, and responsible AI practices built in. Build better applications with sophisticated retrieval strategies backed by decades of research and customer validation. Quickly deploy your generative AI app with seamless platform and data integrations for data sources, AI models, and frameworks. Automatically upload data from a wide range of supported Azure and third-party sources. Streamline vector data processing with built-in extraction, chunking, enrichment, and vectorization, all in one flow. Support for multivector, hybrid, multilingual, and metadata filtering. Move beyond vector-only search with keyword match scoring, reranking, geospatial search, and autocomplete.
Learn more
voyage-3-large
Voyage AI has unveiled voyage-3-large, a cutting-edge general-purpose and multilingual embedding model that leads across eight evaluated domains, including law, finance, and code, outperforming OpenAI-v3-large and Cohere-v3-English by averages of 9.74% and 20.71%, respectively. Enabled by Matryoshka learning and quantization-aware training, it supports embeddings of 2048, 1024, 512, and 256 dimensions, along with multiple quantization options such as 32-bit floating point, signed and unsigned 8-bit integer, and binary precision, significantly reducing vector database costs with minimal impact on retrieval quality. Notably, voyage-3-large offers a 32K-token context length, surpassing OpenAI's 8K and Cohere's 512 tokens. Evaluations across 100 datasets in diverse domains demonstrate its superior performance, with flexible precision and dimensionality options enabling substantial storage savings without compromising quality.
Learn more