Best Large Language Models - Page 7

Compare the Top Large Language Models as of August 2025 - Page 7

  • 1
    LUIS

    LUIS

    Microsoft

    Language Understanding (LUIS): A machine learning-based service to build natural language into apps, bots, and IoT devices. Quickly create enterprise-ready, custom models that continuously improve. Add natural language to your apps. Designed to identify valuable information in conversations, LUIS interprets user goals (intents) and distills valuable information from sentences (entities), for a high quality, nuanced language model. LUIS integrates seamlessly with the Azure Bot Service, making it easy to create a sophisticated bot. Powerful developer tools are combined with customizable pre-built apps and entity dictionaries, such as Calendar, Music, and Devices, so you can build and deploy a solution more quickly. Dictionaries are mined from the collective knowledge of the web and supply billions of entries, helping your model to correctly identify valuable information from user conversations. Active learning is used to continuously improve the quality of the models.
  • 2
    Sparrow

    Sparrow

    DeepMind

    Sparrow is a research model and proof of concept, designed with the goal of training dialogue agents to be more helpful, correct, and harmless. By learning these qualities in a general dialogue setting, Sparrow advances our understanding of how we can train agents to be safer and more useful – and ultimately, to help build safer and more useful artificial general intelligence (AGI). Sparrow is not yet available for public use. Training a conversational AI is an especially challenging problem because it’s difficult to pinpoint what makes a dialogue successful. To address this problem, we turn to a form of reinforcement learning (RL) based on people's feedback, using the study participants’ preference feedback to train a model of how useful an answer is. To get this data, we show our participants multiple model answers to the same question and ask them which answer they like the most.
  • 3
    NVIDIA NeMo
    NVIDIA NeMo LLM is a service that provides a fast path to customizing and using large language models trained on several frameworks. Developers can deploy enterprise AI applications using NeMo LLM on private and public clouds. They can also experience Megatron 530B—one of the largest language models—through the cloud API or experiment via the LLM service. Customize your choice of various NVIDIA or community-developed models that work best for your AI applications. Within minutes to hours, get better responses by providing context for specific use cases using prompt learning techniques. Leverage the power of NVIDIA Megatron 530B, one of the largest language models, through the NeMo LLM Service or the cloud API. Take advantage of models for drug discovery, including in the cloud API and NVIDIA BioNeMo framework.
  • 4
    ERNIE Bot
    ERNIE Bot is an AI-powered conversational assistant developed by Baidu, designed to facilitate seamless and natural interactions with users. Built on the ERNIE (Enhanced Representation through Knowledge Integration) model, ERNIE Bot excels at understanding complex queries and generating human-like responses across various domains. Its capabilities include processing text, generating images, and engaging in multimodal communication, making it suitable for a wide range of applications such as customer support, virtual assistants, and enterprise automation. With its advanced contextual understanding, ERNIE Bot offers an intuitive and efficient solution for businesses seeking to enhance their digital interactions and automate workflows.
    Starting Price: Free
  • 5
    PaLM

    PaLM

    Google

    PaLM API is an easy and safe way to build on top of our best language models. Today, we’re making an efficient model available, in terms of size and capabilities, and we’ll add other sizes soon. The API also comes with an intuitive tool called MakerSuite, which lets you quickly prototype ideas and, over time, will have features for prompt engineering, synthetic data generation and custom-model tuning — all supported by robust safety tools. Select developers can access the PaLM API and MakerSuite in Private Preview today, and stay tuned for our waitlist soon.
  • 6
    Med-PaLM 2

    Med-PaLM 2

    Google Cloud

    Healthcare breakthroughs change the world and bring hope to humanity through scientific rigor, human insight, and compassion. We believe AI can contribute to this, with thoughtful collaboration between researchers, healthcare organizations, and the broader ecosystem. Today, we're sharing exciting progress on these initiatives, with the announcement of limited access to Google’s medical large language model, or LLM, called Med-PaLM 2. It will be available in the coming weeks to a select group of Google Cloud customers for limited testing, to explore use cases and share feedback as we investigate safe, responsible, and meaningful ways to use this technology. Med-PaLM 2 harnesses the power of Google’s LLMs, aligned to the medical domain to more accurately and safely answer medical questions. As a result, Med-PaLM 2 was the first LLM to perform at an “expert” test-taker level performance on the MedQA dataset of US Medical Licensing Examination (USMLE)-style questions.
  • 7
    Gopher

    Gopher

    DeepMind

    Language, and its role in demonstrating and facilitating comprehension - or intelligence - is a fundamental part of being human. It gives people the ability to communicate thoughts and concepts, express ideas, create memories, and build mutual understanding. These are foundational parts of social intelligence. It’s why our teams at DeepMind study aspects of language processing and communication, both in artificial agents and in humans. As part of a broader portfolio of AI research, we believe the development and study of more powerful language models – systems that predict and generate text – have tremendous potential for building advanced AI systems that can be used safely and efficiently to summarise information, provide expert advice and follow instructions via natural language. Developing beneficial language models requires research into their potential impacts, including the risks they pose.
  • 8
    PaLM 2

    PaLM 2

    Google

    PaLM 2 is our next generation large language model that builds on Google’s legacy of breakthrough research in machine learning and responsible AI. It excels at advanced reasoning tasks, including code and math, classification and question answering, translation and multilingual proficiency, and natural language generation better than our previous state-of-the-art LLMs, including PaLM. It can accomplish these tasks because of the way it was built – bringing together compute-optimal scaling, an improved dataset mixture, and model architecture improvements. PaLM 2 is grounded in Google’s approach to building and deploying AI responsibly. It was evaluated rigorously for its potential harms and biases, capabilities and downstream uses in research and in-product applications. It’s being used in other state-of-the-art models, like Med-PaLM 2 and Sec-PaLM, and is powering generative AI features and tools at Google, like Bard and the PaLM API.
  • 9
    Hippocratic AI

    Hippocratic AI

    Hippocratic AI

    Hippocratic AI is the new state of the art (SOTA) model, outperforming GPT-4 on 105 of 114 healthcare exams and certifications. Hippocratic AI has outperformed GPT-4 on 105 out of 114 tests and certifications, outperformed by a margin of five percent or more on 74 of the certifications, and outperformed by a margin of ten percent or more on 43 of the certifications. Most language models pre-train on the common crawl of the Internet, which may include incorrect and misleading information. Unlike these LLMs, Hippocratic AI is investing heavily in legally acquiring evidence-based healthcare content. We’re conducting a unique Reinforcement Learning with Human Feedback process using healthcare professionals to train and validate the model’s readiness for deployment. We call this RLHF-HP. Hippocratic AI will not release the model until a large number of these licensed professionals deem it safe.
  • 10
    YandexGPT
    Take advantage of the capabilities of generative language models to improve and optimize your applications and web services. Get an aggregated result of accumulated textual data whether it be information from work chats, user reviews, or other types of data. YandexGPT will help both summarize and interpret the information. Speed up text creation as you improve their quality and style. Create template texts for newsletters, product descriptions for online stores and other applications. Develop a chatbot for your support service: teach the bot to answer various user questions, both common and more complicated. Use the API to integrate the service with your applications and automate processes.
  • 11
    Ntropy

    Ntropy

    Ntropy

    Ship faster integrating with our Python SDK or Rest API in minutes. No prior setups or data formatting. You can get going straight away as soon as you have incoming data and your first customers. We have built and fine-tuned custom language models to recognize entities, automatically crawl the web in real-time and pick the best match, as well as assign labels with superhuman accuracy in a fraction of the time. Everybody has a data enrichment model that is trying to be good at one thing, US or Europe, business or consumer. These models are poor at generalizing and are not capable of human-level output. With us, you can leverage the power of the world's largest and most performant models embedded in your products, at a fraction of cost and time.
  • 12
    Giga ML

    Giga ML

    Giga ML

    We just launched X1 large series of Models. Giga ML's most powerful model is available for pre-training and fine-tuning with on-prem deployment. Since we are Open AI compatible, your existing integrations with long chain, llama-index, and all others work seamlessly. You can continue pre-training of LLM's with domain-specific data books or docs or company docs. The world of large language models (LLMs) rapidly expanding, offering unprecedented opportunities for natural language processing across various domains. However, some critical challenges have remained unaddressed. At Giga ML, we proudly introduce the X1 Large 32k model, a pioneering on-premise LLM solution that addresses these critical issues.
  • 13
    Martian

    Martian

    Martian

    By using the best-performing model for each request, we can achieve higher performance than any single model. Martian outperforms GPT-4 across OpenAI's evals (open/evals). We turn opaque black boxes into interpretable representations. Our router is the first tool built on top of our model mapping method. We are developing many other applications of model mapping including turning transformers from indecipherable matrices into human-readable programs. If a company experiences an outage or high latency period, automatically reroute to other providers so your customers never experience any issues. Determine how much you could save by using the Martian Model Router with our interactive cost calculator. Input your number of users, tokens per session, and sessions per month, and specify your cost/quality tradeoff.
  • 14
    Phi-2

    Phi-2

    Microsoft

    We are now releasing Phi-2, a 2.7 billion-parameter language model that demonstrates outstanding reasoning and language understanding capabilities, showcasing state-of-the-art performance among base language models with less than 13 billion parameters. On complex benchmarks Phi-2 matches or outperforms models up to 25x larger, thanks to new innovations in model scaling and training data curation. With its compact size, Phi-2 is an ideal playground for researchers, including for exploration around mechanistic interpretability, safety improvements, or fine-tuning experimentation on a variety of tasks. We have made Phi-2 available in the Azure AI Studio model catalog to foster research and development on language models.
  • 15
    Hyperplane

    Hyperplane

    Hyperplane

    Better audiences from the richness of transaction data. Create nuanced personas and effective marketing campaigns based on financial behaviors and consumer interests. Increase user limits, without worrying about default. Leverage user income estimates that are precise and always up-to-date. The Hyperplane platform enables financial institutions to launch personalized consumer experiences through specialized foundation models (LLMs). Upgrade your feature sets with embeddings for credit, collections, and lookalike modeling. Segment users based on various criteria, enabling you to target specific audience groups for personalized marketing campaigns, content delivery, and user analysis. Segmentation is achieved through facets, which are key attributes or characteristics used to categorize users, Hyperplane offers the capability to enrich user segmentation by employing additional attributes to fine-tune the filtering of responses from certain audience segmentation endpoints.
  • 16
    Smaug-72B
    Smaug-72B is a powerful open-source large language model (LLM) known for several key features: High Performance: It currently holds the top spot on the Hugging Face Open LLM leaderboard, surpassing models like GPT-3.5 in various benchmarks. This means it excels at tasks like understanding, responding to, and generating human-like text. Open Source: Unlike many other advanced LLMs, Smaug-72B is freely available for anyone to use and modify, fostering collaboration and innovation in the AI community. Focus on Reasoning and Math: It specifically shines in handling reasoning and mathematical tasks, attributing this strength to unique fine-tuning techniques developed by Abacus AI, the creators of Smaug-72B. Based on Qwen-72B: It's technically a fine-tuned version of another powerful LLM called Qwen-72B, released by Alibaba, further improving upon its capabilities. Overall, Smaug-72B represents a significant step forward in open-source AI.
    Starting Price: Free
  • 17
    Gemma

    Gemma

    Google

    Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Developed by Google DeepMind and other teams across Google, Gemma is inspired by Gemini, and the name reflects the Latin gemma, meaning “precious stone.” Accompanying our model weights, we’re also releasing tools to support developer innovation, foster collaboration, and guide the responsible use of Gemma models. Gemma models share technical and infrastructure components with Gemini, our largest and most capable AI model widely available today. This enables Gemma 2B and 7B to achieve best-in-class performance for their sizes compared to other open models. And Gemma models are capable of running directly on a developer laptop or desktop computer. Notably, Gemma surpasses significantly larger models on key benchmarks while adhering to our rigorous standards for safe and responsible outputs.
  • 18
    Eternity AI

    Eternity AI

    Eternity AI

    Eternity AI is building an HTLM-7B, a machine learning model that knows what the internet is and how to access it to generate responses. Humans don't make decisions based on 2-year-old data. For a model to think like a human, it needs to get access to real-time knowledge and everything about how humans behave. Members of our team have previously published white papers and articles on topics related to on-chain vulnerability coordination, GPT database retrieval, decentralized dispute resolution, etc.
  • 19
    Adept

    Adept

    Adept

    Adept is an ML research and product lab building general intelligence by enabling humans and computers to work together creatively. Designed and trained specifically for taking actions on computers in response to your natural language commands. ACT-1 is our first step towards a foundation model that can use every software tool, API and website that exists. Adept is building an entirely new way to get things done. It takes your goals, in plain language, and turns them into actions on the software you use every day. We believe that AI systems should be built with users at the center — where machines work together with people in the driver's seat, discovering new solutions, enabling more informed decisions, and giving us more time for the work we love.
  • 20
    DBRX

    DBRX

    Databricks

    Today, we are excited to introduce DBRX, an open, general-purpose LLM created by Databricks. Across a range of standard benchmarks, DBRX sets a new state-of-the-art for established open LLMs. Moreover, it provides the open community and enterprises building their own LLMs with capabilities that were previously limited to closed model APIs; according to our measurements, it surpasses GPT-3.5, and it is competitive with Gemini 1.0 Pro. It is an especially capable code model, surpassing specialized models like CodeLLaMA-70B in programming, in addition to its strength as a general-purpose LLM. This state-of-the-art quality comes with marked improvements in training and inference performance. DBRX advances the state-of-the-art in efficiency among open models thanks to its fine-grained mixture-of-experts (MoE) architecture. Inference is up to 2x faster than LLaMA2-70B, and DBRX is about 40% of the size of Grok-1 in terms of both total and active parameter counts.
  • 21
    Claude Haiku 3
    Claude Haiku 3 is the fastest and most affordable model in its intelligence class. With state-of-the-art vision capabilities and strong performance on industry benchmarks, Haiku is a versatile solution for a wide range of enterprise applications. The model is now available alongside Sonnet and Opus in the Claude API and on claude.ai for our Claude Pro subscribers.
  • 22
    Gemma 2

    Gemma 2

    Google

    A family of state-of-the-art, light-open models created from the same research and technology that were used to create Gemini models. These models incorporate comprehensive security measures and help ensure responsible and reliable AI solutions through selected data sets and rigorous adjustments. Gemma models achieve exceptional comparative results in their 2B, 7B, 9B, and 27B sizes, even outperforming some larger open models. With Keras 3.0, enjoy seamless compatibility with JAX, TensorFlow, and PyTorch, allowing you to effortlessly choose and change frameworks based on task. Redesigned to deliver outstanding performance and unmatched efficiency, Gemma 2 is optimized for incredibly fast inference on various hardware. The Gemma family of models offers different models that are optimized for specific use cases and adapt to your needs. Gemma models are large text-to-text lightweight language models with a decoder, trained in a huge set of text data, code, and mathematical content.
  • 23
    Moshi

    Moshi

    Kyutai

    Moshi is an experimental conversational AI. Moshi thinks and speaks at the same time. Moshi can listen and talk at all time: maximum flow between you and Moshi.
    Starting Price: Free
  • 24
    Phi-3

    Phi-3

    Microsoft

    A family of powerful, small language models (SLMs) with groundbreaking performance at low cost and low latency. Maximize AI capabilities, lower resource use, and ensure cost-effective generative AI deployments across your applications. Accelerate response times in real-time interactions, autonomous systems, apps requiring low latency, and other critical scenarios. Run Phi-3 in the cloud, at the edge, or on device, resulting in greater deployment and operation flexibility. Phi-3 models were developed in accordance with Microsoft AI principles: accountability, transparency, fairness, reliability and safety, privacy and security, and inclusiveness. Operate effectively in offline environments where data privacy is paramount or connectivity is limited. Generate more coherent, accurate, and contextually relevant outputs with an expanded context window. Deploy at the edge to deliver faster responses.
  • 25
    NVIDIA Nemotron
    NVIDIA Nemotron is a family of open-source models developed by NVIDIA, designed to generate synthetic data for training large language models (LLMs) for commercial applications. The Nemotron-4 340B model, in particular, is a significant release by NVIDIA, offering developers a powerful tool to generate high-quality data and filter it based on various attributes using a reward model.
  • 26
    Jamba

    Jamba

    AI21 Labs

    Jamba is the most powerful & efficient long context model, open for builders and built for the enterprise. Jamba's latency outperforms all leading models of comparable sizes. Jamba's 256k context window is the longest openly available. Jamba's Mamba-Transformer MoE architecture is designed for cost & efficiency gains. Jamba offers key features of OOTB including function calls, JSON mode output, document objects, and citation mode. Jamba 1.5 models maintain high performance across the full length of their context window. Jamba 1.5 models achieve top scores across common quality benchmarks. Secure deployment that suits your enterprise. Seamlessly start using Jamba on our production-grade SaaS platform. The Jamba model family is available for deployment across our strategic partners. We offer VPC & on-prem deployments for enterprises that require custom solutions. For enterprises that have unique, bespoke requirements, we offer hands-on management, continuous pre-training, etc.
  • 27
    DataGemma
    DataGemma represents a pioneering effort by Google to enhance the accuracy and reliability of large language models (LLMs) when dealing with statistical and numerical data. Launched as a set of open models, DataGemma leverages Google's Data Commons, a vast repository of public statistical data—to ground its responses in real-world facts. This initiative employs two innovative approaches: Retrieval Interleaved Generation (RIG) and Retrieval Augmented Generation (RAG). The RIG method integrates real-time data checks during the generation process to ensure factual accuracy, while RAG retrieves relevant information before generating responses, thereby reducing the likelihood of AI hallucinations. By doing so, DataGemma aims to provide users with more trustworthy and factually grounded answers, marking a significant step towards mitigating the issue of misinformation in AI-generated content.
  • 28
    LFM-40B

    LFM-40B

    Liquid AI

    LFM-40B offers a new balance between model size and output quality. It leverages 12B activated parameters at use. Its performance is comparable to models larger than itself, while its MoE architecture enables higher throughput and deployment on more cost-effective hardware.
  • 29
    LFM-3B

    LFM-3B

    Liquid AI

    LFM-3B delivers incredible performance for its size. It positions itself as first place among 3B parameter transformers, hybrids, and RNN models, but also outperforms the previous generation of 7B and 13B models. It is also on par with Phi-3.5-mini on multiple benchmarks, while being 18.4% smaller. LFM-3B is the ideal choice for mobile and other edge text-based applications.
  • 30
    OLMo 2
    OLMo 2 is a family of fully open language models developed by the Allen Institute for AI (AI2), designed to provide researchers and developers with transparent access to training data, open-source code, reproducible training recipes, and comprehensive evaluations. These models are trained on up to 5 trillion tokens and are competitive with leading open-weight models like Llama 3.1 on English academic benchmarks. OLMo 2 emphasizes training stability, implementing techniques to prevent loss spikes during long training runs, and utilizes staged training interventions during late pretraining to address capability deficiencies. The models incorporate state-of-the-art post-training methodologies from AI2's Tülu 3, resulting in the creation of OLMo 2-Instruct models. An actionable evaluation framework, the Open Language Modeling Evaluation System (OLMES), was established to guide improvements through development stages, consisting of 20 evaluation benchmarks assessing core capabilities.