Alternatives to T5

Compare T5 alternatives for your business or organization using the curated list below. SourceForge ranks the best alternatives to T5 in 2024. Compare features, ratings, user reviews, pricing, and more from T5 competitors and alternatives in order to make an informed decision for your business.

  • 1
    GPT-4

    GPT-4

    OpenAI

    GPT-4 (Generative Pre-trained Transformer 4) is a large-scale unsupervised language model, yet to be released by OpenAI. GPT-4 is the successor to GPT-3 and part of the GPT-n series of natural language processing models, and was trained on a dataset of 45TB of text to produce human-like text generation and understanding capabilities. Unlike most other NLP models, GPT-4 does not require additional training data for specific tasks. Instead, it can generate text or answer questions using only its own internally generated context as input. GPT-4 has been shown to be able to perform a wide variety of tasks without any task specific training data such as translation, summarization, question answering, sentiment analysis and more.
    Starting Price: $0.0200 per 1000 tokens
  • 2
    BERT

    BERT

    Google

    BERT is a large language model and a method of pre-training language representations. Pre-training refers to how BERT is first trained on a large source of text, such as Wikipedia. You can then apply the training results to other Natural Language Processing (NLP) tasks, such as question answering and sentiment analysis. With BERT and AI Platform Training, you can train a variety of NLP models in about 30 minutes.
    Starting Price: Free
  • 3
    RoBERTa
    RoBERTa builds on BERT’s language masking strategy, wherein the system learns to predict intentionally hidden sections of text within otherwise unannotated language examples. RoBERTa, which was implemented in PyTorch, modifies key hyperparameters in BERT, including removing BERT’s next-sentence pretraining objective, and training with much larger mini-batches and learning rates. This allows RoBERTa to improve on the masked language modeling objective compared with BERT and leads to better downstream task performance. We also explore training RoBERTa on an order of magnitude more data than BERT, for a longer amount of time. We used existing unannotated NLP datasets as well as CC-News, a novel set drawn from public news articles.
    Starting Price: Free
  • 4
    Cohere

    Cohere

    Cohere AI

    Build natural language understanding and generation into your product with a few lines of code. The Cohere API provides access to models that read billions of web pages and learn to understand the meaning, sentiment, and intent of the words we use. Use the Cohere API to write human-like text by completing a prompt or filling in blanks. You can write copy, generate code, summarize text, and more. Compute the likelihood of text and retrieve representations from the model. Use the likelihood API to filter text based on chosen categories or selected criteria. With representations, you can train your own downstream models on a wide variety of domain-specific natural language tasks. The Cohere API can compute the similarity between pieces of text, and make categorical predictions by comparing the likelihood of different text options. The model has multiple lenses through which to view ideas, so that it can recognize abstract similarities between concepts as distinct as DNA and computers.
    Starting Price: $0.40 / 1M Tokens
  • 5
    ALBERT

    ALBERT

    Google

    ALBERT is a self-supervised Transformer model that was pretrained on a large corpus of English data. This means it does not require manual labelling, and instead uses an automated process to generate inputs and labels from raw texts. It is trained with two distinct objectives in mind. The first is Masked Language Modeling (MLM), which randomly masks 15% of words in the input sentence and requires the model to predict them. This technique differs from RNNs and autoregressive models like GPT as it allows the model to learn bidirectional sentence representations. The second objective is Sentence Ordering Prediction (SOP), which entails predicting the ordering of two consecutive segments of text during pretraining.
  • 6
    GPT-4o mini
    A small model with superior textual intelligence and multimodal reasoning. GPT-4o mini enables a broad range of tasks with its low cost and latency, such as applications that chain or parallelize multiple model calls (e.g., calling multiple APIs), pass a large volume of context to the model (e.g., full code base or conversation history), or interact with customers through fast, real-time text responses (e.g., customer support chatbots). Today, GPT-4o mini supports text and vision in the API, with support for text, image, video and audio inputs and outputs coming in the future. The model has a context window of 128K tokens, supports up to 16K output tokens per request, and has knowledge up to October 2023. Thanks to the improved tokenizer shared with GPT-4o, handling non-English text is now even more cost effective.
  • 7
    ChatGPT

    ChatGPT

    OpenAI

    ChatGPT is a language model developed by OpenAI. It has been trained on a diverse range of internet text, allowing it to generate human-like responses to a variety of prompts. ChatGPT can be used for various natural language processing tasks, such as question answering, conversation, and text generation. ChatGPT is a pre-trained language model that uses deep learning algorithms to generate text. It was trained on a large corpus of text data, allowing it to generate human-like responses to a wide range of prompts. The model has a transformer architecture, which has been shown to be effective in many NLP tasks. In addition to generating text, ChatGPT can also be fine-tuned for specific NLP tasks such as question answering, text classification, and language translation. This allows developers to build powerful NLP applications that can perform specific tasks more accurately. ChatGPT can also process and generate code.
    Starting Price: Free
  • 8
    GPT-4 Turbo
    GPT-4 is a large multimodal model (accepting text or image inputs and outputting text) that can solve difficult problems with greater accuracy than any of our previous models, thanks to its broader general knowledge and advanced reasoning capabilities. GPT-4 is available in the OpenAI API to paying customers. Like gpt-3.5-turbo, GPT-4 is optimized for chat but works well for traditional completions tasks using the Chat Completions API. GPT-4 is the latest GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Returns a maximum of 4,096 output tokens. This preview model is not yet suited for production traffic.
    Starting Price: $0.0200 per 1000 tokens
  • 9
    Llama 3.2
    The open-source AI model you can fine-tune, distill and deploy anywhere is now available in more versions. Choose from 1B, 3B, 11B or 90B, or continue building with Llama 3.1 Llama 3.2 is a collection of large language models (LLMs) pretrained and fine-tuned in 1B and 3B sizes that are multilingual text only, and 11B and 90B sizes that take both text and image inputs and output text. Develop highly performative and efficient applications from our latest release. Use our 1B or 3B models for on device applications such as summarizing a discussion from your phone or calling on-device tools like calendar. Use our 11B or 90B models for image use cases such as transforming an existing image into something new or getting more information from an image of your surroundings.
    Starting Price: Free
  • 10
    Qwen

    Qwen

    Alibaba

    Qwen LLM refers to a family of large language models (LLMs) developed by Alibaba Cloud's Damo Academy. These models are trained on a massive dataset of text and code, allowing them to understand and generate human-like text, translate languages, write different kinds of creative content, and answer your questions in an informative way. Here are some key features of Qwen LLMs: Variety of sizes: The Qwen series ranges from 1.8 billion to 72 billion parameters, offering options for different needs and performance levels. Open source: Some versions of Qwen are open-source, which means their code is publicly available for anyone to use and modify. Multilingual support: Qwen can understand and translate multiple languages, including English, Chinese, and French. Diverse capabilities: Besides generation and translation, Qwen models can be used for tasks like question answering, text summarization, and code generation.
    Starting Price: Free
  • 11
    ERNIE 3.0 Titan
    Pre-trained language models have achieved state-of-the-art results in various Natural Language Processing (NLP) tasks. GPT-3 has shown that scaling up pre-trained language models can further exploit their enormous potential. A unified framework named ERNIE 3.0 was recently proposed for pre-training large-scale knowledge enhanced models and trained a model with 10 billion parameters. ERNIE 3.0 outperformed the state-of-the-art models on various NLP tasks. In order to explore the performance of scaling up ERNIE 3.0, we train a hundred-billion-parameter model called ERNIE 3.0 Titan with up to 260 billion parameters on the PaddlePaddle platform. Furthermore, We design a self-supervised adversarial loss and a controllable language modeling loss to make ERNIE 3.0 Titan generate credible and controllable texts.
  • 12
    BLOOM

    BLOOM

    BigScience

    BLOOM is an autoregressive Large Language Model (LLM), trained to continue text from a prompt on vast amounts of text data using industrial-scale computational resources. As such, it is able to output coherent text in 46 languages and 13 programming languages that is hardly distinguishable from text written by humans. BLOOM can also be instructed to perform text tasks it hasn't been explicitly trained for, by casting them as text generation tasks.
  • 13
    Gemma 2

    Gemma 2

    Google

    A family of state-of-the-art, light-open models created from the same research and technology that were used to create Gemini models. These models incorporate comprehensive security measures and help ensure responsible and reliable AI solutions through selected data sets and rigorous adjustments. Gemma models achieve exceptional comparative results in their 2B, 7B, 9B, and 27B sizes, even outperforming some larger open models. With Keras 3.0, enjoy seamless compatibility with JAX, TensorFlow, and PyTorch, allowing you to effortlessly choose and change frameworks based on task. Redesigned to deliver outstanding performance and unmatched efficiency, Gemma 2 is optimized for incredibly fast inference on various hardware. The Gemma family of models offers different models that are optimized for specific use cases and adapt to your needs. Gemma models are large text-to-text lightweight language models with a decoder, trained in a huge set of text data, code, and mathematical content.
  • 14
    GPT-4o

    GPT-4o

    OpenAI

    GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction—it accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs. It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time (opens in a new window) in a conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models.
    Starting Price: $5.00 / 1M tokens
  • 15
    XLNet

    XLNet

    XLNet

    XLNet is a new unsupervised language representation learning method based on a novel generalized permutation language modeling objective. Additionally, XLNet employs Transformer-XL as the backbone model, exhibiting excellent performance for language tasks involving long context. Overall, XLNet achieves state-of-the-art (SOTA) results on various downstream language tasks including question answering, natural language inference, sentiment analysis, and document ranking.
    Starting Price: Free
  • 16
    Samsung Gauss
    Samsung Gauss is a new AI model developed by Samsung Electronics. It is a large language model (LLM) that has been trained on a massive dataset of text and code. Samsung Gauss is able to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way. Samsung Gauss is still under development, but it has already learned to perform many kinds of tasks, including: Following instructions and completing requests thoughtfully. Answering your questions in a comprehensive and informative way, even if they are open ended, challenging, or strange. Generating different creative text formats, like poems, code, scripts, musical pieces, email, letters, etc. Here are some examples of what Samsung Gauss can do: Translation: Samsung Gauss can translate text between many different languages, including English, French, German, Spanish, Chinese, Japanese, and Korean. Coding: Samsung Gauss can generate code.
  • 17
    OpenAI

    OpenAI

    OpenAI

    OpenAI’s mission is to ensure that artificial general intelligence (AGI)—by which we mean highly autonomous systems that outperform humans at most economically valuable work—benefits all of humanity. We will attempt to directly build safe and beneficial AGI, but will also consider our mission fulfilled if our work aids others to achieve this outcome. Apply our API to any language task — semantic search, summarization, sentiment analysis, content generation, translation, and more — with only a few examples or by specifying your task in English. One simple integration gives you access to our constantly-improving AI technology. Explore how you integrate with the API with these sample completions.
  • 18
    GPT-3

    GPT-3

    OpenAI

    Our GPT-3 models can understand and generate natural language. We offer four main models with different levels of power suitable for different tasks. Davinci is the most capable model, and Ada is the fastest. The main GPT-3 models are meant to be used with the text completion endpoint. We also offer models that are specifically meant to be used with other endpoints. Davinci is the most capable model family and can perform any task the other models can perform and often with less instruction. For applications requiring a lot of understanding of the content, like summarization for a specific audience and creative content generation, Davinci is going to produce the best results. These increased capabilities require more compute resources, so Davinci costs more per API call and is not as fast as the other models.
    Starting Price: $0.0200 per 1000 tokens
  • 19
    GPT-3.5

    GPT-3.5

    OpenAI

    GPT-3.5 is the next evolution of GPT 3 large language model from OpenAI. GPT-3.5 models can understand and generate natural language. We offer four main models with different levels of power suitable for different tasks. The main GPT-3.5 models are meant to be used with the text completion endpoint. We also offer models that are specifically meant to be used with other endpoints. Davinci is the most capable model family and can perform any task the other models can perform and often with less instruction. For applications requiring a lot of understanding of the content, like summarization for a specific audience and creative content generation, Davinci is going to produce the best results. These increased capabilities require more compute resources, so Davinci costs more per API call and is not as fast as the other models.
    Starting Price: $0.0200 per 1000 tokens
  • 20
    AI21 Studio

    AI21 Studio

    AI21 Studio

    AI21 Studio provides API access to Jurassic-1 large-language-models. Our models power text generation and comprehension features in thousands of live applications. Take on any language task. Our Jurassic-1 models are trained to follow natural language instructions and require just a few examples to adapt to new tasks. Use our specialized APIs for common tasks like summarization, paraphrasing and more. Access superior results at a lower cost without reinventing the wheel. Need to fine-tune your own custom model? You're just 3 clicks away. Training is fast, affordable and trained models are deployed immediately. Give your users superpowers by embedding an AI co-writer in your app. Drive user engagement and success with features like long-form draft generation, paraphrasing, repurposing and custom auto-complete.
    Starting Price: $29 per month
  • 21
    Martian

    Martian

    Martian

    By using the best-performing model for each request, we can achieve higher performance than any single model. Martian outperforms GPT-4 across OpenAI's evals (open/evals). We turn opaque black boxes into interpretable representations. Our router is the first tool built on top of our model mapping method. We are developing many other applications of model mapping including turning transformers from indecipherable matrices into human-readable programs. If a company experiences an outage or high latency period, automatically reroute to other providers so your customers never experience any issues. Determine how much you could save by using the Martian Model Router with our interactive cost calculator. Input your number of users, tokens per session, and sessions per month, and specify your cost/quality tradeoff.
  • 22
    PanGu-α

    PanGu-α

    Huawei

    PanGu-α is developed under the MindSpore and trained on a cluster of 2048 Ascend 910 AI processors. The training parallelism strategy is implemented based on MindSpore Auto-parallel, which composes five parallelism dimensions to scale the training task to 2048 processors efficiently, including data parallelism, op-level model parallelism, pipeline model parallelism, optimizer model parallelism and rematerialization. To enhance the generalization ability of PanGu-α, we collect 1.1TB high-quality Chinese data from a wide range of domains to pretrain the model. We empirically test the generation ability of PanGu-α in various scenarios including text summarization, question answering, dialogue generation, etc. Moreover, we investigate the effect of model scales on the few-shot performances across a broad range of Chinese NLP tasks. The experimental results demonstrate the superior capabilities of PanGu-α in performing various tasks under few-shot or zero-shot settings.
  • 23
    Claude 3.5 Sonnet
    Claude 3.5 Sonnet sets new industry benchmarks for graduate-level reasoning (GPQA), undergraduate-level knowledge (MMLU), and coding proficiency (HumanEval). It shows marked improvement in grasping nuance, humor, and complex instructions, and is exceptional at writing high-quality content with a natural, relatable tone. Claude 3.5 Sonnet operates at twice the speed of Claude 3 Opus. This performance boost, combined with cost-effective pricing, makes Claude 3.5 Sonnet ideal for complex tasks such as context-sensitive customer support and orchestrating multi-step workflows. Claude 3.5 Sonnet is now available for free on Claude.ai and the Claude iOS app, while Claude Pro and Team plan subscribers can access it with significantly higher rate limits. It is also available via the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. The model costs $3 per million input tokens and $15 per million output tokens, with a 200K token context window.
    Starting Price: Free
  • 24
    PaLM 2

    PaLM 2

    Google

    PaLM 2 is our next generation large language model that builds on Google’s legacy of breakthrough research in machine learning and responsible AI. It excels at advanced reasoning tasks, including code and math, classification and question answering, translation and multilingual proficiency, and natural language generation better than our previous state-of-the-art LLMs, including PaLM. It can accomplish these tasks because of the way it was built – bringing together compute-optimal scaling, an improved dataset mixture, and model architecture improvements. PaLM 2 is grounded in Google’s approach to building and deploying AI responsibly. It was evaluated rigorously for its potential harms and biases, capabilities and downstream uses in research and in-product applications. It’s being used in other state-of-the-art models, like Med-PaLM 2 and Sec-PaLM, and is powering generative AI features and tools at Google, like Bard and the PaLM API.
  • 25
    Azure OpenAI Service
    Apply advanced coding and language models to a variety of use cases. Leverage large-scale, generative AI models with deep understandings of language and code to enable new reasoning and comprehension capabilities for building cutting-edge applications. Apply these coding and language models to a variety of use cases, such as writing assistance, code generation, and reasoning over data. Detect and mitigate harmful use with built-in responsible AI and access enterprise-grade Azure security. Gain access to generative models that have been pretrained with trillions of words. Apply them to new scenarios including language, code, reasoning, inferencing, and comprehension. Customize generative models with labeled data for your specific scenario using a simple REST API. Fine-tune your model's hyperparameters to increase accuracy of outputs. Use the few-shot learning capability to provide the API with examples and achieve more relevant results.
    Starting Price: $0.0004 per 1000 tokens
  • 26
    mT5

    mT5

    Google

    Multilingual T5 (mT5) is a massively multilingual pretrained text-to-text transformer model, trained following a similar recipe as T5. This repo can be used to reproduce the experiments in the mT5 paper. mT5 is pretrained on the mC4 corpus, covering 101 languages: Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese, Corsican, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scottish Gaelic, Serbian, Shona, Sindhi, and more.
    Starting Price: Free
  • 27
    LFM-40B

    LFM-40B

    Liquid AI

    LFM-40B offers a new balance between model size and output quality. It leverages 12B activated parameters at use. Its performance is comparable to models larger than itself, while its MoE architecture enables higher throughput and deployment on more cost-effective hardware.
  • 28
    Gemini Pro
    Gemini is natively multimodal, which gives you the potential to transform any type of input into any type of output. We've built Gemini responsibly from the start, incorporating safeguards and working together with partners to make it safer and more inclusive. Integrate Gemini models into your applications with Google AI Studio and Google Cloud Vertex AI.
  • 29
    YandexGPT
    Take advantage of the capabilities of generative language models to improve and optimize your applications and web services. Get an aggregated result of accumulated textual data whether it be information from work chats, user reviews, or other types of data. YandexGPT will help both summarize and interpret the information. Speed up text creation as you improve their quality and style. Create template texts for newsletters, product descriptions for online stores and other applications. Develop a chatbot for your support service: teach the bot to answer various user questions, both common and more complicated. Use the API to integrate the service with your applications and automate processes.
  • 30
    VideoPoet
    VideoPoet is a simple modeling method that can convert any autoregressive language model or large language model (LLM) into a high-quality video generator. It contains a few simple components. An autoregressive language model learns across video, image, audio, and text modalities to autoregressively predict the next video or audio token in the sequence. A mixture of multimodal generative learning objectives are introduced into the LLM training framework, including text-to-video, text-to-image, image-to-video, video frame continuation, video inpainting and outpainting, video stylization, and video-to-audio. Furthermore, such tasks can be composed together for additional zero-shot capabilities. This simple recipe shows that language models can synthesize and edit videos with a high degree of temporal consistency.
  • 31
    LFM-3B

    LFM-3B

    Liquid AI

    LFM-3B delivers incredible performance for its size. It positions itself as first place among 3B parameter transformers, hybrids, and RNN models, but also outperforms the previous generation of 7B and 13B models. It is also on par with Phi-3.5-mini on multiple benchmarks, while being 18.4% smaller. LFM-3B is the ideal choice for mobile and other edge text-based applications.
  • 32
    GPT-5

    GPT-5

    OpenAI

    GPT-5 is the anticipated next iteration of OpenAI's Generative Pre-trained Transformer, a large language model (LLM) still under development. LLMs are trained on massive amounts of text data and are able to generate realistic and coherent text, translate languages, write different kinds of creative content, and answer your questions in an informative way. It's not publicly available yet. OpenAI hasn't announced a release date, but some speculate it could be launched sometime in 2024. It's expected to be even more powerful than its predecessor, GPT-4. GPT-4 is already impressive, capable of generating human-quality text, translating languages, and writing different kinds of creative content. GPT-5 is expected to take these abilities even further, with better reasoning, factual accuracy, and ability to follow instructions.
    Starting Price: $0.0200 per 1000 tokens
  • 33
    GPT-4V (Vision)
    GPT-4 with vision (GPT-4V) enables users to instruct GPT-4 to analyze image inputs provided by the user, and is the latest capability we are making broadly available. Incorporating additional modalities (such as image inputs) into large language models (LLMs) is viewed by some as a key frontier in artificial intelligence research and development. Multimodal LLMs offer the possibility of expanding the impact of language-only systems with novel interfaces and capabilities, enabling them to solve new tasks and provide novel experiences for their users. In this system card, we analyze the safety properties of GPT-4V. Our work on safety for GPT-4V builds on the work done for GPT-4 and here we dive deeper into the evaluations, preparation, and mitigation work done specifically for image inputs.
  • 34
    Pixtral 12B

    Pixtral 12B

    Mistral AI

    Pixtral 12B is a pioneering multimodal AI model developed by Mistral AI, designed to process and interpret both text and image data seamlessly. This model marks a significant advancement in the integration of different data types, allowing for more intuitive interactions and enhanced content creation capabilities. With a foundation built upon Mistral's NeMo 12B text model, Pixtral 12B incorporates an additional vision adapter that adds approximately 400 million parameters, expanding its ability to handle visual inputs up to 1024 x 1024 pixels in size. This model supports a variety of applications, from detailed image analysis to answering questions about visual content, showcasing its versatility in real-world applications. Pixtral 12B's architecture not only supports a large context window of 128k tokens but also employs innovative techniques like GeLU activation and 2D RoPE for its vision components, making it a robust tool for developers and enterprises aiming to leverage AI.
    Starting Price: Free
  • 35
    GPT-J

    GPT-J

    EleutherAI

    GPT-J is a cutting-edge language model created by the research organization EleutherAI. In terms of performance, GPT-J exhibits a level of proficiency comparable to that of OpenAI's renowned GPT-3 model in a range of zero-shot tasks. Notably, GPT-J has demonstrated the ability to surpass GPT-3 in tasks related to generating code. The latest iteration of this language model, known as GPT-J-6B, is built upon a linguistic dataset referred to as The Pile. This dataset, which is publicly available, encompasses a substantial volume of 825 gibibytes of language data, organized into 22 distinct subsets. While GPT-J shares certain capabilities with ChatGPT, it is important to note that GPT-J is not designed to operate as a chatbot; rather, its primary function is to predict text. In a significant development in March 2023, Databricks introduced Dolly, a model that follows instructions and is licensed under Apache.
    Starting Price: Free
  • 36
    PygmalionAI

    PygmalionAI

    PygmalionAI

    PygmalionAI is a community dedicated to creating open-source projects based on EleutherAI's GPT-J 6B and Meta's LLaMA models. In simple terms, Pygmalion makes AI fine-tuned for chatting and roleplaying purposes. The current actively supported Pygmalion AI model is the 7B variant, based on Meta AI's LLaMA model. With only 18GB (or less) VRAM required, Pygmalion offers better chat capability than much larger language models with relatively minimal resources. Our curated dataset of high-quality roleplaying data ensures that your bot will be the optimal RP partner. Both the model weights and the code used to train it are completely open-source, and you can modify/re-distribute it for whatever purpose you want. Language models, including Pygmalion, generally run on GPUs since they need access to fast memory and massive processing power in order to output coherent text at an acceptable speed.
    Starting Price: Free
  • 37
    InstructGPT
    InstructGPT is an open-source framework for training language models to generate natural language instructions from visual input. It uses a generative pre-trained transformer (GPT) model and the state-of-the-art object detector, Mask R-CNN, to detect objects in images and generate natural language sentences that describe the image. InstructGPT is designed to be effective across domains such as robotics, gaming and education; it can assist robots in navigating complex tasks with natural language instructions, or help students learn by providing descriptive explanations of processes or events.
    Starting Price: $0.0200 per 1000 tokens
  • 38
    Reka

    Reka

    Reka

    Our enterprise-grade multimodal assistant carefully designed with privacy, security, and efficiency in mind. We train Yasa to read text, images, videos, and tabular data, with more modalities to come. Use it to generate ideas for creative tasks, get answers to basic questions, or derive insights from your internal data. Generate, train, compress, or deploy on-premise with a few simple commands. Use our proprietary algorithms to personalize our model to your data and use cases. We design proprietary algorithms involving retrieval, fine-tuning, self-supervised instruction tuning, and reinforcement learning to tune our model on your datasets.
  • 39
    Qwen-7B

    Qwen-7B

    Alibaba

    Qwen-7B is the 7B-parameter version of the large language model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-7B is a Transformer-based large language model, which is pretrained on a large volume of data, including web texts, books, codes, etc. Additionally, based on the pretrained Qwen-7B, we release Qwen-7B-Chat, a large-model-based AI assistant, which is trained with alignment techniques. The features of the Qwen-7B series include: Trained with high-quality pretraining data. We have pretrained Qwen-7B on a self-constructed large-scale high-quality dataset of over 2.2 trillion tokens. The dataset includes plain texts and codes, and it covers a wide range of domains, including general domain data and professional domain data. Strong performance. In comparison with the models of the similar model size, we outperform the competitors on a series of benchmark datasets, which evaluates natural language understanding, mathematics, coding, etc. And more.
    Starting Price: Free
  • 40
    Gemini Ultra
    Gemini Ultra is a powerful new language model from Google DeepMind. It is the largest and most capable model in the Gemini family, which also includes Gemini Pro and Gemini Nano. Gemini Ultra is designed for highly complex tasks, such as natural language processing, machine translation, and code generation. It is also the first language model to outperform human experts on the Massive Multitask Language Understanding (MMLU) test, obtaining a score of 90%.
  • 41
    Upstage

    Upstage

    Upstage

    Use the Chat API to create a simple conversational agent with Solar. Function Calling is now supported, the way to connect LLM to external tools. The embedding vectors can be used for tasks such as retrieval and classification. Context-aware English-Korean translation that leverages previous dialogues to ensure unmatched coherence and continuity in your conversations. Verifies whether the answers provided by the LLM are appropriately generated, based on the user's question and search results. Developing a healthcare LLM to automate patient communication, personalize treatment plans, aid in clinical decision support and support medical transcription. Aims to enable business owners and companies to deploy generative AI chatbots on websites and mobile apps easily, providing human-like services in customer support and engagement.
    Starting Price: $0.5 per 1M tokens
  • 42
    Granite Code
    We introduce the Granite series of decoder-only code models for code generative tasks (e.g., fixing bugs, explaining code, documenting code), trained with code written in 116 programming languages. A comprehensive evaluation of the Granite Code model family on diverse tasks demonstrates that our models consistently reach state-of-the-art performance among available open source code LLMs. The key advantages of Granite Code models include: All-rounder Code LLM: Granite Code models achieve competitive or state-of-the-art performance on different kinds of code-related tasks, including code generation, explanation, fixing, editing, translation, and more. Demonstrating their ability to solve diverse coding tasks. Trustworthy Enterprise-Grade LLM: All our models are trained on license-permissible data collected following IBM's AI Ethics principles and guided by IBM’s Corporate Legal team for trustworthy enterprise usage.
    Starting Price: Free
  • 43
    Code Llama
    Code Llama is a large language model (LLM) that can use text prompts to generate code. Code Llama is state-of-the-art for publicly available LLMs on code tasks, and has the potential to make workflows faster and more efficient for current developers and lower the barrier to entry for people who are learning to code. Code Llama has the potential to be used as a productivity and educational tool to help programmers write more robust, well-documented software. Code Llama is a state-of-the-art LLM capable of generating code, and natural language about code, from both code and natural language prompts. Code Llama is free for research and commercial use. Code Llama is built on top of Llama 2 and is available in three models: Code Llama, the foundational code model; Codel Llama - Python specialized for Python; and Code Llama - Instruct, which is fine-tuned for understanding natural language instructions.
    Starting Price: Free
  • 44
    Medical LLM

    Medical LLM

    John Snow Labs

    John Snow Labs' Medical LLM is an advanced, domain-specific large language model (LLM) designed to revolutionize the way healthcare organizations harness the power of artificial intelligence. This innovative platform is tailored specifically for the healthcare industry, combining cutting-edge natural language processing (NLP) capabilities with a deep understanding of medical terminology, clinical workflows, and regulatory requirements. The result is a powerful tool that enables healthcare providers, researchers, and administrators to unlock new insights, improve patient outcomes, and drive operational efficiency. At the heart of the Healthcare LLM is its comprehensive training on vast amounts of healthcare data, including clinical notes, research papers, and regulatory documents. This specialized training allows the model to accurately interpret and generate medical text, making it an invaluable asset for tasks such as clinical documentation, automated coding, and medical research.
  • 45
    MPT-7B

    MPT-7B

    MosaicML

    Introducing MPT-7B, the latest entry in our MosaicML Foundation Series. MPT-7B is a transformer trained from scratch on 1T tokens of text and code. It is open source, available for commercial use, and matches the quality of LLaMA-7B. MPT-7B was trained on the MosaicML platform in 9.5 days with zero human intervention at a cost of ~$200k. Now you can train, finetune, and deploy your own private MPT models, either starting from one of our checkpoints or training from scratch. For inspiration, we are also releasing three finetuned models in addition to the base MPT-7B: MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-StoryWriter-65k+, the last of which uses a context length of 65k tokens!
    Starting Price: Free
  • 46
    Smaug-72B
    Smaug-72B is a powerful open-source large language model (LLM) known for several key features: High Performance: It currently holds the top spot on the Hugging Face Open LLM leaderboard, surpassing models like GPT-3.5 in various benchmarks. This means it excels at tasks like understanding, responding to, and generating human-like text. Open Source: Unlike many other advanced LLMs, Smaug-72B is freely available for anyone to use and modify, fostering collaboration and innovation in the AI community. Focus on Reasoning and Math: It specifically shines in handling reasoning and mathematical tasks, attributing this strength to unique fine-tuning techniques developed by Abacus AI, the creators of Smaug-72B. Based on Qwen-72B: It's technically a fine-tuned version of another powerful LLM called Qwen-72B, released by Alibaba, further improving upon its capabilities. Overall, Smaug-72B represents a significant step forward in open-source AI.
    Starting Price: Free
  • 47
    Gemini Nano
    Gemini Nano is the tiny titan of the Gemini family, Google DeepMind's latest generation of multimodal language models. Imagine a super-powered AI shrunk down to fit snugly on your smartphone, that's Nano in a nutshell! ✨ Though the smallest of the bunch (alongside its siblings, Ultra and Pro), Nano packs a mighty punch. It's specifically designed to run on edge devices like your phone, bringing powerful AI capabilities right to your fingertips, even when you're offline. Think of it as your ultimate on-device assistant, whispering smart suggestions and automating tasks with ease. Need a quick summary of that long recorded lecture? Nano's got you covered. Want to craft the perfect reply to a tricky text? Nano will generate options that'll have your friends thinking you're a wordsmith extraordinaire.
  • 48
    NVIDIA NeMo Megatron
    NVIDIA NeMo Megatron is an end-to-end framework for training and deploying LLMs with billions and trillions of parameters. NVIDIA NeMo Megatron, part of the NVIDIA AI platform, offers an easy, efficient, and cost-effective containerized framework to build and deploy LLMs. Designed for enterprise application development, it builds upon the most advanced technologies from NVIDIA research and provides an end-to-end workflow for automated distributed data processing, training large-scale customized GPT-3, T5, and multilingual T5 (mT5) models, and deploying models for inference at scale. Harnessing the power of LLMs is made easy through validated and converged recipes with predefined configurations for training and inference. Customizing models is simplified by the hyperparameter tool, which automatically searches for the best hyperparameter configurations and performance for training and inference on any given distributed GPU cluster configuration.
  • 49
    Amazon Titan
    Exclusive to Amazon Bedrock, the Amazon Titan family of models incorporates Amazon’s 25 years of experience innovating with AI and machine learning across its business. Amazon Titan foundation models (FMs) provide customers with a breadth of high-performing image, multimodal, and text model choices, via a fully managed API. Amazon Titan models are created by AWS and pretrained on large datasets, making them powerful, general-purpose models built to support a variety of use cases, while also supporting the responsible use of AI. Use them as is or privately customize them with your own data. Amazon Titan Text Premier is a powerful and advanced model within the Amazon Titan Text family, designed to deliver superior performance across a wide range of enterprise applications. This model is optimized for integration with Agents and Knowledge Bases for Amazon Bedrock, making it an ideal option for building interactive generative AI applications.
  • 50
    Alpaca

    Alpaca

    Stanford Center for Research on Foundation Models (CRFM)

    Instruction-following models such as GPT-3.5 (text-DaVinci-003), ChatGPT, Claude, and Bing Chat have become increasingly powerful. Many users now interact with these models regularly and even use them for work. However, despite their widespread deployment, instruction-following models still have many deficiencies: they can generate false information, propagate social stereotypes, and produce toxic language. To make maximum progress on addressing these pressing problems, it is important for the academic community to engage. Unfortunately, doing research on instruction-following models in academia has been difficult, as there is no easily accessible model that comes close in capabilities to closed-source models such as OpenAI’s text-DaVinci-003. We are releasing our findings about an instruction-following language model, dubbed Alpaca, which is fine-tuned from Meta’s LLaMA 7B model.