Alternatives to Llama 3.2
Compare Llama 3.2 alternatives for your business or organization using the curated list below. SourceForge ranks the best alternatives to Llama 3.2 in 2024. Compare features, ratings, user reviews, pricing, and more from Llama 3.2 competitors and alternatives in order to make an informed decision for your business.
-
1
LFM-3B
Liquid AI
LFM-3B delivers incredible performance for its size. It positions itself as first place among 3B parameter transformers, hybrids, and RNN models, but also outperforms the previous generation of 7B and 13B models. It is also on par with Phi-3.5-mini on multiple benchmarks, while being 18.4% smaller. LFM-3B is the ideal choice for mobile and other edge text-based applications. -
2
Llama 2
Meta
The next generation of our open source large language model. This release includes model weights and starting code for pretrained and fine-tuned Llama language models — ranging from 7B to 70B parameters. Llama 2 pretrained models are trained on 2 trillion tokens, and have double the context length than Llama 1. Its fine-tuned models have been trained on over 1 million human annotations. Llama 2 outperforms other open source language models on many external benchmarks, including reasoning, coding, proficiency, and knowledge tests. Llama 2 was pretrained on publicly available online data sources. The fine-tuned model, Llama-2-chat, leverages publicly available instruction datasets and over 1 million human annotations. We have a broad range of supporters around the world who believe in our open approach to today’s AI — companies that have given early feedback and are excited to build with Llama 2.Starting Price: Free -
3
Code Llama
Meta
Code Llama is a large language model (LLM) that can use text prompts to generate code. Code Llama is state-of-the-art for publicly available LLMs on code tasks, and has the potential to make workflows faster and more efficient for current developers and lower the barrier to entry for people who are learning to code. Code Llama has the potential to be used as a productivity and educational tool to help programmers write more robust, well-documented software. Code Llama is a state-of-the-art LLM capable of generating code, and natural language about code, from both code and natural language prompts. Code Llama is free for research and commercial use. Code Llama is built on top of Llama 2 and is available in three models: Code Llama, the foundational code model; Codel Llama - Python specialized for Python; and Code Llama - Instruct, which is fine-tuned for understanding natural language instructions.Starting Price: Free -
4
Llama 3.1
Meta
The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions. Using our open ecosystem, build faster with a selection of differentiated product offerings to support your use cases. Choose from real-time inference or batch inference services. Download model weights to further optimize cost per token. Adapt for your application, improve with synthetic data and deploy on-prem or in the cloud. Use Llama system components and extend the model using zero shot tool use and RAG to build agentic behaviors. Leverage 405B high quality data to improve specialized models for specific use cases.Starting Price: Free -
5
Alpaca
Stanford Center for Research on Foundation Models (CRFM)
Instruction-following models such as GPT-3.5 (text-DaVinci-003), ChatGPT, Claude, and Bing Chat have become increasingly powerful. Many users now interact with these models regularly and even use them for work. However, despite their widespread deployment, instruction-following models still have many deficiencies: they can generate false information, propagate social stereotypes, and produce toxic language. To make maximum progress on addressing these pressing problems, it is important for the academic community to engage. Unfortunately, doing research on instruction-following models in academia has been difficult, as there is no easily accessible model that comes close in capabilities to closed-source models such as OpenAI’s text-DaVinci-003. We are releasing our findings about an instruction-following language model, dubbed Alpaca, which is fine-tuned from Meta’s LLaMA 7B model. -
6
PygmalionAI
PygmalionAI
PygmalionAI is a community dedicated to creating open-source projects based on EleutherAI's GPT-J 6B and Meta's LLaMA models. In simple terms, Pygmalion makes AI fine-tuned for chatting and roleplaying purposes. The current actively supported Pygmalion AI model is the 7B variant, based on Meta AI's LLaMA model. With only 18GB (or less) VRAM required, Pygmalion offers better chat capability than much larger language models with relatively minimal resources. Our curated dataset of high-quality roleplaying data ensures that your bot will be the optimal RP partner. Both the model weights and the code used to train it are completely open-source, and you can modify/re-distribute it for whatever purpose you want. Language models, including Pygmalion, generally run on GPUs since they need access to fast memory and massive processing power in order to output coherent text at an acceptable speed.Starting Price: Free -
7
StarCoder
BigCode
StarCoder and StarCoderBase are Large Language Models for Code (Code LLMs) trained on permissively licensed data from GitHub, including from 80+ programming languages, Git commits, GitHub issues, and Jupyter notebooks. Similar to LLaMA, we trained a ~15B parameter model for 1 trillion tokens. We fine-tuned StarCoderBase model for 35B Python tokens, resulting in a new model that we call StarCoder. We found that StarCoderBase outperforms existing open Code LLMs on popular programming benchmarks and matches or surpasses closed models such as code-cushman-001 from OpenAI (the original Codex model that powered early versions of GitHub Copilot). With a context length of over 8,000 tokens, the StarCoder models can process more input than any other open LLM, enabling a wide range of interesting applications. For example, by prompting the StarCoder models with a series of dialogues, we enabled them to act as a technical assistant.Starting Price: Free -
8
Aya
Cohere AI
Aya is a new state-of-the-art, open-source, massively multilingual, generative large language research model (LLM) covering 101 different languages — more than double the number of languages covered by existing open-source models. Aya helps researchers unlock the powerful potential of LLMs for dozens of languages and cultures largely ignored by most advanced models on the market today. We are open-sourcing both the Aya model, as well as the largest multilingual instruction fine-tuned dataset to-date with a size of 513 million covering 114 languages. This data collection includes rare annotations from native and fluent speakers all around the world, ensuring that AI technology can effectively serve a broad global audience that have had limited access to-date. -
9
Vicuna
lmsys.org
Vicuna-13B is an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. Preliminary evaluation using GPT-4 as a judge shows Vicuna-13B achieves more than 90%* quality of OpenAI ChatGPT and Google Bard while outperforming other models like LLaMA and Stanford Alpaca in more than 90%* of cases. The cost of training Vicuna-13B is around $300. The code and weights, along with an online demo, are publicly available for non-commercial use.Starting Price: Free -
10
Reka
Reka
Our enterprise-grade multimodal assistant carefully designed with privacy, security, and efficiency in mind. We train Yasa to read text, images, videos, and tabular data, with more modalities to come. Use it to generate ideas for creative tasks, get answers to basic questions, or derive insights from your internal data. Generate, train, compress, or deploy on-premise with a few simple commands. Use our proprietary algorithms to personalize our model to your data and use cases. We design proprietary algorithms involving retrieval, fine-tuning, self-supervised instruction tuning, and reinforcement learning to tune our model on your datasets. -
11
ChatGPT
OpenAI
ChatGPT is a language model developed by OpenAI. It has been trained on a diverse range of internet text, allowing it to generate human-like responses to a variety of prompts. ChatGPT can be used for various natural language processing tasks, such as question answering, conversation, and text generation. ChatGPT is a pre-trained language model that uses deep learning algorithms to generate text. It was trained on a large corpus of text data, allowing it to generate human-like responses to a wide range of prompts. The model has a transformer architecture, which has been shown to be effective in many NLP tasks. In addition to generating text, ChatGPT can also be fine-tuned for specific NLP tasks such as question answering, text classification, and language translation. This allows developers to build powerful NLP applications that can perform specific tasks more accurately. ChatGPT can also process and generate code.Starting Price: Free -
12
Azure OpenAI Service
Microsoft
Apply advanced coding and language models to a variety of use cases. Leverage large-scale, generative AI models with deep understandings of language and code to enable new reasoning and comprehension capabilities for building cutting-edge applications. Apply these coding and language models to a variety of use cases, such as writing assistance, code generation, and reasoning over data. Detect and mitigate harmful use with built-in responsible AI and access enterprise-grade Azure security. Gain access to generative models that have been pretrained with trillions of words. Apply them to new scenarios including language, code, reasoning, inferencing, and comprehension. Customize generative models with labeled data for your specific scenario using a simple REST API. Fine-tune your model's hyperparameters to increase accuracy of outputs. Use the few-shot learning capability to provide the API with examples and achieve more relevant results.Starting Price: $0.0004 per 1000 tokens -
13
FreeWilly
Stability AI
Stability AI and its CarperAI lab are proud to announce FreeWilly1 and its successor FreeWilly2, two powerful new, open access, Large Language Models (LLMs). Both models demonstrate exceptional reasoning ability across varied benchmarks. FreeWilly1 leverages the original LLaMA 65B foundation model and was carefully fine-tuned with a new synthetically-generated dataset using Supervised Fine-Tune (SFT) in standard Alpaca format. Similarly, FreeWilly2 leverages the LLaMA 2 70B foundation model to reach a performance that compares favorably with GPT-3.5 for some tasks. The training for the FreeWilly models was directly inspired by the methodology pioneered by Microsoft in its paper: "Orca: Progressive Learning from Complex Explanation Traces of GPT-4.” While our data generation process is similar, we differ in our data sources.Starting Price: Free -
14
LLaMA
Meta
LLaMA (Large Language Model Meta AI) is a state-of-the-art foundational large language model designed to help researchers advance their work in this subfield of AI. Smaller, more performant models such as LLaMA enable others in the research community who don’t have access to large amounts of infrastructure to study these models, further democratizing access in this important, fast-changing field. Training smaller foundation models like LLaMA is desirable in the large language model space because it requires far less computing power and resources to test new approaches, validate others’ work, and explore new use cases. Foundation models train on a large set of unlabeled data, which makes them ideal for fine-tuning for a variety of tasks. We are making LLaMA available at several sizes (7B, 13B, 33B, and 65B parameters) and also sharing a LLaMA model card that details how we built the model in keeping with our approach to Responsible AI practices. -
15
Smaug-72B
Abacus
Smaug-72B is a powerful open-source large language model (LLM) known for several key features: High Performance: It currently holds the top spot on the Hugging Face Open LLM leaderboard, surpassing models like GPT-3.5 in various benchmarks. This means it excels at tasks like understanding, responding to, and generating human-like text. Open Source: Unlike many other advanced LLMs, Smaug-72B is freely available for anyone to use and modify, fostering collaboration and innovation in the AI community. Focus on Reasoning and Math: It specifically shines in handling reasoning and mathematical tasks, attributing this strength to unique fine-tuning techniques developed by Abacus AI, the creators of Smaug-72B. Based on Qwen-72B: It's technically a fine-tuned version of another powerful LLM called Qwen-72B, released by Alibaba, further improving upon its capabilities. Overall, Smaug-72B represents a significant step forward in open-source AI.Starting Price: Free -
16
AI21 Studio
AI21 Studio
AI21 Studio provides API access to Jurassic-1 large-language-models. Our models power text generation and comprehension features in thousands of live applications. Take on any language task. Our Jurassic-1 models are trained to follow natural language instructions and require just a few examples to adapt to new tasks. Use our specialized APIs for common tasks like summarization, paraphrasing and more. Access superior results at a lower cost without reinventing the wheel. Need to fine-tune your own custom model? You're just 3 clicks away. Training is fast, affordable and trained models are deployed immediately. Give your users superpowers by embedding an AI co-writer in your app. Drive user engagement and success with features like long-form draft generation, paraphrasing, repurposing and custom auto-complete.Starting Price: $29 per month -
17
LongLLaMA
LongLLaMA
This repository contains the research preview of LongLLaMA, a large language model capable of handling long contexts of 256k tokens or even more. LongLLaMA is built upon the foundation of OpenLLaMA and fine-tuned using the Focused Transformer (FoT) method. LongLLaMA code is built upon the foundation of Code Llama. We release a smaller 3B base variant (not instruction tuned) of the LongLLaMA model on a permissive license (Apache 2.0) and inference code supporting longer contexts on hugging face. Our model weights can serve as the drop-in replacement of LLaMA in existing implementations (for short context up to 2048 tokens). Additionally, we provide evaluation results and comparisons against the original OpenLLaMA models.Starting Price: Free -
18
Giga ML
Giga ML
We just launched X1 large series of Models. Giga ML's most powerful model is available for pre-training and fine-tuning with on-prem deployment. Since we are Open AI compatible, your existing integrations with long chain, llama-index, and all others work seamlessly. You can continue pre-training of LLM's with domain-specific data books or docs or company docs. The world of large language models (LLMs) rapidly expanding, offering unprecedented opportunities for natural language processing across various domains. However, some critical challenges have remained unaddressed. At Giga ML, we proudly introduce the X1 Large 32k model, a pioneering on-premise LLM solution that addresses these critical issues. -
19
Hermes 3
Nous Research
Experiment, and push the boundaries of individual alignment, artificial consciousness, open-source software, and decentralization, in ways that monolithic companies and governments are too afraid to try. Hermes 3 contains advanced long-term context retention and multi-turn conversation capability, complex roleplaying and internal monologue abilities, and enhanced agentic function-calling. Our training data aggressively encourages the model to follow the system and instruction prompts exactly and in an adaptive manner. Hermes 3 was created by fine-tuning Llama 3.1 8B, 70B, and 405B, and training on a dataset of primarily synthetically generated responses. The model boasts comparable and superior performance to Llama 3.1 while unlocking deeper capabilities in reasoning and creativity. Hermes 3 is a series of instruct and tool-use models with strong reasoning and creative abilities.Starting Price: Free -
20
Mistral NeMo
Mistral AI
Mistral NeMo, our new best small model. A state-of-the-art 12B model with 128k context length, and released under the Apache 2.0 license. Mistral NeMo is a 12B model built in collaboration with NVIDIA. Mistral NeMo offers a large context window of up to 128k tokens. Its reasoning, world knowledge, and coding accuracy are state-of-the-art in its size category. As it relies on standard architecture, Mistral NeMo is easy to use and a drop-in replacement in any system using Mistral 7B. We have released pre-trained base and instruction-tuned checkpoints under the Apache 2.0 license to promote adoption for researchers and enterprises. Mistral NeMo was trained with quantization awareness, enabling FP8 inference without any performance loss. The model is designed for global, multilingual applications. It is trained on function calling and has a large context window. Compared to Mistral 7B, it is much better at following precise instructions, reasoning, and handling multi-turn conversations.Starting Price: Free -
21
Qwen
Alibaba
Qwen LLM refers to a family of large language models (LLMs) developed by Alibaba Cloud's Damo Academy. These models are trained on a massive dataset of text and code, allowing them to understand and generate human-like text, translate languages, write different kinds of creative content, and answer your questions in an informative way. Here are some key features of Qwen LLMs: Variety of sizes: The Qwen series ranges from 1.8 billion to 72 billion parameters, offering options for different needs and performance levels. Open source: Some versions of Qwen are open-source, which means their code is publicly available for anyone to use and modify. Multilingual support: Qwen can understand and translate multiple languages, including English, Chinese, and French. Diverse capabilities: Besides generation and translation, Qwen models can be used for tasks like question answering, text summarization, and code generation.Starting Price: Free -
22
MPT-7B
MosaicML
Introducing MPT-7B, the latest entry in our MosaicML Foundation Series. MPT-7B is a transformer trained from scratch on 1T tokens of text and code. It is open source, available for commercial use, and matches the quality of LLaMA-7B. MPT-7B was trained on the MosaicML platform in 9.5 days with zero human intervention at a cost of ~$200k. Now you can train, finetune, and deploy your own private MPT models, either starting from one of our checkpoints or training from scratch. For inspiration, we are also releasing three finetuned models in addition to the base MPT-7B: MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-StoryWriter-65k+, the last of which uses a context length of 65k tokens!Starting Price: Free -
23
Pixtral 12B
Mistral AI
Pixtral 12B is a pioneering multimodal AI model developed by Mistral AI, designed to process and interpret both text and image data seamlessly. This model marks a significant advancement in the integration of different data types, allowing for more intuitive interactions and enhanced content creation capabilities. With a foundation built upon Mistral's NeMo 12B text model, Pixtral 12B incorporates an additional vision adapter that adds approximately 400 million parameters, expanding its ability to handle visual inputs up to 1024 x 1024 pixels in size. This model supports a variety of applications, from detailed image analysis to answering questions about visual content, showcasing its versatility in real-world applications. Pixtral 12B's architecture not only supports a large context window of 128k tokens but also employs innovative techniques like GeLU activation and 2D RoPE for its vision components, making it a robust tool for developers and enterprises aiming to leverage AI.Starting Price: Free -
24
TinyLlama
TinyLlama
The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens. With some proper optimization, we can achieve this within a span of "just" 90 days using 16 A100-40G GPUs. We adopted exactly the same architecture and tokenizer as Llama 2. This means TinyLlama can be plugged and played in many open-source projects built upon Llama. Besides, TinyLlama is compact with only 1.1B parameters. This compactness allows it to cater to a multitude of applications demanding a restricted computation and memory footprint.Starting Price: Free -
25
NLP Cloud
NLP Cloud
Fast and accurate AI models suited for production. Highly-available inference API leveraging the most advanced NVIDIA GPUs. We selected the best open-source natural language processing (NLP) models from the community and deployed them for you. Fine-tune your own models - including GPT-J - or upload your in-house custom models, and deploy them easily to production. Upload or Train/Fine-Tune your own AI models - including GPT-J - from your dashboard, and use them straight away in production without worrying about deployment considerations like RAM usage, high-availability, scalability... You can upload and deploy as many models as you want to production.Starting Price: $29 per month -
26
Phi-2
Microsoft
We are now releasing Phi-2, a 2.7 billion-parameter language model that demonstrates outstanding reasoning and language understanding capabilities, showcasing state-of-the-art performance among base language models with less than 13 billion parameters. On complex benchmarks Phi-2 matches or outperforms models up to 25x larger, thanks to new innovations in model scaling and training data curation. With its compact size, Phi-2 is an ideal playground for researchers, including for exploration around mechanistic interpretability, safety improvements, or fine-tuning experimentation on a variety of tasks. We have made Phi-2 available in the Azure AI Studio model catalog to foster research and development on language models. -
27
GPT-4
OpenAI
GPT-4 (Generative Pre-trained Transformer 4) is a large-scale unsupervised language model, yet to be released by OpenAI. GPT-4 is the successor to GPT-3 and part of the GPT-n series of natural language processing models, and was trained on a dataset of 45TB of text to produce human-like text generation and understanding capabilities. Unlike most other NLP models, GPT-4 does not require additional training data for specific tasks. Instead, it can generate text or answer questions using only its own internally generated context as input. GPT-4 has been shown to be able to perform a wide variety of tasks without any task specific training data such as translation, summarization, question answering, sentiment analysis and more.Starting Price: $0.0200 per 1000 tokens -
28
InstructGPT
OpenAI
InstructGPT is an open-source framework for training language models to generate natural language instructions from visual input. It uses a generative pre-trained transformer (GPT) model and the state-of-the-art object detector, Mask R-CNN, to detect objects in images and generate natural language sentences that describe the image. InstructGPT is designed to be effective across domains such as robotics, gaming and education; it can assist robots in navigating complex tasks with natural language instructions, or help students learn by providing descriptive explanations of processes or events.Starting Price: $0.0200 per 1000 tokens -
29
GPT-4o mini
OpenAI
A small model with superior textual intelligence and multimodal reasoning. GPT-4o mini enables a broad range of tasks with its low cost and latency, such as applications that chain or parallelize multiple model calls (e.g., calling multiple APIs), pass a large volume of context to the model (e.g., full code base or conversation history), or interact with customers through fast, real-time text responses (e.g., customer support chatbots). Today, GPT-4o mini supports text and vision in the API, with support for text, image, video and audio inputs and outputs coming in the future. The model has a context window of 128K tokens, supports up to 16K output tokens per request, and has knowledge up to October 2023. Thanks to the improved tokenizer shared with GPT-4o, handling non-English text is now even more cost effective. -
30
Falcon-40B
Technology Innovation Institute (TII)
Falcon-40B is a 40B parameters causal decoder-only model built by TII and trained on 1,000B tokens of RefinedWeb enhanced with curated corpora. It is made available under the Apache 2.0 license. Why use Falcon-40B? It is the best open-source model currently available. Falcon-40B outperforms LLaMA, StableLM, RedPajama, MPT, etc. See the OpenLLM Leaderboard. It features an architecture optimized for inference, with FlashAttention and multiquery. It is made available under a permissive Apache 2.0 license allowing for commercial use, without any royalties or restrictions. ⚠️ This is a raw, pretrained model, which should be further finetuned for most usecases. If you are looking for a version better suited to taking generic instructions in a chat format, we recommend taking a look at Falcon-40B-Instruct.Starting Price: Free -
31
Baichuan-13B
Baichuan Intelligent Technology
Baichuan-13B is an open source and commercially available large-scale language model containing 13 billion parameters developed by Baichuan Intelligent following Baichuan -7B . It has achieved the best results of the same size on authoritative Chinese and English benchmarks. This release contains two versions of pre-training ( Baichuan-13B-Base ) and alignment ( Baichuan-13B-Chat ). Larger size, more data : Baichuan-13B further expands the number of parameters to 13 billion on the basis of Baichuan -7B , and trains 1.4 trillion tokens on high-quality corpus, which is 40% more than LLaMA-13B. It is currently open source The model with the largest amount of training data in the 13B size. Support Chinese and English bilingual, use ALiBi position code, context window length is 4096.Starting Price: Free -
32
Ntropy
Ntropy
Ship faster integrating with our Python SDK or Rest API in minutes. No prior setups or data formatting. You can get going straight away as soon as you have incoming data and your first customers. We have built and fine-tuned custom language models to recognize entities, automatically crawl the web in real-time and pick the best match, as well as assign labels with superhuman accuracy in a fraction of the time. Everybody has a data enrichment model that is trying to be good at one thing, US or Europe, business or consumer. These models are poor at generalizing and are not capable of human-level output. With us, you can leverage the power of the world's largest and most performant models embedded in your products, at a fraction of cost and time. -
33
ALBERT
Google
ALBERT is a self-supervised Transformer model that was pretrained on a large corpus of English data. This means it does not require manual labelling, and instead uses an automated process to generate inputs and labels from raw texts. It is trained with two distinct objectives in mind. The first is Masked Language Modeling (MLM), which randomly masks 15% of words in the input sentence and requires the model to predict them. This technique differs from RNNs and autoregressive models like GPT as it allows the model to learn bidirectional sentence representations. The second objective is Sentence Ordering Prediction (SOP), which entails predicting the ordering of two consecutive segments of text during pretraining. -
34
ChatGLM
Zhipu AI
ChatGLM-6B is an open-source, Chinese-English bilingual dialogue language model based on the General Language Model (GLM) architecture with 6.2 billion parameters. Combined with model quantization technology, users can deploy locally on consumer-grade graphics cards (only 6GB of video memory is required at the INT4 quantization level). ChatGLM-6B uses technology similar to ChatGPT, optimized for Chinese Q&A and dialogue. After about 1T identifiers of Chinese and English bilingual training, supplemented by supervision and fine-tuning, feedback self-help, human feedback reinforcement learning and other technologies, ChatGLM-6B with 6.2 billion parameters has been able to generate answers that are quite in line with human preferences.Starting Price: Free -
35
T5
Google
With T5, we propose reframing all NLP tasks into a unified text-to-text-format where the input and output are always text strings, in contrast to BERT-style models that can only output either a class label or a span of the input. Our text-to-text framework allows us to use the same model, loss function, and hyperparameters on any NLP task, including machine translation, document summarization, question answering, and classification tasks (e.g., sentiment analysis). We can even apply T5 to regression tasks by training it to predict the string representation of a number instead of the number itself. -
36
Hyperplane
Hyperplane
Better audiences from the richness of transaction data. Create nuanced personas and effective marketing campaigns based on financial behaviors and consumer interests. Increase user limits, without worrying about default. Leverage user income estimates that are precise and always up-to-date. The Hyperplane platform enables financial institutions to launch personalized consumer experiences through specialized foundation models (LLMs). Upgrade your feature sets with embeddings for credit, collections, and lookalike modeling. Segment users based on various criteria, enabling you to target specific audience groups for personalized marketing campaigns, content delivery, and user analysis. Segmentation is achieved through facets, which are key attributes or characteristics used to categorize users, Hyperplane offers the capability to enrich user segmentation by employing additional attributes to fine-tune the filtering of responses from certain audience segmentation endpoints. -
37
Amazon Titan
Amazon
Exclusive to Amazon Bedrock, the Amazon Titan family of models incorporates Amazon’s 25 years of experience innovating with AI and machine learning across its business. Amazon Titan foundation models (FMs) provide customers with a breadth of high-performing image, multimodal, and text model choices, via a fully managed API. Amazon Titan models are created by AWS and pretrained on large datasets, making them powerful, general-purpose models built to support a variety of use cases, while also supporting the responsible use of AI. Use them as is or privately customize them with your own data. Amazon Titan Text Premier is a powerful and advanced model within the Amazon Titan Text family, designed to deliver superior performance across a wide range of enterprise applications. This model is optimized for integration with Agents and Knowledge Bases for Amazon Bedrock, making it an ideal option for building interactive generative AI applications. -
38
Qwen-7B
Alibaba
Qwen-7B is the 7B-parameter version of the large language model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-7B is a Transformer-based large language model, which is pretrained on a large volume of data, including web texts, books, codes, etc. Additionally, based on the pretrained Qwen-7B, we release Qwen-7B-Chat, a large-model-based AI assistant, which is trained with alignment techniques. The features of the Qwen-7B series include: Trained with high-quality pretraining data. We have pretrained Qwen-7B on a self-constructed large-scale high-quality dataset of over 2.2 trillion tokens. The dataset includes plain texts and codes, and it covers a wide range of domains, including general domain data and professional domain data. Strong performance. In comparison with the models of the similar model size, we outperform the competitors on a series of benchmark datasets, which evaluates natural language understanding, mathematics, coding, etc. And more.Starting Price: Free -
39
GPT-4o
OpenAI
GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction—it accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs. It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time (opens in a new window) in a conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models.Starting Price: $5.00 / 1M tokens -
40
Dolly
Databricks
Dolly is a cheap-to-build LLM that exhibits a surprising degree of the instruction following capabilities exhibited by ChatGPT. Whereas the work from the Alpaca team showed that state-of-the-art models could be coaxed into high quality instruction-following behavior, we find that even years-old open source models with much earlier architectures exhibit striking behaviors when fine tuned on a small corpus of instruction training data. Dolly works by taking an existing open source 6 billion parameter model from EleutherAI and modifying it ever so slightly to elicit instruction following capabilities such as brainstorming and text generation not present in the original model, using data from Alpaca.Starting Price: Free -
41
GPT-4 Turbo
OpenAI
GPT-4 is a large multimodal model (accepting text or image inputs and outputting text) that can solve difficult problems with greater accuracy than any of our previous models, thanks to its broader general knowledge and advanced reasoning capabilities. GPT-4 is available in the OpenAI API to paying customers. Like gpt-3.5-turbo, GPT-4 is optimized for chat but works well for traditional completions tasks using the Chat Completions API. GPT-4 is the latest GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Returns a maximum of 4,096 output tokens. This preview model is not yet suited for production traffic.Starting Price: $0.0200 per 1000 tokens -
42
Gemini
Google
Gemini was created from the ground up to be multimodal, highly efficient at tool and API integrations and built to enable future innovations, like memory and planning. While still early, we’re already seeing impressive multimodal capabilities not seen in prior models. Gemini is also our most flexible model yet — able to efficiently run on everything from data centers to mobile devices. Its state-of-the-art capabilities will significantly enhance the way developers and enterprise customers build and scale with AI. We’ve optimized Gemini 1.0, our first version, for three different sizes: Gemini Ultra — our largest and most capable model for highly complex tasks. Gemini Pro — our best model for scaling across a wide range of tasks. Gemini Nano — our most efficient model for on-device tasks.Starting Price: Free -
43
LUIS
Microsoft
Language Understanding (LUIS): A machine learning-based service to build natural language into apps, bots, and IoT devices. Quickly create enterprise-ready, custom models that continuously improve. Add natural language to your apps. Designed to identify valuable information in conversations, LUIS interprets user goals (intents) and distills valuable information from sentences (entities), for a high quality, nuanced language model. LUIS integrates seamlessly with the Azure Bot Service, making it easy to create a sophisticated bot. Powerful developer tools are combined with customizable pre-built apps and entity dictionaries, such as Calendar, Music, and Devices, so you can build and deploy a solution more quickly. Dictionaries are mined from the collective knowledge of the web and supply billions of entries, helping your model to correctly identify valuable information from user conversations. Active learning is used to continuously improve the quality of the models. -
44
PanGu-α
Huawei
PanGu-α is developed under the MindSpore and trained on a cluster of 2048 Ascend 910 AI processors. The training parallelism strategy is implemented based on MindSpore Auto-parallel, which composes five parallelism dimensions to scale the training task to 2048 processors efficiently, including data parallelism, op-level model parallelism, pipeline model parallelism, optimizer model parallelism and rematerialization. To enhance the generalization ability of PanGu-α, we collect 1.1TB high-quality Chinese data from a wide range of domains to pretrain the model. We empirically test the generation ability of PanGu-α in various scenarios including text summarization, question answering, dialogue generation, etc. Moreover, we investigate the effect of model scales on the few-shot performances across a broad range of Chinese NLP tasks. The experimental results demonstrate the superior capabilities of PanGu-α in performing various tasks under few-shot or zero-shot settings. -
45
GPT-3.5
OpenAI
GPT-3.5 is the next evolution of GPT 3 large language model from OpenAI. GPT-3.5 models can understand and generate natural language. We offer four main models with different levels of power suitable for different tasks. The main GPT-3.5 models are meant to be used with the text completion endpoint. We also offer models that are specifically meant to be used with other endpoints. Davinci is the most capable model family and can perform any task the other models can perform and often with less instruction. For applications requiring a lot of understanding of the content, like summarization for a specific audience and creative content generation, Davinci is going to produce the best results. These increased capabilities require more compute resources, so Davinci costs more per API call and is not as fast as the other models.Starting Price: $0.0200 per 1000 tokens -
46
Gemma
Google
Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Developed by Google DeepMind and other teams across Google, Gemma is inspired by Gemini, and the name reflects the Latin gemma, meaning “precious stone.” Accompanying our model weights, we’re also releasing tools to support developer innovation, foster collaboration, and guide the responsible use of Gemma models. Gemma models share technical and infrastructure components with Gemini, our largest and most capable AI model widely available today. This enables Gemma 2B and 7B to achieve best-in-class performance for their sizes compared to other open models. And Gemma models are capable of running directly on a developer laptop or desktop computer. Notably, Gemma surpasses significantly larger models on key benchmarks while adhering to our rigorous standards for safe and responsible outputs. -
47
Qwen2-VL
Alibaba
Qwen2-VL is the latest version of the vision language models based on Qwen2 in the Qwen model familities. Compared with Qwen-VL, Qwen2-VL has the capabilities of: SoTA understanding of images of various resolution & ratio: Qwen2-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc. Understanding videos of 20 min+: Qwen2-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc. Agent that can operate your mobiles, robots, etc.: with the abilities of complex reasoning and decision making, Qwen2-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions. Multilingual Support: to serve global users, besides English and Chinese, Qwen2-VL now supports the understanding of texts in different languages inside imagesStarting Price: Free -
48
YandexGPT
Yandex
Take advantage of the capabilities of generative language models to improve and optimize your applications and web services. Get an aggregated result of accumulated textual data whether it be information from work chats, user reviews, or other types of data. YandexGPT will help both summarize and interpret the information. Speed up text creation as you improve their quality and style. Create template texts for newsletters, product descriptions for online stores and other applications. Develop a chatbot for your support service: teach the bot to answer various user questions, both common and more complicated. Use the API to integrate the service with your applications and automate processes. -
49
Stable LM
Stability AI
Stable LM: Stability AI Language Models. The release of Stable LM builds on our experience in open-sourcing earlier language models with EleutherAI, a nonprofit research hub. These language models include GPT-J, GPT-NeoX, and the Pythia suite, which were trained on The Pile open-source dataset. Many recent open-source language models continue to build on these efforts, including Cerebras-GPT and Dolly-2. Stable LM is trained on a new experimental dataset built on The Pile, but three times larger with 1.5 trillion tokens of content. We will release details on the dataset in due course. The richness of this dataset gives Stable LM surprisingly high performance in conversational and coding tasks, despite its small size of 3 to 7 billion parameters (by comparison, GPT-3 has 175 billion parameters). Stable LM 3B is a compact language model designed to operate on portable digital devices like handhelds and laptops, and we’re excited about its capabilities and portability.Starting Price: Free -
50
Lemonfox.ai
Lemonfox.ai
Our models are deployed around the world to give you the best possible response times. Integrate our OpenAI-compatible API effortlessly into your application. Begin within minutes and seamlessly scale to serve millions of users. Benefit from our extensive scale and performance optimizations, making our API 4 times more affordable than OpenAI's GPT-3.5 API. Generate text and chat with our AI model that delivers ChatGPT-level performance at a fraction of the cost. Getting started just takes a few minutes with our OpenAI-compatible API. Harness the power of one of the most advanced AI image models to craft stunning, high-quality images, graphics, and illustrations in a few seconds.Starting Price: $5 per month