StarCoder

StarCoder

BigCode
+
+

Related Products

  • Vertex AI
    783 Ratings
    Visit Website
  • Google AI Studio
    11 Ratings
    Visit Website
  • LM-Kit.NET
    23 Ratings
    Visit Website
  • Curtain LogTrace File Activity Monitoring
    4 Ratings
    Visit Website
  • imgproxy
    15 Ratings
    Visit Website
  • Teradata VantageCloud
    992 Ratings
    Visit Website
  • Odoo
    1,629 Ratings
    Visit Website
  • Proton Pass
    31,996 Ratings
    Visit Website
  • Source Defense
    7 Ratings
    Visit Website
  • Windsurf Editor
    156 Ratings
    Visit Website

About

Foundation models such as GPT-4 have driven rapid improvement in AI. However, the most powerful models are closed commercial models or only partially open. RedPajama is a project to create a set of leading, fully open-source models. Today, we are excited to announce the completion of the first step of this project: the reproduction of the LLaMA training dataset of over 1.2 trillion tokens. The most capable foundation models today are closed behind commercial APIs, which limits research, customization, and their use with sensitive data. Fully open-source models hold the promise of removing these limitations, if the open community can close the quality gap between open and closed models. Recently, there has been much progress along this front. In many ways, AI is having its Linux moment. Stable Diffusion showed that open-source can not only rival the quality of commercial offerings like DALL-E but can also lead to incredible creativity from broad participation by communities.

About

StarCoder and StarCoderBase are Large Language Models for Code (Code LLMs) trained on permissively licensed data from GitHub, including from 80+ programming languages, Git commits, GitHub issues, and Jupyter notebooks. Similar to LLaMA, we trained a ~15B parameter model for 1 trillion tokens. We fine-tuned StarCoderBase model for 35B Python tokens, resulting in a new model that we call StarCoder. We found that StarCoderBase outperforms existing open Code LLMs on popular programming benchmarks and matches or surpasses closed models such as code-cushman-001 from OpenAI (the original Codex model that powered early versions of GitHub Copilot). With a context length of over 8,000 tokens, the StarCoder models can process more input than any other open LLM, enabling a wide range of interesting applications. For example, by prompting the StarCoder models with a series of dialogues, we enabled them to act as a technical assistant.

Platforms Supported

Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook

Platforms Supported

Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook

Audience

AI and LLM developers

Audience

Developers interested in an LLM for code generation

Support

Phone Support
24/7 Live Support
Online

Support

Phone Support
24/7 Live Support
Online

API

Offers API

API

Offers API

Screenshots and Videos

Screenshots and Videos

Pricing

Free
Free Version
Free Trial

Pricing

Free
Free Version
Free Trial

Reviews/Ratings

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Reviews/Ratings

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Training

Documentation
Webinars
Live Online
In Person

Training

Documentation
Webinars
Live Online
In Person

Company Information

RedPajama
Founded: 2023
www.together.xyz/blog/redpajama

Company Information

BigCode
Founded: 2023
huggingface.co/blog/starcoder

Alternatives

Alpaca

Alpaca

Stanford Center for Research on Foundation Models (CRFM)

Alternatives

CodeGemma

CodeGemma

Google
Dolly

Dolly

Databricks
CodeQwen

CodeQwen

Alibaba
Falcon-40B

Falcon-40B

Technology Innovation Institute (TII)
DeepSeek Coder

DeepSeek Coder

DeepSeek
Falcon-7B

Falcon-7B

Technology Innovation Institute (TII)

Categories

Categories

Integrations

ChatGPT
CodeQwen
Git
GitHub
LM Studio
OpenAI
Python
Tabby
Taylor AI
Visual Studio Code
WebLLM

Integrations

ChatGPT
CodeQwen
Git
GitHub
LM Studio
OpenAI
Python
Tabby
Taylor AI
Visual Studio Code
WebLLM
Claim RedPajama and update features and information
Claim RedPajama and update features and information
Claim StarCoder and update features and information
Claim StarCoder and update features and information