Audience
Protein science researchers who need transformer-based sequence representations for structure prediction, function annotation, mutation analysis, and protein design
About ESMC
ESMC is the latest in the ESM family of protein language models, establishing a new frontier in representation learning for protein biology. Trained on billions of evolutionary sequences, it learns representations that reflect a mechanistic reduction of protein structure and function. The model is built on a transformer architecture, supports sequences as its core modality, and is trained on up to 6 billion proteins. ESMC is designed for protein science research, including structure prediction, function annotation, protein design, and understanding evolutionary relationships between proteins. It can generate novel proteins from partial sequence, structure, or functional constraints, helping researchers explore new possibilities in protein design and biological discovery. The Biohub Platform provides access to ESMC through the API and the ESM Python package, with quickstart resources for installing the package, creating an API key, connecting to the platform.