tiktoken is a fast BPE tokeniser for use with OpenAI's models
An Efficient Agentic Model for Computer Use
Generate Any 3D Scene in Seconds
Pretrained time-series foundation model developed by Google Research
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Large Multimodal Models for Video Understanding and Editing
Open Source Speech Language Model
High-resolution models for human tasks
Video understanding codebase from FAIR for reproducing video models
CLIP, Predict the most relevant text snippet given an image
A Unified Framework for Text-to-3D and Image-to-3D Generation
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
The official PyTorch implementation of Google's Gemma models
Achieving 3+ generation speedup on reasoning tasks
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
Ling is a MoE LLM provided and open-sourced by InclusionAI
Phi-3.5 for Mac: Locally-run Vision and Language Models
Revolutionizing Database Interactions with Private LLM Technology
Controllable & emotion-expressive zero-shot TTS
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Language modeling in a sentence representation space
An AI-powered security review GitHub Action using Claude
Provides convenient access to the Anthropic REST API from any Python 3
Generating Immersive, Explorable, and Interactive 3D Worlds
CogView4, CogView3-Plus and CogView3(ECCV 2024)