A fast, powerful, and simple hierarchical vision transformer
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
Multi-lingual large voice generation model, providing inference
Sample code and notebooks for Generative AI on Google Cloud
When LLM Meets Domain Experts
An experimental version of DeepSeek model
A simple, secure MCP-to-OpenAPI proxy server
Generate Any 3D Scene in Seconds
Collection of Gemma 3 variants that are trained for performance
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
Offline inference engine for art, real-time voice conversations
PyTorch code and models for V-JEPA self-supervised learning from video
A Systematic Framework for Interactive World Modeling
Get a ChatGPT plugin up and running in under 5 minutes
LTX-Video Support for ComfyUI
Code to accompany "A Method for Animating Children's Drawings"
"Big Model" trains a visual multimodal VLM with 26M parameters
An Open Source text-to-speech system built by inverting Whisper
Build Vision Agents quickly with any model or video provider
Multi-Agent daTa geneRation Infra and eXperimentation framework
Diversity-driven optimization and large-model reasoning ability
LLM powered fuzzing via OSS-Fuzz
Code release for Cut and Learn for Unsupervised Object Detection
CLIP, Predict the most relevant text snippet given an image
RL implementations