Motion-controllable Video Generation via Latent Trajectory Guidance
Virtual AI anchor that combines state-of-the-art technology
Open source platform for the machine learning lifecycle
An open source implementation of CLIP
Pretrained time-series foundation model developed by Google Research
Self-learning data agent that grounds its answers in layers of content
Large Audio Language Model built for natural interactions
Specification and documentation for Agent Skills
Build Vision Agents quickly with any model or video provider
Supercharge Your LLM with the Fastest KV Cache Layer
The official Meta Llama 3 GitHub site
MobileLLM Optimizing Sub-billion Parameter Language Models
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Library for OCR-related tasks powered by Deep Learning
Models and examples built with TensorFlow
The Simple Agent Development Kit
Structured outputs for llms
Library for training machine learning models with privacy for data
Synthetic data generators for tabular and time-series data
Real-World Centric Foundation GUI Agents
An Open Source text-to-speech system built by inverting Whisper
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
Data Lake for Deep Learning. Build, manage, and query datasets
Technical principles related to large models
An API standard for single-agent reinforcement learning environments