Video Object and Interaction Deletion
Hunyuan Translation Model Version 1.5
Ultra-Efficient LLMs on End Device
Open-source deep-learning framework
The official PyTorch implementation of Google's Gemma models
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Unified Multimodal Understanding and Generation Models
Designed for text embedding and ranking tasks
Generating Immersive, Explorable, and Interactive 3D Worlds
Qwen2.5-VL is the multimodal large language model series
Bidirectional token-classification model for identifiable info
Foundation Models for Time Series
Genome modeling and design across all domains of life
Phi-3.5 for Mac: Locally-run Vision and Language Models
Programmatic access to the AlphaGenome model
Open image model at the forefront of design
MOSS‑TTS Family open‑source speech and sound generation model
Foundation model for image generation
Implementation of "MobileCLIP" CVPR 2024
Video understanding codebase from FAIR for reproducing video models
Python SDK for Claude Agent
Multimodal-Driven Architecture for Customized Video Generation
Achieving 3+ generation speedup on reasoning tasks
General-purpose image editing model that delivers high-fidelity
FAIR Sequence Modeling Toolkit 2