Awesome multilingual OCR toolkits based on PaddlePaddle
AlphaFold 3 inference pipeline
A theoretical reconstruction of the Claude Mythos architecture
Native and Compact Structured Latents for 3D Generation
Visual Causal Flow
A Multi-Modal World Model for Reconstructing, Generating, Simulation
From Images to High-Fidelity 3D Assets
Video Object and Interaction Deletion
Bidirectional token-classification model for identifiable info
Qwen3.5 is the large language model series developed by Qwen team
A multimodal model for brain response prediction
Python SDK for Claude Agent
Industrial-level controllable zero-shot text-to-speech system
Open Source Speech Language Model
Long-form streaming TTS system for multi-speaker dialogue generation
RGBD video generation model conditioned on camera input
Audio foundation model excelling in audio understanding
Contexts Optical Compression
Project Lyra: Open Generative 3D World Models
Controllable & emotion-expressive zero-shot TTS
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
State of the art LLM and coding model
Genome modeling and design across all domains of life
FAIR Sequence Modeling Toolkit 2
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model