SOTA discrete acoustic codec models with 40/75 tokens per second
Ultimate meta-skill for generating best-in-class Claude Code skills
End-to-end pipeline converting generative videos
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Motion-controllable Video Generation via Latent Trajectory Guidance
A tool to use the Ai2 Open Coding Agents Soft-Verified Agents
Connect any LLM to your internal knowledge sources
Hunyuan Translation Model Version 1.5
Persistent context and multi-instance coordination
Habit Tracker for the AI Coding Workshop
Language Model Reinforcement Learning Environments frameworks
Simple and easily configurable grid world environments
Fast and accurate AI powered file content types detection
A simple, secure MCP-to-OpenAPI proxy server
An undetectable, powerful, flexible, high-performance Python library
Implementation of "MobileCLIP" CVPR 2024
A fast, powerful, and simple hierarchical vision transformer
Code release for Cut and Learn for Unsupervised Object Detection
High-resolution models for human tasks
CLIP, Predict the most relevant text snippet given an image
Ling is a MoE LLM provided and open-sourced by InclusionAI
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible
A Unified Framework for Text-to-3D and Image-to-3D Generation
Multimodal-Driven Architecture for Customized Video Generation
Multimodal Diffusion with Representation Alignment