A fast, powerful, and simple hierarchical vision transformer
Towards Real-World Vision-Language Understanding
Multimodal-Driven Architecture for Customized Video Generation
A framework for managing your zsh configuration
The NVIDIA AgentIQ toolkit is an open-source library
Advanced evolutionary computation library built on top of PyTorch
Implementation of RLHF (Reinforcement Learning with Human Feedback)
Dependabot's core logic for creating update PR's
A collection of learning resources for curious software engineers
Unified Multimodal Understanding and Generation Models
Volcano Engine Reinforcement Learning for LLMs
Dataset of GPT-2 outputs for research in detection, biases, and more
Code for running inference and finetuning with SAM 3 model
Models for object and human mesh reconstruction
Fire up your models with the flame
A neural network that transforms a design mock-up into static websites
Clojure Desktop UI framework
SAPIEN Manipulation Skill Framework
A library to handle Apple Property List format in binary or XML
A unified analytics engine for large-scale data processing
A JAX-native LLM Post-Training Library
Diffusion Transformer with Fine-Grained Chinese Understanding
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
A Customizable Image-to-Video Model based on HunyuanVideo
Central interface to connect your LLM's with external data