High-resolution models for human tasks
Pretrained time-series foundation model developed by Google Research
Inference script for Oasis 500M
Fast and Universal 3D reconstruction model for versatile tasks
Memory-efficient and performant finetuning of Mistral's models
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
High-Fidelity and Controllable Generation of Textured 3D Assets
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
Unified Multimodal Understanding and Generation Models
DeepMind model for tracking arbitrary points across videos & robotics
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
A trainable PyTorch reproduction of AlphaFold 3
OCR expert VLM powered by Hunyuan's native multimodal architecture
Official DeiT repository
High-Resolution Image Synthesis with Latent Diffusion Models
StudioOllamaUI is a local, portable interface for Ollama
AI-powered tool to quickly remove watermarks from images flawlessly
Example Discord bot written in Python that uses the completions API
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Let us control diffusion models
Fine-tuning ChatGLM-6B with PEFT
Chinese LLaMA & Alpaca large language model + local CPU/GPU training
800,000 step-level correctness labels on LLM solutions to MATH problem
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion
Reference implementation of the Transformer architecture optimized