Code release for Cut and Learn for Unsupervised Object Detection
CLIP, Predict the most relevant text snippet given an image
tiktoken is a fast BPE tokeniser for use with OpenAI's models
RL research on Android devices
Documentation for Google's Gen AI site - including Gemini API & Gemma
MCP integration platforms for AI agents to use tools at any scale
Fundamentals of Machine Learning and Deep Learning
4M: Massively Multimodal Masked Modeling
Guiding Instruction-based Image Editing via Multimodal Large Language
PyTorch code and models for V-JEPA self-supervised learning from video
PyTorch code and models for the DINOv2 self-supervised learning
Towards Ultimate Expert Specialization in Mixture-of-Experts Language
Audiocraft is a library for audio processing and generation
Generate 3D objects conditioned on text or images
Models for object and human mesh reconstruction
Implementation of Vision Transformer, a simple way to achieve SOTA
LLM powered fuzzing via OSS-Fuzz
The best ChatGPT that $100 can buy
Set of tools to assess and improve LLM security
PPTAgent: Generating and Evaluating Presentations
The official PyTorch implementation of Google's Gemma models
Volcano Engine Reinforcement Learning for LLMs
Diffusion Transformer with Fine-Grained Chinese Understanding
A Customizable Image-to-Video Model based on HunyuanVideo
Learn AI and LLMs from scratch using free resources