The data structure for multimodal data
The best ChatGPT that $100 can buy
PPTAgent: Generating and Evaluating Presentations
Code for the paper "Evaluating Large Language Models Trained on Code"
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Phi-3.5 for Mac: Locally-run Vision and Language Models
Embed images and sentences into fixed-length vectors
Renderer for the harmony response format to be used with gpt-oss
The official PyTorch implementation of Google's Gemma models
Machine Learning Systems: Design and Implementation
Seamlessly integrate LLMs into scikit-learn
Central interface to connect your LLM's with external data
Gemma open-weight LLM library, from Google DeepMind
CLIP + FFT/DWT/RGB = text to image/video
Multimodal Diffusion with Representation Alignment
Integrate ChatGPT into your own discord bot
MII makes low-latency and high-throughput inference possible
State-of-the-art diffusion models for image and audio generation
A Model Context Protocol server for searching and analyzing arXiv
Refer and Ground Anything Anywhere at Any Granularity
FAIR Sequence Modeling Toolkit 2
ICLR2024 Spotlight: curation/training code, metadata, distribution
Official implementation of DreamCraft3D
Transformers4Rec is a flexible and efficient library
Deterministic LLMs Outputs for AI Applications and AI Agents