Data loaders and abstractions for text and NLP
AI Agent Application Development Framework
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
A code-first agent framework for seamlessly planning analytics tasks
AI agent that streamlines the entire process of data analysis
GitLab automatic code review tool based on large models
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Simplest working implementation of Stylegan2
Flexible Photo Recrafting While Preserving Your Identity
Photorealistic Synthetic Dataset for Holistic Indoor Scene
Knowledge Graph Generation from Any Text
Neural Network architecture based on ideas of the original LSTM
Run all your local AI together in one package
Graph Neural Network Library for PyTorch
Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
LLM-based agent for general purpose software engineering tasks
Large Multimodal Models for Video Understanding and Editing
OCR model for complex documents with layout-aware structured outputs
Open Source Speech Language Model
Build multimodal AI applications with cloud-native stack
Open-source evaluation toolkit of large multi-modality models (LMMs)
End-to-end pipeline converting generative videos
Implementation of "MobileCLIP" CVPR 2024
CLIP, Predict the most relevant text snippet given an image
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning