A Python toolbox for gaining geometric insights
Video Object and Interaction Deletion
Master the fundamentals of machine learning, deep learning
Open-source evaluation toolkit of large multi-modality models (LMMs)
A new kind of Progress Bar, with real-time throughput, ETA
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences
Generate audiobooks from e-books
Benchmarking Multimodal Agents for Open-Ended Tasks
Browse the web, directly from Cursor etc.
PDF to Markdown with vision models
Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
A Pioneering Open-Source Alternative to GPT-4o
Pixel-Aligned 3D Generation from Images
Phi-3.5 for Mac: Locally-run Vision and Language Models
Extension of Google Research’s PaperBanana
Detects phishing and lookalike domains using DNS fuzzing techniques
Multimodal Agents as Smartphone Users, an LLM-based multimodal agent
Label Studio is a multi-type data labeling and annotation tool
Open-source and free to self-host
The book "Performance Analysis and Tuning on Modern CPU"
3D plotting and mesh analysis through a streamlined interface
Static site generator for .NET API documentation
A frontier, first-principles handbook
Modular quant framework
3D Engine with Blender Integration