Extract audio and video content and organize it into a Markdown note
Document Image Parsing via Heterogeneous Anchor Prompting”
Framework for building neural networks
StreamSpeech is a seamless model for offline speech recognition
Synchronized Translation for Videos
Advanced techniques for RAG systems
Automatic SSRF fuzzer and exploitation tool
Implementation of Vision Transformer, a simple way to achieve SOTA
A best practices guide for day 2 operations
Mini website for testing both general CS knowledge and enforce coding
Blazing-fast vector DB with similarity search and metadata filtering
Library for reading and writing large multi-dimensional arrays
A JAX-native LLM Post-Training Library
A Model Context Protocol server for searching and analyzing arXiv
4M: Massively Multimodal Masked Modeling
This repository contains the official implementation of FastVLM
Refer and Ground Anything Anywhere at Any Granularity
Set of tools to assess and improve LLM security
Open-source platform for building enterprise-grade agents
TorchMultimodal is a PyTorch library
ICLR2024 Spotlight: curation/training code, metadata, distribution
A library for differentiable nonlinear optimization
A PyTorch library for implementing flow matching algorithms
An implementation of a deep learning recommendation model (DLRM)
Anthropic's educational courses