Dataset of GPT-2 outputs for research in detection, biases, and more
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
DeepSeek Coder: Let the Code Write Itself
The Telegram Bot Framework
A simple, secure MCP-to-OpenAPI proxy server
The most powerful Android RPA agent framework
Implementation of "MobileCLIP" CVPR 2024
A fast, powerful, and simple hierarchical vision transformer
Code release for Cut and Learn for Unsupervised Object Detection
Video understanding codebase from FAIR for reproducing video models
Towards Real-World Vision-Language Understanding
CLIP, Predict the most relevant text snippet given an image
An Open-Source Programming Framework for Agentic AI
The official Meta Llama 3 GitHub site
Diffusion Transformer with Fine-Grained Chinese Understanding
Renderer for the harmony response format to be used with gpt-oss
Claude Code is an agentic coding tool that lives in your terminal
This repository provides an advanced RAG
Get GPT like ChatGPT on your terminal
Open Source TypeScript AI Agent Framework
Meta Agents Research Environments is a comprehensive platform
Code for the paper Language Models are Unsupervised Multitask Learners
Official code for Style Aligned Image Generation via Shared Attention
A Model Context Protocol server for searching and analyzing arXiv
4M: Massively Multimodal Masked Modeling