Implementation of Make-A-Video, new SOTA text to video generator
local-first semantic code search engine
Multilingual Document Layout Parsing in a Single Vision-Language Model
NLTK Source
Build multimodal AI applications with cloud-native stack
The official implementation of RAPTOR
Ready-to-run cloud templates for RAG
Open-source industrial-grade ASR models
Code for the paper "Evaluating Large Language Models Trained on Code"
A Python package for segmenting geospatial data with the SAM
Integrate ChatGPT into your own discord bot
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
LLM-based agent for general purpose software engineering tasks
Autonomous LLM agent for end-to-end data science workflows
AI Slack bot for reading, summarizing, and chatting with content
A Personalized LLM-powered Agent Frameworks
AI framework for automated short video creation and editing tools
Faster and easier training and deployments
Play couplet with seq2seq model
Running large language models on a single GPU
Vertically Unified Agents for Graph Retrieval-Augmented Reasoning
LongBench v2 and LongBench (ACL 25'&24')
Traditional Mandarin LLMs for Taiwan
Benchmark LLMs by fighting in Street Fighter 3
Skywork-R1V is an advanced multimodal AI model series