A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Towards Human-Level Text-to-Speech through Style Diffusion
Models for the spaCy Natural Language Processing (NLP) library
Seamlessly integrate LLMs into scikit-learn
LongBench v2 and LongBench (ACL 25'&24')
Solve end to end problems using Llama model family
Terminal-based CPU stress and monitoring utility
Virtual AI anchor that combines state-of-the-art technology
Multi-tool for semantic search
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Shared repository for open-sourced projects from the Google AI Lang
Get Alerts from your Docker Container Logs
Unified Multimodal Understanding and Generation Models
A Python package for segmenting geospatial data with the SAM
AI-Powered Data Processing: Use LOTUS to process all of your datasets
OpenRecall is a fully open-source, privacy-first alternative
Towards Studio-Grade Character Animation via In-Context Learning of 3D
Large Multimodal Models for Video Understanding and Editing
A Model Context Protocol server for searching and analyzing arXiv
Pushing the Limits of Mathematical Reasoning in Open Language Models
Python crawler for collecting and downloading Sina Weibo user data
ChatGPT extension for scientific research work
LLM-based agent for general purpose software engineering tasks
An interactive program for statistical analysis of texts