Open source no-code system for text annotation and building of text
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
OCR expert VLM powered by Hunyuan's native multimodal architecture
Official implementation of DreamCraft3D
Private chat with local GPT with document, images, video, etc.
An unsupervised and free tool for image and video dataset analysis
Python SDK for the Computer Use model Lux, developed by OpenAGI
Large Multimodal Models for Video Understanding and Editing
Controllable and fast Text-to-Speech for over 7000 languages
Integrate, train and manage any AI models and APIs with your database
Library for serving Transformers models on Amazon SageMaker
Obsei is a low code AI powered automation tool
Scalable data pre processing and curation toolkit for LLMs
A toolkit to optimize ML models for deployment for Keras & TensorFlow
Document Image Parsing via Heterogeneous Anchor Prompting”
A python library that makes AMR parsing, generation and visualization
Swirl queries any number of data sources with APIs
airda(Air Data Agent
Spatiotemporal Signal Processing with Neural Machine Learning Models
AutoGluon: AutoML for Image, Text, and Tabular Data
A minimal yet professional single agent demo project
Build cross-modal and multimodal applications on the cloud
A unified framework for scalable computing
Deep learning library
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning