Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion
Tensor search for humans
The data structure for multimodal data
Django friendly finite state machine support
Open source framework for deep learning satellite and aerial imagery
A python library for self-supervised learning on images
Jittor is a high-performance deep learning framework
Implementation of Imagen, Google's Text-to-Image Neural Network
Hub of ready-to-use datasets for ML models
Build cross-modal and multimodal applications on the cloud
A Python toolbox for scalable outlier detection
Deep learning library
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Experimental, AI/ML-powered and open sourced Marketing Mix Modeling
Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
UI-TARS-desktop version that can operate on your local personal device
LLM-based agent for general purpose software engineering tasks
Large Multimodal Models for Video Understanding and Editing
Real-time voice interactive digital human
Concatenate a directory full of files into a single prompt
Scalable machine learning for time series forecasting
Benchmarking synthetic data generation methods
Implementation of 'lightweight' GAN, proposed in ICLR 2021
Powering Amazon custom machine learning chips