AI Slack bot for reading, summarizing, and chatting with content
The data structure for multimodal data
Dealing with all unstructured data, such as reverse image search
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
InvokeAI is a leading creative engine for Stable Diffusion models
Build AI-powered semantic search applications
Benchmark LLMs by fighting in Street Fighter 3
Build cross-modal and multimodal applications on the cloud
Overcoming Data Limitations for High-Quality Video Diffusion Models
SoundTranscriber can be used to generate automatic transcription / aut
AI-powered tool to quickly remove watermarks from videos flawlessly
Ainee - AI Notetaking and Learning Companion
CLIP + FFT/DWT/RGB = text to image/video
Multimodal AI Story Teller, built with Stable Diffusion, GPT, etc.
Video automatic transcribe and translated subtitle generator
A walk along memory lane
Implementation of NÜWA, attention network for text to video synthesis
Generate images from texts. In Russian
Based on the Disco Diffusion, version of the AI art creation software
Software tool that converts text to video for more engaging experience
Easy-OCR solution and Tesseract trainer for GNU/Linux
IPTV/NVR/CCTV/Video cloud https://fastocloud.com
Basic Utilities for PyTorch Natural Language Processing (NLP)
Just Another Speech Recognition and Text to Speech software.