A state-of-the-art open visual language model
Generate audiobooks from EPUBs, PDFs and text with captions
Easily turn large sets of image urls to an image dataset
A robust, efficient, low-latency speech-to-text library
Simple HTML5, YouTube and Vimeo player
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Towards Real-World Vision-Language Understanding
CLIP, Predict the most relevant text snippet given an image
4M: Massively Multimodal Masked Modeling
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
OpenAI swift async text to image for SwiftUI app using OpenAI
An enhanced HTML 5 file input for Bootstrap 5.x/4.x./3.x
Packages with more than 80 components for all delphi versions
An open-source framework for training large multimodal models
The ultimate tool to automate custom telegram message forwarding
Elegant, responsive, flexible and lightweight modal plugin with jQuery
A simple yet powerful JQuery star rating plugin with fractional rating
A lightweight, dependency-free Python library
Official implementation for UniVL video and language training models
Quickly create custom webpages from your content
Touch swipe image slider/slideshow/gallery/carousel/banner mobile
Online test tool for instagram caption/convert/post image automation
Get Caption, start watching