CLIP + FFT/DWT/RGB = text to image/video
Rename anything
Multimodal Diffusion with Representation Alignment
Integrate ChatGPT into your own discord bot
Web based localization tool with tight version control integration
An open sourced end-to-end VLM-based GUI Agent
MII makes low-latency and high-throughput inference possible
State-of-the-art diffusion models for image and audio generation
A Model Context Protocol server for searching and analyzing arXiv
Refer and Ground Anything Anywhere at Any Granularity
FAIR Sequence Modeling Toolkit 2
TorchMultimodal is a PyTorch library
ICLR2024 Spotlight: curation/training code, metadata, distribution
Official implementation of DreamCraft3D
Transformers4Rec is a flexible and efficient library
Deterministic LLMs Outputs for AI Applications and AI Agents
Synthetic data generators for structured and unstructured text
Chuyển đổi văn bản thành giọng nói không giới hạn
Language modeling in a sentence representation space
Super Tiny Icons are miniscule SVG versions of your favourite website
LLM training code for MosaicML foundation models
The standard data-centric AI package for data quality and ML
Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
Proofs, cases, concept supplements, and reference explanations
Build cross-modal and multimodal applications on the cloud