Multilingual sentence & image embeddings with BERT
Automatically translates the text of a video based on a subtitle file
HunyuanVideo: A Systematic Framework For Large Video Generation Model
InvokeAI is a leading creative engine for Stable Diffusion models
The data structure for multimodal data
Implementation of "MobileCLIP" CVPR 2024
Towards Real-World Vision-Language Understanding
Scalable data pre processing and curation toolkit for LLMs
User toolkit for analyzing and interfacing with Large Language Models
A python tool that uses GPT-4, FFmpeg, and OpenCV
Models for the spaCy Natural Language Processing (NLP) library
Framework for building neural networks
Memory-efficient and performant finetuning of Mistral's models
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Low-latency REST API for serving text-embeddings
Open-source choice to scale, assess and maintain natural language data
The open-source data curation platform for LLMs
Data loaders and abstractions for text and NLP
Simple, Pythonic building blocks to evaluate LLM applications
⚡ Building applications with LLMs through composability ⚡
Implementation of AudioLM audio generation model in Pytorch
Central interface to connect your LLM's with external data
SOTA discrete acoustic codec models with 40/75 tokens per second
One-click deployment (including offline integration package)
Code for the paper Language Models are Unsupervised Multitask Learners