Automate browser-based workflows with LLMs and Computer Vision
This repos contains notebooks for the Advanced Solutions Lab
Chat with your documents using local AI
21 Lessons, Get Started Building with Generative AI
Converts text to speech in realtime
RGBD video generation model conditioned on camera input
Qwen2.5-VL is the multimodal large language model series
An unsupervised and free tool for image and video dataset analysis
Build AI-powered semantic search applications
An on-premises, OCR-free unstructured data extraction
Unified framework for building enterprise RAG pipelines
Agent S: an open agentic framework that uses computers like a human
Hindsight: Agent Memory That Learns
Build portable, production-ready MLOps pipelines
Replace OpenAI GPT with another LLM in your app
Chat & pretrained large audio language model proposed by Alibaba Cloud
SkyPilot: Run AI and batch jobs on any infra
Flowly is 100x faster than OpenClaw
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Tooling for the Common Objects In 3D dataset
An MLOps framework to package, deploy, monitor and manage models
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
DeepVariant is an analysis pipeline that uses a deep neural networks
OpenRecall is a fully open-source, privacy-first alternative