State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Large Multimodal Models for Video Understanding and Editing
Implementation of 'lightweight' GAN, proposed in ICLR 2021
Audio foundation model excelling in audio understanding
A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama
Open-source MCP server that gives your coding agent
Uncommon Objects in 3D dataset
Any model. Any hardware. Zero compromise
AI tool for automating desktop tasks via natural language input
Open source RAG framework for building scalable modular AI apps
Sample applications for Google Kubernetes Engine (GKE)
Agent-ready RPA suite with visual workflow automation tools engine
SpikingJelly is an open-source deep learning framework
Visual intelligence for your home.
An open-source, modern-design AI training tracking and visualization
AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories
Foundation model for image generation
Hunyuan Translation Model Version 1.5
Tool for exploring and debugging transformer model behaviors
Conditional GAN for generating synthetic tabular data
A Model Context Protocol server that provides network asset info
ChatGPT interface with better UI
DeepMind model for tracking arbitrary points across videos & robotics
Tooling for the Common Objects In 3D dataset