Instructions on how to use the Realtime API on Microcontrollers
A Powerful Native Multimodal Model for Image Generation
Open-source multi-speaker long-form text-to-speech model
Blazeface is a lightweight model that detects faces in images
A CNN model that predicts human joints from RGB images of a person
Detect faces in an image
Ultra-efficient 3B multimodal instruct model built for edge deployment
Compact 8B multimodal instruct model optimized for edge deployment
Efficient 8B multimodal model tuned for advanced reasoning tasks.