MiniCPM-o
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
...With 8 billion parameters, MiniCPM-o 2.6 surpasses its predecessors in versatility and efficiency, making it one of the most robust models available. It supports both text and audio inputs to generate outputs in various forms, including voice cloning, emotion control, and interactive role-playing.