audio video streaming server source code free download

Qwen2.5-Omni

Capable of understanding text, audio, vision, video

Qwen2.5-Omni is an end-to-end multimodal flagship model in the Qwen series by Alibaba Cloud, designed to process multiple modalities (text, images, audio, video) and generate responses both as text and natural speech in streaming real-time. It supports “Thinker-Talker” architecture, and introduces innovations for aligning modalities over time (for example synchronizing video/audio), robust speech generation, and low-VRAM/quantized versions to make usage more accessible. It holds...

Downloads: 0 This Week

Last Update: 2025-09-23

See Project

Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM

...It achieves state-of-the-art results: across 36 audio and audio-visual benchmarks, it hits open-source SOTA on 32 and overall SOTA on 22, outperforming or matching strong closed-source models such as Gemini-2.5 Pro and GPT-4o. To reduce latency, especially in audio/video streaming, Talker predicts discrete speech codecs via a multi-codebook scheme and replaces heavier diffusion approaches.

Downloads: 6 This Week

Last Update: 2026-01-08

See Project

NExT-GPT

Code and models for ICML 2024 paper, NExT-GPT

NExT-GPT is an open-source research framework that implements an advanced multimodal large language model capable of understanding and generating content across multiple modalities. Unlike traditional models that primarily handle text, NExT-GPT supports input and output combinations involving text, images, video, and audio in a unified architecture. The system connects a large language model with multimodal encoders and diffusion-based decoders so it can interpret information from different...

Downloads: 0 This Week

Last Update: 2026-03-05

See Project

Deep Lake

Data Lake for Deep Learning. Build, manage, and query datasets

Deep Lake (formerly known as Activeloop Hub) is a data lake for deep learning applications. Our open-source dataset format is optimized for rapid streaming and querying of data while training models at scale, and it includes a simple API for creating, storing, and collaborating on AI datasets of any size. It can be deployed locally or in the cloud, and it enables you to store all of your data in one place, ranging from simple annotations to large videos. Deep Lake is used by Google, Waymo,...

Downloads: 0 This Week

Last Update: 2026-02-12

See Project

Search Results for "audio video streaming server source code"

Showing 4 open source projects for "audio video streaming server source code"

Qwen2.5-Omni

Qwen3-Omni

NExT-GPT

Deep Lake

Search Results for "audio video streaming server source code"

Showing 4 open source projects for "audio video streaming server source code"

Qwen2.5-Omni

Qwen3-Omni

NExT-GPT

Deep Lake

Related Categories