Search Results for "video player linux" - Page 3

Showing 795 open source projects for "video player linux"

View related business solutions
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • Easily Host LLMs and Web Apps on Cloud Run Icon
    Easily Host LLMs and Web Apps on Cloud Run

    Run everything from popular models with on-demand NVIDIA L4 GPUs to web apps without infrastructure management.

    Run frontend and backend services, batch jobs, host LLMs, and queue processing workloads without the need to manage infrastructure. Cloud Run gives you on-demand GPU access for hosting LLMs and running real-time AI—with 5-second cold starts and automatic scale-to-zero so you only pay for actual usage. New customers get $300 in free credit to start.
    Try Cloud Run Free
  • 1
    AV1 AVIF

    AV1 AVIF

    AV1 Image File Format Specification - ISO-BMFF/HEIF derivative

    AV1 AVIF is the official specification and reference design for the AV1 Image File Format (AVIF), defining how AV1-encoded bitstreams are packaged into the HEIF container format (based on ISOBMFF) to produce AVIF files. The project outlines the syntax and semantics required for AVIF compliance, including support for multiple image profiles, color depths, chroma subsampling modes, HDR/WCG, alpha channels, animation/image sequences, and various color-space/bit-depth combinations — making AVIF...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 2
    Open Vision Agents by Stream

    Open Vision Agents by Stream

    Build Vision Agents quickly with any model or video provider

    Open Vision Agents by Stream is an open source framework from Stream for building real time, multimodal AI agents that watch, listen, and respond to live video streams. It focuses on combining video understanding models, such as YOLO and Roboflow based detectors, with real time large language models like OpenAI Realtime and Gemini Live to create interactive experiences. The framework uses Stream’s ultra low latency edge network so agents can join sessions quickly and maintain very low audio...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    WeMod Launcher

    WeMod Launcher

    Tool made to launch the popular Game Trainer / Cheat tool

    The WeMod Launcher is currently on version 1.499. Tool made to launch the popular Game Trainer / Cheat tool WeMod along with your game (made for steam-runtime version in Linux).
    Downloads: 51 This Week
    Last Update:
    See Project
  • 4
    TikTok-ViewBot

    TikTok-ViewBot

    ViewBot using requests updated 2025

    TikTok-ViewBot explores automated interactions with TikTok’s viewing mechanisms for research and educational purposes. The code demonstrates how scripted traffic might be generated and measured, highlighting the kinds of heuristics a platform could use to validate or discount views. It is often used to study rate limits, signature schemes, request patterns, and the fragility of naïve automation. Because it touches on automation against a third-party service, responsible use and adherence to...
    Downloads: 65 This Week
    Last Update:
    See Project
  • Go from Data Warehouse to Data and AI platform with BigQuery Icon
    Go from Data Warehouse to Data and AI platform with BigQuery

    Build, train, and run ML models with simple SQL. Automate data prep, analysis, and predictions with built-in AI assistance from Gemini.

    BigQuery is more than a data warehouse—it's an autonomous data-to-AI platform. Use familiar SQL to train ML models, run time-series forecasts, and generate AI-powered insights with native Gemini integration. Built-in agents handle data engineering and data science workflows automatically. Get $300 in free credit, query 1 TB, and store 10 GB free monthly.
    Try BigQuery Free
  • 5
    MagicTime

    MagicTime

    Time-lapse Video Generation Models as Metamorphic Simulators

    This repository is the official implementation of MagicTime, a metamorphic video generation pipeline based on the given prompts. The main idea is to enhance the capacity of video generation models to accurately depict the real world through our proposed methods and dataset. Compared to general videos, metamorphic videos contain physical knowledge, long persistence, and strong variation, making them difficult to generate.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Memvid

    Memvid

    Video-based AI memory library. Store millions of text chunks in MP4

    Memvid encodes text chunks as QR codes within MP4 frames to build a portable “video memory” for AI systems. This innovative approach uses standard video containers and offers millisecond-level semantic search across large corpora with dramatically less storage than vector DBs. It's self-contained—no DB needed—and supports features like PDF indexing, chat integration, and cloud dashboards.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    VideoRAG

    VideoRAG

    "VideoRAG: Chat with Your Videos

    VideoRAG is a retrieval-augmented generation (RAG) framework tailored for video content that enables AI systems to answer questions, summarize, and reason over long videos by combining visual embeddings with contextual search. The system works by first breaking video into clips, extracting visual and audio-textual features, and indexing them into embeddings, then using an LLM with a retriever to pull relevant segments on demand. When a user query is received, VideoRAG locates semantically...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    FeelUOwn

    FeelUOwn

    Trying to be a robust, user-friendly and hackable music player

    FeelUOwn is a user-friendly, and hackable music player.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    FramePack

    FramePack

    Lets make video diffusion practical

    FramePack explores compact representations for sequences of image frames, targeting tasks where many near-duplicate frames carry redundant information. The idea is to “pack” frames by detecting shared structure and storing differences efficiently, which can accelerate training or inference on video-like data. By reducing I/O and memory bandwidth, datasets become lighter to load while models still see the essential temporal variation. The repository demonstrates both packing and unpacking...
    Downloads: 9 This Week
    Last Update:
    See Project
  • Cut Cloud Costs with Google Compute Engine Icon
    Cut Cloud Costs with Google Compute Engine

    Save up to 91% with Spot VMs and get automatic sustained-use discounts. One free VM per month, plus $300 in credits.

    Save on compute costs with Compute Engine. Reduce your batch jobs and workload bill 60-91% with Spot VMs. Compute Engine's committed use offers customers up to 70% savings through sustained use discounts. Plus, you get one free e2-micro VM monthly and $300 credit to start.
    Try Compute Engine
  • 10
    Tauon

    Tauon

    The music player of today

    Tauon is a modern, streamlined music player app that's packed with features! An emphasis on playlists and drag-and-drop importing puts you in control of your music library. Faded volume control, 24-bit FLAC support, and gapless playback provide the ultimate listening experience. Excellent CUE sheet support, an original smart playlist system, and network playback from koel or Airsonic servers. Last.fm, Listenbrainz, and Maloja scribbling. MPRIS2 support for desktop integration. Tauon is a...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    Mihomo

    Mihomo

    A simple Python Pydantic model for Honkai

    Mihomo is a Python client library leveraging Pydantic to model parsed Honkai: Star Rail user data from the Mihomo public API. It provides structured types, type hints, and convenience methods to fetch and transform player profiles, daily stats, and character details efficiently.
    Downloads: 49 This Week
    Last Update:
    See Project
  • 12
    Qwen2.5-Omni

    Qwen2.5-Omni

    Capable of understanding text, audio, vision, video

    Qwen2.5-Omni is an end-to-end multimodal flagship model in the Qwen series by Alibaba Cloud, designed to process multiple modalities (text, images, audio, video) and generate responses both as text and natural speech in streaming real-time. It supports “Thinker-Talker” architecture, and introduces innovations for aligning modalities over time (for example synchronizing video/audio), robust speech generation, and low-VRAM/quantized versions to make usage more accessible. It holds...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    BlogWizard

    BlogWizard

    Generate blog articles from video or audio

    BlogWizard is a demo/utility project built on top of Groq’s LLM infrastructure that converts video or audio content into well-structured blog posts, enabling creators to repurpose multimedia content into text — useful for SEO, accessibility, or reaching audiences that prefer reading. The tool uses transcription (e.g. via Whisper) to extract text from audio/video, then runs an LLM-based generation pipeline to transform that content into coherent, readable blog-format posts — with sections,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Perception Models

    Perception Models

    State-of-the-art Image & Video CLIP, Multimodal Large Language Models

    Perception Models is a state-of-the-art framework developed by Facebook Research for advanced image and video perception tasks. It introduces two primary components: the Perception Encoder (PE) for visual feature extraction and the Perception Language Model (PLM) for multimodal decoding and reasoning. The PE module is a family of vision encoders designed to excel in image and video understanding, surpassing models like SigLIP2, InternVideo2, and DINOv2 across multiple benchmarks. Meanwhile,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Sa2VA

    Sa2VA

    Official Repo For "Sa2VA: Marrying SAM2 with LLaVA

    Sa2VA is a cutting-edge open-source multi-modal large language model (MLLM) developed by ByteDance that unifies dense segmentation, visual understanding, and language-based reasoning across both images and videos. It merges the segmentation power of a state-of-the-art video segmentation model (based on SAM‑2) with the vision-language reasoning capabilities of a strong LLM backbone (derived from models like InternVL2.5 / Qwen-VL series), yielding a system that can answer questions about...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Persepolis Download Manager

    Persepolis Download Manager

    Persepolis Download Manager is a GUI for aria2

    Persepolis is a download manager & a GUI for Aria2. It's written in Python. Persepolis is a sample of free and open source software. It's developed for GNU/Linux distributions, BSDs, MacOS, and Microsoft Windows.
    Downloads: 31 This Week
    Last Update:
    See Project
  • 17
    LatentSync

    LatentSync

    Taming Stable Diffusion for Lip Sync

    LatentSync is an open-source framework from ByteDance that produces high-quality lip-synchronization for video by using an audio-conditioned latent diffusion model, bypassing traditional intermediate motion representations. In effect, given a source video (with masked or reference frames) and an audio track, LatentSync directly generates frames whose lip motions and expressions align with the audio, producing convincing talking-head or animated lip-sync output. The system leverages a U-Net...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    CogVLM2

    CogVLM2

    GPT4V-level open-source multi-modal model based on Llama3-8B

    CogVLM2 is the second generation of the CogVLM vision-language model series, developed by ZhipuAI and released in 2024. Built on Meta-Llama-3-8B-Instruct, CogVLM2 significantly improves over its predecessor by providing stronger performance across multimodal benchmarks such as TextVQA, DocVQA, and ChartQA, while introducing extended context length support of up to 8K tokens and high-resolution image input up to 1344×1344. The series includes models for both image understanding and video...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Douyin TikTok Download API

    Douyin TikTok Download API

    Douyin TikTok Download API

    Use the official interface to capture Douyin|TikTok data, support API calls, Web portals, and batch analysis. Fast, asynchronous, free, open source, ad-free, long-term maintenance. This project is based on PyWebIO , FastAPI , HTTPX , a fast and asynchronous Douyin / TikTok data crawling tool, and realizes online batch parsing and downloading of watermark-free videos or atlases through the web, data crawling API, and iOS shortcut instructions for watermark-free download and other functions....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Shiptest

    Shiptest

    The Shiptest Codebase

    Shiptest is an open source fork of Space Station 13 that replaces the traditional single-station gameplay with multiple player-controlled ships. Instead of being confined to one station, players can design, operate, and explore with their own ships in a shared space environment. The repository contains full source code, assets, and maps to host or develop servers. Shiptest introduces new mechanics around ship construction, navigation, and resource management, creating a sandbox that...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    Quod Libet

    Quod Libet

    Music player and music library manager for Linux, Windows, and macOS

    Quod Libet is a cross-platform audio/music management program. It provides many ways to view your local library, and supports streaming audio and feeds (podcasts, etc). It has extremely flexible metadata editing and searching capabilities. With over 90 plugins included, you can extend and integrate with almost anything, or write your own! Ex Falso is a bare-bones tag editor with the same editing interface as Quod Libet. Quod Libet is a GTK+-based audio player written in Python, using the...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    LingBot-World

    LingBot-World

    Advancing Open-source World Models

    LingBot-World is an open-source, high-fidelity world simulator designed to advance the state of world models through video generation. Built on top of Wan2.2, it enables realistic, dynamic environment simulation across diverse styles, including real-world, scientific, and stylized domains. LingBot-World supports long-term temporal consistency, maintaining coherent scenes and interactions over minute-level horizons. With real-time interactivity and sub-second latency at 16 FPS, it is...
    Downloads: 28 This Week
    Last Update:
    See Project
  • 23
    MiniMax-MCP

    MiniMax-MCP

    Official MiniMax Model Context Protocol (MCP) server

    MiniMax-MCP is the official Model Context Protocol (MCP) server for accessing MiniMax’s multimodal generative APIs from MCP-compatible clients. It acts as a bridge between tools like Claude Desktop, Cursor, Windsurf, OpenAI Agents, and the MiniMax platform, exposing capabilities such as text-to-speech, voice cloning, image generation, text-to-image, video generation, image-to-video, text-to-video, and music generation. The server is written in Python and distributed under the MIT license,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    labelme Image Polygonal Annotation

    labelme Image Polygonal Annotation

    Image polygonal annotation with Python

    Labelme is a graphical image annotation tool. It is written in Python and uses Qt for its graphical interface. Image annotation for polygon, rectangle, circle, line and point. Image flag annotation for classification and cleaning. Video annotation. (video annotation). GUI customization (predefined labels / flags, auto-saving, label validation, etc). Exporting VOC-format dataset for semantic/instance segmentation. (semantic segmentation, instance segmentation). Exporting COCO-format dataset...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 25
    SteadyDancer

    SteadyDancer

    Harmonized and Coherent Human Image Animation

    SteadyDancer is a research-oriented motion stabilization and dancer tracking system designed to analyze and correct motion in videos, making captured performances appear smoother and more stable while preserving expressiveness. It employs computer vision and motion modeling to estimate and reduce unwanted jitters, shakes, or camera wobbles — particularly in dance or movement sequences where traditional smoothing would distort intentional motion. By differentiating between intentional...
    Downloads: 2 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB