Showing 1864 open source projects for "linux-gnome"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Cloud tools for web scraping and data extraction Icon
    Cloud tools for web scraping and data extraction

    Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.

    Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
    Explore 10,000+ tools
  • 1
    BentoML

    BentoML

    Unified Model Serving Framework

    BentoML simplifies ML model deployment and serves your models at a production scale. Support multiple ML frameworks natively: Tensorflow, PyTorch, XGBoost, Scikit-Learn and many more! Define custom serving pipeline with pre-processing, post-processing and ensemble models. Standard .bento format for packaging code, models and dependencies for easy versioning and deployment. Integrate with any training pipeline or ML experimentation platform. Parallelize compute-intense model inference...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    Bolna

    Bolna

    Conversational voice AI agents

    Bolna is an end-to-end open-source platform for building conversational voice AI agents, enabling developers to create voice-first conversational assistants efficiently.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    Instructor

    Instructor

    Structured outputs for llms

    Instructor is a tool that enables developers to extract structured data from natural language using Large Language Models (LLMs). Integrating with Python's Pydantic library allows users to define desired output structures through type hints, facilitating schema validation and seamless integration with IDEs. Instructor supports various LLM providers, including OpenAI, Anthropic, Litellm, and Cohere, offering flexibility in implementation. Its customizable nature permits the definition of...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    supervision

    supervision

    We write your reusable computer vision tools

    We write your reusable computer vision tools. Whether you need to load your dataset from your hard drive, draw detections on an image or video, or count how many detections are in a zone. You can count on us.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Say goodbye to broken revenue funnels and poor customer experiences Icon
    Say goodbye to broken revenue funnels and poor customer experiences

    Connect and coordinate your data, signals, tools, and people at every step of the customer journey.

    LeanData is a Demand Management solution that supports all go-to-market strategies such as account-based sales development, geo-based territories, and more. LeanData features a visual, intuitive workflow native to Salesforce that enables users to view their entire lead flow in one interface. LeanData allows users to access the drag-and-drop feature to route their leads. LeanData also features an algorithms match that uses multiple fields in Salesforce.
    Learn More
  • 5
    MetaGPT

    MetaGPT

    The Multi-Agent Framework

    The Multi-Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo. Assign different roles to GPTs to form a collaborative software entity for complex tasks. MetaGPT takes a one-line requirement as input and outputs user stories / competitive analysis/requirements/data structures / APIs / documents, etc. Internally, MetaGPT includes product managers/architects/project managers/engineers. It provides the entire process of a software company along with carefully orchestrated SOPs.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    WhisperLive

    WhisperLive

    A nearly-live implementation of OpenAI's Whisper

    WhisperLive is a “nearly live” implementation of OpenAI’s Whisper model focused on real-time transcription. It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7
    StyleTTS 2

    StyleTTS 2

    Towards Human-Level Text-to-Speech through Style Diffusion

    StyleTTS2 is a state-of-the-art text-to-speech system that aims for human-level naturalness by combining style diffusion, adversarial training, and large speech language models. It extends the original StyleTTS idea by introducing a style diffusion model that can sample rich, realistic speaking styles conditioned on reference speech, allowing highly expressive and diverse prosody. The architecture uses a two-stage training process and leverages an auxiliary speech language model to guide...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 8
    Janus

    Janus

    Unified Multimodal Understanding and Generation Models

    Janus is a sophisticated open-source project from DeepSeek AI that aims to unify both visual understanding and image generation in a single model architecture. Rather than having separate systems for “look and describe” and “prompt and generate”, Janus uses an autoregressive transformer framework with a decoupled visual encoder—allowing it to ingest images for comprehension and to produce images from text prompts with shared internal representations. The design tackles long-standing...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    Depth Pro

    Depth Pro

    Sharp Monocular Metric Depth in Less Than a Second

    Depth Pro is a foundation model for zero-shot metric monocular depth estimation, producing sharp, high-frequency depth maps with absolute scale from a single image. Unlike many prior approaches, it does not require camera intrinsics or extra metadata, yet still outputs metric depth suitable for downstream 3D tasks. Apple highlights both accuracy and speed: the model can synthesize a ~2.25-megapixel depth map in around 0.3 seconds on a standard GPU, enabling near real-time applications. The...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Dragonfly | An In-Memory Data Store without Limits Icon
    Dragonfly | An In-Memory Data Store without Limits

    Dragonfly Cloud is engineered to handle the heaviest data workloads with the strictest security requirements.

    Dragonfly is a drop-in Redis replacement that is designed for heavy data workloads running on modern cloud hardware. Migrate in less than a day and experience up to 25X the performance on half the infrastructure.
    Learn More
  • 10
    DeepSeek Coder

    DeepSeek Coder

    DeepSeek Coder: Let the Code Write Itself

    DeepSeek-Coder is a series of code-specialized language models designed to generate, complete, and infill code (and mixed code + natural language) with high fluency in both English and Chinese. The models are trained from scratch on a massive corpus (~2 trillion tokens), of which about 87% is code and 13% is natural language. This dataset covers project-level code structure (not just line-by-line snippets), using a large context window (e.g. 16K) and a secondary fill-in-the-blank objective...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    Stable Diffusion Version 2

    Stable Diffusion Version 2

    High-Resolution Image Synthesis with Latent Diffusion Models

    Stable Diffusion (the stablediffusion repo by Stability-AI) is an open-source implementation and reference codebase for high-resolution latent diffusion image models that power many text-to-image systems. The repository provides code for training and running Stable Diffusion-style models, instructions for installing dependencies (with notes about performance libraries like xformers), and guidance on hardware/driver requirements for efficient GPU inference and training. It’s organized as a...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 12
    LlamaParse

    LlamaParse

    Parse files for optimal RAG

    LlamaParse is a GenAI-native document parser that can parse complex document data for any downstream LLM use case (RAG, agents). Load in 160+ data sources and data formats, from unstructured, and semi-structured, to structured data (API's, PDFs, documents, SQL, etc.) Store and index your data for different use cases. Integrate with 40+ vector stores, document stores, graph stores, and SQL db providers.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    gpt-engineer

    gpt-engineer

    Full stack AI software engineer

    gpt-engineer is an open-source platform designed to help developers automate the software development process using natural language. The platform allows users to specify software requirements in plain language, and the AI generates and executes the corresponding code. It can also handle improvements and iterative development, giving users more control over the software they’re building. Built with a terminal-based interface, gpt-engineer is customizable, enabling developers to experiment...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    Diffusers

    Diffusers

    State-of-the-art diffusion models for image and audio generation

    Diffusers is the go-to library for state-of-the-art pretrained diffusion models for generating images, audio, and even 3D structures of molecules. Whether you're looking for a simple inference solution or training your own diffusion models, Diffusers is a modular toolbox that supports both. Our library is designed with a focus on usability over performance, simple over easy, and customizability over abstractions. State-of-the-art diffusion pipelines that can be run in inference with just a...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 15
    Transformers

    Transformers

    State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX

    Transformers provides APIs and tools to easily download and train state-of-the-art pre-trained models. Using pre-trained models can reduce your compute costs, carbon footprint, and save you the time and resources required to train a model from scratch. These models support common tasks in different modalities. Text, for tasks like text classification, information extraction, question answering, summarization, translation, text generation, in over 100 languages. Images, for tasks like image...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    Llama Stack

    Llama Stack

    Composable building blocks to build Llama Apps

    Llama-Stack is an open-source framework designed to facilitate the deployment and fine-tuning of large language models (LLMs) for various natural language processing tasks.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    Real-Time Voice Cloning

    Real-Time Voice Cloning

    Clone a voice in 5 seconds to generate arbitrary speech in real-time

    Real-Time Voice Cloning is an influential deep-learning repository that demonstrates how to clone a voice from just a few seconds of audio and then generate arbitrary speech in that voice in near real time. It implements the SV2TTS pipeline (“Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis”) in three stages: a speaker encoder, a synthesizer, and a vocoder. In the first stage, short audio clips are converted into a fixed-dimensional speaker embedding that...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    CodeGeeX2

    CodeGeeX2

    CodeGeeX2: A More Powerful Multilingual Code Generation Model

    CodeGeeX2 is the second-generation multilingual code generation model from ZhipuAI, built upon the ChatGLM2-6B architecture and trained on 600B code tokens. Compared to the first generation, it delivers a significant boost in programming ability across multiple languages, outperforming even larger models like StarCoder-15B in some benchmarks despite having only 6B parameters. The model excels at code generation, translation, summarization, debugging, and comment generation, and it supports...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 19
    AlphaGenome

    AlphaGenome

    Programmatic access to the AlphaGenome model

    The AlphaGenome API provides access to AlphaGenome, Google DeepMind’s unifying model for deciphering the regulatory code within DNA sequences. This repository contains client-side code, examples, and documentation to help you use the AlphaGenome API. AlphaGenome offers multimodal predictions, encompassing diverse functional outputs such as gene expression, splicing patterns, chromatin features, and contact maps. The model analyzes DNA sequences of up to 1 million base pairs in length and can...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    MedGemma

    MedGemma

    Collection of Gemma 3 variants that are trained for performance

    MedGemma is a collection of specialized open-source AI models created by Google as part of its Health AI Developer Foundations initiative, built on the Gemma 3 family of transformer models and trained for medical text and image comprehension tasks that help accelerate the development of healthcare-focused AI applications. It includes multiple variants such as a 4 billion-parameter multimodal model that can process both medical images and text and a 27 billion-parameter text-only (and...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    LTX-Video

    LTX-Video

    Official repository for LTX-Video

    LTX-Video is a sophisticated multimedia processing framework from Lightricks designed to handle high-quality video editing, compositing, and transformation tasks with performance and scalability. It provides runtime components that efficiently decode, encode, and manipulate video streams, frame buffers, and audio tracks while exposing a rich API for building customized editing features like transitions, effects, color grading, and keyframe automation. The toolkit is built with both real-time...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    HunyuanOCR

    HunyuanOCR

    OCR expert VLM powered by Hunyuan's native multimodal architecture

    HunyuanOCR is an open-source, end-to-end OCR (optical character recognition) Vision-Language Model (VLM) developed by Tencent‑Hunyuan. It’s designed to unify the entire OCR pipeline, detection, recognition, layout parsing, information extraction, translation, and even subtitle or structured output generation, into a single model inference instead of a cascade of separate tools. Despite being fairly lightweight (about 1 billion parameters), it delivers state-of-the-art performance across a...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 23
    HexStrike AI MCP Agents

    HexStrike AI MCP Agents

    HexStrike AI MCP Agents is an advanced MCP server

    HexStrike AI is an MCP server that lets LLM agents autonomously operate a large catalog of offensive-security tools. Its goal is to bridge “language models” and practical pentest workflows—enumeration, exploitation, vulnerability discovery, and bug bounty reconnaissance—under safe, auditable controls. The server exposes typed tools and guardrails so agent prompts translate to concrete, parameterized actions rather than brittle shell strings. It ships with curated tool adapters, task...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 24
    VMZ (Video Model Zoo)

    VMZ (Video Model Zoo)

    VMZ: Model Zoo for Video Modeling

    The codebase was designed to help researchers and practitioners quickly reproduce FAIR’s results and leverage robust pre-trained backbones for downstream tasks. It also integrates Gradient Blending, an audio-visual modeling method that fuses modalities effectively (available in the Caffe2 implementation). Although VMZ is now archived and no longer actively maintained, it remains a valuable reference for understanding early large-scale video model training, transfer learning, and multimodal...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    HunyuanVideo

    HunyuanVideo

    HunyuanVideo: A Systematic Framework For Large Video Generation Model

    HunyuanVideo is a cutting-edge framework designed for large-scale video generation, leveraging advanced AI techniques to synthesize videos from various inputs. It is implemented in PyTorch, providing pre-trained model weights and inference code for efficient deployment. The framework aims to push the boundaries of video generation quality, incorporating multiple innovative approaches to improve the realism and coherence of the generated content. Release of FP8 model weights to reduce GPU...
    Downloads: 5 This Week
    Last Update:
    See Project