Showing 228 open source projects for "open source kms"

View related business solutions
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    Step-Video-T2V

    Step-Video-T2V

    State-of-the-art (SoTA) text-to-video pre-trained model

    ...Its training and generation pipeline includes techniques like flow-matching, full 3D attention for temporal consistency, and fine-tuning approaches (e.g. video-based DPO) to improve fidelity and reduce artifacts. As a result, Step-Video-T2V aims to push the frontier of open-source video generation.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Perception Models

    Perception Models

    State-of-the-art Image & Video CLIP, Multimodal Large Language Models

    Perception Models is a state-of-the-art framework developed by Facebook Research for advanced image and video perception tasks. It introduces two primary components: the Perception Encoder (PE) for visual feature extraction and the Perception Language Model (PLM) for multimodal decoding and reasoning. The PE module is a family of vision encoders designed to excel in image and video understanding, surpassing models like SigLIP2, InternVideo2, and DINOv2 across multiple benchmarks. Meanwhile,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    Phi-3-MLX

    Phi-3-MLX

    Phi-3.5 for Mac: Locally-run Vision and Language Models

    Phi-3-Vision-MLX is an Apple MLX (machine learning on Apple silicon) implementation of Phi-3 Vision, a lightweight multi-modal model designed for vision and language tasks. It focuses on running vision-language AI efficiently on Apple hardware like M1 and M2 chips.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Fara-7B

    Fara-7B

    An Efficient Agentic Model for Computer Use

    Fara-7B is a Microsoft initiative aimed at bringing rigor, transparency, and structured evaluation to AI systems through automated and customizable assessment frameworks. It provides stakeholders with a way to benchmark and evaluate models across dimensions such as fairness, robustness, security, privacy, and ethical considerations. Rather than relying on ad-hoc or manual review processes, FARA enables organizations to profile AI behavior using standardized tests, metrics, and reporting...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    Qwen-Image-Layered

    Qwen-Image-Layered

    Qwen-Image-Layered: Layered Decomposition for Inherent Editablity

    Qwen-Image-Layered is an extension of the Qwen series of multimodal models that introduces layered image understanding, enabling the model to reason about hierarchical visual structures — such as separating foreground, background, objects, and contextual layers within an image. This architecture allows richer semantic interpretation, enabling use cases such as scene decomposition, object-level editing, layered captioning, and more fine-grained multimodal reasoning than with flat image...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    IndexTTS2

    IndexTTS2

    Industrial-level controllable zero-shot text-to-speech system

    ...The system supports zero-shot voice cloning — meaning it can mimic a target speaker’s voice from a short reference sample — making it versatile for multi-voice uses. Compared to many open-source TTS tools, IndexTTS emphasizes efficiency and controllability: it offers faster inference, simpler training pipelines, and controllable speech parameters (like duration, pitch, and prosody), which is critical for production use.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    Anthropic SDK Python

    Anthropic SDK Python

    Provides convenient access to the Anthropic REST API from any Python 3

    The anthropic-sdk-python repository is the official Python client library for interacting with the Anthropic (Claude) REST API. It is designed to provide a user-friendly, type-safe, and asynchronous/synchronous capable interface for making chat/completion requests to models like Claude. The library includes definitions for all request and response parameters using Python typed objects, automatically handles serialization and deserialization, and wraps HTTP logic (timeouts, retries, error...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    Depth Anything 3

    Depth Anything 3

    Recovering the Visual Space from Any Views

    Depth Anything 3 is a research-driven project that brings accurate and dense depth estimation to any input image or video, enabling foundational understanding of 3D structure from 2D visual content. Designed to work across diverse scenes, lighting conditions, and image types, it uses advanced neural networks trained on large, heterogeneous datasets, producing depth maps that reveal scene depth relationships and object surfaces with strong fidelity. The model can be applied to photography,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    Claude Code SDK Python

    Claude Code SDK Python

    Python SDK for Claude Agent

    claude-code-sdk-python is the Python SDK for Claude Code, Anthropic’s agentic coding system. It provides abstractions to easily query Claude Code (with streaming support) and conduct interactive sessions. The SDK includes core client classes, asynchronous query functions, and support for custom tools and hooks within Claude sessions. It is designed to integrate with local Python workflows and allow developers to embed Claude Code capabilities directly in their applications or scripts. The...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 10
    Protenix

    Protenix

    A trainable PyTorch reproduction of AlphaFold 3

    Protenix is an open-source, trainable PyTorch reimplementation of AlphaFold 3, developed by ByteDance with the goal of democratizing high-accuracy protein structure prediction for computational biology and drug-discovery research. Protenix provides a complete pipeline for turning protein sequences (with optional MSA / sequence alignment) or structural inputs (e.g.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    CodeGeeX2

    CodeGeeX2

    CodeGeeX2: A More Powerful Multilingual Code Generation Model

    CodeGeeX2 is the second-generation multilingual code generation model from ZhipuAI, built upon the ChatGLM2-6B architecture and trained on 600B code tokens. Compared to the first generation, it delivers a significant boost in programming ability across multiple languages, outperforming even larger models like StarCoder-15B in some benchmarks despite having only 6B parameters. The model excels at code generation, translation, summarization, debugging, and comment generation, and it supports...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    DINOv2

    DINOv2

    PyTorch code and models for the DINOv2 self-supervised learning

    DINOv2 is a self-supervised vision learning framework that produces strong, general-purpose image representations without using human labels. It builds on the DINO idea of student–teacher distillation and adapts it to modern Vision Transformer backbones with a carefully tuned recipe for data augmentation, optimization, and multi-crop training. The core promise is that a single pretrained backbone can transfer well to many downstream tasks—from linear probing on classification to retrieval,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    Tiktoken

    Tiktoken

    tiktoken is a fast BPE tokeniser for use with OpenAI's models

    tiktoken is a high-performance, tokenizer library (based on byte-pair encoding, BPE) designed for use with OpenAI’s models. It handles encoding and decoding text to token IDs efficiently, with minimal overhead. Because tokenization is a fundamental step in preparing text for models, tiktoken is optimized for speed, memory, and correctness in model contexts (e.g. matching OpenAI’s internal tokenization). The repo supports multiple encodings (e.g. “cl100k_base”) and lets users switch encoding...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    CogVLM2

    CogVLM2

    GPT4V-level open-source multi-modal model based on Llama3-8B

    ...The series includes models for both image understanding and video understanding, with CogVLM2-Video supporting up to 1-minute videos by analyzing keyframes. It supports bilingual interaction (Chinese and English) and has open-source versions optimized for dialogue and video comprehension. Notably, the Int4 quantized version allows efficient inference on GPUs with only 16GB of memory. The repository offers demos, API servers, fine-tuning examples, and integration with OpenAI API-compatible endpoints, making it accessible for both researchers and developers.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    MOSS-TTS Family

    MOSS-TTS Family

    MOSS‑TTS Family opensource speech and sound generation model

    MOSS-TTS is an open-source speech and sound generation model family built for high-fidelity, expressive, and production-oriented audio workflows. It covers long-form speech, voice cloning, multi-speaker dialogue, voice design, environmental sound effects, and real-time streaming TTS. The project is designed for complex real-world use cases where a single speech model may not be enough.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    MedGemma

    MedGemma

    Collection of Gemma 3 variants that are trained for performance

    MedGemma is a collection of specialized open-source AI models created by Google as part of its Health AI Developer Foundations initiative, built on the Gemma 3 family of transformer models and trained for medical text and image comprehension tasks that help accelerate the development of healthcare-focused AI applications. It includes multiple variants such as a 4 billion-parameter multimodal model that can process both medical images and text and a 27 billion-parameter text-only (and multimodal) model that offers deeper clinical reasoning and understanding at higher capacity, making it suitable for complex tasks like medical question answering, summarization of clinical notes, or generating reports from radiology images. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    HunyuanWorld 1.0

    HunyuanWorld 1.0

    Generating Immersive, Explorable, and Interactive 3D Worlds

    HunyuanWorld-1.0 is an open-source, simulation-capable 3D world generation model developed by Tencent Hunyuan that creates immersive, explorable, and interactive 3D environments from text or image inputs. It combines the strengths of video-based diversity and 3D-based geometric consistency through a novel framework using panoramic world proxies and semantically layered 3D mesh representations.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    MiMo-V2.5-ASR

    MiMo-V2.5-ASR

    Robust Speech Recognition Across Languages, Dialects

    MiMo-V2.5-ASR is an advanced automatic speech recognition system developed as part of Xiaomi’s MiMo AI ecosystem. It is designed to handle complex acoustic environments, including noisy conditions and diverse speaker variations. The model supports multiple languages and dialects, enabling robust transcription across global use cases. It leverages modern deep learning architectures to improve accuracy and adaptability in real-world scenarios. The system is built to integrate with broader AI...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    GLM-OCR

    GLM-OCR

    Accurate × Fast × Comprehensive

    GLM-OCR is an open-source multimodal optical character recognition (OCR) model built on a GLM-V encoder–decoder foundation that brings robust, accurate document understanding to complex real-world layouts and modalities. Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B), enabling deployment in high-concurrency services and edge environments. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    HY-Motion 1.0

    HY-Motion 1.0

    HY-Motion model for 3D character animation generation

    HY-Motion 1.0 is an open-source, large-scale AI model suite developed by Tencent’s Hunyuan team that generates high-quality 3D human motion from simple text prompts, enabling the automatic production of fluid, diverse, and semantically accurate animations without manual keyframing or rigging. Built on advanced deep learning architectures that combine Diffusion Transformer (DiT) and flow matching techniques, HY-Motion scales these approaches to the billion-parameter level, resulting in strong instruction-following capabilities and richer motion outputs compared to existing open-source models. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Step-Audio

    Step-Audio

    Open-source framework for intelligent speech interaction

    Step-Audio is a unified, open-source framework aimed at building intelligent speech systems that combine both comprehension and generation: it integrates large language models (LLMs) with speech input/output to handle not only semantic understanding but also rich vocal characteristics like tone, style, dialect, emotion, and prosody. The design moves beyond traditional separate-component pipelines (ASR → text model → TTS), instead offering a multimodal model that ingests speech or audio and produces speech accordingly, enabling natural dialogue, voice cloning, and expressive speech synthesis. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    SeedVR

    SeedVR

    Repo for SeedVR2 & SeedVR

    SeedVR (from the ByteDance-Seed organization) is an open-source research and implementation repository focused on cutting-edge video restoration using diffusion transformer architectures. The project includes both the original SeedVR and its successor SeedVR2 models, which are designed to restore degraded or low-quality video content by learning to reconstruct high-fidelity frames with temporal coherence.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    OpenAI Harmony

    OpenAI Harmony

    Renderer for the harmony response format to be used with gpt-oss

    Harmony is a response format developed by OpenAI for use with the gpt-oss model series. It defines a structured way for language models to produce outputs, including regular text, reasoning traces, tool calls, and structured data. By mimicking the OpenAI Responses API, Harmony provides developers with a familiar interface while enabling more advanced capabilities such as multiple output channels, instruction hierarchies, and tool namespaces. The format is essential for ensuring gpt-oss...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    Surya

    Surya

    Implementation of the Surya Foundation Model for Heliophysics

    Surya is an opensource, AI‑based foundation model for heliophysics developed collaboratively by NASA (via the IMPACT AI team) and IBM. Named after the Sanskrit word for “sun,” Surya is trained on nine years of high‑resolution solar imagery from NASA’s Solar Dynamics Observatory (SDO). It is designed to forecast solar phenomena—such as flares, solar wind, irradiance, and active region behavior—by predicting future solar images with a sophisticated long–short vision transformer architecture, thereby enabling improved space weather forecasting. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Lyra 2

    Lyra 2

    Project Lyra: Open Generative 3D World Models

    The Lyra 2 project is a research-driven framework developed by NVIDIA that focuses on building open generative 3D world models using advanced diffusion-based techniques. It enables the creation of fully explorable 3D environments from minimal inputs such as a single image or video, leveraging self-distillation methods to generate consistent spatial representations. The system evolves across versions, with newer iterations introducing long-horizon generation and improved 3D consistency across...
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB