Showing 136 open source projects for "space"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 1
    whisper.cpp

    whisper.cpp

    Port of OpenAI's Whisper model in C/C++

    ...The command downloads the base.en model converted to custom ggml format and runs the inference on all .wav samples in the folder samples. whisper.cpp supports integer quantization of the Whisper ggml models. Quantized models require less memory and disk space and depending on the hardware can be processed more efficiently.
    Downloads: 446 This Week
    Last Update:
    See Project
  • 2
    CLIP

    CLIP

    CLIP, Predict the most relevant text snippet given an image

    CLIP (Contrastive Language-Image Pretraining) is a neural model that links images and text in a shared embedding space, allowing zero-shot image classification, similarity search, and multimodal alignment. It was trained on large sets of (image, caption) pairs using a contrastive objective: images and their matching text are pulled together in embedding space, while mismatches are pushed apart. Once trained, you can give it any text labels and ask it to pick which label best matches a given image—even without explicit training for that classification task. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    vJEPA-2

    vJEPA-2

    PyTorch code and models for VJEPA2 self-supervised learning from video

    VJEPA2 is a next-generation self-supervised learning framework for video that extends the “predict in representation space” idea from i-JEPA to the temporal domain. Instead of reconstructing pixels, it predicts the missing high-level embeddings of masked space-time regions using a context encoder and a slowly updated target encoder. This objective encourages the model to learn semantics, motion, and long-range structure without the shortcuts that pixel-level losses can invite. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    VoxCPM

    VoxCPM

    TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

    VoxCPM is a tokenizer-free text-to-speech system that models speech in a continuous space, aiming for extremely realistic, context-aware synthesis and true-to-life zero-shot voice cloning. Instead of converting speech into discrete tokens, it uses an end-to-end diffusion-autoregressive architecture built on the MiniCPM-4 backbone, combining hierarchical language modeling, finite scalar quantization (FSQ), and local Diffusion Transformers.
    Downloads: 24 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    stable-diffusion-videos

    stable-diffusion-videos

    Create videos with Stable Diffusion

    Create videos with Stable Diffusion by exploring the latent space and morphing between text prompts. Try it yourself in Colab.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    UForm

    UForm

    Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion

    UForm is a Multi-Modal Modal Inference package, designed to encode Multi-Lingual Texts, Images, and, soon, Audio, Video, and Documents, into a shared vector space! It comes with a set of homonymous pre-trained networks available on HuggingFace portal and extends the transfromers package to support Mid-fusion Models. Late-fusion models encode each modality independently, but into one shared vector space. Due to independent encoding late-fusion models are good at capturing coarse-grained features but often neglect fine-grained ones. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    SkillOpt

    SkillOpt

    Text-space optimizer that trains reusable natural-language skills

    ...The system learns from agent rollouts, reflection, bounded edits, and validation gates to produce better instructions over time. Its output is a deployable best_skill.md artifact that can be reused across agent tasks. The project is focused on making agents more effective through text-space optimization rather than traditional fine-tuning. It is most useful for AI researchers and agent developers studying self-improving workflows, skill libraries, and evaluation-driven prompt refinement.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    JEPA

    JEPA

    PyTorch code and models for V-JEPA self-supervised learning from video

    ...The repository provides training recipes, data pipelines, and evaluation utilities for image JEPA variants and often includes ablations that illuminate which masking and architectural choices matter. Because the objective is non-autoregressive and operates in embedding space, JEPA tends to be compute-efficient and stable at scale. The approach has become a strong alternative to contrastive or pixel-reconstruction methods for representation learning.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    HeartMuLa

    HeartMuLa

    A Family of Open Sourced Music Foundation Models

    ...For text extraction from audio, it provides HeartTranscriptor, a Whisper-based model tuned specifically for lyrics transcription, which helps bridge generated or recorded audio back into structured text. It also introduces HeartCLAP, which aligns audio and text into a shared embedding space.
    Downloads: 14 This Week
    Last Update:
    See Project
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 10
    Depth Anything 3

    Depth Anything 3

    Recovering the Visual Space from Any Views

    Depth Anything 3 is a research-driven project that brings accurate and dense depth estimation to any input image or video, enabling foundational understanding of 3D structure from 2D visual content. Designed to work across diverse scenes, lighting conditions, and image types, it uses advanced neural networks trained on large, heterogeneous datasets, producing depth maps that reveal scene depth relationships and object surfaces with strong fidelity. The model can be applied to photography,...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 11
    UI-TARS

    UI-TARS

    UI-TARS-desktop version that can operate on your local personal device

    UI-TARS is an open-source multimodal “GUI agent” created by ByteDance: a model designed to perceive raw screenshots (or rendered UI frames), reason about what needs to be done, and then perform real interactions with graphical user interfaces (GUIs) — like clicking, typing, navigating menus — across desktop, browser, mobile, or game environments. Rather than relying on rigid, manually scripted UI automation, UI-TARS uses a unified vision-language model (VLM) that integrates perception,...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 12
    Semantic Kernel

    Semantic Kernel

    Integrate cutting-edge LLM technology quickly and easily into your app

    Semantic Kernel is an open-source SDK that lets you easily combine AI services like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C# and Python. By doing so, you can create AI apps that combine the best of both worlds. To help developers build their own Copilot experiences on top of AI plugins, we have released Semantic Kernel, a lightweight open-source SDK that allows you to orchestrate AI plugins. With Semantic Kernel, you can leverage the same AI...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    KerasTuner

    KerasTuner

    A Hyperparameter Tuning Library for Keras

    KerasTuner is an easy-to-use, scalable hyperparameter optimization framework that solves the pain points of hyperparameter search. Easily configure your search space with a define-by-run syntax, then leverage one of the available search algorithms to find the best hyperparameter values for your models. KerasTuner comes with Bayesian Optimization, Hyperband, and Random Search algorithms built-in, and is also designed to be easy for researchers to extend in order to experiment with new search algorithms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Linfa

    Linfa

    A Rust machine learning framework

    linfa aims to provide a comprehensive toolkit to build Machine Learning applications with Rust. Kin in spirit to Python's scikit-learn, it focuses on common preprocessing tasks and classical ML algorithms for your everyday ML tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Physical Symbolic Optimization (Φ-SO)

    Physical Symbolic Optimization (Φ-SO)

    Physical Symbolic Optimization

    Physical Symbolic Optimization (Φ-SO) - A symbolic optimization package built for physics. Symbolic regression module uses deep reinforcement learning to infer analytical physical laws that fit data points, searching in the space of functional forms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    gensim

    gensim

    Topic Modelling for Humans

    Gensim is a Python library for topic modeling, document indexing, and similarity retrieval with large corpora. The target audience is the natural language processing (NLP) and information retrieval (IR) community.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    LangKit

    LangKit

    An open-source toolkit for monitoring Language Learning Models (LLMs)

    ...Productionizing language models, including LLMs, comes with a range of risks due to the infinite amount of input combinations, which can elicit an infinite amount of outputs. The unstructured nature of text poses a challenge in the ML observability space - a challenge worth solving, since the lack of visibility on the model's behavior can have serious consequences.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    HRM-Text

    HRM-Text

    1B text generation model based on the HRM architecture

    ...It is designed to make foundation model pretraining more accessible by reducing compute and data requirements compared with traditional scaling-heavy approaches. The system combines hierarchical recurrent design, task-completion strengthening, and latent-space reasoning. Its training stack includes PrefixLM sequence packing, FlashAttention 3 kernels, PyTorch FSDP2, evaluation scripts, and checkpoint conversion tools. The repository supports reference pretraining runs for smaller and larger configurations, with Hopper-class GPUs expected for the attention path. It is useful for researchers and engineers exploring efficient language model pretraining, reasoning-focused architectures, and reproducible foundation model experiments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Story Flicks

    Story Flicks

    Generate high-definition story short videos with one click using AI

    Story Flicks is another open-source project in the AI-assisted video generation / editing space, focused on creating short, story-style videos from script or prompt inputs. It aims to let users generate high-definition short movies or video stories with minimal manual effort, using AI models under the hood to assemble visuals, timing, and possibly narration or subtitles. For creators who want to produce narrative short-form content — whether for social media, storytelling, or prototyping video ideas — story-flicks offers a lightweight, code-backed alternative to complex video editing suites. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    InvokeAI

    InvokeAI

    InvokeAI is a leading creative engine for Stable Diffusion models

    InvokeAI is an implementation of Stable Diffusion, the open source text-to-image and image-to-image generator. It provides a streamlined process with various new features and options to aid the image generation process. It runs on Windows, Mac and Linux machines, and runs on GPU cards with as little as 4 GB or RAM. InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies....
    Downloads: 10 This Week
    Last Update:
    See Project
  • 21
    ESP32-CAM_MJPEG2SD

    ESP32-CAM_MJPEG2SD

    ESP32 Camera motion capture application to record JPEGs to SD card

    ...The AVI format allows recordings to replay at the correct frame rate on media players. If a microphone is installed then a WAV file is also created and stored in the AVI file. The ESP32 cannot support all of the features as it will run out of heap space. For better functionality and performance, use one of the new ESP32S3 camera boards, eg Freenove ESP32S3 Cam, and ESP32S3 XIAO Sense, but avoid no-name boards marked ESPS3 RE:1.0.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    AppAgent

    AppAgent

    Multimodal Agents as Smartphone Users, an LLM-based multimodal agent

    AppAgent is an open-source multimodal agent framework designed to enable large language models to operate smartphone applications through natural interactions with graphical user interfaces. The system allows an AI agent to interpret visual information from the screen and translate natural language instructions into actions such as tapping, swiping, and navigating between application screens. Instead of requiring backend access to application APIs, the framework interacts with apps the same...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    zclaw

    zclaw

    Your personal AI assistant at all-in 888KiB

    ...It includes support for GPIO control, scheduled tasks, memory handling, and other embedded automation features that enable real-world device interaction. The architecture is optimized for efficiency, allowing the full assistant stack to run in under one megabyte of space. By targeting low-power hardware, zclaw explores the future of edge AI assistants that operate independently of large cloud systems. Overall, the project showcases how lightweight autonomous assistants can be embedded directly into IoT devices.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    LeWorldModel

    LeWorldModel

    Official code base for LeWorldModel: Stable End-to-End Joint-Embedding

    LeWorldModel is a minimalist tiling window manager designed for the X11 windowing system, focusing on simplicity, performance, and efficient use of screen space. It provides automatic window tiling behavior, organizing application windows into structured layouts without requiring manual resizing or positioning. The project emphasizes a lightweight design, minimizing resource usage while maintaining responsiveness and stability. It is highly configurable through source code or configuration files, allowing users to tailor behavior, keybindings, and layouts to their preferences. le-wm is intended for users who prefer keyboard-driven workflows and a distraction-free desktop environment. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Ant Design X

    Ant Design X

    Experimental Ant Design extensions for advanced UI patterns

    Ant Design X is an experimental extension project built around the Ant Design ecosystem, focusing on exploring advanced user interface patterns and next-generation component designs. It serves as a space for prototyping and validating ideas that go beyond the core Ant Design library, allowing developers to experiment with more complex interactions and layouts. Ant Design X emphasizes flexibility and extensibility, enabling developers to adapt components to modern application needs. It often includes higher-level abstractions and enhanced UI capabilities that are not yet part of the stable design system. ...
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo