30 projects for "system" with 2 filters applied:

  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • Find Hidden Risks in Windows Task Scheduler Icon
    Find Hidden Risks in Windows Task Scheduler

    Free diagnostic script reveals configuration issues, error patterns, and security risks. Instant HTML report.

    Windows Task Scheduler might be hiding critical failures. Download the free JAMS diagnostic tool to uncover problems before they impact production—get a color-coded risk report with clear remediation steps in minutes.
    Download Free Tool
  • 1
    HunyuanWorld-Voyager

    HunyuanWorld-Voyager

    RGBD video generation model conditioned on camera input

    HunyuanWorld-Voyager is a next-generation video diffusion framework developed by Tencent-Hunyuan for generating world-consistent 3D scene videos from a single input image. By leveraging user-defined camera paths, it enables immersive scene exploration and supports controllable video synthesis with high realism. The system jointly produces aligned RGB and depth video sequences, making it directly applicable to 3D reconstruction tasks. At its core, Voyager integrates a world-consistent video diffusion model with an efficient long-range world exploration engine powered by auto-regressive inference. To support training, the team built a scalable data engine that automatically curates large video datasets with camera pose estimation and metric depth prediction. ...
    Downloads: 68 This Week
    Last Update:
    See Project
  • 2
    IndexTTS2

    IndexTTS2

    Industrial-level controllable zero-shot text-to-speech system

    IndexTTS is a modern, zero-shot text-to-speech (TTS) system engineered to deliver high-quality, natural-sounding speech synthesis with few requirements and strong voice-cloning capabilities. It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 3
    Hunyuan3D-2.1

    Hunyuan3D-2.1

    From Images to High-Fidelity 3D Assets

    Hunyuan3D-2.1 is Tencent Hunyuan’s advanced 3D asset generation system that produces high-fidelity 3D models with Physically Based Rendering (PBR) textures. It is fully open-source with released model weights, training, and inference code. It improves on prior versions by using a PBR texture pipeline (enabling realistic material effects like reflections and subsurface scattering) and allowing community fine-tuning and extension.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 4
    GLM-TTS

    GLM-TTS

    Controllable & emotion-expressive zero-shot TTS

    ...The system introduces a multi-reward reinforcement learning framework that jointly optimizes for voice similarity, emotional expressiveness, pronunciation, and intelligibility, yielding output that can rival commercial options in naturalness and expressiveness. GLM-TTS also supports phoneme-level control and hybrid text + phoneme input, giving developers precise control over pronunciation critical for multilingual or polyphone­-rich languages.
    Downloads: 4 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    Poetiq

    Poetiq

    Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1

    ...The repository allows others to reproduce their results, experiment with different LLM backends (e.g. the user may supply keys for supported models), and observe how their adaptive meta-system handles the logic and abstraction challenges.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    DeepSeek-OCR

    DeepSeek-OCR

    Contexts Optical Compression

    ...It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body text, interpreting tables, or recognizing handwritten versus printed words. It supports local deployment, enabling organizations concerned about privacy or latency to run the pipeline on-premises rather than send sensitive documents to third-party cloud services. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 7
    Claude Code SDK Python

    Claude Code SDK Python

    Python SDK for Claude Agent

    claude-code-sdk-python is the Python SDK for Claude Code, Anthropic’s agentic coding system. It provides abstractions to easily query Claude Code (with streaming support) and conduct interactive sessions. The SDK includes core client classes, asynchronous query functions, and support for custom tools and hooks within Claude sessions. It is designed to integrate with local Python workflows and allow developers to embed Claude Code capabilities directly in their applications or scripts. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    PokeeResearch-7B

    PokeeResearch-7B

    Pokee Deep Research Model Open Source Repo

    ...It is built to operate end-to-end: planning a research strategy, gathering sources, reasoning over conflicting claims, and writing a grounded response. The repository includes evaluation results on multi-step QA and research benchmarks, illustrating how web-time context boosts accuracy. Because the system is modular, you can swap the search component, reader, or policy to fit private deployments or different data domains. It’s aimed at developers who want a transparent, hackable research agent they can run locally or wire into existing workflows.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    DeepSeek-OCR 2

    DeepSeek-OCR 2

    Visual Causal Flow

    ...The repository provides model code and inference scripts that let researchers and developers run and benchmark the system on both images and PDFs, with support for batch evaluation and optimized pipelines leveraging vLLM and transformers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Say goodbye to broken revenue funnels and poor customer experiences Icon
    Say goodbye to broken revenue funnels and poor customer experiences

    Connect and coordinate your data, signals, tools, and people at every step of the customer journey.

    LeanData is a Demand Management solution that supports all go-to-market strategies such as account-based sales development, geo-based territories, and more. LeanData features a visual, intuitive workflow native to Salesforce that enables users to view their entire lead flow in one interface. LeanData allows users to access the drag-and-drop feature to route their leads. LeanData also features an algorithms match that uses multiple fields in Salesforce.
    Learn More
  • 10
    Kimi-Audio

    Kimi-Audio

    Audio foundation model excelling in audio understanding

    ...Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one system, enabling developers to build rich, multimodal audio applications without stitching together disparate components. It uses a novel model setup that combines continuous acoustic features with discrete semantic tokens to richly capture sound and meaning across speech, music, and environmental audio.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Oasis

    Oasis

    Inference script for Oasis 500M

    Open-Oasis provides inference code and released weights for Oasis 500M, an interactive world model that generates gameplay frames conditioned on user keyboard input. Instead of rendering a pre-built game world, the system produces the next visual state via a diffusion-transformer approach, effectively “imagining” the world response to your actions in real time. The project focuses on enabling action-conditional frame generation so developers can experiment with interactive, model-generated environments rather than static video generation alone. Because it’s an inference-focused repository, it’s especially useful as a practical reference for running the model, wiring inputs, and producing the autoregressive sequence of gameplay frames. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Seamless Communication

    Seamless Communication

    Foundational Models for State-of-the-Art Speech and Text Translation

    ...The motivation is to move beyond “text in, text out” and enable direct, live, multi-turn exchange involving language, gesture, gaze, vision, and modality switching without user friction. The system architecture includes a real-time multimodal signal pipeline for audio, video, and sensor data, a dialog manager that can decide when to act (speak, gesture, point) or query, and a cross-modal reasoning layer that fuses perception with semantic context. The research prototype includes components for visual grounding (understanding when a user references something in view), gesture recognition and synthesis, and turn-taking mechanisms that mirror human conversational timing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Profile Data

    Profile Data

    Analyze computation-communication overlap in V3/R1

    profile-data is a repository that publishes profiling traces and metrics from DeepSeek’s training and inference infrastructure (especially during DeepSeek-V3 / R1 experiments). The profiling data targets insights into computation-communication overlap, pipeline scheduling (e.g. DualPipe), and how MoE / EP / parallelism strategies interact in real systems. The repository contains JSON trace files like train.json, prefill.json, decode.json, and associated assets. Users can load them into tools...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    DeepSeek Math

    DeepSeek Math

    Pushing the Limits of Mathematical Reasoning in Open Language Models

    ...The goal is to push DeepSeek’s performance in domains that require rigorous symbolic steps, calculus, linear algebra, number theory, or multi-step derivations. The repo may also include modules that integrate external computational tools (e.g. a CAS / computer algebra system) or calculator assistance backends to enhance correctness. Because math reasoning is a high bar for LLMs, DeepSeek-Math aims to showcase their model’s ability not just in natural text but in precise formal reasoning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    NVIDIA Isaac GR00T

    NVIDIA Isaac GR00T

    NVIDIA Isaac GR00T N1.5 is the world's first open foundation model

    NVIDIA Isaac‑GR00T N1.5 is an open-source foundation model engineered for generalized humanoid robot reasoning and manipulation skills. It accepts multimodal inputs—such as language and images—and uses a diffusion transformer architecture built upon vision-language encoders, enabling adaptive robot behaviors across diverse environments. It is designed to be customizable via post-training with real or synthetic data. The vision-language model remains frozen during both pretraining and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Vidi2

    Vidi2

    Large Multimodal Models for Video Understanding and Editing

    ...Vidi targets applications like intelligent video editing, automated video search, content analysis, and editing assistance, enabling users to efficiently locate relevant segments and objects in hours-long footage. The system is built with open-source release in mind, giving developers access to model code, inference scripts, and evaluation pipelines so they can reproduce research results or integrate Vidi into their own video-processing workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    MoveNet

    MoveNet

    A CNN model that predicts human joints from RGB images of a person

    The MoveNet model is an efficient, real-time human pose estimation system designed for detecting and tracking keypoints of human bodies. It utilizes deep learning to accurately locate 17 key points across the body, providing precise tracking even with fast movements. Optimized for mobile and embedded devices, MoveNet can be integrated into applications for fitness tracking, augmented reality, and interactive systems.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    ControlNet

    ControlNet

    Let us control diffusion models

    ...Rather than training from scratch, ControlNet “locks” the weights of a pre-trained diffusion model and introduces a parallel trainable branch that learns additional conditions—like edges, depth maps, segmentation, human pose, scribbles, or other guidance signals. This allows the system to control where and how the model should focus during generation, enabling users to steer layout, structure, and content more precisely than prompt text alone. The project includes many trained model variants that accept different types of conditioning (e.g., canny edge input, normal maps, skeletal pose) and produce improved fidelity in stable diffusion outputs. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    PyTorch-BigGraph

    PyTorch-BigGraph

    Generate embeddings from large-scale graph-structured data

    PyTorch-BigGraph (PBG) is a system for learning embeddings on massive graphs—think billions of nodes and edges—using partitioning and distributed training to keep memory and compute tractable. It shards entities into partitions and buckets edges so that each training pass only touches a small slice of parameters, which drastically reduces peak RAM and enables horizontal scaling across machines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Mistral Large 3 675B Instruct 2512 Eagle

    Mistral Large 3 675B Instruct 2512 Eagle

    Speculative-decoding accelerator for the 675B Mistral Large 3

    Mistral Large 3 675B Instruct 2512 Eagle is the dedicated speculative-decoding draft model for the full Mistral Large 3 Instruct system, designed to significantly speed up generation while preserving high output quality. It works alongside the primary 675B instruct model, enabling faster response times by predicting several tokens ahead using Mistral’s Eagle speculative method. Built on the same frontier-scale multimodal Mixture-of-Experts architecture, it complements a system featuring 41B active parameters and a 2.5B-parameter vision encoder. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Ministral 3 3B Instruct 2512

    Ministral 3 3B Instruct 2512

    Ultra-efficient 3B multimodal instruct model built for edge deployment

    ...It includes a 3.4B-parameter language model paired with a 0.4B vision encoder, enabling it to understand both text and visual inputs. As an FP8 instruct-fine-tuned model, it is optimized for chat, instruction following, and compact agentic tasks while maintaining strong adherence to system prompts. Despite its small size, it delivers efficient real-time performance and can run locally on a single 8GB GPU, with further memory reductions through quantization. It supports dozens of languages across major global regions, making it well-suited for multilingual and embedded applications. The model also provides function calling, clean JSON output, and stable tool-use behavior, enabling it to serve as a small but effective agentic system.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Ministral 3 8B Instruct 2512

    Ministral 3 8B Instruct 2512

    Compact 8B multimodal instruct model optimized for edge deployment

    ...Designed for edge deployment, the model can run on a wide range of hardware and fits locally on a single 12GB GPU, with the option for even smaller quantized configurations. Its multilingual support covers dozens of major languages, allowing it to work across diverse global environments and applications. The model adheres reliably to system prompts, supports native function calling, and outputs clean JSON, giving it strong tool-use behavior.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Ministral 3 8B Reasoning 2512

    Ministral 3 8B Reasoning 2512

    Efficient 8B multimodal model tuned for advanced reasoning tasks.

    ...Despite its reasoning-focused training, the model remains edge-optimized and can run locally on a single 24GB GPU in BF16, or under 12GB when quantized. It supports dozens of languages, adheres reliably to system prompts, and provides native function calling and structured JSON output—key capabilities for agentic and automation workflows. The model also includes a 256k context window, allowing it to handle long documents and extended reasoning chains.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Ministral 3 14B Reasoning 2512

    Ministral 3 14B Reasoning 2512

    High-precision 14B multimodal model built for advanced reasoning tasks

    ...This version is specifically post-trained for reasoning tasks, making it highly effective for math, coding, STEM workloads, and complex multi-step problem-solving. Despite its scale, the model is engineered for practical deployment and can run locally on 32GB of VRAM in BF16 or under 24GB when quantized. It maintains robust system-prompt adherence, supports dozens of languages, and provides native function calling with clean JSON output for agentic workflows. The model's architecture also delivers a 256k context window, unlocking large-document analysis and long-form reasoning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Ministral 3 14B Instruct 2512

    Ministral 3 14B Instruct 2512

    Efficient 14B multimodal instruct model with edge deployment and FP8

    ...It combines a 13.5B-parameter language model with a 0.4B-parameter vision encoder, enabling strong multimodal understanding in both text and image tasks. This FP8 instruct-tuned variant is designed specifically for chat, instruction following, and agentic workflows with robust system-prompt adherence. Despite its size, the model is engineered for practical deployment, capable of running locally on a single 24GB GPU when served in FP8 and even less with further quantization. Its multilingual support spans dozens of major languages, making it suitable for global, multilingual, and localized AI applications. The model’s architecture provides native function calling, structured JSON outputs, and reliable tool-use behavior essential for agentic automation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next