156 projects for "input-leap" with 2 filters applied:

  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    VoxCPM2

    VoxCPM2

    Tokenizer-Free TTS for Multilingual Speech Generation

    ...Built on top of the MiniCPM model family, it enables highly natural, expressive, and context-aware speech generation that adapts tone, emotion, and pacing directly from input text. The system is trained on massive multilingual datasets, enabling support for dozens of languages and dialects while maintaining high fidelity and realism in generated audio. VoxCPM stands out for its ability to perform voice cloning with minimal input, capturing not only the speaker’s timbre but also nuanced features such as rhythm, accent, and emotional delivery. ...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 2
    Everywhere

    Everywhere

    Context-aware desktop AI assistant that understands screen content

    ...It integrates with multiple large language model providers and supports various tools, enabling flexible and extensible AI-powered workflows. Everywhere features a modern design with interactive elements such as markdown rendering, keyboard shortcuts, and voice input capabilities. Additionally, the project emphasizes seamless workflow integration by operating alongside existing applications rather than requiring users to switch.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    LTX-2

    LTX-2

    Python inference and LoRA trainer package for the LTX-2 audio–video

    ...Beyond basic rendering scaffolding, LTX-2 includes optimized math libraries, resource loaders, utilities for texture and buffer handling, and integration points for native event loops and input systems. The framework targets both interactive graphical applications and media-rich experiences, making it a solid foundation for games, creative tools, or visualization systems that demand both performance and flexibility. While being low-level, it also provides sensible defaults and helper abstractions that reduce boilerplate and help teams maintain clear, maintainable code.
    Downloads: 47 This Week
    Last Update:
    See Project
  • 4
    Claude Autoresearch

    Claude Autoresearch

    Claude Autoresearch Skill, autonomous goal-directed iteration

    ...Its iterative loop enables deeper exploration of topics over time, making it particularly useful for complex or open-ended research questions. The architecture emphasizes autonomy, reducing the need for constant user input while still producing meaningful insights. It may also include summarization and reporting capabilities to present findings in a digestible format. Overall, autoresearch represents a step toward self-directed knowledge discovery systems that continuously improve their outputs through iteration.
    Downloads: 19 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 5
    Whisper-WebUI

    Whisper-WebUI

    A Web UI for easy subtitle using whisper model

    ...The platform integrates optimized implementations such as faster-whisper, significantly improving transcription speed and reducing memory usage compared to standard models. It supports multiple input sources including local files, YouTube content, and microphone input, making it versatile for different workflows. Whisper WebUI also includes advanced preprocessing and postprocessing features such as voice activity detection, background music separation, and speaker diarization, enabling more accurate and structured outputs.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    PentestGPT

    PentestGPT

    Automated Penetration Testing Agentic Framework Powered by LLMs

    PentestGPT is an AI-powered autonomous penetration testing agent designed to perform intelligent, end-to-end security assessments using large language models. Published at USENIX Security 2024, it combines advanced reasoning with an agentic workflow to automate tasks traditionally handled by human pentesters. The platform supports multiple penetration testing categories, including web security, cryptography, reversing, forensics, privilege escalation, and binary exploitation. PentestGPT runs...
    Downloads: 272 This Week
    Last Update:
    See Project
  • 7
    Nexent

    Nexent

    Zero-code platform for building AI agents from natural language input

    ...Nexent supports multi-agent collaboration, enabling multiple intelligent agents to interact and coordinate tasks within complex workflows. It also includes capabilities for data processing, knowledge tracing, and multimodal interaction, allowing agents to work with different input and output formats. Nexent provides built-in agents for common scenarios such as productivity, travel, and daily assistance.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    SenseVoice

    SenseVoice

    Multilingual speech recognition and audio understanding model

    SenseVoice is a speech foundation model designed to perform multiple voice understanding tasks from audio input. It provides capabilities such as automatic speech recognition, spoken language identification, speech emotion recognition, and audio event detection within a single system. SenseVoice is trained on more than 400,000 hours of speech data and supports over 50 languages for multilingual recognition tasks. It is built to achieve high transcription accuracy while maintaining efficient inference performance. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    autoMate

    autoMate

    AI tool for automating desktop tasks via natural language input

    ...It combines large language models with computer vision techniques to interpret user intent and understand on-screen content, allowing it to interact with graphical interfaces similarly to a human user. autoMate follows an observe-decide-act workflow, where it analyzes the screen, plans actions, and executes them through simulated input such as mouse clicks and keyboard events. Unlike conventional RPA tools that require predefined workflows, autoMate dynamically adapts to tasks by making autonomous decisions based on the current interface state. autoMate emphasizes local execution, meaning all processing happens on the user’s machine to maintain privacy and data security.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 10
    WhisperLive

    WhisperLive

    A nearly-live implementation of OpenAI's Whisper

    ...The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and network streams such as RTSP and HLS, making it flexible for live events, monitoring, or accessibility workflows. Configuration options let you control the number of clients, maximum connection time, and threading behavior so the server can be tuned for different deployment environments. On the client side, you can set the language, whether to translate into English, model size, voice activity detection, and output recording behavior.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    HunyuanWorld-Voyager

    HunyuanWorld-Voyager

    RGBD video generation model conditioned on camera input

    HunyuanWorld-Voyager is a next-generation video diffusion framework developed by Tencent-Hunyuan for generating world-consistent 3D scene videos from a single input image. By leveraging user-defined camera paths, it enables immersive scene exploration and supports controllable video synthesis with high realism. The system jointly produces aligned RGB and depth video sequences, making it directly applicable to 3D reconstruction tasks. At its core, Voyager integrates a world-consistent video diffusion model with an efficient long-range world exploration engine powered by auto-regressive inference. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 12
    TypeChat

    TypeChat

    Library for building type-safe natural language interfaces with LLMs

    ...Instead of writing complex prompts, developers define types that represent the intents supported by their applications. It then uses those type definitions to construct prompts for language models and translate user input into structured data that follows the defined schema.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    Vulnhuntr

    Vulnhuntr

    AI tool for detecting complex vulnerabilities in Python codebases

    Vulnhuntr is an open source security tool that uses large language models to analyze codebases and identify remotely exploitable vulnerabilities. It focuses on Python projects and applies static code analysis combined with LLM reasoning to trace how user input flows through an application. Instead of scanning entire repositories at once, it builds call chains step by step, allowing deeper inspection of complex, multi-stage issues that traditional tools may miss. Vulnhuntr can generate detailed findings, including vulnerability explanations and potential exploit paths, helping developers and security teams understand risks faster. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    Purple Llama

    Purple Llama

    Set of tools to assess and improve LLM security

    Purple Llama is an umbrella safety initiative that aggregates tools, benchmarks, and mitigations to help developers build responsibly with open generative AI. Its scope spans input and output safeguards, cybersecurity-focused evaluations, and reference shields that can be inserted at inference time. The project evolves as a hub for safety research artifacts like Llama Guard and Code Shield, along with dataset specs and how-to guides for integrating checks into applications. CyberSecEval, one of its flagship components, provides repeatable evaluations for security risk, including agent-oriented tasks such as automated patching benchmarks. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Qwen3-TTS

    Qwen3-TTS

    Qwen3-TTS is an open-source series of TTS models

    Qwen3-TTS is an open-source text-to-speech (TTS) project built around the Qwen3 large language model family, focused on generating high-quality, natural-sounding speech from plain text input. It provides researchers and developers with tools to transform text into expressive, intelligible audio, supporting multiple languages and voice characteristics tuned for clarity and fluidity. The project includes pre-trained models and inference scripts that let users synthesize speech locally or integrate TTS into larger pipelines such as voice assistants, accessibility tools, or multimedia generation workflows. ...
    Downloads: 32 This Week
    Last Update:
    See Project
  • 16
    Groq TypeScript / Node.s

    Groq TypeScript / Node.s

    The official Node.js / Typescript library for the Groq API

    ...It exports strongly-typed interfaces for models, chat completions, file uploads (e.g. for audio transcription), and other endpoints, allowing for better type safety and developer experience when using Groq from TypeScript. The library also supports passing different input types (file streams, blobs, fetch responses) for media-related endpoints, making it flexible for diverse environments (backend, browser, serverless). With this SDK, developers can call Groq’s models, transcribe audio, perform file uploads — all with minimal boilerplate — which streamlines creation of AI-enabled applications in the JavaScript/TypeScript ecosystem.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    Guardrails

    Guardrails

    Framework for validating and controlling LLM outputs in AI apps

    ...Guardrails also supports generating structured data from language models, allowing developers to enforce schemas or type constraints on responses. A companion ecosystem known as a hub provides reusable validators that can be combined into input and output guards to address different reliability and safety concerns.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    FastVLM

    FastVLM

    This repository contains the official implementation of FastVLM

    FastVLM is an efficiency-focused vision-language modeling stack that introduces FastViTHD, a hybrid vision encoder engineered to emit fewer visual tokens and slash encoding time, especially for high-resolution images. Instead of elaborate pruning stages, the design trades off resolution and token count through input scaling, simplifying the pipeline while maintaining strong accuracy. Reported results highlight dramatic speedups in time-to-first-token and competitive quality versus contemporary open VLMs, including comparisons across small and larger variants. The repository documents model variants, showcases head-to-head numbers against known baselines, and explains how the encoder integrates with common LLM backbones. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    OpenAI.fm

    OpenAI.fm

    Code for openai.fm, a demo for the OpenAI Speech API

    ...Developed using Next.js and the OpenAI Speech API, this demo illustrates how the latest neural voice models can produce natural, expressive speech with adjustable styles and voices, highlighting features like emotional range, tone, and real-time playback. Users can experiment with different input text and voice options directly in their browser, gaining a sense of how high-fidelity AI audio can be integrated into applications ranging from podcasts and narration to accessibility tools and interactive agents. Although the web demo is free to explore, production use of the underlying API requires an OpenAI API key and may incur costs based on usage.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 20
    Feynman

    Feynman

    The open source AI research agent

    Feynman is a command-line AI research agent designed to automate complex research workflows by orchestrating multiple specialized agents that collaborate to gather, analyze, and synthesize information into structured outputs. It operates as a “Claude Code for research,” allowing users to input natural language queries and receive fully developed, source-grounded research briefs, literature reviews, or experimental analyses. The system is built around a multi-agent architecture that includes roles such as researcher, reviewer, writer, and verifier, each responsible for a specific stage of the research pipeline. It supports advanced workflows like deep research investigations, paper replication, peer review simulation, and autonomous experimentation, enabling users to go beyond simple question answering into full research automation.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 21
    Streamdown

    Streamdown

    Streaming markdown renderer for AI apps with smooth updates

    ...It focuses on providing a smooth and visually stable experience while content is being appended, avoiding layout shifts that can disrupt readability. Streamdown is built to handle partial Markdown input gracefully, progressively enhancing the output as more text becomes available. It is especially relevant for chat interfaces, coding assistants, and any environment where responses are streamed token by token. Streamdown emphasizes performance and simplicity, ensuring that developers can integrate it without unnecessary complexity. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    SkillForge

    SkillForge

    Ultimate meta-skill for generating best-in-class Claude Code skills

    SkillForge is a systematic methodology and tooling framework for creating high-quality AI “skills” specifically optimized for Claude Code integrations, treating skill creation as an engineering discipline rather than an ad-hoc art form. It introduces a multi-phase architecture where every input or request is triaged intelligently, analyzed deeply through structured lenses, specified formally, synthesized with automated generation, and finally subjected to multi-agent review before consideration complete. The system includes tooling that routes natural language inputs to existing skills, augments them, or generates new ones using autonomous phases, enforcing quality, extensibility, security, and timelessness. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    FLUX.2

    FLUX.2

    Official inference repo for FLUX.2 models

    FLUX.2 is a state-of-the-art open-weight image generation and editing model released by Black Forest Labs aimed at bridging the gap between research-grade capabilities and production-ready workflows. The model offers both text-to-image generation and powerful image editing, including editing of multiple reference images, with fidelity, consistency, and realism that push the limits of what open-source generative models have achieved. It supports high-resolution output (up to ~4 megapixels),...
    Downloads: 41 This Week
    Last Update:
    See Project
  • 24
    Deep Chat

    Deep Chat

    Customizable AI chat component for websites with API support

    ...It is built as a framework-agnostic solution, meaning it can work across various frontend environments, with additional support provided for React through a dedicated wrapper. Deep Chat includes advanced interaction capabilities such as speech input and output, file handling, and multimedia communication, making it suitable for rich conversational experiences. Internally, it uses a structured architecture that manages input, message handling, and service communication, allowing developers to intercept and customize requests and responses.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Hugging Face - Speech To Speech

    Hugging Face - Speech To Speech

    Open speech-to-speech models and pipelines by Hugging Face toolkit AI

    This project from Hugging Face focuses on enabling direct speech-to-speech processing using modern machine learning models. It provides tools and reference implementations that allow audio input to be transformed into audio output without requiring an intermediate text representation. Hugging Face - Speech To Speech builds on recent advances in speech modeling, combining components such as speech recognition, translation, and synthesis into unified pipelines. It is designed to help researchers and developers experiment with multilingual and cross-lingual voice applications. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB