Showing 48 open source projects for "direct"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    ACI.dev

    ACI.dev

    Open platform connecting AI agents to tools via unified MCP server

    ...It focuses on simplifying tool integration by connecting hundreds of pre-built services into agentic environments, allowing developers to avoid building custom API clients and authentication flows for each service. ACI provides intent-aware tool access, meaning agents can dynamically discover and use tools based on context rather than rigid configurations. It supports both direct function calling and a unified Model Context Protocol (MCP) server, offering flexibility in how integrations are exposed to AI systems. ACI also includes multi-tenant authentication and permission controls, ensuring that tool usage is secure and scoped appropriately for different users and agents. Additionally, it is framework-agnostic, making it compatible with various large language model setups and agent architectures.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    Browser Harness

    Browser Harness

    Self-healing browser harness that enables LLMs to complete any task

    Browser Harness is a self-healing browser control system built to give language models direct and flexible access to a real Chrome browser through the Chrome DevTools Protocol. Its main philosophy is minimalism: instead of imposing a rigid framework, it exposes a very thin bridge so the agent can perform browser tasks with almost no abstraction in the way. A defining part of the project is that the agent can write or extend missing helper functions during a task, which is why the repository describes it as self-healing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Fli

    Fli

    Google Flights MCP and Python Library

    Fli is a powerful Python library and command-line tool that provides direct programmatic access to Google Flights data through reverse-engineered API interactions rather than traditional web scraping. This approach enables faster, more reliable, and more stable access to flight information, avoiding the fragility associated with HTML parsing and UI changes. The library supports a wide range of flight search capabilities, including filtering by airline, departure time, number of stops, cabin class, and sorting by price or duration, making it suitable for both casual queries and advanced travel analysis. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    RLHF-Reward-Modeling

    RLHF-Reward-Modeling

    Recipes to train reward model for RLHF

    ...It supports multiple optimization strategies commonly used in alignment pipelines, including reinforcement learning with PPO, iterative supervised fine-tuning using rejection sampling, and direct preference optimization methods. The project also includes evaluation results showing that the trained reward models can achieve competitive performance compared with other open-source alignment systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • 5
    WhisperLive

    WhisperLive

    A nearly-live implementation of OpenAI's Whisper

    WhisperLive is a “nearly live” implementation of OpenAI’s Whisper model focused on real-time transcription. It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and...
    Downloads: 27 This Week
    Last Update:
    See Project
  • 6
    GPT-API-free

    GPT-API-free

    Free ChatGPT&DeepSeek API Key

    GPT-API-free is a project that provides access to GPT-style APIs without requiring direct integration with paid official endpoints, focusing on accessibility and ease of experimentation. It offers a proxy-based approach that allows developers to interact with language models through a simplified interface, often requiring minimal configuration. The system is designed to lower barriers for developers who want to test or build applications using conversational AI without managing billing or complex authentication flows. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 7
    ScrapeGraphAI

    ScrapeGraphAI

    Python scraper based on AI

    Extracting content from websites and local documents using LLM. ScrapeGraphAI is a web scraping python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents (XML, HTML, JSON, Markdown, etc.). Just say which information you want to extract and the library will do it for you.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    edge-tts

    edge-tts

    Use Microsoft Edge's online text-to-speech service from Python

    edge-tts is a Python module and command-line tool that gives you direct access to Microsoft Edge’s online text-to-speech service without needing the Edge browser, Windows, or any API key. It wraps the same cloud voices used by Edge, exposing them through a simple CLI (edge-tts, edge-playback) and a Python API, so you can script high-quality speech generation in your own applications. The tool lets you list available voices, specify locale and voice name, and generate audio files in common formats like MP3 or WAV. ...
    Downloads: 25 This Week
    Last Update:
    See Project
  • 9
    pycm

    pycm

    Multi-class confusion matrix library in Python

    PyCM is a multi-class confusion matrix library written in Python that supports both input data vectors and direct matrix, and a proper tool for post-classification model evaluation that supports most classes and overall statistics parameters. PyCM is the swiss-army knife of confusion matrices, targeted mainly at data scientists that need a broad array of metrics for predictive models and an accurate evaluation of large variety of classifiers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    HiDream-I1

    HiDream-I1

    Open-source image generative foundation model

    ...It is designed to produce high-quality images from text prompts while keeping inference practical through efficient model design. The project provides full, dev, and fast model variants with different inference step counts. It supports direct Python inference scripts, an interactive Gradio demo, and integration through the Hugging Face Diffusers library. The model uses a Llama 3.1 text encoder path and requires the proper Hugging Face access setup for automatic downloads. It is useful for researchers, developers, and creative AI builders who want an open text-to-image model with strong benchmark performance and multiple deployment options.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    PPTAgent

    PPTAgent

    PPTAgent: Generating and Evaluating Presentations

    ...The repository highlights the EMNLP 2025 paper and provides links to resources for replication and study. The approach reflects human presentation practice—plan, draft, then refine with edits—yielding more coherent decks than direct one-shot generation. Community interest and stars suggest strong uptake for research and tooling around presentation automation.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    OpenRLHF

    OpenRLHF

    An Easy-to-use, Scalable and High-performance RLHF Framework

    OpenRLHF is an easy-to-use, scalable, and high-performance framework for Reinforcement Learning with Human Feedback (RLHF). It supports various training techniques and model architectures.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    AutoAgent AI

    AutoAgent AI

    Autonomous harness engineering

    AutoAgent is an experimental AI framework focused on autonomous agent engineering, where a meta-agent iteratively improves another agent’s architecture without direct human intervention. Instead of manually tuning prompts or workflows, developers define high-level goals in a configuration file, and the system continuously modifies its own tools, orchestration, and logic based on benchmark performance. It operates through a loop of testing, analyzing failures, and refining the agent’s configuration to maximize a scoring metric. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    The AI Scientist-v2

    The AI Scientist-v2

    Workshop-Level Automated Scientific Discovery via Agentic Tree Search

    ...The platform is capable of generating original research ideas, designing and executing experiments, analyzing and visualizing results, and producing full academic papers without direct human intervention. It introduces a generalized framework that removes reliance on predefined templates, enabling broader applicability across multiple machine learning domains and more open-ended exploration of research problems. A key innovation is its progressive agentic tree search, which systematically explores experimental paths and is coordinated by an experiment manager agent that guides decision-making. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    KIS Open API

    KIS Open API

    Korea Investment & Securities Open API Github

    ...It includes example scripts that demonstrate how to authenticate with the service, retrieve financial data, and execute trading operations through REST and WebSocket interfaces. The repository organizes its examples into two main groups: code designed for direct user implementation and simplified examples intended for large language model agents or automation workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    yt-fts

    yt-fts

    Search all of YouTube from the command line

    ...Once indexed, users can perform full-text searches across all transcripts to quickly locate keywords or phrases mentioned within the videos. The tool returns search results with timestamps and direct links to the exact moment in the video where the phrase occurs. In addition to traditional keyword search, the system supports experimental semantic search capabilities using embeddings from AI services and vector databases. This allows users to search videos by meaning rather than only exact keywords.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Spark TTS

    Spark TTS

    Spark-TTS Inference Code

    Spark TTS is an open-source, PyTorch-based text-to-speech inference system that leverages large language models to produce highly natural, intelligible speech from text input. It uses an efficient single-stream architecture where speech tokens are directly reconstructed from the predictions of an LLM, removing the need for external acoustic models or complex vocoders and making the generation pipeline cleaner and faster. The project supports zero-shot voice cloning, meaning it can imitate a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    HY-MT

    HY-MT

    Hunyuan Translation Model Version 1.5

    HY-MT (Hunyuan Translation) is a high-quality multilingual machine translation model suite developed to support mutual translation across dozens of languages with strong performance even at smaller model scales. It ships with both an 1.8 B parameter model and a larger 7 B model, the latter optimized not only for direct translation but also for formatted and contextualized output, allowing better handling of terminology and mixed-language content. The project emphasizes both speed and quality, with the smaller model able to be quantized and deployed on edge devices for real-time translation tasks without requiring large server infrastructure. Terminology intervention and contextual translation features give users control over how specific terms or styles are rendered, which is important for technical or domain-specific content.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    llm.c

    llm.c

    LLM training in simple, raw C/CUDA

    ...By stripping away heavy frameworks, it exposes the core math and memory flows of embeddings, attention, and feed-forward layers. The code illustrates how to wire forward passes, losses, and simple training or inference loops with direct control over arrays and buffers. Its compact design makes it easy to trace execution, profile hotspots, and understand the cost of each operation. Portability is a goal: it aims to compile with common toolchains and run on modest hardware for small experiments. Rather than delivering a production-grade stack, it serves as a reference and learning scaffold for people who want to “see the metal” behind LLMs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    HunyuanWorld-Voyager

    HunyuanWorld-Voyager

    RGBD video generation model conditioned on camera input

    HunyuanWorld-Voyager is a next-generation video diffusion framework developed by Tencent-Hunyuan for generating world-consistent 3D scene videos from a single input image. By leveraging user-defined camera paths, it enables immersive scene exploration and supports controllable video synthesis with high realism. The system jointly produces aligned RGB and depth video sequences, making it directly applicable to 3D reconstruction tasks. At its core, Voyager integrates a world-consistent video...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    BlenderMCP

    BlenderMCP

    Blender Model Context Protocol Integration

    BlenderMCP is a bridge that connects Blender, a 3D modeling and rendering software, with AI systems like Claude through the Model Context Protocol, enabling direct AI-driven interaction with 3D environments. It allows users to control Blender using natural language prompts, effectively turning AI into a co-creator for 3D modeling, scene construction, and asset manipulation. The system establishes a two-way communication channel between Blender and the AI, where commands can be sent and results retrieved in real time. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Hugging Face - Speech To Speech

    Hugging Face - Speech To Speech

    Open speech-to-speech models and pipelines by Hugging Face toolkit AI

    This project from Hugging Face focuses on enabling direct speech-to-speech processing using modern machine learning models. It provides tools and reference implementations that allow audio input to be transformed into audio output without requiring an intermediate text representation. Hugging Face - Speech To Speech builds on recent advances in speech modeling, combining components such as speech recognition, translation, and synthesis into unified pipelines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    TensorFlow Quantum

    TensorFlow Quantum

    Open-source Python framework for hybrid quantum-classical ml learning

    ...TensorFlow Quantum integrates with the Cirq quantum computing framework to define and manipulate quantum circuits, while leveraging TensorFlow’s infrastructure for optimization, automatic differentiation, and large-scale computation. The library also supports high-performance simulation of quantum circuits, enabling researchers to test and evaluate quantum models even without direct access to quantum hardware.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    The Alignment Handbook

    The Alignment Handbook

    Robust recipes to align language models with human and AI preferences

    The Alignment Handbook is an open-source resource created to provide practical guidance for aligning large language models with human preferences and safety requirements. The project focuses on the post-training stage of model development, where models are refined after pre-training to behave more helpfully, safely, and reliably in real-world applications. It provides detailed training recipes that explain how to perform tasks such as supervised fine-tuning, preference modeling, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Fast3R

    Fast3R

    Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

    ...It outputs high-quality 3D scene representations from unordered or sequential views, scaling to large datasets and varied camera intrinsics. The repository includes pretrained models, Gradio-based demos, and modular APIs for direct integration into research or production workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo