Showing 122 open source projects for "direct"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    openclaw-kapso-whatsapp

    openclaw-kapso-whatsapp

    Give your OpenClaw AI agent a WhatsApp number

    openclaw-kapso-whatsapp is a plugin repository designed to extend the OpenClaw AI agent by giving it a dedicated WhatsApp phone number using the official Meta Cloud API via Kapso, enabling direct interaction through one of the most widely used messaging platforms. This integration allows the autonomous AI assistant to send and receive messages on WhatsApp, turning the agent into a real-world task performer accessible through text conversations. The plugin is built in Go and handles communication entirely through cloud APIs, avoiding the risk of bans that come with unofficial or reverse-engineered interfaces. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    AutoAgent AI

    AutoAgent AI

    Autonomous harness engineering

    AutoAgent is an experimental AI framework focused on autonomous agent engineering, where a meta-agent iteratively improves another agent’s architecture without direct human intervention. Instead of manually tuning prompts or workflows, developers define high-level goals in a configuration file, and the system continuously modifies its own tools, orchestration, and logic based on benchmark performance. It operates through a loop of testing, analyzing failures, and refining the agent’s configuration to maximize a scoring metric. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    KIS Open API

    KIS Open API

    Korea Investment & Securities Open API Github

    ...It includes example scripts that demonstrate how to authenticate with the service, retrieve financial data, and execute trading operations through REST and WebSocket interfaces. The repository organizes its examples into two main groups: code designed for direct user implementation and simplified examples intended for large language model agents or automation workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Lobe Icons

    Lobe Icons

    Brings AI/LLM brand logos to your React & React Native apps

    Lobe Icons is an open-source icon library designed to provide developers with a comprehensive collection of logos and visual assets representing popular artificial intelligence platforms, language models, and related technologies. The project focuses on making it easy for developers to include recognizable AI brand icons in applications such as dashboards, AI tools, documentation sites, or developer portals. The library includes icons for a wide range of AI providers and models, allowing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 5
    yt-fts

    yt-fts

    Search all of YouTube from the command line

    ...Once indexed, users can perform full-text searches across all transcripts to quickly locate keywords or phrases mentioned within the videos. The tool returns search results with timestamps and direct links to the exact moment in the video where the phrase occurs. In addition to traditional keyword search, the system supports experimental semantic search capabilities using embeddings from AI services and vector databases. This allows users to search videos by meaning rather than only exact keywords.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    ANE Training

    ANE Training

    Training neural networks on Apple Neural Engine via APIs

    ANE Training is an experimental research project that demonstrates how to train neural networks directly on Apple’s Neural Engine by leveraging reverse-engineered private APIs that are normally inaccessible to developers. The repository implements a from-scratch transformer training pipeline capable of running both forward and backward passes on ANE hardware without relying on CoreML, Metal, or GPU acceleration. It explores the internal software stack of the Apple Neural Engine by...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Spark TTS

    Spark TTS

    Spark-TTS Inference Code

    Spark TTS is an open-source, PyTorch-based text-to-speech inference system that leverages large language models to produce highly natural, intelligible speech from text input. It uses an efficient single-stream architecture where speech tokens are directly reconstructed from the predictions of an LLM, removing the need for external acoustic models or complex vocoders and making the generation pipeline cleaner and faster. The project supports zero-shot voice cloning, meaning it can imitate a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    HY-MT

    HY-MT

    Hunyuan Translation Model Version 1.5

    HY-MT (Hunyuan Translation) is a high-quality multilingual machine translation model suite developed to support mutual translation across dozens of languages with strong performance even at smaller model scales. It ships with both an 1.8 B parameter model and a larger 7 B model, the latter optimized not only for direct translation but also for formatted and contextualized output, allowing better handling of terminology and mixed-language content. The project emphasizes both speed and quality, with the smaller model able to be quantized and deployed on edge devices for real-time translation tasks without requiring large server infrastructure. Terminology intervention and contextual translation features give users control over how specific terms or styles are rendered, which is important for technical or domain-specific content.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Sled

    Sled

    Teleport Claude Code, Codex or Gemini CLI to your phone

    Sled under the layercodedev account appears to be a lightweight web and mobile UI aimed at interacting with local coding agents, likely including AI-assisted coding models or remote execution integrations; it’s designed in TypeScript/JavaScript and intended to let developers use or control coding agents from various devices. Although specific details in the repository are limited without direct project documentation, context and related online mentions indicate it functions as a local interface layer that abstracts development agent workflows and Teleport-style interactions, bringing parts of modern assistant capabilities to phone or web UIs. This project resembles modern agent front ends where developers can test, iterate, and prompt their local models or backends without complex setup. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    llm.c

    llm.c

    LLM training in simple, raw C/CUDA

    ...By stripping away heavy frameworks, it exposes the core math and memory flows of embeddings, attention, and feed-forward layers. The code illustrates how to wire forward passes, losses, and simple training or inference loops with direct control over arrays and buffers. Its compact design makes it easy to trace execution, profile hotspots, and understand the cost of each operation. Portability is a goal: it aims to compile with common toolchains and run on modest hardware for small experiments. Rather than delivering a production-grade stack, it serves as a reference and learning scaffold for people who want to “see the metal” behind LLMs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    HunyuanWorld-Voyager

    HunyuanWorld-Voyager

    RGBD video generation model conditioned on camera input

    HunyuanWorld-Voyager is a next-generation video diffusion framework developed by Tencent-Hunyuan for generating world-consistent 3D scene videos from a single input image. By leveraging user-defined camera paths, it enables immersive scene exploration and supports controllable video synthesis with high realism. The system jointly produces aligned RGB and depth video sequences, making it directly applicable to 3D reconstruction tasks. At its core, Voyager integrates a world-consistent video...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Ars Contexta

    Ars Contexta

    Claude Code plugin that generates individualized knowledge systems

    Ars Contexta is a Claude Code plugin designed to automatically transform conversations into structured, personalized knowledge systems that function as a “second brain.” Instead of leaving insights scattered across chat sessions, the tool captures how a user thinks, works, and solves problems, then converts those interactions into organized markdown files that the user fully owns. The system emphasizes long-term knowledge retention by structuring information into reusable and evolving...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    webclaw

    webclaw

    Fast, local-first web content extraction for LLMs

    ...The tool addresses a major inefficiency in AI workflows by removing irrelevant elements like navigation menus, ads, and scripts, significantly reducing token usage when feeding data into language models. It supports multiple modes of operation, including CLI usage, REST API access, and an MCP server for direct integration with agent-based systems. Webclaw also provides advanced capabilities such as recursive crawling, structured JSON extraction, summarization, and content comparison, making it suitable for research and data pipelines. Its local-first architecture ensures privacy and eliminates the need for API keys.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Hugging Face - Speech To Speech

    Hugging Face - Speech To Speech

    Open speech-to-speech models and pipelines by Hugging Face toolkit AI

    This project from Hugging Face focuses on enabling direct speech-to-speech processing using modern machine learning models. It provides tools and reference implementations that allow audio input to be transformed into audio output without requiring an intermediate text representation. Hugging Face - Speech To Speech builds on recent advances in speech modeling, combining components such as speech recognition, translation, and synthesis into unified pipelines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    TensorFlow Quantum

    TensorFlow Quantum

    Open-source Python framework for hybrid quantum-classical ml learning

    ...TensorFlow Quantum integrates with the Cirq quantum computing framework to define and manipulate quantum circuits, while leveraging TensorFlow’s infrastructure for optimization, automatic differentiation, and large-scale computation. The library also supports high-performance simulation of quantum circuits, enabling researchers to test and evaluate quantum models even without direct access to quantum hardware.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    RAG from Scratch

    RAG from Scratch

    Demystify RAG by building it from scratch

    ...Each example is written with detailed explanations so that developers can understand the internal mechanics of semantic search and context-aware language generation. The repository emphasizes learning through direct implementation, allowing users to see how each component of the RAG architecture functions independently.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    The Alignment Handbook

    The Alignment Handbook

    Robust recipes to align language models with human and AI preferences

    The Alignment Handbook is an open-source resource created to provide practical guidance for aligning large language models with human preferences and safety requirements. The project focuses on the post-training stage of model development, where models are refined after pre-training to behave more helpfully, safely, and reliably in real-world applications. It provides detailed training recipes that explain how to perform tasks such as supervised fine-tuning, preference modeling, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Fast3R

    Fast3R

    Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

    ...It outputs high-quality 3D scene representations from unordered or sequential views, scaling to large datasets and varied camera intrinsics. The repository includes pretrained models, Gradio-based demos, and modular APIs for direct integration into research or production workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Seamless Communication

    Seamless Communication

    Foundational Models for State-of-the-Art Speech and Text Translation

    Seamless Communication is a research project focused on building more integrated, low-latency multimodal communication between humans and AI agents. The motivation is to move beyond “text in, text out” and enable direct, live, multi-turn exchange involving language, gesture, gaze, vision, and modality switching without user friction. The system architecture includes a real-time multimodal signal pipeline for audio, video, and sensor data, a dialog manager that can decide when to act (speak, gesture, point) or query, and a cross-modal reasoning layer that fuses perception with semantic context. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    MODMAIL

    MODMAIL

    A feature rich discord Modmail bot

    ...This bot is free for everyone and always will be. If you like this project and would like to show your appreciation, you can support us on Patreon, cool benefits included! When a member sends a direct message to the bot, Modmail will create a channel or "thread" into a designated category. All further DM messages will automatically relay to that channel; any available staff can respond within the channel. Schedule tasks in human time, e.g. ?close in 2 hours silently. Editing and deleting messages are synced. Support for the diverse range of message contents (multiple images, files). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    ConfiChat

    ConfiChat

    Lightweight, standalone, multi-platform, and privacy focused local LLM

    ...A key differentiator is its optional encryption of chat history and assets, ensuring that sensitive data can remain secure even when stored locally. Conversations are managed as local JSON files, giving users transparency and direct control over their data. Overall, ConfiChat is designed for users who prioritize privacy, flexibility, and independence from complex infrastructure while still maintaining access.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Coinbase Agentic Wallet Skills

    Coinbase Agentic Wallet Skills

    npx skills add coinbase/agentic-wallet-skills

    Coinbase Agentic Wallet Skills project is a modular skill library developed by Coinbase as part of its Agentic Wallet ecosystem, designed to give AI agents direct access to on-chain financial operations through a standardized and reusable interface. It provides a set of pre-built “skills” that abstract complex blockchain interactions into simple, callable capabilities, allowing agents to authenticate, manage funds, and execute transactions without requiring developers to implement low-level logic. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Slack MCP Server

    Slack MCP Server

    The most powerful MCP Slack Server with no permission requirements

    Slack MCP Server is an open-source server implementation that connects Slack workspaces to AI systems through the Model Context Protocol (MCP). MCP is a standardized protocol that allows large language models and AI agents to securely interact with external tools and data sources such as messaging platforms, databases, or file systems. The slack-mcp-server acts as an intermediary layer that exposes Slack data and messaging functionality to AI clients while enforcing access rules and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    ERNIE

    ERNIE

    The official repository for ERNIE 4.5 and ERNIEKit

    ERNIE is an open-source large-model toolkit and model family from the PaddlePaddle ecosystem that focuses on training, fine-tuning, compression, and practical application of ERNIE large language models. The repository positions ERNIEKit as an industrial-grade development toolkit, emphasizing end-to-end workflows that span high-performance pre-training, supervised fine-tuning, and alignment. It supports both full-parameter training and parameter-efficient approaches so teams can choose...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    MAI-UI

    MAI-UI

    Real-World Centric Foundation GUI Agents

    MAI-UI is a cutting-edge open-source project that implements a family of foundation GUI (Graphical User Interface) agent models capable of interpreting natural language and performing real-world GUI navigation and control tasks across mobile and desktop environments. Developed by Tongyi-MAI (Alibaba’s research initiative), the MAI-UI models are multimodal agents trained to understand user instructions and corresponding screenshots, grounding those instructions to on-screen elements and...
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo