55 projects for "tools" with 2 filters applied:

  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 1
    LTX-2.3

    LTX-2.3

    Official Python inference and LoRA trainer package

    LTX-2.3 is an open-source multimodal artificial intelligence foundation model developed by Lightricks for generating synchronized video and audio from prompts or other inputs. Unlike most earlier video generation systems that only produced silent clips, LTX-2 combines video and audio generation in a unified architecture capable of producing coherent audiovisual scenes. The model uses a diffusion-transformer-based architecture designed to generate high-fidelity visual frames while...
    Downloads: 127 This Week
    Last Update:
    See Project
  • 2
    Claude Code SDK Python

    Claude Code SDK Python

    Python SDK for Claude Agent

    ...It provides abstractions to easily query Claude Code (with streaming support) and conduct interactive sessions. The SDK includes core client classes, asynchronous query functions, and support for custom tools and hooks within Claude sessions. It is designed to integrate with local Python workflows and allow developers to embed Claude Code capabilities directly in their applications or scripts. The repo is MIT-licensed and includes documentation and installation instructions (requires Python 3.10+, Node installation of Claude Code). Example usage shows how to stream responses, parse structured message blocks, or create persistent client sessions.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 3
    Depth Anything 3

    Depth Anything 3

    Recovering the Visual Space from Any Views

    ...The model can be applied to photography, AR/VR content creation, robotics perception, and 3D reconstruction workflows, making it versatile across industries and research domains. It includes support for high-resolution inputs and post-processing tools that refine depth predictions, helping downstream tasks like segmentation, bounding volume estimation, and mixed reality layering.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 4
    FLUX.1

    FLUX.1

    Official inference repo for FLUX.1 models

    ...This repo focuses on running the open-source model variants efficiently, providing scripts, model loading logic, and examples for local installations, and supports integration with Python toolchains like PyTorch and popular generative pipelines. Users can launch CLI tools to generate images, experiment with different FLUX variants, and extend the base code for research-oriented applications.
    Downloads: 44 This Week
    Last Update:
    See Project
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 5
    Qwen3-TTS

    Qwen3-TTS

    Qwen3-TTS is an open-source series of TTS models

    Qwen3-TTS is an open-source text-to-speech (TTS) project built around the Qwen3 large language model family, focused on generating high-quality, natural-sounding speech from plain text input. It provides researchers and developers with tools to transform text into expressive, intelligible audio, supporting multiple languages and voice characteristics tuned for clarity and fluidity. The project includes pre-trained models and inference scripts that let users synthesize speech locally or integrate TTS into larger pipelines such as voice assistants, accessibility tools, or multimedia generation workflows. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    GLM-4.5

    GLM-4.5

    GLM-4.5: Open-source LLM for intelligent agents by Z.ai

    GLM-4.5 is a cutting-edge open-source large language model designed by Z.ai for intelligent agent applications. The flagship GLM-4.5 model has 355 billion total parameters with 32 billion active parameters, while the compact GLM-4.5-Air version offers 106 billion total parameters and 12 billion active parameters. Both models unify reasoning, coding, and intelligent agent capabilities, providing two modes: a thinking mode for complex reasoning and tool usage, and a non-thinking mode for...
    Downloads: 70 This Week
    Last Update:
    See Project
  • 7
    Anthropic SDK Python

    Anthropic SDK Python

    Provides convenient access to the Anthropic REST API from any Python 3

    ...Importantly, it also supports streaming responses via Server-Sent Events (SSE) so that large outputs can be consumed incrementally rather than waiting for the full response. The client offers helper abstractions for tools (function-style “tools”) and streaming utilities for building interactive agents.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    Qwen3.5

    Qwen3.5

    Qwen3.5 is the large language model series developed by Qwen team

    Qwen3.5 is part of Alibaba’s Qwen family of large language and multimodal foundation models, designed to power advanced AI applications such as chatbots, coding assistants, and autonomous agents. The project represents a significant step toward “agentic AI,” meaning models that can reason through multi-step tasks and interact with external tools or environments rather than only generating text. Qwen3.5 builds on earlier Qwen generations by improving multilingual understanding, reasoning ability, and efficiency, while also introducing native multimodal capabilities that allow the model to work with both language and visual inputs. Architecturally, the system leverages modern large-scale training techniques and mixture-of-experts style efficiency so that very large parameter counts can be used while keeping inference practical.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 9
    ChatGLM-6B

    ChatGLM-6B

    ChatGLM-6B: An Open Bilingual Dialogue Language Model

    ChatGLM-6B is an open bilingual (Chinese + English) conversational language model based on the GLM architecture, with approximately 6.2 billion parameters. The project provides inference code, demos (command line, web, API), quantization support for lower memory deployment, and tools for finetuning (e.g., via P-Tuning v2). It is optimized for dialogue and question answering with a balance between performance and deployability in consumer hardware settings. Support for quantized inference (INT4, INT8) to reduce GPU memory requirements. Automatic mode switching between precision/memory tradeoffs (full/quantized).
    Downloads: 16 This Week
    Last Update:
    See Project
  • One App to Replace Your Entire SaaS Stack Icon
    One App to Replace Your Entire SaaS Stack

    Projects, docs, chat, and AI in one workspace. Work faster, not across 10 tabs.

    ClickUp replaces your scattered tool stack with one AI-powered platform. Stop paying for project management, docs, chat, and time tracking separately when they all live in one place. Teams that consolidate into ClickUp cut software costs and move faster because everything is connected, not siloed across apps that don't talk to each other.
    Try ClickUp Free
  • 10
    HRM-Text

    HRM-Text

    1B text generation model based on the HRM architecture

    ...The system combines hierarchical recurrent design, task-completion strengthening, and latent-space reasoning. Its training stack includes PrefixLM sequence packing, FlashAttention 3 kernels, PyTorch FSDP2, evaluation scripts, and checkpoint conversion tools. The repository supports reference pretraining runs for smaller and larger configurations, with Hopper-class GPUs expected for the attention path. It is useful for researchers and engineers exploring efficient language model pretraining, reasoning-focused architectures, and reproducible foundation model experiments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    FinGPT

    FinGPT

    Open-Source Financial Large Language Models

    ...It extends traditional GPT-style models by connecting them to live or historical financial datasets, news APIs, and economic indicators so that outputs are grounded in relevant and recent market conditions rather than generic knowledge alone. The platform typically includes tools for fine-tuning, context engineering, and prompt templating, enabling users to build specialized assistants for tasks like sentiment analysis, earnings summary generation, risk profiling, trading signal interpretation, and document extraction from financial reports.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    LTX-2

    LTX-2

    Python inference and LoRA trainer package for the LTX-2 audio–video

    ...Beyond basic rendering scaffolding, LTX-2 includes optimized math libraries, resource loaders, utilities for texture and buffer handling, and integration points for native event loops and input systems. The framework targets both interactive graphical applications and media-rich experiences, making it a solid foundation for games, creative tools, or visualization systems that demand both performance and flexibility. While being low-level, it also provides sensible defaults and helper abstractions that reduce boilerplate and help teams maintain clear, maintainable code.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 13
    DeepSeek-OCR 2

    DeepSeek-OCR 2

    Visual Causal Flow

    DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    Open Infra Index

    Open Infra Index

    Production-tested AI infrastructure tools

    open-infra-index is a central “infrastructure index” repository maintained by DeepSeek AI that acts as a catalog and hub for a collection of production-tested AI infrastructure tools and internal building blocks they have open-sourced. Instead of a single monolithic codebase, it functions more like an index or launching point: linking and documenting a set of library repos (e.g. FlashMLA, DeepEP, DeepGEMM, 3FS, etc.) that together form DeepSeek’s infrastructure stack. The repo's README describes the project as sharing “humble building blocks” of their online service—code that is documented, deployed, and battle-tested in production. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    DeepSeek Coder

    DeepSeek Coder

    DeepSeek Coder: Let the Code Write Itself

    ...Multiple sizes of the model are offered (e.g. 1B, 5.7B, 6.7B, 33B) so users can trade off inference cost vs capability. The repo provides model weights, documentation on training setup, evaluation results on common benchmarks (HumanEval, MultiPL-E, APPS, etc.), and inference tools.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 16
    IndexTTS2

    IndexTTS2

    Industrial-level controllable zero-shot text-to-speech system

    ...The system supports zero-shot voice cloning — meaning it can mimic a target speaker’s voice from a short reference sample — making it versatile for multi-voice uses. Compared to many open-source TTS tools, IndexTTS emphasizes efficiency and controllability: it offers faster inference, simpler training pipelines, and controllable speech parameters (like duration, pitch, and prosody), which is critical for production use.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 17
    Cactus Needle

    Cactus Needle

    26m function call model that runs on incredibly small devices

    ...It is based on a Simple Attention Network architecture and was distilled from a much larger model to focus on fast, compact tool-use behavior. The project provides open weights, training details, dataset generation resources, and a playground for testing the model with custom tools. Needle is optimized for single-shot function calling rather than broad conversational ability, so its core use case is selecting the right tool and producing structured arguments. It can be fine-tuned locally, including on consumer machines, which makes it useful for experimentation with small personalized agents. The project is best suited for researchers and developers exploring tiny AI models, edge inference, and lightweight tool-calling systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Qwen3-ASR

    Qwen3-ASR

    Qwen3-ASR is an open-source series of ASR models

    ...The architecture combines advanced neural acoustic modeling with context-aware language prediction so that outputs maintain both fidelity to the original speech and grammatical coherence. This makes Qwen3-ASR suitable for voice-driven applications like AI assistants, dictation tools, speech analytics pipelines, and accessibility features, where accurate and fluid transcription is critical.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Hunyuan3D-1

    Hunyuan3D-1

    A Unified Framework for Text-to-3D and Image-to-3D Generation

    ...(Note: less detailed public documentation was found for Hunyuan3D-1 compared to 2.1.). Community and ecosystem support (e.g. usage via Blender addon for geometry/texture). Integration into user-friendly tools/platforms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    DeepSeek-V3.2-Exp

    DeepSeek-V3.2-Exp

    An experimental version of DeepSeek model

    ...In public evaluations across a variety of reasoning, code, and question-answering benchmarks (e.g. MMLU, LiveCodeBench, AIME, Codeforces, etc.), V3.2-Exp shows performance very close to or in some cases matching that of V3.1-Terminus. The repository includes tools and kernels to support the new sparse architecture—for instance, CUDA kernels, logit indexers, and open-source modules like FlashMLA and DeepGEMM are invoked for performance.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    Profile Data

    Profile Data

    Analyze computation-communication overlap in V3/R1

    ...DualPipe), and how MoE / EP / parallelism strategies interact in real systems. The repository contains JSON trace files like train.json, prefill.json, decode.json, and associated assets. Users can load them into tools like Chrome tracing to inspect GPU idle times, overlapping operations, and scheduling alignment. The idea is to bring transparency to internal efficiency tradeoffs, enabling researchers to reproduce, analyze, or improve on DeepSeek’s parallelism strategies. The README explains how trace data corresponds to forward/backward chunks, settings (e.g. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Stable Diffusion WebUI Forge

    Stable Diffusion WebUI Forge

    Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion

    ...The UI surfaces advanced options in a way that remains recognizable to WebUI users, so migration costs are low while gaining experimental features. In practice, Forge serves as a proving ground for ideas that may later influence upstream tools, giving power users early access to cutting-edge techniques.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 23
    MOSS-TTS Family

    MOSS-TTS Family

    MOSS‑TTS Family open‑source speech and sound generation model

    ...The broader family also includes dialogue generation, prompt-based voice creation, streaming voice-agent support, and a unified audio tokenizer. It is especially useful for developers building dubbing, podcasts, audiobooks, voice assistants, character voices, and creative audio tools.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    HunyuanVideo-Foley

    HunyuanVideo-Foley

    Multimodal Diffusion with Representation Alignment

    HunyuanVideo-Foley is a multimodal diffusion model from Tencent Hunyuan for high-fidelity Foley (sound effects) audio generation synchronized to video scenes. It is designed to generate audio that matches both visual content and textual semantic cues, for use in video production, film, advertising, games, etc. The model architecture aligns audio, video, and text representations to produce realistic synchronized soundtracks. Produces high-quality 48 kHz audio output suitable for professional...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    GLM-4.6V

    GLM-4.6V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    ...Unlike many vision-language models that treat images and text separately or require intermediate conversions, GLM-4.6V allows inputs such as images, screenshots or document pages directly as part of its reasoning pipeline — and can output or act via tools seamlessly, bridging perception and execution. Its architecture supports a very large context window (on the order of 128K tokens during training), which lets it handle complex multimodal inputs like long documents, multi-page reports, or video transcripts, while maintaining coherence across extended content. In benchmarks and internal evaluations, GLM-4.6V achieves state-of-the-art (SoTA) performance among models of comparable parameter scale on multimodal reasoning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
Auth0 Logo