Showing 33 open source projects for "task"

View related business solutions
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    FinGPT

    FinGPT

    Open-Source Financial Large Language Models

    FinGPT is an open-source, finance-specialized large language model framework that blends the capabilities of general LLMs with real-time financial data feeds, domain-specific knowledge bases, and task-oriented agents to support market analysis, research automation, and decision support. It extends traditional GPT-style models by connecting them to live or historical financial datasets, news APIs, and economic indicators so that outputs are grounded in relevant and recent market conditions rather than generic knowledge alone. The platform typically includes tools for fine-tuning, context engineering, and prompt templating, enabling users to build specialized assistants for tasks like sentiment analysis, earnings summary generation, risk profiling, trading signal interpretation, and document extraction from financial reports.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    GLM-5

    GLM-5

    From Vibe Coding to Agentic Engineering

    GLM-5 is a next-generation open-source large language model (LLM) developed by the Z .ai team under the zai-org organization that pushes the boundaries of reasoning, coding, and long-horizon agentic intelligence. Building on earlier GLM series models, GLM-5 dramatically scales the parameter count (to roughly 744 billion) and expands pre-training data to significantly improve performance on complex tasks such as multi-step reasoning, software engineering workflows, and agent orchestration...
    Downloads: 64 This Week
    Last Update:
    See Project
  • 3
    Qwen3 Embedding

    Qwen3 Embedding

    Designed for text embedding and ranking tasks

    ...It builds upon the Qwen3 base/dense models and offers several sizes (0.6B, 4B, 8B parameters), for both embedding and reranking, with high multilingual capability, long‐context understanding, and reasoning. It achieves state-of-the-art performance on benchmarks like MTEB (Multilingual Text Embedding Benchmark) and supports instruction-aware embedding (i.e. embedding task instructions along with queries) and flexible embedding/vector dimension definitions. It is meant for tasks such as text retrieval, classification, clustering, bitext mining, and code retrieval.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Qwen3.5

    Qwen3.5

    Qwen3.5 is the large language model series developed by Qwen team

    Qwen3.5 is part of Alibaba’s Qwen family of large language and multimodal foundation models, designed to power advanced AI applications such as chatbots, coding assistants, and autonomous agents. The project represents a significant step toward “agentic AI,” meaning models that can reason through multi-step tasks and interact with external tools or environments rather than only generating text. Qwen3.5 builds on earlier Qwen generations by improving multilingual understanding, reasoning...
    Downloads: 25 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 5
    4M

    4M

    4M: Massively Multimodal Masked Modeling

    ...Training/inference configs and issues discuss things like depth tokenizers, input masks for generation, and CUDA build questions, signaling active research iteration. The design leans into flexibility and steerability, so prompts and masks can shape behavior without bespoke heads per task. In short, 4M provides a unified recipe to pretrain large multimodal models that generalize broadly while remaining practical to fine-tune.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    MiniMax-M2.5

    MiniMax-M2.5

    State of the art LLM and coding model

    MiniMax-M2.5 is a state-of-the-art foundation model extensively trained with reinforcement learning across hundreds of thousands of real-world environments. It delivers leading performance in coding, agentic tool use, search, and complex office workflows, achieving top benchmark scores such as 80.2% on SWE-Bench Verified and 76.3% on BrowseComp. Designed to reason efficiently and decompose tasks like an experienced architect, M2.5 plans features, structure, and system design before...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Kimi K2.5

    Kimi K2.5

    Moonshot's most powerful AI model

    ...Based on a 1T-parameter Mixture-of-Experts (MoE) architecture with 32B activated parameters, it integrates advanced language reasoning with strong visual understanding. K2.5 supports both “Thinking” and “Instant” modes, enabling either deep step-by-step reasoning or low-latency responses depending on the task. Designed for agentic workflows, it features an Agent Swarm mechanism that decomposes complex problems into coordinated sub-agents executing in parallel. With a 256K context length and MoonViT vision encoder, the model excels across reasoning, coding, long-context comprehension, image, and video benchmarks. Kimi K2.5 is available via Moonshot’s API (OpenAI/Anthropic-compatible) and supports deployment through vLLM, SGLang, and KTransformers.
    Downloads: 25 This Week
    Last Update:
    See Project
  • 8
    MiMo-V2-Flash

    MiMo-V2-Flash

    MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation

    MiMo-V2-Flash is a large Mixture-of-Experts language model designed to deliver strong reasoning, coding, and agentic-task performance while keeping inference fast and cost-efficient. It uses an MoE setup where a very large total parameter count is available, but only a smaller subset is activated per token, which helps balance capability with runtime efficiency. The project positions the model for workflows that require tool use, multi-step planning, and higher throughput, rather than only single-turn chat. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 9
    CLIP

    CLIP

    CLIP, Predict the most relevant text snippet given an image

    ...It was trained on large sets of (image, caption) pairs using a contrastive objective: images and their matching text are pulled together in embedding space, while mismatches are pushed apart. Once trained, you can give it any text labels and ask it to pick which label best matches a given image—even without explicit training for that classification task. The repository provides code for model architecture, preprocessing transforms, evaluation pipelines, and example inference scripts. Because it generalizes to arbitrary labels via text prompts, CLIP is a powerful tool for tasks that involve interpreting images in terms of descriptive language.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 10
    DINOv2

    DINOv2

    PyTorch code and models for the DINOv2 self-supervised learning

    DINOv2 is a self-supervised vision learning framework that produces strong, general-purpose image representations without using human labels. It builds on the DINO idea of student–teacher distillation and adapts it to modern Vision Transformer backbones with a carefully tuned recipe for data augmentation, optimization, and multi-crop training. The core promise is that a single pretrained backbone can transfer well to many downstream tasks—from linear probing on classification to retrieval,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Map-Anything

    Map-Anything

    MapAnything: Universal Feed-Forward Metric 3D Reconstruction

    Map-Anything is a universal, feed-forward transformer for metric 3D reconstruction that predicts a scene’s geometry and camera parameters directly from visual inputs. Instead of stitching together many task-specific models, it uses a single architecture that supports a wide range of 3D tasks—multi-image structure-from-motion, multi-view stereo, monocular metric depth, registration, depth completion, and more. The model flexibly accepts different input combinations (images, intrinsics, poses, sparse or dense depth) and produces a rich set of outputs including per-pixel 3D points, camera intrinsics, camera poses, ray directions, confidence maps, and validity masks. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Ling-V2

    Ling-V2

    Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI

    ...Trained on more than 20 trillion tokens of high-quality data and enhanced through multi-stage supervised fine-tuning and reinforcement learning, Ling-V2’s models demonstrate strong general reasoning, mathematical problem-solving, coding understanding, and knowledge-intensive task performance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    ralph-loop-agent

    ralph-loop-agent

    Continuous Autonomy for the AI SDK

    ralph-loop-agent is an experimental autonomous agent framework from Vercel Labs that brings continuous autonomy to the AI SDK, enabling AI solutions to perform long-running, iterative tasks without manual stop/start intervention. Rather than simply answering a single request and stopping, Ralph Loop implements a loop control architecture that allows an agent to repeatedly evaluate its progress, adjust its approach, and continue working toward a defined completion criteria until tasks are...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    NVIDIA Isaac GR00T

    NVIDIA Isaac GR00T

    NVIDIA Isaac GR00T N1.5 is the world's first open foundation model

    NVIDIA Isaac‑GR00T N1.5 is an open-source foundation model engineered for generalized humanoid robot reasoning and manipulation skills. It accepts multimodal inputs—such as language and images—and uses a diffusion transformer architecture built upon vision-language encoders, enabling adaptive robot behaviors across diverse environments. It is designed to be customizable via post-training with real or synthetic data. The vision-language model remains frozen during both pretraining and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    CogVLM

    CogVLM

    A state-of-the-art open visual language model

    ...The repo provides multiple ways to run models (CLI, web demo, and OpenAI-Vision–style APIs), along with quantization options that reduce VRAM needs (e.g., 4-bit). It includes checkpoints for chat, base, and grounding variants, plus recipes for model-parallel inference and LoRA fine-tuning. The documentation covers task prompts for general dialogue, visual grounding (box→caption, caption→box, caption+boxes), and GUI agent workflows that produce structured actions with bounding boxes.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    Qwen-Audio

    Qwen-Audio

    Chat & pretrained large audio language model proposed by Alibaba Cloud

    ...There is also an instruction-tuned version called Qwen-Audio-Chat which supports conversational interaction (multi-round), audio + text input, creative tasks and reasoning over audio. It uses multi-task training over many different audio tasks (30+), and achieves strong multi-benchmarks performance without task-specific fine‐tuning. It includes features such as flexible multi-run chat, audio understanding/reasoning, music appreciation, and also tool usage (e.g. voice editing).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    DeepSeek Math

    DeepSeek Math

    Pushing the Limits of Mathematical Reasoning in Open Language Models

    DeepSeek-Math is DeepSeek’s specialized model (or dataset + evaluation) focusing on mathematical reasoning, symbolic manipulation, proof steps, and advanced quantitative problem solving. The repository is likely to include fine-tuning routines or task datasets (e.g. MATH, GSM8K, ARB), demonstration notebooks, prompt templates, and evaluation results on math benchmarks. The goal is to push DeepSeek’s performance in domains that require rigorous symbolic steps, calculus, linear algebra, number theory, or multi-step derivations. The repo may also include modules that integrate external computational tools (e.g. a CAS / computer algebra system) or calculator assistance backends to enhance correctness. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18
    GLM-4-32B-0414

    GLM-4-32B-0414

    Open Multilingual Multimodal Chat LMs

    GLM-4-32B-0414 is a powerful open-source large language model featuring 32 billion parameters, designed to deliver performance comparable to leading models like OpenAI’s GPT series. It supports multilingual and multimodal chat capabilities with an extensive 32K token context length, making it ideal for dialogue, reasoning, and complex task completion. The model is pre-trained on 15 trillion tokens of high-quality data, including substantial synthetic reasoning datasets, and further enhanced with reinforcement learning and human preference alignment for improved instruction-following and function calling. Variants like GLM-Z1-32B-0414 offer deep reasoning and advanced mathematical problem-solving, while GLM-Z1-Rumination-32B-0414 specializes in long-form, complex research-style writing using scaled reinforcement learning and external search tools. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    MuJoCo MPC

    MuJoCo MPC

    Real-time behaviour synthesis with MuJoCo, using Predictive Control

    MuJoCo MPC (MJPC) is an advanced interactive framework for real-time model predictive control (MPC) built on top of the MuJoCo physics engine, developed by Google DeepMind. It allows researchers and roboticists to design, visualize, and execute complex control tasks for simulated or real robotic systems. MJPC integrates a high-performance GUI and multiple predictive control algorithms, including iLQG, gradient descent, and Predictive Sampling — a competitive, derivative-free method that...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    ChatGLM Efficient Tuning

    ChatGLM Efficient Tuning

    Fine-tuning ChatGLM-6B with PEFT

    ...The project exposes practical switches for quantization and mixed precision, allowing bigger models to fit into limited VRAM. It includes examples for instruction tuning and dialogue datasets, making it straightforward to stand up a task-specific assistant. Because the code leans on widely used libraries, you can bring your own datasets and monitoring tools with minimal glue. For builders who want results fast, it’s a pragmatic way to specialize ChatGLM while controlling costs and turnaround time.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    VALL-E

    VALL-E

    PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)

    ...Specifically, we train a neural codec language model (called VALL-E) using discrete codes derived from an off-the-shelf neural audio codec model, and regard TTS as a conditional language modeling task rather than continuous signal regression as in previous work. During the pre-training stage, we scale up the TTS training data to 60K hours of English speech which is hundreds of times larger than existing systems. VALL-E emerges in-context learning capabilities and can be used to synthesize high-quality personalized speech with only a 3-second enrolled recording of an unseen speaker as an acoustic prompt. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    Mask2Former

    Mask2Former

    Code release for "Masked-attention Mask Transformer

    ...Its core idea is to cast segmentation as mask classification: a transformer decoder predicts a set of mask queries, each with an associated class score, eliminating the need for task-specific heads. A pixel decoder fuses multi-scale features and feeds masked attention in the transformer so each query focuses computation on its current spatial support. This leads to accurate masks with sharp boundaries and strong small-object performance while remaining efficient on high-resolution inputs. The project provides extensive configurations and pretrained models across popular benchmarks like COCO, ADE20K, and Cityscapes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    MaskFormer

    MaskFormer

    Per-Pixel Classification is Not All You Need for Semantic Segmentation

    MaskFormer is a unified framework for image segmentation developed by Facebook Research, designed to bridge the gap between semantic, instance, and panoptic segmentation within a single architecture. Unlike traditional segmentation pipelines that treat these tasks separately, MaskFormer reformulates segmentation as a mask classification problem, enabling a consistent and efficient approach across multiple segmentation domains. Built on top of Detectron2, it supports a wide range of datasets...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    StarSpace

    StarSpace

    Learning embeddings for classification, retrieval and ranking

    StarSpace is a general-purpose embedding-based learning framework that trains embeddings for entities (words, sentences, users, items) under various supervision signals (classification, ranking, matching). Instead of focusing on one task, StarSpace supports multi-task and multi-domain setups—for instance, you can train embeddings so that textual queries match item descriptions, sentences map to labels, or users align with liked items in the same embedding space. The training objective is contrastive: for a given query embedding, positive and negative examples are sampled and the model is optimized to score positive higher than negatives. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Kimi K2.6

    Kimi K2.6

    Multimodal agent model for coding, orchestration, and autonomy

    ...One of its most distinctive capabilities is horizontal agent scaling, supporting up to 300 sub-agents and 4,000 coordinated steps in a single run, which enables parallel task decomposition and end-to-end completion of outputs such as documents, websites, and spreadsheets. Architecturally, it uses a 1T-parameter Mixture-of-Experts design with 32B activated parameters, a MoonViT vision encoder, and a 256K context window.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo