Nebius Integrations

5 Integrations with Nebius

View a list of Nebius integrations and software that integrates with Nebius below. Compare the best Nebius integrations as well as features, ratings, user reviews, and pricing of software that integrates with Nebius. Here are the current Nebius integrations in 2026:

1

Nebius Token Factory

Nebius

Nebius Token Factory is a scalable AI inference platform designed to run open-source and custom AI models in production without manual infrastructure management. It offers enterprise-ready inference endpoints with predictable performance, autoscaling throughput, and sub-second latency — even at very high request volumes. It delivers 99.9% uptime availability and supports unlimited or tailored traffic profiles based on workload needs, simplifying the transition from experimentation to global deployment. Nebius Token Factory supports a broad set of open source models such as Llama, Qwen, DeepSeek, GPT-OSS, Flux, and many others, and lets teams host and fine-tune models through an API or dashboard. Users can upload LoRA adapters or full fine-tuned variants directly, with the same enterprise performance guarantees applied to custom models.

Starting Price: $0.02

View Software
2

AI SpendOps

AI SpendOps

We give engineering, finance, and FinOps teams a single platform to track, attribute, and optimise LLM API spend across every provider. Costs are broken down by dimensions you define, matching how your business already reports its financials. Engineering teams get frictionless cost tracking without slowing anything down. CTOs get a single pane of glass to enforce model governance and prevent shadow usage. CFOs get finance-grade reporting for forecasting, budgeting, and chargebacks, attributed using their own reporting structure. FinOps teams get real-time, multi-provider cost data that slots straight into the workflows they already run for cloud. If your organisation uses LLM APIs and the board is asking "what are we spending and why?" we're the answer.

Starting Price: £199

View Software
3

NVIDIA DGX Cloud Lepton

NVIDIA

NVIDIA DGX Cloud Lepton is an AI platform that connects developers to a global network of GPU compute across multiple cloud providers through a single platform. It offers a unified experience to discover and utilize GPU resources, along with integrated AI services to streamline the deployment lifecycle across multiple clouds. Developers can start building with instant access to NVIDIA’s accelerated APIs, including serverless endpoints, prebuilt NVIDIA Blueprints, and GPU-backed compute. When it’s time to scale, DGX Cloud Lepton powers seamless customization and deployment across a global network of GPU cloud providers. It enables frictionless deployment across any GPU cloud, allowing AI applications to be deployed across multi-cloud and hybrid environments with minimal operational burden, leveraging integrated services for inference, testing, and training workloads.

View Software
4

NVIDIA DGX Cloud Serverless Inference

NVIDIA

NVIDIA DGX Cloud Serverless Inference is a high-performance, serverless AI inference solution that accelerates AI innovation with auto-scaling, cost-efficient GPU utilization, multi-cloud flexibility, and seamless scalability. With NVIDIA DGX Cloud Serverless Inference, you can scale down to zero instances during periods of inactivity to optimize resource utilization and reduce costs. There's no extra cost for cold-boot start times, and the system is optimized to minimize them. NVIDIA DGX Cloud Serverless Inference is powered by NVIDIA Cloud Functions (NVCF), which offers robust observability features. It allows you to integrate your preferred monitoring tools, such as Splunk, for comprehensive insights into your AI workloads. NVCF offers flexible deployment options for NIM microservices while allowing you to bring your own containers, models, and Helm charts.

View Software
5

Shadeform

Shadeform

Shadeform is a GPU cloud marketplace that provides a single platform, unified console, and API for finding, comparing, launching, and managing on-demand GPU instances across numerous cloud providers, making it easier to develop, train, and deploy AI models without juggling multiple accounts or provider interfaces. It lets users view live pricing and availability for GPUs across clouds, launch instances in either their own cloud accounts or in Shadeform-managed accounts, and manage a cross-cloud fleet from one place with standardized tooling such as curl, Python, or Terraform. It aggregates GPU capacity and pricing data so teams can optimize compute spend, deploy containerized workloads with consistent interfaces, centralize billing and account management, and avoid vendor-specific complexity by using a unified API that supports multiple providers. Shadeform also offers scheduling and automated provisioning so that users can secure resources when they become available.

Starting Price: $0.15 per hour

View Software