dstack - Browse /0.20.20 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
0.20.20 source code.tar.gz	< 12 hours ago	32.2 MB	0
0.20.20 source code.zip	< 12 hours ago	33.2 MB	0
README.md	< 12 hours ago	10.9 kB	0
Totals: 3 Items		65.4 MB	0

Services

NVIDIA Dynamo

This update adds support for Prefill-Decode (PD) disaggregated inference with NVIDIA Dynamo.

Previously, dstack supported PD disaggregation only with Shepherd Model Gateway as the router and SGLang as the inference engine for workers. With this update, a replica group can declare router: { type: dynamo }, allowing workers to use inference engines such as SGLang, vLLM, or TensorRT-LLM.

:::yaml
type: service
name: dynamo-pd

env:

  - HF_TOKEN
  - MODEL_ID=zai-org/GLM-4.5-Air-FP8

replicas:

  - count: 1
    docker: true
    commands:
      - apt-get update
      - apt-get install -y python3-dev python3-venv
      - python3 -m venv ~/dyn-venv
      - source ~/dyn-venv/bin/activate
      - pip install -U pip
      - pip install "ai-dynamo[sglang]==1.1.1"
      - git clone https://github.com/ai-dynamo/dynamo.git
      # Brings up the NATS / etcd compose stack and runs the Dynamo HTTP frontend.
      - docker compose -f dynamo/deploy/docker-compose.yml up -d
      - |
        python3 -m dynamo.frontend \
          --http-host 0.0.0.0 --http-port 8000 \
          --discovery-backend etcd --router-mode kv \
          --kv-cache-block-size 64
    resources:
      cpu: 4
    router:
      type: dynamo


  - count: 1..4
    scaling:
      metric: rps
      target: 3
    python: "3.12"
    nvcc: true
    commands:
      # dstack injects DSTACK_ROUTER_INTERNAL_IP after the router replica
      # is provisioned. Compose the etcd/NATS endpoints from it.
      - export ETCD_ENDPOINTS="http://$DSTACK_ROUTER_INTERNAL_IP:2379"
      - export NATS_SERVER="nats://$DSTACK_ROUTER_INTERNAL_IP:4222"
      # Set to enable /health endpoint required by dstack probes.
      - export DYN_SYSTEM_PORT="8000"
      # Wait until the router's etcd and NATS ports are actually accepting connections.
      - |
        until (echo > /dev/tcp/$DSTACK_ROUTER_INTERNAL_IP/2379) 2>/dev/null \
           && (echo > /dev/tcp/$DSTACK_ROUTER_INTERNAL_IP/4222) 2>/dev/null; do
          echo "waiting for etcd/NATS on $DSTACK_ROUTER_INTERNAL_IP..."; sleep 3
        done
      - pip install "ai-dynamo[sglang]==1.1.1"
      - |
        python3 -m dynamo.sglang \
          --model-path $MODEL_ID --served-model-name $MODEL_ID \
          --discovery-backend etcd --host 0.0.0.0 \
          --page-size 64 \
          --disaggregation-mode prefill --disaggregation-transfer-backend nixl
    resources:
      gpu: H200


  - count: 1..8
    scaling:
      metric: rps
      target: 2
    python: "3.12"
    nvcc: true
    commands:
      - export ETCD_ENDPOINTS="http://$DSTACK_ROUTER_INTERNAL_IP:2379"
      - export NATS_SERVER="nats://$DSTACK_ROUTER_INTERNAL_IP:4222"
      - export DYN_SYSTEM_PORT="8000"
      - |
        until (echo > /dev/tcp/$DSTACK_ROUTER_INTERNAL_IP/2379) 2>/dev/null \
           && (echo > /dev/tcp/$DSTACK_ROUTER_INTERNAL_IP/4222) 2>/dev/null; do
          echo "waiting for etcd/NATS on $DSTACK_ROUTER_INTERNAL_IP..."; sleep 3
        done
      - pip install "ai-dynamo[sglang]==1.1.1"
      - |
        python3 -m dynamo.sglang \
          --model-path $MODEL_ID --served-model-name $MODEL_ID \
          --discovery-backend etcd --host 0.0.0.0 \
          --page-size 64 \
          --disaggregation-mode decode --disaggregation-transfer-backend nixl
    resources:
      gpu: H200

port: 8000
model: zai-org/GLM-4.5-Air-FP8

# Custom probe is required for PD disaggregation.
probes:

  - type: http
    url: /health
    interval: 15s

dstack provisions the router replica, injects DSTACK_ROUTER_INTERNAL_IP into non-router replicas, and lets Dynamo workers connect directly to the router’s etcd and NATS services.

Refer to the Dynamo example for full deployment instructions.

Replica groups

It's now possible to configure the image, docker, python, nvcc, and privileged properties at the replica group level. This enables complex multi-component services like NVIDIA Dynamo, where different replicas require different runtime environments.

Exports

Gateways

Gateways can now be exported and shared across projects, enabling centralized gateway management in multi-project setups.

:::shell
$ dstack export --project main create my-export --gateway shared-gateway --importer team
 NAME       FLEETS  GATEWAYS        IMPORTERS 
 my-export  -       shared-gateway  team

Now, if you list gateways in the team project, you'll see the exported gateway:

:::shell
$ dstack gateway --project team
 NAME                 BACKEND          HOSTNAME        DOMAIN                 DEFAULT  STATUS  
 main/shared-gateway  aws (eu-west-1)  108.131.126.35  gtw.mycompany.example           running

Additionally, gateway domains now support optional project name interpolation using ${{ run.project_name }}, allowing different projects to use different domains on the same shared gateway.

:::yaml
type: gateway
name: shared-gateway

backend: aws
region: eu-west-1

domain: ${{ run.project_name }}.mycompany.example

Global exports

Users with global admin privileges can now export SSH fleets and gateways to all projects at once, enabling organization-wide resource sharing.

:::bash
$ dstack export create global-export --gateway shared-gateway --global
 NAME           FLEETS  GATEWAYS        IMPORTERS
 global-export  -       shared-gateway  *

AWS

EFA clusters

Previously, fleets that used EFA (Elastic Fabric Adapter) with multiple network interfaces required public_ips: False. With this release, dstack allows creating such fleets with public IPs. This simplifies the use of interconnected clusters on AWS by removing the need to run the dstack server and CLI inside a private VPC.

Kubernetes

Backend configuration

The namespace property of the kubernetes backend configuration is now formally deprecated. It still takes effect and remains the source of truth in this version, but future versions will read the namespace from the current kubeconfig context instead.

Migration guide

#### Migration guide - If `namespace` is unset or set to `default` in both the backend config and the kubeconfig, no action is required — `default` continues to be used. - If `namespace` is set to the same value (e.g. `ns-a`) in both the backend config and the kubeconfig, no action is required. - If `namespace` is set to `ns-a` in the backend config but the kubeconfig has a different value (or none), set the namespace to `ns-a` in your kubeconfig context to prepare for future versions. - It is only safe to remove `namespace` from the backend config if its value is `default`.

What's changed

[Services] Allow to specify image, docker, python, nvcc, privileged at replica group level by @Bihan in https://github.com/dstackai/dstack/pull/3832
[Internal]: Delete some unused classes by @jvstme in https://github.com/dstackai/dstack/pull/3842
[Internal] Fix pyright failing in CI by @jvstme in https://github.com/dstackai/dstack/pull/3846
[Internal] Update RunpodApiClient by @un-def in https://github.com/dstackai/dstack/pull/3847
[Internal] Fix openai SDK failing in tests by @jvstme in https://github.com/dstackai/dstack/pull/3849
[RunPod] Handle deleting non-existent volume by @r4victor in https://github.com/dstackai/dstack/pull/3853
[Runpod] Fix broken registry_auth support by @un-def in https://github.com/dstackai/dstack/pull/3844
[UX] Raise ImportError on Python 3.14 or later by @r4victor in https://github.com/dstackai/dstack/pull/3855
[Exports] Gateway support by @jvstme in https://github.com/dstackai/dstack/pull/3845
[Internal] Rename docs/ to mkdocs/, move examples under /docs/, inline source by @peterschmidt85 in https://github.com/dstackai/dstack/pull/3859
[Kubernetes] Deprecate namespace in backend config by @un-def in https://github.com/dstackai/dstack/pull/3858
[Gateways] Allow setting imported gateway as project default by @jvstme in https://github.com/dstackai/dstack/pull/3860
[Internal] Forbid exporting the built-in dstack Sky gateway by @jvstme in https://github.com/dstackai/dstack/pull/3864
[AWS] Support multi-EFA instances with public IPs by @r4victor in https://github.com/dstackai/dstack/pull/3865
[Internal] Add server-side validation for fleet configuration subtypes by @un-def in https://github.com/dstackai/dstack/pull/3848
[Verda] Optimize terminating Verda instances by @jvstme in https://github.com/dstackai/dstack/pull/3811
[Internal] Introduce GatewayModel.forbid_new_services by @jvstme in https://github.com/dstackai/dstack/pull/3863
[Docs] Introduce CLI & API guide; rework the HTTP API reference page by @peterschmidt85 in https://github.com/dstackai/dstack/pull/3869
[Internal] Add script to set up Kubernetes cluster for dstack backend by @un-def in https://github.com/dstackai/dstack/pull/3866
Fix Pyright errors with requests==2.34.0 by @jvstme in https://github.com/dstackai/dstack/pull/3873
Add project name interpolation in gateway domains by @jvstme in https://github.com/dstackai/dstack/pull/3870
[Bugfix] Fix duplicate headers with in-server proxy by @jvstme in https://github.com/dstackai/dstack/pull/3872
[Docs]: Gateway Exports by @jvstme in https://github.com/dstackai/dstack/pull/3862
[Kubernetes] Fail fast if job pod was not scheduled by @un-def in https://github.com/dstackai/dstack/pull/3874
[Exports] Global exports support by @jvstme in https://github.com/dstackai/dstack/pull/3879
[Services] Support PD with NVIDIA Dynamo by @Bihan in https://github.com/dstackai/dstack/pull/3868
[Internal] Update text regarding billing based on the project type by @peterschmidt85 in https://github.com/dstackai/dstack/pull/3876
[Docs] Add NVIDIA Dynamo docs by @Bihan in https://github.com/dstackai/dstack/pull/3877
[Internal] Fix unreleased global_exports lock on Postgres by @jvstme in https://github.com/dstackai/dstack/pull/3882