Download Latest Version v0.18.0 source code.tar.gz (3.2 MB)
Email in envelope

Get an email when there's a new version of WhisperKit

Home / v0.18.0
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2026-04-01 3.8 kB
v0.18.0 source code.tar.gz 2026-04-01 3.2 MB
v0.18.0 source code.zip 2026-04-01 3.4 MB
Totals: 3 Items   6.7 MB 5

Highlights

This release refactors SpeakerKit around a new shared ModelManager base class, unifying the download → load → unload lifecycle across kits and simplifying the SpeakerKit public API.

The top-level entry point is now just:

:::swift
let speakerKit = try await SpeakerKit()

No config object required for the default case.

Architecture Changes

ModelManager (new, ArgmaxCore)

A reusable base class for managing the full model lifecycle that all kits can now inherit from. It handles state transitions, error recovery, and concurrent load coalescing via an internal LoadModelsCoordinator — concurrent callers to ensureModelsLoaded() coalesce onto a single in-flight task rather than racing.

Backend-specific I/O is delegated to a new ModelLoader protocol:

:::swift
public protocol ModelLoader: AnyObject, Sendable {
    var modelFolder: String? { get }
    func resolveModels(downloader: ModelDownloader, progressCallback: ((Progress) -> Void)?) async throws -> String
    func load(from modelPath: String, prewarm: Bool) async throws
    func unload() async
}

SpeakerKitDiarizer (replaces SpeakerKitModelManager)

SpeakerKitModelManager has been replaced by SpeakerKitDiarizer, which inherits from ModelManager and conforms to the new Diarizer protocol. Create it via the static factory:

:::swift
let diarizer = SpeakerKitDiarizer.pyannote(config: config)

SpeakerKitModelManager is still available as a deprecated typealias pointing to SpeakerKitDiarizer so existing code compiles with a warning rather than breaking.

Diarizer protocol (new)

A clean protocol for plugging in diarization backends:

:::swift
public protocol Diarizer: Sendable {
    var modelState: ModelState { get }
    func downloadModels() async throws
    func loadModels() async throws
    func unloadModels() async
    func diarize(audioArray: [Float], options: (any DiarizationOptions)?, progressCallback: ...) async throws -> DiarizationResult
}

ModelDownloadConfig (new, ArgmaxCore)

All download parameters (endpoint, repo, token, revision, background session flag) are now bundled in a ModelDownloadConfig struct rather than passed individually to ModelDownloader. The existing convenience init still works unchanged.

SpeakerKitConfig (new base class)

PyannoteConfig now inherits from SpeakerKitConfig, which adds a load: Bool flag. When false (the default), models are loaded lazily on the first diarize() call. Set load: true to eager-load during SpeakerKit init.

API Changes

SpeakerKit initialization simplified

:::swift
// Before
let config = PyannoteConfig()
let speakerKit = try await SpeakerKit(config)

// After
let speakerKit = try await SpeakerKit()

Local models path is now a String instead of URL:

:::swift
// Before
let config = PyannoteConfig(modelFolder: URL(filePath: "/path/to/models"))

// After
let config = PyannoteConfig(modelFolder: "/path/to/models")

Lazy loading by default

Models are now loaded on the first diarize() call unless load: true is set in config.

CLI

diarize and transcribe --diarization now use the simplified SpeakerKit(config) API internally. No functional changes to CLI flags.

What's Changed

Full Changelog: https://github.com/argmaxinc/WhisperKit/compare/v0.17.0...v0.18.0

Source: README.md, updated 2026-04-01