Download Latest Version v0.13.0 source code.tar.gz (1.5 MB)
Email in envelope

Get an email when there's a new version of WhisperKit

Home / v0.12.0
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2025-04-15 2.3 kB
v0.12.0 source code.tar.gz 2025-04-15 1.5 MB
v0.12.0 source code.zip 2025-04-15 1.6 MB
Totals: 3 Items   3.1 MB 2

This minor release brings in multi-channel audio merging which was a frequently requested feature. This changes the default audio processing code path to always merge all channels if multiple are detected, whereas before it was only using channel 0. You can also select specific channels when loading audio as part of the config:

:::swift
        let config = WhisperKitConfig(
            ...
            audioInputConfig: AudioInputConfig(channelMode: .sumChannels([1, 3, 5]))
        )

This could be used as a simplified form of speaker separation if your audio file has distinct speakers in different channels.

From [#320]:

The audio merging algorithm works like this: We find the peak across all channels, check if the peak of the mono (summed) version is higher than any of the peaks of the channels, then we multiply the whole track so that the peak of the mono channel matches the peak of the loudest channel.

Eg: Top mono (merged) buffer, bottom individual channels (pre-merge) image Here you can see how the merged audio maintains the same loudness as the original multi-channel audio file, and you can see the total merged waveform of all the channels.

This release also brings in updates to the recommendedModels function for the latest devices released since the last update, as well as some improved testing methods.

What's Changed

New Contributors

Full Changelog: https://github.com/argmaxinc/WhisperKit/compare/v0.11.0...v0.12.0

Source: README.md, updated 2025-04-15