WhisperKit - Browse /v0.12.0 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
README.md	2025-04-15	2.3 kB	1
v0.12.0 source code.tar.gz	2025-04-15	1.5 MB	0
v0.12.0 source code.zip	2025-04-15	1.6 MB	1
Totals: 3 Items		3.1 MB	2

This minor release brings in multi-channel audio merging which was a frequently requested feature. This changes the default audio processing code path to always merge all channels if multiple are detected, whereas before it was only using channel 0. You can also select specific channels when loading audio as part of the config:

:::swift
        let config = WhisperKitConfig(
            ...
            audioInputConfig: AudioInputConfig(channelMode: .sumChannels([1, 3, 5]))
        )

This could be used as a simplified form of speaker separation if your audio file has distinct speakers in different channels.

From [#320]:

The audio merging algorithm works like this: We find the peak across all channels, check if the peak of the mono (summed) version is higher than any of the peaks of the channels, then we multiply the whole track so that the peak of the mono channel matches the peak of the loudest channel.

Eg: Top mono (merged) buffer, bottom individual channels (pre-merge) Here you can see how the merged audio maintains the same loudness as the original multi-channel audio file, and you can see the total merged waveform of all the channels.

This release also brings in updates to the recommendedModels function for the latest devices released since the last update, as well as some improved testing methods.

What's Changed

Multi channel audio merging by @flashno in https://github.com/argmaxinc/WhisperKit/pull/320
Platform-specific defaults for concurrentWorkerCount by @ZachNagengast in https://github.com/argmaxinc/WhisperKit/pull/321
Updated fallbackModelSupportConfig with extra device identifiers by @iandundas in https://github.com/argmaxinc/WhisperKit/pull/323
Use remote models for test by @ZachNagengast in https://github.com/argmaxinc/WhisperKit/pull/324

New Contributors

@flashno made their first contribution in https://github.com/argmaxinc/WhisperKit/pull/320

Full Changelog: https://github.com/argmaxinc/WhisperKit/compare/v0.11.0...v0.12.0

Source: README.md, updated 2025-04-15

WhisperKit Files

On-device Speech Recognition for Apple Silicon

What's Changed

New Contributors

WhisperKit Files

On-device Speech Recognition for Apple Silicon

Get an email when there's a new version of WhisperKit

What's Changed

New Contributors