[OMPT-svn-commit] OpenMPT r23192 - in branches/OpenMPT-1.32: . include/SignalsmithStretch include/S

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Author: sagamusix
Date: Sun May 25 14:04:43 2025
New Revision: 23192
URL: https://source.openmpt.org/browse/openmpt/?op=revision&rev=23192

Log:
Merged revision(s) 23191 from trunk/OpenMPT:
[Var] Update Signalsmith Stretch to commit c78aacf6389fb862978d4ead01cfc71a3d5f8ca8 (2025-05-01)
........

Added:
   branches/OpenMPT-1.32/include/SignalsmithStretch/signalsmith-linear/
      - copied from r23191, trunk/OpenMPT/include/SignalsmithStretch/signalsmith-linear/
Deleted:
   branches/OpenMPT-1.32/include/SignalsmithStretch/SignalsmithStretch/dsp/
Modified:
   branches/OpenMPT-1.32/   (props changed)
   branches/OpenMPT-1.32/include/SignalsmithStretch/OpenMPT.txt
   branches/OpenMPT-1.32/include/SignalsmithStretch/SignalsmithStretch/README.md
   branches/OpenMPT-1.32/include/SignalsmithStretch/SignalsmithStretch/signalsmith-stretch.h
   branches/OpenMPT-1.32/tracklib/TimeStretchPitchShift.cpp

Modified: branches/OpenMPT-1.32/include/SignalsmithStretch/OpenMPT.txt
==============================================================================

--- branches/OpenMPT-1.32/include/SignalsmithStretch/OpenMPT.txt	Sun May 25 14:04:10 2025	(r23191)
+++ branches/OpenMPT-1.32/include/SignalsmithStretch/OpenMPT.txt	Sun May 25 14:04:43 2025	(r23192)
@@ -1,3 +1,7 @@
-Signalsmith Stretch: C++ pitch/time library, commit ffa45981be0b75079b43d1a7a142234f06869254 (2025-02-12)
+Signalsmith Stretch: C++ pitch shift / time stretch library, commit c78aacf6389fb862978d4ead01cfc71a3d5f8ca8 (2025-05-01)
 https://github.com/Signalsmith-Audio/signalsmith-stretch
-Unneeded files (cmd and web folder) have been removed.
\ No newline at end of file
+Unneeded files have been removed.
+
+Signalsmith Linear: C++ audio utilities, commit 94d677be71a50789eb7a33b8927947046c0c2fa2 (2025-04-09)
+https://github.com/Signalsmith-Audio/linear
+Unneeded files have been removed.
\ No newline at end of file

Modified: branches/OpenMPT-1.32/include/SignalsmithStretch/SignalsmithStretch/README.md
==============================================================================
--- branches/OpenMPT-1.32/include/SignalsmithStretch/SignalsmithStretch/README.md	Sun May 25 14:04:10 2025	(r23191)
+++ branches/OpenMPT-1.32/include/SignalsmithStretch/SignalsmithStretch/README.md	Sun May 25 14:04:43 2025	(r23192)
@@ -90,6 +90,12 @@
 
 You should be supplying input samples slightly ahead of the processing time (which is where changes to pitch-shift or stretch rate will be centred), and you'll receive output samples slightly behind that processing time.
 
+### Split computation
+
+All of the `.preset???()` and `.configure()` methods have an optional `splitComputation` flag.  When enabled, this introduces one extra interval of output latency, and uses this to spread the computation out more evenly.
+
+Without this (as is common for spectral processing) the library will occasionally do a bunch of computation all at once, to compute the next spectral block of audio.  This is often fine, when audio's being processed across multiple threads with a decent amount of buffering (like mixing in a DAW), but if you're in a stricter situation then this flag might help. 
+
 #### Automation
 
 To follow pitch/time automation accurately, you should give it automation values from the current processing time (`.outputLatency()` samples ahead of the output), and feed it input from `.inputLatency()` samples ahead of the current processing time.
@@ -114,21 +120,36 @@
 stretch.flush(outputBuffers, outputSamples);
 ``` 
 
-Using `.seek()`/`.flush()` like this, you can perform an exact time-stretch on a fixed-length sound, and your result will have `.outputLatency()` of pre-roll. 
+Using `.seek()`/`.flush()` like this, you can perform an exact time-stretch on a fixed-length sound, and your result will have `.outputLatency()` of pre-roll.
 
-## Compiling
+### Formant compensation
 
-⚠️ This has mostly been tested with Clang.  If you're using another compiler and have any problems, please get in touch.
+```cpp
+stretch.setFormantFactor(1.2); // up ~3 semitones
 
-🚨 It's generally be OK to enable `-ffast-math`, however there's a bug in Apple Clang 16.0.0 which can generate incorrect SIMD code.  If you _have_ to use this version, we advise you don't use `-ffast-math`.
+stretch.setTransposeSemitones(3);
+```
 
-It's much slower (about 10x) if optimisation is disabled altogether, so you might want to enable optimisation where it's used, even in Debug builds.
+Both of those methods both have an optional `compensatePitch` flag.  Enabling this adjust for the pitch-shift (or non-linear map) when correcting/shifting formants.
 
-### DSP Library
+The formant correction is not a sharp as monophonic algorithms (such such as PSOLA).  It also needs you to give it a rough estimate of fundamental frequency (relative to Nyquist):
 
-This uses the Signalsmith DSP library for FFTs and other bits and bobs.
+```cpp
+// if 200Hz is the middle-register of the instrument
+stretch.setFormantBase(200/sampleRate);
+```
 
-For convenience, a copy of the library is included (with its own `LICENSE.txt`) in `dsp/`, but if you're already using this elsewhere then you should remove this copy to avoid versioning issues.
+## Compiling
+
+⚠️ This has mostly been tested with AppleClang (Mac), and MSVC (Windows).  We aim to support other compilers though, so please get in touch if you have any problems.
+
+Enabling `-ffast-math` (or equivalent) is fine, unless you're using Apple Clang 16.0.0 (in which case it will deliberately break, because that version generates incorrect SIMD code).
+
+The algorithm has a lot of number-crunching, so Debug builds are much slower (up to 10x).  If possible, you might want to enable optimisation where Stretch is used, even in Debug builds.
+
+### Dependencies and `#define`s
+
+This uses the [Signalsmith Linear](https://github.com/Signalsmith-Audio/linear) library for FFTs and other speedups.  There are [flags]([Linear repo](https://github.com/Signalsmith-Audio/linear?tab=readme-ov-file#building)) to enable Accelerate (`SIGNALSMITH_USE_ACCELERATE`) or IPP (`SIGNALSMITH_USE_IPP`).
 
 ## License
 
@@ -136,14 +157,18 @@
 
 ## Other environments / languages
 
-There's a Web Audio wrapper in `web/` (using WASM/AudioWorklet).  This will remain in-sync with the C++ library.
+There's a Web Audio release in [`web/`](web/) (WASM/AudioWorklet), also available on [NPM](npmjs.com/package/signalsmith-stretch).  This has its own (higher-level) API, but the stretching algorithm will remain in-sync with the C++ library.
+
+There's a [Python binding](https://pypi.org/project/python-stretch/) written/published by [Gregorio Andrea Giudici](https://github.com/gregogiudici/python-stretch).  This is used as the default pitch/time method in [Audiomentations](https://iver56.github.io/audiomentations/).
 
-There's a [Python binding](https://pypi.org/project/python-stretch/) written/published by [Gregorio Andrea Giudici](https://github.com/gregogiudici/python-stretch), and a [Rust wrapper](https://crates.io/crates/signalsmith-stretch) by [Colin Marc](https://github.com/colinmarc/signalsmith-stretch-rs).
+There's a [Rust wrapper](https://crates.io/crates/signalsmith-stretch) by [Colin Marc](https://github.com/colinmarc/signalsmith-stretch-rs).
 
 ## Thanks
 
 We'd like to particularly thank the following people who sponsored specific features or improvements:
 
+* **Future Audio Workshop**: chunked-computation mode
+* **Jochem de Jong**: formant shifting/compensation
 * **Metronaut**: web audio (JS/WASM) release
 * **Daniel L Bowling** and the Stanford School of Medicine: web audio improvements
 

Modified: branches/OpenMPT-1.32/include/SignalsmithStretch/SignalsmithStretch/signalsmith-stretch.h
==============================================================================
--- branches/OpenMPT-1.32/include/SignalsmithStretch/SignalsmithStretch/signalsmith-stretch.h	Sun May 25 14:04:10 2025	(r23191)
+++ branches/OpenMPT-1.32/include/SignalsmithStretch/SignalsmithStretch/signalsmith-stretch.h	Sun May 25 14:04:43 2025	(r23192)
@@ -1,63 +1,89 @@
 #ifndef SIGNALSMITH_STRETCH_H
 #define SIGNALSMITH_STRETCH_H
 
-#include "dsp/spectral.h"
-#include "dsp/delay.h"
-#include "dsp/perf.h"
-SIGNALSMITH_DSP_VERSION_CHECK(1, 6, 0); // Check version is compatible
+#include "signalsmith-linear/stft.h" // https://github.com/Signalsmith-Audio/linear
+
 #include <vector>
+#include <array>
 #include <algorithm>
 #include <functional>
 #include <random>
+#include <limits>
+#include <type_traits>
 
 namespace signalsmith { namespace stretch {
 
-template<typename Sample=float, class RandomEngine=std::default_random_engine>
+namespace _impl {
+	template<bool conjugateSecond=false, typename V>
+	static std::complex<V> mul(const std::complex<V> &a, const std::complex<V> &b) {
+		return conjugateSecond ? std::complex<V>{
+			b.real()*a.real() + b.imag()*a.imag(),
+				b.real()*a.imag() - b.imag()*a.real()
+		} : std::complex<V>{
+			a.real()*b.real() - a.imag()*b.imag(),
+			a.real()*b.imag() + a.imag()*b.real()
+		};
+	}
+	template<typename V>
+	static V norm(const std::complex<V> &a) {
+		V r = a.real(), i = a.imag();
+		return r*r + i*i;
+	}
+}
+
+template<typename Sample=float, class RandomEngine=void>
 struct SignalsmithStretch {
-	static constexpr size_t version[3] = {1, 1, 1};
+	static constexpr size_t version[3] = {1, 3, 1};
 
 	SignalsmithStretch() : randomEngine(std::random_device{}()) {}
 	SignalsmithStretch(long seed) : randomEngine(seed) {}
-
+	
 	int blockSamples() const {
-		return stft.windowSize();
+		return int(stft.blockSamples());
 	}
 	int intervalSamples() const {
-		return stft.interval();
+		return int(stft.defaultInterval());
 	}
 	int inputLatency() const {
-		return stft.windowSize()/2;
+		return int(stft.analysisLatency());
 	}
 	int outputLatency() const {
-		return stft.windowSize() - inputLatency();
+		return int(stft.synthesisLatency() + _splitComputation*stft.defaultInterval());
 	}
 	
 	void reset() {
-		stft.reset();
-		inputBuffer.reset();
+		stft.reset(0.1);
+		stashedInput = stft.input;
+		stashedOutput = stft.output;
+		
 		prevInputOffset = -1;
 		channelBands.assign(channelBands.size(), Band());
 		silenceCounter = 0;
 		didSeek = false;
-		flushed = true;
+		blockProcess = {};
+		freqEstimateWeighted = freqEstimateWeight = 0;
 	}
 
 	// Configures using a default preset
-	void presetDefault(int nChannels, Sample sampleRate) {
-		configure(nChannels, sampleRate*0.12, sampleRate*0.03);
+	void presetDefault(int nChannels, Sample sampleRate, bool splitComputation=false) {
+		configure(nChannels, sampleRate*0.12, sampleRate*0.03, splitComputation);
 	}
-	void presetCheaper(int nChannels, Sample sampleRate) {
-		configure(nChannels, sampleRate*0.1, sampleRate*0.04);
+	void presetCheaper(int nChannels, Sample sampleRate, bool splitComputation=true) {
+		configure(nChannels, sampleRate*0.1, sampleRate*0.04, splitComputation);
 	}
 
 	// Manual setup
-	void configure(int nChannels, int blockSamples, int intervalSamples) {
+	void configure(int nChannels, int blockSamples, int intervalSamples, bool splitComputation=false) {
+		_splitComputation = splitComputation;
 		channels = nChannels;
-		stft.setWindow(stft.kaiser, true);
-		stft.resize(channels, blockSamples, intervalSamples);
-		bands = stft.bands();
-		inputBuffer.resize(channels, blockSamples + intervalSamples + 1);
-		timeBuffer.assign(stft.fftSize(), 0);
+		stft.configure(channels, channels, blockSamples, intervalSamples + 1);
+		stft.setInterval(intervalSamples, stft.kaiser);
+		stft.reset(0.1);
+		stashedInput = stft.input;
+		stashedOutput = stft.output;
+		tmpBuffer.resize(blockSamples + intervalSamples);
+
+		bands = int(stft.bands());
 		channelBands.assign(bands*channels, Band());
 		
 		peaks.reserve(bands/2);
@@ -65,6 +91,9 @@
 		smoothedEnergy.resize(bands);
 		outputMap.resize(bands);
 		channelPredictions.resize(channels*bands);
+
+		blockProcess = {};
+		formantMetric.resize(bands + 2);
 	}
 
 	/// Frequency multiplier, and optional tonality limit (as multiple of sample-rate)
@@ -79,39 +108,76 @@
 	}
 	void setTransposeSemitones(Sample semitones, Sample tonalityLimit=0) {
 		setTransposeFactor(std::pow(2, semitones/12), tonalityLimit);
-		customFreqMap = nullptr;
 	}
 	// Sets a custom frequency map - should be monotonically increasing
 	void setFreqMap(std::function<Sample(Sample)> inputToOutput) {
 		customFreqMap = inputToOutput;
 	}
 
+	void setFormantFactor(Sample multiplier, bool compensatePitch=false) {
+		formantMultiplier = multiplier;
+		invFormantMultiplier = 1/multiplier;
+		formantCompensation = compensatePitch;
+	}
+	void setFormantSemitones(Sample semitones, bool compensatePitch=false) {
+		setFormantFactor(std::pow(2, semitones/12), compensatePitch);
+	}
+	// Rough guesstimate of the fundamental frequency, used for formant analysis. 0 means attempting to detect the pitch
+	void setFormantBase(Sample baseFreq=0) {
+		formantBaseFreq = baseFreq;
+	}
+
 	// Provide previous input ("pre-roll"), without affecting the speed calculation.  You should ideally feed it one block-length + one interval
 	template<class Inputs>
 	void seek(Inputs &&inputs, int inputSamples, double playbackRate) {
-		inputBuffer.reset();
+		tmpBuffer.resize(0);
+		tmpBuffer.resize(stft.blockSamples() + stft.defaultInterval());
+
+		int startIndex = std::max<int>(0, inputSamples - int(tmpBuffer.size())); // start position in input
+		int padStart = int(tmpBuffer.size() + startIndex) - inputSamples; // start position in tmpBuffer
+
 		Sample totalEnergy = 0;
 		for (int c = 0; c < channels; ++c) {
 			auto &&inputChannel = inputs[c];
-			auto &&bufferChannel = inputBuffer[c];
-			int startIndex = std::max<int>(0, inputSamples - stft.windowSize() - stft.interval());
 			for (int i = startIndex; i < inputSamples; ++i) {
 				Sample s = inputChannel[i];
 				totalEnergy += s*s;
-				bufferChannel[i] = s;
+				tmpBuffer[i - startIndex + padStart] = s;
 			}
+			
+			stft.writeInput(c, tmpBuffer.size(), tmpBuffer.data());
 		}
+		stft.moveInput(tmpBuffer.size());
 		if (totalEnergy >= noiseFloor) {
 			silenceCounter = 0;
 			silenceFirst = true;
 		}
-		inputBuffer += inputSamples;
 		didSeek = true;
-		seekTimeFactor = (playbackRate*stft.interval() > 1) ? 1/playbackRate : stft.interval();
+		seekTimeFactor = (playbackRate*stft.defaultInterval() > 1) ? 1/playbackRate : stft.defaultInterval();
 	}
-
+	
 	template<class Inputs, class Outputs>
 	void process(Inputs &&inputs, int inputSamples, Outputs &&outputs, int outputSamples) {
+#ifdef SIGNALSMITH_STRETCH_PROFILE_PROCESS_START
+		SIGNALSMITH_STRETCH_PROFILE_PROCESS_START(inputSamples, outputSamples);
+#endif
+		int prevCopiedInput = 0;
+		auto copyInput = [&](int toIndex){
+
+			int length = std::min<int>(int(stft.blockSamples() + stft.defaultInterval()), toIndex - prevCopiedInput);
+			tmpBuffer.resize(length);
+			int offset = toIndex - length;
+			for (int c = 0; c < channels; ++c) {
+				auto &&inputBuffer = inputs[c];
+				for (int i = 0; i < length; ++i) {
+					tmpBuffer[i] = inputBuffer[i + offset];
+				}
+				stft.writeInput(c, length, tmpBuffer.data());
+			}
+			stft.moveInput(length);
+			prevCopiedInput = toIndex;
+		};
+
 		Sample totalEnergy = 0;
 		for (int c = 0; c < channels; ++c) {
 			auto &&inputChannel = inputs[c];
@@ -120,10 +186,13 @@
 				totalEnergy += s*s;
 			}
 		}
+
 		if (totalEnergy < noiseFloor) {
-			if (silenceCounter >= 2*stft.windowSize()) {
-				if (silenceFirst) {
+			if (silenceCounter >= 2*stft.blockSamples()) {
+				if (silenceFirst) { // first block of silence processing
 					silenceFirst = false;
+					//stft.reset();
+					blockProcess = {};
 					for (auto &b : channelBands) {
 						b.input = b.prevInput = b.output = 0;
 						b.inputEnergy = 0;
@@ -148,15 +217,7 @@
 				}
 
 				// Store input in history buffer
-				for (int c = 0; c < channels; ++c) {
-					auto &&inputChannel = inputs[c];
-					auto &&bufferChannel = inputBuffer[c];
-					int startIndex = std::max<int>(0, inputSamples - stft.windowSize() - stft.interval());
-					for (int i = startIndex; i < inputSamples; ++i) {
-						bufferChannel[i] = inputChannel[i];
-					}
-				}
-				inputBuffer += inputSamples;
+				copyInput(inputSamples);
 				return;
 			} else {
 				silenceCounter += inputSamples;
@@ -165,119 +226,174 @@
 			silenceCounter = 0;
 			silenceFirst = true;
 		}
-
+		
 		for (int outputIndex = 0; outputIndex < outputSamples; ++outputIndex) {
-			stft.ensureValid(outputIndex, [&](int outputOffset) {
+			bool newBlock = blockProcess.samplesSinceLast >= stft.defaultInterval();
+			if (newBlock) {
+				blockProcess.step = 0;
+				blockProcess.steps = 0; // how many processing steps this block will have
+				blockProcess.samplesSinceLast = 0;
+				
 				// Time to process a spectrum!  Where should it come from in the input?
-				int inputOffset = std::round(outputOffset*Sample(inputSamples)/outputSamples) - stft.windowSize();
+				int inputOffset = std::round(outputIndex*Sample(inputSamples)/outputSamples);
 				int inputInterval = inputOffset - prevInputOffset;
 				prevInputOffset = inputOffset;
+				
+				copyInput(inputOffset);
+				stashedInput = stft.input; // save the input state, since that's what we'll analyse later
+				if (_splitComputation) {
+					stashedOutput = stft.output; // save the current output, and read from it
+					stft.moveOutput(stft.defaultInterval()); // the actual input jumps forward in time by one interval, ready for the synthesis
+				}
+
+				blockProcess.newSpectrum = didSeek || (inputInterval > 0);
+				blockProcess.mappedFrequencies = customFreqMap || freqMultiplier != 1;
+				if (blockProcess.newSpectrum) {
+					// make sure the previous input is the correct distance in the past (give or take 1 sample)
+					blockProcess.reanalysePrev = didSeek || std::abs(inputInterval - int(stft.defaultInterval())) > 1;
+					if (blockProcess.reanalysePrev) blockProcess.steps += stft.analyseSteps() + 1;
 
-				bool newSpectrum = didSeek || (inputInterval > 0);
-				if (newSpectrum) {
-					for (int c = 0; c < channels; ++c) {
-						// Copy from the history buffer, if needed
-						auto &&bufferChannel = inputBuffer[c];
-						for (int i = 0; i < -inputOffset; ++i) {
-							timeBuffer[i] = bufferChannel[i + inputOffset];
+					// analyse a new input
+					blockProcess.steps += stft.analyseSteps() + 1;
+				}
+				
+				blockProcess.processFormants = formantMultiplier != 1 || (formantCompensation && blockProcess.mappedFrequencies);
+
+				blockProcess.timeFactor = didSeek ? seekTimeFactor : stft.defaultInterval()/std::max<Sample>(1, inputInterval);
+				didSeek = false;
+
+				updateProcessSpectrumSteps();
+				blockProcess.steps += processSpectrumSteps;
+
+				blockProcess.steps += stft.synthesiseSteps() + 1;
+			}
+			
+			size_t processToStep = newBlock ? blockProcess.steps : 0;
+			if (_splitComputation) {
+				Sample processRatio = Sample(blockProcess.samplesSinceLast + 1)/stft.defaultInterval();
+				processToStep = std::min<size_t>(blockProcess.steps, (blockProcess.steps + 0.999f)*processRatio);
+			}
+			
+			while (blockProcess.step < processToStep) {
+				size_t step = blockProcess.step++;
+#ifdef SIGNALSMITH_STRETCH_PROFILE_PROCESS_STEP
+				SIGNALSMITH_STRETCH_PROFILE_PROCESS_STEP(step, blockProcess.steps);
+#endif
+				if (blockProcess.newSpectrum) {
+					if (blockProcess.reanalysePrev) {
+						// analyse past input
+						if (step < stft.analyseSteps()) {
+							stashedInput.swap(stft.input);
+							stft.analyseStep(step, stft.defaultInterval());
+							stashedInput.swap(stft.input);
+							continue;
 						}
-						// Copy the rest from the input
-						auto &&inputChannel = inputs[c];
-						for (int i = std::max<int>(0, -inputOffset); i < stft.windowSize(); ++i) {
-							timeBuffer[i] = inputChannel[i + inputOffset];
+						step -= stft.analyseSteps();
+						if (step < 1) {
+							// Copy previous analysis to our band objects
+							for (int c = 0; c < channels; ++c) {
+								auto channelBands = bandsForChannel(c);
+								auto *spectrumBands = stft.spectrum(c);
+								for (int b = 0; b < bands; ++b) {
+									channelBands[b].prevInput = spectrumBands[b];
+								}
+							}
+							continue;
 						}
-						stft.analyse(c, timeBuffer);
+						step -= 1;
 					}
-					flushed = false; // TODO: first block after a flush should be gain-compensated
 
-					for (int c = 0; c < channels; ++c) {
-						auto channelBands = bandsForChannel(c);
-						auto &&spectrumBands = stft.spectrum[c];
-						for (int b = 0; b < bands; ++b) {
-							channelBands[b].input = spectrumBands[b];
-						}
+					// Analyse latest (stashed) input
+					if (step < stft.analyseSteps()) {
+						stashedInput.swap(stft.input);
+						stft.analyseStep(step);
+						stashedInput.swap(stft.input);
+						continue;
 					}
-
-					if (didSeek || inputInterval != stft.interval()) { // make sure the previous input is the correct distance in the past
-						int prevIntervalOffset = inputOffset - stft.interval();
-						for (int c = 0; c < channels; ++c) {
-							// Copy from the history buffer, if needed
-							auto &&bufferChannel = inputBuffer[c];
-							for (int i = 0; i < std::min(-prevIntervalOffset, stft.windowSize()); ++i) {
-								timeBuffer[i] = bufferChannel[i + prevIntervalOffset];
-							}
-							// Copy the rest from the input
-							auto &&inputChannel = inputs[c];
-							for (int i = std::max<int>(0, -prevIntervalOffset); i < stft.windowSize(); ++i) {
-								timeBuffer[i] = inputChannel[i + prevIntervalOffset];
-							}
-							stft.analyse(c, timeBuffer);
-						}
+					step -= stft.analyseSteps();
+					if (step < 1) {
+						// Copy analysed spectrum into our band objects
 						for (int c = 0; c < channels; ++c) {
 							auto channelBands = bandsForChannel(c);
-							auto &&spectrumBands = stft.spectrum[c];
+							auto *spectrumBands = stft.spectrum(c);
 							for (int b = 0; b < bands; ++b) {
-								channelBands[b].prevInput = spectrumBands[b];
+								channelBands[b].input = spectrumBands[b];
 							}
 						}
+						continue;
 					}
+					step -= 1;
 				}
-				
-				Sample timeFactor = didSeek ? seekTimeFactor : stft.interval()/std::max<Sample>(1, inputInterval);
-				processSpectrum(newSpectrum, timeFactor);
-				didSeek = false;
 
-				for (int c = 0; c < channels; ++c) {
-					auto channelBands = bandsForChannel(c);
-					auto &&spectrumBands = stft.spectrum[c];
-					for (int b = 0; b < bands; ++b) {
-						spectrumBands[b] = channelBands[b].output;
+				if (step < processSpectrumSteps) {
+					processSpectrum(step);
+					continue;
+				}
+				step -= processSpectrumSteps;
+
+				if (step < 1) {
+					// Copy band objects into spectrum
+					for (int c = 0; c < channels; ++c) {
+						auto channelBands = bandsForChannel(c);
+						auto *spectrumBands = stft.spectrum(c);
+						for (int b = 0; b < bands; ++b) {
+							spectrumBands[b] = channelBands[b].output;
+						}
 					}
+					continue;
 				}
-			});
+				step -= 1;
+				
+				if (step < stft.synthesiseSteps()) {
+					stft.synthesiseStep(step);
+					continue;
+				}
+			}
+#ifdef SIGNALSMITH_STRETCH_PROFILE_PROCESS_ENDSTEP
+			SIGNALSMITH_STRETCH_PROFILE_PROCESS_ENDSTEP();
+#endif
 
+			++blockProcess.samplesSinceLast;
+			if (_splitComputation) stashedOutput.swap(stft.output);
 			for (int c = 0; c < channels; ++c) {
 				auto &&outputChannel = outputs[c];
-				auto &&stftChannel = stft[c];
-				outputChannel[outputIndex] = stftChannel[outputIndex];
+				Sample v = 0;
+				stft.readOutput(c, 1, &v);
+				outputChannel[outputIndex] = v;
 			}
+			stft.moveOutput(1);
+			if (_splitComputation) stashedOutput.swap(stft.output);
 		}
-
-		// Store input in history buffer
-		for (int c = 0; c < channels; ++c) {
-			auto &&inputChannel = inputs[c];
-			auto &&bufferChannel = inputBuffer[c];
-			int startIndex = std::max<int>(0, inputSamples - stft.windowSize());
-			for (int i = startIndex; i < inputSamples; ++i) {
-				bufferChannel[i] = inputChannel[i];
-			}
-		}
-		inputBuffer += inputSamples;
-		stft += outputSamples;
+		
+		copyInput(inputSamples);
 		prevInputOffset -= inputSamples;
+#ifdef SIGNALSMITH_STRETCH_PROFILE_PROCESS_END
+		SIGNALSMITH_STRETCH_PROFILE_PROCESS_END();
+#endif
 	}
 
 	// Read the remaining output, providing no further input.  `outputSamples` should ideally be at least `.outputLatency()`
 	template<class Outputs>
 	void flush(Outputs &&outputs, int outputSamples) {
-		int plainOutput = std::min<int>(outputSamples, stft.windowSize());
-		int foldedBackOutput = std::min<int>(outputSamples, stft.windowSize() - plainOutput);
+		int plainOutput = std::min<int>(outputSamples, int(stft.blockSamples()));
+		int foldedBackOutput = std::min<int>(outputSamples, int(stft.blockSamples()) - plainOutput);
+		stft.finishOutput(1);
 		for (int c = 0; c < channels; ++c) {
+			tmpBuffer.resize(plainOutput);
+			stft.readOutput(c, plainOutput, tmpBuffer.data());
 			auto &&outputChannel = outputs[c];
-			auto &&stftChannel = stft[c];
 			for (int i = 0; i < plainOutput; ++i) {
 				// TODO: plain output should be gain-
-				outputChannel[i] = stftChannel[i];
+				outputChannel[i] = tmpBuffer[i];
 			}
+			tmpBuffer.resize(foldedBackOutput);
+			stft.readOutput(c, plainOutput, foldedBackOutput, tmpBuffer.data());
 			for (int i = 0; i < foldedBackOutput; ++i) {
-				outputChannel[outputSamples - 1 - i] -= stftChannel[plainOutput + i];
-			}
-			for (int i = 0; i < plainOutput + foldedBackOutput; ++i) {
-				stftChannel[i] = 0;
+				outputChannel[outputSamples - 1 - i] -= tmpBuffer[i];
 			}
 		}
-		// Skip the output we just used/cleared
-		stft += plainOutput + foldedBackOutput;
+		stft.reset(0.1);
+
 		// Reset the phase-vocoder stuff, so the next block gets a fresh start
 		for (int c = 0; c < channels; ++c) {
 			auto channelBands = bandsForChannel(c);
@@ -285,31 +401,50 @@
 				channelBands[b].prevInput = channelBands[b].output = 0;
 			}
 		}
-		flushed = true;
 	}
 private:
+	bool _splitComputation = false;
+	struct {
+		size_t samplesSinceLast = std::numeric_limits<size_t>::max();
+		size_t steps = 0;
+		size_t step = 0;
+		
+		bool newSpectrum = false;
+		bool reanalysePrev = false;
+		bool mappedFrequencies = false;
+		bool processFormants = false;
+		Sample timeFactor;
+	} blockProcess;
+
 	using Complex = std::complex<Sample>;
 	static constexpr Sample noiseFloor{1e-15};
 	static constexpr Sample maxCleanStretch{2}; // time-stretch ratio before we start randomising phases
-	int silenceCounter = 0;
+	size_t silenceCounter = 0;
 	bool silenceFirst = true;
 
 	Sample freqMultiplier = 1, freqTonalityLimit = 0.5;
 	std::function<Sample(Sample)> customFreqMap = nullptr;
+	
+	bool formantCompensation = false; // compensate for pitch/freq change
+	Sample formantMultiplier = 1, invFormantMultiplier = 1;
+
+	using STFT = signalsmith::linear::DynamicSTFT<Sample, false, true>;
+	STFT stft;
+	typename STFT::Input stashedInput;
+	typename STFT::Output stashedOutput;
+	
+	std::vector<Sample> tmpBuffer;
 
-	signalsmith::spectral::STFT<Sample> stft{0, 1, 1};
-	signalsmith::delay::MultiBuffer<Sample> inputBuffer;
 	int channels = 0, bands = 0;
 	int prevInputOffset = -1;
-	std::vector<Sample> timeBuffer;
-	bool didSeek = false, flushed = true;
+	bool didSeek = false;
 	Sample seekTimeFactor = 1;
 
 	Sample bandToFreq(Sample b) const {
-		return (b + Sample(0.5))/stft.fftSize();
+		return stft.binToFreq(b);
 	}
 	Sample freqToBand(Sample f) const {
-		return f*stft.fftSize() - Sample(0.5);
+		return stft.freqToBin(f);
 	}
 	
 	struct Band {
@@ -371,10 +506,10 @@
 		Complex input;
 
 		Complex makeOutput(Complex phase) {
-			Sample phaseNorm = std::norm(phase);
+			Sample phaseNorm = _impl::norm(phase);
 			if (phaseNorm <= noiseFloor) {
 				phase = input; // prediction is too weak, fall back to the input
-				phaseNorm = std::norm(input) + noiseFloor;
+				phaseNorm = _impl::norm(input) + noiseFloor;
 			}
 			return phase*std::sqrt(energy/phaseNorm);
 		}
@@ -384,49 +519,95 @@
 		return channelPredictions.data() + c*bands;
 	}
 
-	RandomEngine randomEngine;
-
-	void processSpectrum(bool newSpectrum, Sample timeFactor) {
+	// If RandomEngine=void, use std::default_random_engine;
+	using RandomEngineImpl = typename std::conditional<
+		std::is_void<RandomEngine>::value,
+		std::default_random_engine,
+		RandomEngine
+	>::type;
+	RandomEngineImpl randomEngine;
+
+	size_t processSpectrumSteps = 0;
+	static constexpr size_t splitMainPrediction = 8; // it's just heavy, since we're blending up to 4 different phase predictions
+	void updateProcessSpectrumSteps() {
+		processSpectrumSteps = 0;
+		if (blockProcess.newSpectrum) processSpectrumSteps += channels;
+		if (blockProcess.mappedFrequencies) {
+			processSpectrumSteps += smoothEnergySteps;
+			processSpectrumSteps += 1; // findPeaks
+		}
+		processSpectrumSteps += 1; // updating the output map
+		processSpectrumSteps += channels; // preliminary phase-vocoder prediction
+		processSpectrumSteps += splitMainPrediction;
+		if (blockProcess.newSpectrum) processSpectrumSteps += 1; // .input -> .prevInput
+		if (blockProcess.processFormants) processSpectrumSteps += 3;
+	}
+	void processSpectrum(size_t step) {
+		Sample timeFactor = blockProcess.timeFactor;
+		
+		Sample smoothingBins = Sample(stft.fftSamples())/stft.defaultInterval();
+		int longVerticalStep = std::round(smoothingBins);
 		timeFactor = std::max<Sample>(timeFactor, 1/maxCleanStretch);
 		bool randomTimeFactor = (timeFactor > maxCleanStretch);
 		std::uniform_real_distribution<Sample> timeFactorDist(maxCleanStretch*2*randomTimeFactor - timeFactor, timeFactor);
-		
-		if (newSpectrum) {
-			for (int c = 0; c < channels; ++c) {
-				auto bins = bandsForChannel(c);
 
-				Complex rot = std::polar(Sample(1), bandToFreq(0)*stft.interval()*Sample(2*M_PI));
+		if (blockProcess.newSpectrum) {
+			if (step < size_t(channels)) {
+				int channel = int(step);
+				auto bins = bandsForChannel(channel);
+
+				Complex rot = std::polar(Sample(1), bandToFreq(0)*stft.defaultInterval()*Sample(2*M_PI));
 				Sample freqStep = bandToFreq(1) - bandToFreq(0);
-				Complex rotStep = std::polar(Sample(1), freqStep*stft.interval()*Sample(2*M_PI));
-				
+				Complex rotStep = std::polar(Sample(1), freqStep*stft.defaultInterval()*Sample(2*M_PI));
+				 
 				for (int b = 0; b < bands; ++b) {
 					auto &bin = bins[b];
-					bin.output = signalsmith::perf::mul(bin.output, rot);
-					bin.prevInput = signalsmith::perf::mul(bin.prevInput, rot);
-					rot = signalsmith::perf::mul(rot, rotStep);
+					bin.output = _impl::mul(bin.output, rot);
+					bin.prevInput = _impl::mul(bin.prevInput, rot);
+					rot = _impl::mul(rot, rotStep);
 				}
+				return;
 			}
+			step -= channels;
 		}
+		if (blockProcess.mappedFrequencies) {
+			if (step < smoothEnergySteps) {
+				smoothEnergy(step, smoothingBins);
+				return;
+			}
+			step -= smoothEnergySteps;
+			if (step-- == 0) {
+				findPeaks();
+				return;
+			}
+		}
+		if (step-- == 0) {
+			if (blockProcess.mappedFrequencies) {
+				updateOutputMap();
+			} else { // we're not pitch-shifting, so no need to find peaks etc.
+				for (int c = 0; c < channels; ++c) {
+					Band *bins = bandsForChannel(c);
+					for (int b = 0; b < bands; ++b) {
+						bins[b].inputEnergy = _impl::norm(bins[b].input);
+					}
+				}
 
-		Sample smoothingBins = Sample(stft.fftSize())/stft.interval();
-		int longVerticalStep = std::round(smoothingBins);
-		if (customFreqMap || freqMultiplier != 1) {
-			findPeaks(smoothingBins);
-			updateOutputMap();
-		} else { // we're not pitch-shifting, so no need to find peaks etc.
-			for (int c = 0; c < channels; ++c) {
-				Band *bins = bandsForChannel(c);
 				for (int b = 0; b < bands; ++b) {
-					bins[b].inputEnergy = std::norm(bins[b].input);
+					outputMap[b] = {Sample(b), 1};
 				}
 			}
-			for (int b = 0; b < bands; ++b) {
-				outputMap[b] = {Sample(b), 1};
+			return;
+		}
+		if (blockProcess.processFormants) {
+			if (step < 3) {
+				updateFormants(step);
+				return;
 			}
+			step -= 3;
 		}
-
 		// Preliminary output prediction from phase-vocoder
-		for (int c = 0; c < channels; ++c) {
+		if (step < size_t(channels)) {
+			int c = int(step);
 			Band *bins = bandsForChannel(c);
 			auto *predictions = predictionsForChannel(c);
 			for (int b = 0; b < bands; ++b) {
@@ -442,137 +623,152 @@
 
 				auto &outputBin = bins[b];
 				Complex prevInput = getFractional<&Band::prevInput>(c, lowIndex, fracIndex);
-				Complex freqTwist = signalsmith::perf::mul<true>(prediction.input, prevInput);
-				Complex phase = signalsmith::perf::mul(outputBin.output, freqTwist);
+				Complex freqTwist = _impl::mul<true>(prediction.input, prevInput);
+				Complex phase = _impl::mul(outputBin.output, freqTwist);
 				outputBin.output = phase/(std::max(prevEnergy, prediction.energy) + noiseFloor);
 			}
+			return;
 		}
+		step -= channels;
 
-		// Re-predict using phase differences between frequencies
-		for (int b = 0; b < bands; ++b) {
-			// Find maximum-energy channel and calculate that
-			int maxChannel = 0;
-			Sample maxEnergy = predictionsForChannel(0)[b].energy;
-			for (int c = 1; c < channels; ++c) {
-				Sample e = predictionsForChannel(c)[b].energy;
-				if (e > maxEnergy) {
-					maxChannel = c;
-					maxEnergy = e;
+		if (step < splitMainPrediction) {
+			// Re-predict using phase differences between frequencies
+			size_t chunk = step;
+			int startB = int(bands*chunk/splitMainPrediction);
+			int endB = int(bands*(chunk + 1)/splitMainPrediction);
+			for (int b = startB; b < endB; ++b) {
+				// Find maximum-energy channel and calculate that
+				int maxChannel = 0;
+				Sample maxEnergy = predictionsForChannel(0)[b].energy;
+				for (int c = 1; c < channels; ++c) {
+					Sample e = predictionsForChannel(c)[b].energy;
+					if (e > maxEnergy) {
+						maxChannel = c;
+						maxEnergy = e;
+					}
 				}
-			}
 
-			auto *predictions = predictionsForChannel(maxChannel);
-			auto &prediction = predictions[b];
-			auto *bins = bandsForChannel(maxChannel);
-			auto &outputBin = bins[b];
-
-			Complex phase = 0;
-			auto mapPoint = outputMap[b];
-
-			// Upwards vertical steps
-			if (b > 0) {
-				Sample binTimeFactor = randomTimeFactor ? timeFactorDist(randomEngine) : timeFactor;
-				Complex downInput = getFractional<&Band::input>(maxChannel, mapPoint.inputBin - binTimeFactor);
-				Complex shortVerticalTwist = signalsmith::perf::mul<true>(prediction.input, downInput);
+				auto *predictions = predictionsForChannel(maxChannel);
+				auto &prediction = predictions[b];
+				auto *bins = bandsForChannel(maxChannel);
+				auto &outputBin = bins[b];
 
-				auto &downBin = bins[b - 1];
-				phase += signalsmith::perf::mul(downBin.output, shortVerticalTwist);
-				
-				if (b >= longVerticalStep) {
-					Complex longDownInput = getFractional<&Band::input>(maxChannel, mapPoint.inputBin - longVerticalStep*binTimeFactor);
-					Complex longVerticalTwist = signalsmith::perf::mul<true>(prediction.input, longDownInput);
+				Complex phase = 0;
+				auto mapPoint = outputMap[b];
 
-					auto &longDownBin = bins[b - longVerticalStep];
-					phase += signalsmith::perf::mul(longDownBin.output, longVerticalTwist);
-				}
-			}
-			// Downwards vertical steps
-			if (b < bands - 1) {
-				auto &upPrediction = predictions[b + 1];
-				auto &upMapPoint = outputMap[b + 1];
+				// Upwards vertical steps
+				if (b > 0) {
+					Sample binTimeFactor = randomTimeFactor ? timeFactorDist(randomEngine) : timeFactor;
+					Complex downInput = getFractional<&Band::input>(maxChannel, mapPoint.inputBin - binTimeFactor);
+					Complex shortVerticalTwist = _impl::mul<true>(prediction.input, downInput);
 
-				Sample binTimeFactor = randomTimeFactor ? timeFactorDist(randomEngine) : timeFactor;
-				Complex downInput = getFractional<&Band::input>(maxChannel, upMapPoint.inputBin - binTimeFactor);
-				Complex shortVerticalTwist = signalsmith::perf::mul<true>(upPrediction.input, downInput);
+					auto &downBin = bins[b - 1];
+					phase += _impl::mul(downBin.output, shortVerticalTwist);
+					
+					if (b >= longVerticalStep) {
+						Complex longDownInput = getFractional<&Band::input>(maxChannel, mapPoint.inputBin - longVerticalStep*binTimeFactor);
+						Complex longVerticalTwist = _impl::mul<true>(prediction.input, longDownInput);
 
-				auto &upBin = bins[b + 1];
-				phase += signalsmith::perf::mul<true>(upBin.output, shortVerticalTwist);
-				
-				if (b < bands - longVerticalStep) {
-					auto &longUpPrediction = predictions[b + longVerticalStep];
-					auto &longUpMapPoint = outputMap[b + longVerticalStep];
+						auto &longDownBin = bins[b - longVerticalStep];
+						phase += _impl::mul(longDownBin.output, longVerticalTwist);
+					}
+				}
+				// Downwards vertical steps
+				if (b < bands - 1) {
+					auto &upPrediction = predictions[b + 1];
+					auto &upMapPoint = outputMap[b + 1];
+
+					Sample binTimeFactor = randomTimeFactor ? timeFactorDist(randomEngine) : timeFactor;
+					Complex downInput = getFractional<&Band::input>(maxChannel, upMapPoint.inputBin - binTimeFactor);
+					Complex shortVerticalTwist = _impl::mul<true>(upPrediction.input, downInput);
+
+					auto &upBin = bins[b + 1];
+					phase += _impl::mul<true>(upBin.output, shortVerticalTwist);
+					
+					if (b < bands - longVerticalStep) {
+						auto &longUpPrediction = predictions[b + longVerticalStep];
+						auto &longUpMapPoint = outputMap[b + longVerticalStep];
 
-					Complex longDownInput = getFractional<&Band::input>(maxChannel, longUpMapPoint.inputBin - longVerticalStep*binTimeFactor);
-					Complex longVerticalTwist = signalsmith::perf::mul<true>(longUpPrediction.input, longDownInput);
+						Complex longDownInput = getFractional<&Band::input>(maxChannel, longUpMapPoint.inputBin - longVerticalStep*binTimeFactor);
+						Complex longVerticalTwist = _impl::mul<true>(longUpPrediction.input, longDownInput);
 
-					auto &longUpBin = bins[b + longVerticalStep];
-					phase += signalsmith::perf::mul<true>(longUpBin.output, longVerticalTwist);
+						auto &longUpBin = bins[b + longVerticalStep];
+						phase += _impl::mul<true>(longUpBin.output, longVerticalTwist);
+					}
 				}
-			}
 
-			outputBin.output = prediction.makeOutput(phase);
-			
-			// All other bins are locked in phase
-			for (int c = 0; c < channels; ++c) {
-				if (c != maxChannel) {
-					auto &channelBin = bandsForChannel(c)[b];
-					auto &channelPrediction = predictionsForChannel(c)[b];
-					
-					Complex channelTwist = signalsmith::perf::mul<true>(channelPrediction.input, prediction.input);
-					Complex channelPhase = signalsmith::perf::mul(outputBin.output, channelTwist);
-					channelBin.output = channelPrediction.makeOutput(channelPhase);
+				outputBin.output = prediction.makeOutput(phase);
+				
+				// All other bins are locked in phase
+				for (int c = 0; c < channels; ++c) {
+					if (c != maxChannel) {
+						auto &channelBin = bandsForChannel(c)[b];
+						auto &channelPrediction = predictionsForChannel(c)[b];
+						
+						Complex channelTwist = _impl::mul<true>(channelPrediction.input, prediction.input);
+						Complex channelPhase = _impl::mul(outputBin.output, channelTwist);
+						channelBin.output = channelPrediction.makeOutput(channelPhase);
+					}
 				}
 			}
+			return;
 		}
+		step -= splitMainPrediction;
 
-		if (newSpectrum) {
-			for (auto &bin : channelBands) {
-				bin.prevInput = bin.input;
+		if (blockProcess.newSpectrum) {
+			if (step-- == 0) {
+				for (auto &bin : channelBands) {
+					bin.prevInput = bin.input;
+				}
 			}
 		}
 	}
 	
 	// Produces smoothed energy across all channels
-	void smoothEnergy(Sample smoothingBins) {
+	static constexpr size_t smoothEnergySteps = 3;
+	Sample smoothEnergyState = 0;
+	void smoothEnergy(size_t step, Sample smoothingBins) {
 		Sample smoothingSlew = 1/(1 + smoothingBins*Sample(0.5));
-		for (auto &e : energy) e = 0;
-		for (int c = 0; c < channels; ++c) {
-			Band *bins = bandsForChannel(c);
+		if (step-- == 0) {
+			for (auto &e : energy) e = 0;
+			for (int c = 0; c < channels; ++c) {
+				Band *bins = bandsForChannel(c);
+				for (int b = 0; b < bands; ++b) {
+					Sample e = _impl::norm(bins[b].input);
+					bins[b].inputEnergy = e; // Used for interpolating prediction energy
+					energy[b] += e;
+				}
+			}
 			for (int b = 0; b < bands; ++b) {
-				Sample e = std::norm(bins[b].input);
-				bins[b].inputEnergy = e; // Used for interpolating prediction energy
-				energy[b] += e;
+				smoothedEnergy[b] = energy[b];
 			}
+			smoothEnergyState = 0;
+			return;
 		}
-		for (int b = 0; b < bands; ++b) {
-			smoothedEnergy[b] = energy[b];
+
+		// The two other steps are repeated smoothing passes, down and up
+		Sample e = smoothEnergyState;
+		for (int b = bands - 1; b >= 0; --b) {
+			e += (smoothedEnergy[b] - e)*smoothingSlew;
+			smoothedEnergy[b] = e;
 		}
-		Sample e = 0;
-		for (int repeat = 0; repeat < 2; ++repeat) {
-			for (int b = bands - 1; b >= 0; --b) {
-				e += (smoothedEnergy[b] - e)*smoothingSlew;
-				smoothedEnergy[b] = e;
-			}
-			for (int b = 0; b < bands; ++b) {
-				e += (smoothedEnergy[b] - e)*smoothingSlew;
-				smoothedEnergy[b] = e;
-			}
+		for (int b = 0; b < bands; ++b) {
+			e += (smoothedEnergy[b] - e)*smoothingSlew;
+			smoothedEnergy[b] = e;
 		}
+		smoothEnergyState = e;
 	}
 	
 	Sample mapFreq(Sample freq) const {
 		if (customFreqMap) return customFreqMap(freq);
 		if (freq > freqTonalityLimit) {
-			Sample diff = freq - freqTonalityLimit;
-			return freqTonalityLimit*freqMultiplier + diff;
+			return freq + (freqMultiplier - 1)*freqTonalityLimit;
 		}
 		return freq*freqMultiplier;
 	}
 	
 	// Identifies spectral peaks using energy across all channels
-	void findPeaks(Sample smoothingBins) {
-		smoothEnergy(smoothingBins);
-
+	void findPeaks() {
 		peaks.resize(0);
 		
 		int start = 0;
@@ -631,6 +827,118 @@
 			outputMap[b] = {b + topOffset, 1};
 		}
 	}
+
+	// If we mapped formants the same way as mapFreq(), this would be the inverse
+	Sample invMapFormant(Sample freq) const {
+		if (freq*invFormantMultiplier > freqTonalityLimit) {
+			return freq + (1 - formantMultiplier)*freqTonalityLimit;
+		}
+		return freq*invFormantMultiplier;
+	}
+
+	Sample freqEstimateWeighted = 0;
+	Sample freqEstimateWeight = 0;
+	Sample estimateFrequency() {
+		// 3 highest peaks in the input
+		std::array<int, 3> peakIndices{0, 0, 0};
+		for (int b = 1; b < bands - 1; ++b) {
+			Sample e = formantMetric[b];
+			// local maxima only
+			if (e < formantMetric[b - 1] || e <= formantMetric[b + 1]) continue;
+			
+			if (e > formantMetric[peakIndices[0]]) {
+				if (e > formantMetric[peakIndices[1]]) {
+					if (e > formantMetric[peakIndices[2]]) {
+						peakIndices = {peakIndices[1], peakIndices[2], b};
+					} else {
+						peakIndices = {peakIndices[1], b, peakIndices[2]};
+					}
+				} else {
+					peakIndices[0] = b;
+				}
+			}
+		}
+		
+		// VERY rough pitch estimation
+		int peakEstimate = peakIndices[2];
+		if (formantMetric[peakIndices[1]] > formantMetric[peakIndices[2]]*0.1) {
+			int diff = std::abs(peakEstimate - peakIndices[1]);
+			if (diff > peakEstimate/8 && diff < peakEstimate*7/8) peakEstimate = peakEstimate%diff;
+			if (formantMetric[peakIndices[0]] > formantMetric[peakIndices[2]]*0.01) {
+				int diff = std::abs(peakEstimate - peakIndices[0]);
+				if (diff > peakEstimate/8 && diff < peakEstimate*7/8) peakEstimate = peakEstimate%diff;
+			}
+		}
+		Sample weight = formantMetric[peakIndices[2]];
+		// Smooth it out a bit
+		freqEstimateWeighted += (peakEstimate*weight - freqEstimateWeighted)*0.25;
+		freqEstimateWeight += (weight - freqEstimateWeight)*0.25;
+		
+		return freqEstimateWeighted/(freqEstimateWeight + Sample(1e-30));
+	}
+	
+	Sample freqEstimate;
+	
+	std::vector<Sample> formantMetric;
+	Sample formantBaseFreq = 0;
+	void updateFormants(size_t step) {
+		if (step-- == 0) {
+			for (auto &e : formantMetric) e = 0;
+			for (int c = 0; c < channels; ++c) {
+				Band *bins = bandsForChannel(c);
+				for (int b = 0; b < bands; ++b) {
+					formantMetric[b] += bins[b].inputEnergy;
+				}
+			}
+
+			freqEstimate = freqToBand(formantBaseFreq);
+			if (formantBaseFreq <= 0) freqEstimate = estimateFrequency();
+
+			for (int b = 0; b < bands; ++b) {
+				formantMetric[b] = std::sqrt(formantMetric[b]);
+			}
+		} else if (step-- == 0) {
+			Sample slew = 1/(freqEstimate*0.5 + 1);
+			Sample e = 0;
+			for (size_t repeat = 0; repeat < 2; ++repeat) {
+				for (int b = bands - 1; b >= 0; --b) {
+					e += (formantMetric[b] - e)*slew;
+					formantMetric[b] = e;
+				}
+				for (int b = 0; b < bands; ++b) {
+					e += (formantMetric[b] - e)*slew;
+					formantMetric[b] = e;
+				}
+			}
+		} else {
+			auto getFormant = [&](Sample band) -> Sample {
+				if (band < 0) return 0;
+				band = std::min<Sample>(band, bands);
+				int floorBand = std::floor(band);
+				Sample fracBand = band - floorBand;
+				Sample low = formantMetric[floorBand], high = formantMetric[floorBand + 1];
+				return low + (high - low)*fracBand;
+			};
+
+			for (int b = 0; b < bands; ++b) {
+				Sample inputF = bandToFreq(b);
+				Sample outputF = formantCompensation ? mapFreq(inputF) : inputF;
+				outputF = invMapFormant(outputF);
+
+				Sample inputE = formantMetric[b];
+				Sample targetE = getFormant(freqToBand(outputF));
+
+				Sample formantRatio = targetE/(inputE + Sample(1e-30));
+				Sample energyRatio = formantRatio*formantRatio;
+
+				for (int c = 0; c < channels; ++c) {
+					Band *bins = bandsForChannel(c);
+					// This is what's used to decide the output energy, so this affects the output
+					bins[b].inputEnergy *= energyRatio;
+				}
+			}
+		}
+	}
 };
 
 }} // namespace

Modified: branches/OpenMPT-1.32/tracklib/TimeStretchPitchShift.cpp
==============================================================================
--- branches/OpenMPT-1.32/tracklib/TimeStretchPitchShift.cpp	Sun May 25 14:04:10 2025	(r23191)
+++ branches/OpenMPT-1.32/tracklib/TimeStretchPitchShift.cpp	Sun May 25 14:04:43 2025	(r23192)
@@ -20,9 +20,9 @@
 #if MPT_COMPILER_MSVC
 #pragma warning(push)
 #pragma warning(disable:4305)
-#pragma warning(disable:4267)
 #pragma warning(disable:4244)
 #pragma warning(disable:4458)
+#pragma warning(disable:4456)
 #endif  // MPT_COMPILER_MSVC
 #include <SignalsmithStretch/signalsmith-stretch.h>
 #if MPT_COMPILER_MSVC




[OMPT-svn-commit] OpenMPT r23192 - in branches/OpenMPT-1.32: . include/SignalsmithStretch include/S

A music tracker software for Windows.

[OMPT-svn-commit] OpenMPT r23192 - in branches/OpenMPT-1.32: . include/SignalsmithStretch include/SignalsmithStretch/SignalsmithStretch include/SignalsmithStretch/SignalsmithStretch/dsp include/SignalsmithStretch/signalsmith-linear tracklib