Re: [Algorithms] Transposing (CD) audio...

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Dec 14, 2000 at 03:34 -0500, Jeffrey Rainy wrote:
> > > It is the case...  it's due to the "Scaling Property" of the Discrete
> > > Fourier Transform.  I remember looking it up the last time this
> > > discussion went around, but I'll admit to having to look it up again
> > > to remember the official terms :)
> > >
> > > In _Principles of Digital Image Synthesis_, on p 233, there's a table
> > > of some DFT properties.  If you have a time domain function f(n), and
> > > its Discrete Fourier Tranform F(Omega), then the scaling property says
> > > that:
> > >
> > >          DFT
> > > f(a*n) <-----> 1/a * F(Omega/a)
> > >
> > > So if F'(Omega) = F(Omega/a), and a > 1 (so F' is F with the
> > > frequencies scaled up), then f'(n) = a * f(a*n), or a sped-up and
> > > scaled version of f(n).
> > >
> 
> This all true. However it does not apply here.

Within a single block, it definitely applies.

> To perform a complete FT, you need to use the complete time domain, from
> minus infinite to infinite.

Not sure what you mean.  You can take the Discrete FT of a
time-limited signal with no problem.  In fact it's a lot easier than
transforming an infinite signal :) The catch is that the inverse
transform gives you a periodic version of the original time-limited
signal (i.e. it just repeats outside the original block boundaries).
That's just a property of the DFT.

> If you perform it on a sample ( or audio block ) many low frequency are not
> present, thus not shifted, thus not accelerated.

Agreed.

> This is why your audio will
> sound "weird". The high frequency are correctly shifted, but the low
> frequencies, are left unmodified.

Well, it seems to me this is the murky part... the frequencies with
periods below the block length are completely absent in the FT, except
for a DC offset.  So where do the low frequencies in the resulting
signal come from?  I think they come back in, due to the windowing of
the reconstituted blocks (especially considering their DC
components)... when you window anything you're adding all kinds of
frequency information... but I don't think I can explain it in any
coherent way.

Bottom line, this is where my theoretical knowledge abruptly ends :)

> If you cut your audio in blocks, shift up the frequency, then create blocks
> of the same size(time) with the shifted frequency, you'll have audio that is
> not time-accelerated. However your high frequencies components will be
> accelerated, creating "aliasing" into you audio.

I don't think aliasing comes into play, unless you're not careful when
doing the frequency scaling (i.e. you just have to drop any components
that go above half the sample rate).  If you implement the frequency
scaling by just resampling the block, then yeah you have to do a
decent job of filtering to avoid aliasing.  Much like scaling an
image.

The source of echoing is easy to understand -- it's due to the fact
that the inverse FT of the frequency-scaled signal is a periodic
signal, but the period has been shortened due to the scaling property,
so that the reconstructed signal is exactly equivalent to a
sped-up-and-looped-then-windowed version of the original block.  So
the echoes are just echoes, sort of :)

> Bottom line, choose your block size cautiosly, and you might get the right
> effect.

Can't disagree with that :) An ounce of experimenting is worth a pound
of speculation in this area...

-- 
Thatcher Ulrich <tu...@tu...>
==  Soul Ride -- pure snowboarding for the PC -- http://soulride.com
==  real-world terrain -- physics-based gameplay -- no view limits