Thread: [myhdl-list] Low resource FFT core in Myhdl, first python Modell
Brought to you by:
jandecaluwe
From: Norbo <Nor...@gm...> - 2012-11-26 15:53:31
|
The aim was to create a FFT core in Myhdl. The Number of FFT points should be adjustable with 2**x. The upper limit of x should be not limited by the algorithm but just by the memory needed. The following python script should be a good starting point. Note the code uses only multiplication, additions and the lookuptable (i am thinking about replacing the lookuptable by a recursive chebyshev cosinus generator so that the implementation itself can work quit inplace). It was written with the hardware implementation in mind, wherby the input Data can be supplied through a Embedded memory block. The memory bandwidth would then create the speedlimit. Since only one value would be read and written at a clock cycle. I am not sure if i will finish this. But every comment, critic, improvements are welcome. grettings Norbo ### Python FFT implementation (for a myhdl implementation reference) #### Author: Norbert Date:26.11.2012 ######################################################################### from pylab import fft,log2,exp,pi,arange xin=[-7,2,3,4,5,6,7,8,-7,2,7,4,5,2,-7,2] FFTlength=len(xin) log2_FFTlength=int(log2(FFTlength)) assert 2**log2_FFTlength==FFTlength ###data reordering ###### Orderlookup=[] for i in range(FFTlength): sst=list(bin(i)[2:].zfill(log2_FFTlength)) sst.reverse() val=int("".join(sst),2) Orderlookup.append(val) FFTdata=[xin[i] for i in Orderlookup] ######end data reordering ######## lookuptable=exp(-1j*2*pi*arange(0,FFTlength/2,1)/(FFTlength)) count=0 valx=2 sections=FFTlength #2**(log2_FFTlength-1) while valx<=FFTlength: sections=sections>>1 # FFTlength/valx for cur_sec in range(sections): for i in range(valx>>1): aww=FFTdata[i+valx*cur_sec] bww=FFTdata[i+(valx>>1)+valx*cur_sec]*lookuptable[i*sections] # exp(-1j*2*pi*i/valx) FFTdata[i+valx*cur_sec]=aww+bww FFTdata[i+(valx>>1)+valx*cur_sec]=aww-bww #count=count+1 valx=valx<<1 ################# check result ################## fftxin_reference=fft(xin) Error_allowed=0.0001 for ind,fftval in enumerate(FFTdata): print fftval,"=",fftxin_reference[ind] assert (abs(fftxin_reference[ind])*(1-Error_allowed))<abs(fftval)<(abs(fftxin_reference[ind])*(1+Error_allowed)) print "-"*50 |
From: Christopher F. <chr...@gm...> - 2012-11-26 16:40:06
|
On 11/26/2012 9:53 AM, Norbo wrote: > The aim was to create a FFT core in Myhdl. The Number of FFT points should > be adjustable with 2**x. > The upper limit of x should be not limited by the algorithm but just by > the memory needed. > The following python script should be a good starting point. Note the code > uses only > multiplication, additions and the lookuptable (i am thinking about > replacing the lookuptable by a recursive chebyshev cosinus generator so > that the implementation itself can work quit inplace). It was written with > the hardware implementation in mind, wherby the input Data can be supplied > through a Embedded memory block. The memory bandwidth would then create > the speedlimit. Since only one value would be read and written at a clock > cycle. > I am not sure if i will finish this. But every comment, critic, > improvements are welcome. > > grettings > Norbo > I have not reviewed your version yet, thought this might be helpful (compare/contrast). Here is another basic FFT which uses the recursive ability in MyHDL. http://www.myhdl.org/doku.php/users:cfelton:projects:recursivefft This version the required resources are similar to yours. It has been synthesize for an FPGA. Regards, Chris |
From: Norbo <Nor...@gm...> - 2012-11-28 19:41:01
|
> I have not reviewed your version yet, thought this might be helpful > (compare/contrast). Here is another basic FFT which uses the recursive > ability in MyHDL. > > http://www.myhdl.org/doku.php/users:cfelton:projects:recursivefft > > This version the required resources are similar to yours. It has been > synthesize for an FPGA. > > Regards, > Chris Yeah, i looked at this implementation before i started. If i got it correctly the basic radix-N elements are described with the aim to calculate everything combinatorically. Where my aim is to use at least (N*log(N)) clock cycles. And have only one basic radix-2 element as a center element were everything gets clocked through. And the data gets read and written from and to a RAM. greetings Norbo |
From: Christopher F. <chr...@gm...> - 2012-11-28 21:58:58
|
On 11/28/2012 1:40 PM, Norbo wrote: >> I have not reviewed your version yet, thought this might be helpful >> (compare/contrast). Here is another basic FFT which uses the recursive >> ability in MyHDL. >> >> http://www.myhdl.org/doku.php/users:cfelton:projects:recursivefft >> >> This version the required resources are similar to yours. It has been >> synthesize for an FPGA. >> >> Regards, >> Chris > > > Yeah, i looked at this implementation before i started. If i got it > correctly the basic radix-N elements are described with the aim to > calculate everything combinatorically. Where my aim is to use at least > (N*log(N)) clock cycles. And have only one basic radix-2 element as a > center element were everything gets clocked through. And the data gets > read and written from and to a RAM. Yes, my version is straight-forward and doesn't share resources or multiplex. It is pipelined though it is not one large combnitorial computation. But I see this page is incomplete, I don't have everything posted. Regards, Chris |
From: Norbo <Nor...@gm...> - 2012-12-27 11:44:10
Attachments:
myhdl_FFT.py
|
Hello, its done. -> Low resource FFT core in myhdl The file (a single file implementation, oh i like them) of the core is attached. Here are basically the specs of the core: # Description: This Unit does FFT in Hardware, with Fixpoint, it uses 2 multipliers only # + Input data bitwidth is adjustable, Twiddle bitwidth must be the same for now # + Calculation time: N*log2(N)+8 clock cycles , N...FFT length # + FFT length is setable to: 16,32,64,128,256, ... # + Does a complex FFT (real and imaginary values) # + With 18 Bit input data and 18 Bit twiddle factor the 128 point complex FFT has the following # resource usage on a Cyclon II with Quartus: 427 Logic Elements, 4501 Memory Bits, # 4 9Bit Multipliers== 2 18Bit Multipliers , runs up to 80Mhz on Cyclon II EP2C35 speedgrad 8 # + The FFT core read and writes from/to an external supplied RAM. The RAM needs a seperate read-address # and write-address.During operation the FFT core needs full read and write acces to the RAM. Every # clock cycle (when the piplene is filled) one Data is read and one is written to/from the RAM. # + No Scaling # + The Testbench shows the error of Fixpoint FFT vs. Floatingpoint FFT pylab reference implementation # for a given inputvector-Testdata, or if an overflow occours (myhdl built in). # +Simple performance evaluation, just add some simple python code to the testbench to check whether the # fixpoint implementation matches your accuracy needs. # + FFT length could be made adjustable during operation in Hardware with little effort. e.g for interpolation # Note: To do the iFFT, conjungate the data before the start of the FFT and after the FFT is finished # and dont forget the scaling All the best, New year and X-mas greetings Norbert Am 26.11.2012, 16:53 Uhr, schrieb Norbo <Nor...@gm...>: > The aim was to create a FFT core in Myhdl. The Number of FFT points > should > be adjustable with 2**x. > The upper limit of x should be not limited by the algorithm but just by > the memory needed. > The following python script should be a good starting point. Note the > code > uses only > multiplication, additions and the lookuptable (i am thinking about > replacing the lookuptable by a recursive chebyshev cosinus generator so > that the implementation itself can work quit inplace). It was written > with > the hardware implementation in mind, wherby the input Data can be > supplied > through a Embedded memory block. The memory bandwidth would then create > the speedlimit. Since only one value would be read and written at a clock > cycle. > I am not sure if i will finish this. But every comment, critic, > improvements are welcome. > > grettings > Norbo > > > ### Python FFT implementation (for a myhdl implementation reference) #### > Author: Norbert > Date:26.11.2012 > ######################################################################### > > from pylab import fft,log2,exp,pi,arange > xin=[-7,2,3,4,5,6,7,8,-7,2,7,4,5,2,-7,2] > > FFTlength=len(xin) > log2_FFTlength=int(log2(FFTlength)) > > assert 2**log2_FFTlength==FFTlength > > ###data reordering ###### > Orderlookup=[] > for i in range(FFTlength): > sst=list(bin(i)[2:].zfill(log2_FFTlength)) > sst.reverse() > val=int("".join(sst),2) > Orderlookup.append(val) > > FFTdata=[xin[i] for i in Orderlookup] > ######end data reordering ######## > > lookuptable=exp(-1j*2*pi*arange(0,FFTlength/2,1)/(FFTlength)) > count=0 > valx=2 > sections=FFTlength #2**(log2_FFTlength-1) > while valx<=FFTlength: > sections=sections>>1 # FFTlength/valx > > for cur_sec in range(sections): > for i in range(valx>>1): > aww=FFTdata[i+valx*cur_sec] > bww=FFTdata[i+(valx>>1)+valx*cur_sec]*lookuptable[i*sections] > # exp(-1j*2*pi*i/valx) > > FFTdata[i+valx*cur_sec]=aww+bww > FFTdata[i+(valx>>1)+valx*cur_sec]=aww-bww > #count=count+1 > valx=valx<<1 > > ################# check result ################## > fftxin_reference=fft(xin) > Error_allowed=0.0001 > for ind,fftval in enumerate(FFTdata): > print fftval,"=",fftxin_reference[ind] > assert > (abs(fftxin_reference[ind])*(1-Error_allowed))<abs(fftval)<(abs(fftxin_reference[ind])*(1+Error_allowed)) > print "-"*50 > > > ------------------------------------------------------------------------------ > Monitor your physical, virtual and cloud infrastructure from a single > web console. Get in-depth insight into apps, servers, databases, vmware, > SAP, cloud infrastructure, etc. Download 30-day Free Trial. > Pricing starts from $795 for 25 servers or applications! > http://p.sf.net/sfu/zoho_dev2dev_nov -- Erstellt mit Operas revolutionärem E-Mail-Modul: http://www.opera.com/mail/ |