![]() |
|
If you can't view the Datasheet, Please click here to try to view without PDF Reader . |
|
Datasheet File OCR Text: |
Microcontrollers ApNote additional file AP163402.EXE available AP1634 Fast - Fourier - Transformation The Fast Fourier Transformation (FFT) is an algorithm frequently used in various applications, like telecommunication, signal and image processing. It transforms the time domain into the frequency domain where the spectrum of amplitude and frequency can be analyzed. K. Westerholz / Siemens HL MCB AT Semiconductor Group 03.97, Rel. 01 Fast - Fourier - Transformation 1 2 3 4 5 6 Abstract ................................................................................................................... 3 FFT - Derivation of the algorithm .......................................................................... 3 Implementation ....................................................................................................... 7 Flow chart of 1024-point real valued FFT ........................................................... 11 Flowchart of the Midloop .................................................................................... 12 Register Use and content..................................................................................... 13 AP1634 ApNote - Revision History Actual Revision : Rel.01 Previous Revison: none Page of Page of Subjects changes since last release) actual Rel. prev. Rel. Semiconductor Group 2 of 13 AP1634 03.97 Fast - Fourier - Transformation 1 Abstract The Fast Fourier Transformation (FFT) is an algorithm frequently used in various applications, like telecommunication, signal and image processing. It transforms the time domain into the frequency domain where the spectrum of amplitude and frequency can be analyzed. This application note describes an implementation of a real-valued 1024 point decimation in time radix-2 FFT for the C166 microcontroller family. Assuming that the code is started out of the internal ROM via a 16-bit demultiplexed bus, an execution time of 10 ms has been achieved for a C165 running at 25 MHz internal clock. The code comprises 828 bytes. This application note is based on an application note performed by pls (Programmierbare Logik Systeme, Hoyerswerda, Germany). 2 FFT - Derivation of the algorithm Starting from the continuous time Fourier Transformation, the discrete Fourier Transformation can be derived as a function of sample points (N): X ( k ) = WNnk x ( n ) n= 0 N -1 k=0,1,...,N-1 2 2 nk WN = e - j 2 / N nk = cos nk - j sin nk N N nk This formula can be regarded as a matrix WN multiplied by the input data vector x(n). This nk simple DFT has the complexity N2. The coefficients of the matrix WN will be denoted in the following as twiddle factors. In order to derive the radix-2 FFT algorithm, we decompose the transformation into two partial transformations, one containing the input data with even indices, and the other with k k k odd. Exploiting symmetries WN = - WN +( N / 2 ) and WN = WNN + k of the twiddle factors a decomposition is achieved that only requires ( 1 N 2 + 1 N ) multiplications. 2 2 X (k ) = N / 2 -1 n=0 x ( 2 n )W k N nk N /2 +W N / 2 -1 n=0 k N x(2n + 1)W nk N /2 = P( k ) + W Q(k ) X ( k + N / 2) = N / 2 -1 n=0 x ( 2 n )W k N nk N/2 -W N / 2 -1 n= 0 k N x(2n + 1)W nk N/2 0 k (N/2)-1 = P(k ) - W Q( k ) Semiconductor Group 3 of 13 AP1634 03.97 Fast - Fourier - Transformation The number of multiplications can be further cut to (N log N) by repetitive employing this process to the partial transformations P(k) and Q(k) till they are completely decomposed into 2 point DFTs (Discrete Fourier Transformations). The 2-point DFT is commonly referred to as the butterfly operation. The butterfly requires two complex multiplications resulting in four real valued multiplications for computing the terms P(k) and WNk Q(k). Pn+1 = Pn + QnW Nk n=stage Qn+1 = Pn - QnW Nk since W Nk = e-j(2/N)k = cos(u) - j sin(u) and u = (2/N)k Re(Pn) Im(Pn) 1 Re(Pn+1) Im(Pn+1) Re(Qn) Im(Qn) WN k Re(Qn+1) Im(Qn+1) Having only real valued input data x(n), the computational effort of a N point FFT can be reduced to a N/2 point complex FFT. Firstly, even indexed data h(n) = x (2 n) and odd indexed data g(n) = x(2n + 1) are separated. The index k is running from 0 to N-1. h(n) and g(n) have the spectra H(k) and G(k) respectively. The spectrum X(k) can be decomposed into the spectra H(k) and G(k) as follows: x (n) Fourier Transformation X (k ) = H (k ) + j (cos 2Nk - j sin 2Nk )G( k ) In order to cut the above N point transformation into an N/2 point transformation a complex input vector y(n) = h(n) + jg(n) is formed with the index n running from 0 to N/2-1. The real input values are formed by the even indexed input data h(n). The imaginary part is formed by the odd indexed input data g(n). Then y(n) is transformed into the frequency domain resulting in a spectrum consisting of a superposition of the spectra H(k) and G(k). y(n) = h(n) + jg(n) Fourier Transformation Y (k ) = H (k ) + jG( k ) = R(k ) + jI (k ) Now the complex spectra H(k) and G(k) have to be extracted out of the complex spectrum Y (k ) = H (k ) + jG(k ) = R(k ) + jI (k ) . By employing symmetry relations, the spectra H(k) and G(k) can be derived from the spectrum Y(n) as follows: Re{H (k )} = R( k ) + R( N / 2 - k ) 2 I (k ) + I ( N / 2 - k ) Re{G(k )} = 2 I (k ) - I ( N / 2 - k ) 2 R(k ) - R( N / 2 - k ) Im{G( k )} = 2 Im{H (k )} = These spectra inserted into the equation for X(n) deliver the full spectrum. Semiconductor Group 4 of 13 AP1634 03.97 Fast - Fourier - Transformation In order to illuminate the FFT algorithm, the example given below shall demonstrate the computational process during an 8-point FFT. The input data consists of 8 complex numbers X0, ... X7 which can also be perceived as 16 real numbers x(0)...x(15). Input X0 X1 X2 X3 X4 X5 X6 X7 X(0) X(1) X(2) X(3) X(4) X(5) X(6) X(7) X(8) X(9) X(10 ) X(11 ) X(12 ) X(13 ) X(14 ) X(15 ) Stage 1 Stage 2 Stage 3 Output X(0) X(1) 1 1 1 1 W W W W 0 1 1 W W 0 1 W 0 X0 X4 X2 X6 X1 X5 X3 X7 X(8) X(9) X(4) X(5) 1 W 2 0 X(12 ) X(13 ) X(2) X(3) 1 1 W W 2 1 W 1 0 X(10 ) X(11 ) X(6) X(7) 0 1 W 3 0 2 X(14 ) X(15 ) Semiconductor Group 5 of 13 AP1634 03.97 Fast - Fourier - Transformation Corresponding to the above butterfly operation, the input data are connected with each other. Due to the nature of this operation the input data is sequentially ordered, while the output data is in bitreversed order. In each stage four (=N/2) butterflies are computed. The twiddle factor W in the 1st stage is always W 80 = 1. In the 2nd stage the upper half consists again of the butterflies from the previous stage. The second half of the butterflies 2 are computed with twiddle factor W 8 . In the third stage the twiddle factors of the previous stage appear again in the upper half and the lower half contains the factors W81 and W83. For an 8-point-FFT the twiddle factors W Nk = e-j(2/N)k = cos(X) - j sin(X) have the following values. Twiddle factor W 80 W 81 = e-j(/4) W 82 = e-j(/2) W 83 = e-j(3/4) W 84 = e-j = - W80 W 85 = e-j(5/4) = W 81 W 86 = e-j(3/2) = W 82 W 87 = e-j(7/4) = W83 cos(X) 1 1/22 0 -1/22 -1 1/22 0 -1/22 sin(X) 0 1/22 1 1/22 0 1/22 1 W8 5 Im(W) 3 2 1 W8 W8 W8 W8 4 W8 0 Re(W) W8 6 W8 7 1/22 Because of the symmetry of the twiddle factor only the first four (W80 ,..., W 83) must be computed. These four twiddle factors lead to degenerated butterflies. They can be used to precompute the butterflies in the first three stages of the FFT to reduce the number of 0 1 2 3 multiplications. For the twiddle factors W N , W N , W N and W N the butterfly computation can be simplified. For WNk = WN0 = 1 we have cos(X) = 1 ; sin(X) = 0. This results in a butterfly without any multiplications: Pn+1 Qn+1 = [Re(Pn) + Re(Qn) + j [Im(Pn) + Im(Qn)] = [Re(Pn) - Re(Qn)] + j [Im(Pn) - Im(Qn)] Semiconductor Group 6 of 13 AP1634 03.97 Fast - Fourier - Transformation For WNk = WN1 = 1/22(1 - j) we get cos(X) = sin(X) = 1/22. This results in a butterfly for the third stage where only two multiplications must be executed: Pn+1 = [Re(Pn) + Re(Qn) cos(X) + Im(Qn) cos(X)] + j [Im(Pn) + Im(Qn) cos(X) Re(Qn) cos(X)] Qn+1 = [Re(Pn) - Re(Qn) cos(X) - Im(Qn) cos(X)] + j [Im(Pn) - Im(Qn) cos(X) + Re(Qn) cos(X)] For WNk = WN = -j we have cos(X) = 0 ; sin(X) = 1. This results in a butterfly without any multiplications used in the second and third stage: 2 Pn+1 Qn+1 = [Re(Pn) + Im(Qn)] + j [Im(Pn) - Re(Qn)] = [Re(Pn) - Im(Qn)] + j [Im(Pn) + Re(Qn)] 3 For WNk = WN = 1/22(-1 - j) we get -cos(X) = sin(X) = 1/22. This results in a butterfly for the third stage where only two multiplications must be executed: Pn+1 = [Re(Pn) + Re(Qn) cos(X) - Im(Qn) cos(X)] + j [Im(Pn) + Im(Qn) cos(X) + Re(Qn) cos(X)] Qn+1 = [Re(Pn) - Re(Qn) cos(X) + Im(Qn) cos(X)] + j [Im(Pn) - Im(Qn) cos(X) Re(Qn) cos(X)] The output of the decimation in time FFT shows a bitreversed order that has to be ordered to calculate the final frequency spectrum. Supposing the input data has been in a sequential order, the indices of the output data can be easily computed by bit reversing the binary presentation of the input indices. The table below gives an example of the bit reversal for an 8 point FFT. Order of Input data Order of output data bitreversed index binary binary index 0 000 000 0 1 001 100 4 2 010 010 2 3 011 110 6 4 100 001 1 5 101 101 5 6 110 011 3 7 111 111 7 Semiconductor Group 7 of 13 AP1634 03.97 Fast - Fourier - Transformation 3 Implementation For the implementation we assume real valued input data. Based on this assumption, a real valued 1024 point FFT can be reduced to a 512 point complex FFT followed by an unweave phase in order to compute the 1024 point spectrum. The 512 point complex FFT consists of log2 (512) = 9 stages and each stage calculates 512/2 = 256 butterflies. The twiddle factors in the first three stages are the same as for the 8 point complex FFT. The input data is stored in the table FFT_IN which consists of 1024 16 bit words. Since we perform an in-place FFT, the input data field will be overwritten by the output data. However, the final output will be stored in the separate FFTOUT output field. For providing the trigonomic functions, a table (FFT_DAT) is stored encompassing the precomputed sinus and cosines values. Since cos(x) = sin(x+/4) the table consists of 1/4 sinus period and 1/2 cosines period. Together they form a 3/4 sinus period. To rearrange the bitreversed output of the complex FFT and to calculate the twiddle factor W k, a bitreversal table (FFT_BR) has also been precomputed. The input data and the trigonometric function table are represented by a 15 bit fixed-point fraction in two's complement. This means that the MSB of such a number represents the sign, followed by the 15 bits representing a fraction of one. examples: binary 0111 1111 1111 1111 0110 0000 0000 0000 1010 0000 0000 0000 1000 0000 0000 0000 hex 7FFF 6000 A000 8000 dez 32767 24576 -24576 -32768 value +1 + 0.75 - 0.75 -1 s , sig n b1 b2 b3 b4 ... 2-1 2-2 ... b15 2-15 Since the addition of two numbers having the same sign might cause an overflow, numbers have to be divided by 2 before adding them. This is done by an arithmetic shift right. When multiplying two 15-bit-numbers, note that the signs of both are multiplied too, and the result is stored in the 32-bit wide multiplication register. s 15 bit x s 15 bit = s s 30 bit Therefore the result is equal to a scaled multiplication that means that it consists of the multiplication and a subsequent arithmetic shift right as a side effect. Because 32 bit precision is not required, only the first 16 bits contained in the MDH register are used for further calculations. Attached you will find a program flow chart. The first part of the program consists of a 512point complex radix-2 FFT which is executed in nine stages. To avoid multiplications by Semiconductor Group 8 of 13 AP1634 03.97 Fast - Fourier - Transformation means of the degenerated butterflies, 5 different Mid- and Inloops are implemented. The idea is to cut the amount of calculation needed by using the degenerated butterflies in all stages. This is very effective in the first three stages, but time savings decreases in the rear stages. Regarding the 512 complex point FFT, the number of twiddle factors amounts 512. However, due to the symmetry only the first 256 (0..255) are used. In the program source the twiddle factors denoted as W0, W4, W8, W12 refer to the angles 0, 90, 45 and 135. Thus, W80 corresponds to W0, W82 to W4, W81 to W8, and W83 to W12. Stage 1:Since in stage one all twiddle factors are W0, all 256 butterflies are performed in Inloop_0. Stage 2: Stage 3: 128 butterflies with W0 in Inloop_0 128 butterflies with W4 in Inloop_1 64 butterflies with W0 in Inloop_0 64 butterflies with W4 in Inloop_1 64 butterflies with W8 in Inloop_2 64 butterflies with W12 in Inloop_3 32 butterflies with W0 in Inloop_0 32 butterflies with W4 in Inloop_1 32 butterflies with W8 in Inloop_2 32 butterflies with W12 in Inloop_3 128 butterflies with Wk in Inloop Stage 4: The second part of the program unweaves the bitreversed output of the 512-point FFT to extract the 1024 point real valued FFT. Optional, the final stage of the algorithm calculates an amplitude spectrum. On the last page you will find the register use during program execution. Assuming that the code is started out of the internal ROM, the input data is stored in the external RAM and accessed via a 16-bit demultiplexed bus without wait states (Syscon: ROMEN=1, Buscon: MTTC=1, MCTC = 1111, BTYP = 10) an execution time of 10 ms for a real valued 1024-point FFT is achieved for the C165 running at 25 Mhz internally. The code size amounts 828 bytes without data tables. The optional computation of the amplitude spectrum consumes additional 2 ms. The execution time is independent from the input data. Changing the number of sample points N_, the number of stages exp, and reducing the tables, the algorithm can be tailored for various resolutions. In the table below you will find execution times for different numbers of sample points. 64-Points SAB-C165 (25 MHz) 0,56 ms SAB-C167CR (20 0,7 ms MHz) Semiconductor Group 256- Points 2,6 ms 3,3 ms 1024- Points 10,4 ms 13 ms 9 of 13 AP1634 03.97 Fast - Fourier - Transformation This figures demonstrate that the C166 architecture is even superior to some signal processors. This result is founded on a multiplication execution time of 400ns and the RISC like register file. Using a more complex radix-4 FFT algorithm combining two butterflies a further run time saving can be expected. Keeping the input and output tables in the internal RAM an additional speed up can be achieved for the 64 and 256 point FFT. If only parts of the spectrum have to be analyzed, a partial FFT can be performed [7] by omitting all calculations not contributing to the frequency window. Literature [1] Application Note AP-275 "An Algorithm for MCS-96 Products including Supporting Routines and Examples", intel September 1986 [2] MS-C-Program [3] TMS32010 High Performance 16/32-Bit Digital Signal Processor Product Description Texas Instruments 1985 [4] Digital Signal Processing Applications with the TMS320 Family Texas Instruments 1988 [5] TMS320C25 Digital Signal Processor Product Description Texas Instruments 1986 [6] S.K. Mitra and J.F. Kaiser, Handbook for Digital Signal Processing, John Wiley & Sons, 1993 [7] Maurice Bellanger, Digital Processing of Signals, John Wiley & Sons, 1989 Semiconductor Group 10 of 13 AP1634 03.97 Fast - Fourier - Transformation Flow chart of 1024-point real valued FFT Start Outloop for the stages 1 - 9 Midloop_0 Inloop_0: compute butterflies with twiddlefactor W0 yes all elements processed no Midloop_1 Inloop_1: compute butterflies with twiddlefactor W4 yes all elements processed no Midloop_2 Inloop_2: compute butterflies with twiddlefactor W8 Midloop_3 Inloop_3: compute butterflies with twiddlefactor W12 yes all elements processed no Midloop Inloop: computation of remaining butterflies no all elements processed yes END_Midloop 9th stage executed yes no unweave complex FFT to get the result of the real valued FFT bitreversal of FFT_IN and generate butterflies yes calculate absolute values no last element processed yes no Absolute value "Absolute" (calculate absolute values and write results to FFT_OUT) End Semiconductor Group 11 of 13 AP1634 03.97 Fast - Fourier - Transformation Flowchart of the Midloop Midloop Init counter of Inloop set input pointers to index Determination of Twiddle-factor Inloop Read input values Re(Pm), Im(P m), Re(Q m), Im(Q m) from FFT_IN Calculate Butterfly Pm+1 = PR + QRcos(X) + QIsin(X) + j [PI + QIcos(X) - QRsin(X)] Qm+1 = PR - QRcos(X) + QIsin(X) + j [PI - QIcos(X) - QRsin(X)] Write output values Re(Pm+1 ), Im(P m+1 ), Re(Q m+1 ), Im(Q m+1 ) to FFT_IN increment index pointers decrement counter in_loop counter in_loop > 0 yes increment counter mid_loop no yes End Midloop counter mid_loop < 2040 no modify help counters decrement counter out_loop no 9th stage executed ? yes Semiconductor Group 12 of 13 AP1634 03.97 Fast - Fourier - Transformation Register Use and content Stage 1 to 9: R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R1 0 R1 1 R1 2 R1 3 R1 4 R1 5 pP pQ BF BF BF BF BF BF 256...0 256 pP pQ BF BF BF BF BF BF 128...0 128 Stage1 9 1024 2048 0 , N*24 Stage2 8 512 1024 0...N* 2-4 Stage3 7 256 512 0...N* 2-4 Twid/B F pP pQ cos[k] BF BF BF BF BF 64...0 64 p_FFT _In Stage4 6 128 256 0...N* 2-4 Twid/B F pP pQ cos[k] sin[k] BF BF BF BF 32...0 32 p_FFT _In Stage5 5 64 128 0...N* 2-4 Twid/B F pP pQ cos[k] sin[k] BF BF BF BF 16...0 16 p_FFT _In Stage6 4 32 64 0...N* 2-4 Twid/B F pP pQ cos[k] sin[k] BF BF BF BF 8...0 8 p_FFT _In Stage7 3 16 32 0...N* 2-4 Twid/B F pP pQ cos[k] sin[k] BF BF BF BF 4...0 4 p_FFT _In Stage8 2 8 16 0...N* 2-4 Twid/B F pP pQ cos[k] sin[k] BF BF BF BF 2...0 2 p_FFT _In Stage9 1 4 8 0...N* 2-4 Twid/B F pP pQ cos[k] sin[k] BF BF BF BF 1, 0 1 p_FFT _In p_FFT_In Inloop Note Outloop counter Midloop pP_FFT_IN pQ_FFT_IN p_FFT_I p_FFT n _In N = number of samples (1024); BF = butterfly; Twid = twiddle factor; pP= pointer onto element Pn; pQ= pointer onto element Qn Stage 10 and procedure Absolute": Stage 10 R0 R1 R2 R3 R4 R5 R6 R7 Input a temp sin[k] Input b temp Absolute Input (Low) Input (High) Output temp temp temp temp temp R8 R9 R10 R11 R12 R13 R14 R15 Stage10 cos[k] Input c Input d 0,2,4,6,8,...,10 22 FFT_OUT Absolute Semiconductor Group 13 of 13 AP1634 03.97 |
Price & Availability of AP163402
![]() |
|
|
All Rights Reserved © IC-ON-LINE 2003 - 2022 |
[Add Bookmark] [Contact Us] [Link exchange] [Privacy policy] |
Mirror Sites : [www.datasheet.hk]
[www.maxim4u.com] [www.ic-on-line.cn]
[www.ic-on-line.com] [www.ic-on-line.net]
[www.alldatasheet.com.cn]
[www.gdcy.com]
[www.gdcy.net] |