# A Novel Design of High Speed Memory-Based FFT Processor Using Adaptive Technique

#### Tadikonda Prasad Babu and Dr. Rahul Mishra

Department of Electronics & Communication Engineering, Dr. A.P.J. Abdul Kalam University, Indore (M.P.) - 452010, India Corresponding Author Email : tadikondaprasadbabu@gmail.com

Article Info Page Number: 812 - 819 Publication Issue: Vol 70 No. 2 (2021)

#### Abstract

In this paper a novel design of high speed memory based FFT processor using adaptive technique is implemented. FFT is mainly used in the applications of high speed. Initially, input index vector generator will take inputs and transfer the inputs to scaling unit controls the scaling operations and transfer those bits to the BAGU unit. Memory bank address and address generation unit will generate the address to obtain data and saved in the memory-1 block. Similarly same operation is performed and saved that data in memory-2 block. All these data will be computed using  $p \times p$  compotator. The computed data is multiplied using modular multiplication unit. The obtained multiplied data is transferred to the BFP unit which is nothing but block floating point. From results it can observe that proposed system gives effective results in terms of delay and area.

Article History Article Received: 05 September 2021 Revised: 09 October 2021 Accepted: 22 November 2021 Publication: 26 December 2021

**Keywords:** BAGU (Memory Bank Address and Address Generation Unit), SU (Scaling Unit), BFP (Block Floating Point), FFT (Fast Fourier Transform).

### I. INTRODUCTION

Now days, High Speed computer architectures are very needed for multimedia applications. In Arithmetic logic unit, multipliers are very important functional blocks. For high performance, the considerable factor is Fast Multiplication. The fast multipliers are essential in advance electronic systems, fast multipliers required in Advanced Electronic Systems, which requires high speed calculations there are digital signal processing, microprocessors and so on.. It has more demand of high speed performance used for applications of signal processing ,andit increasing more DSP systems Started utilizes the high performance multiplication unit for implementing methods are convolution, frequency analysis and filtering time.

In the present generation the advanced computing applications, communication applications, power utilization applications has been more popular. They mainly depend on the size, cost and chip quality of the circuit. Generally, in integrated chip, power consumption plays very important role. Basically, in communication systems like error correction codes and cryptography, finite field is most widely used. Arithmetic operations are performed using the field elements. Two basis are

normally used to implement a system that is normal basis and polynomial basis. Normal basis is used to implement the hardware and perform the low cost squaring operations.

In the same way, polynomial basis is used to implement the software and in the same way this also performs the low cost squaring operations. The energy utilization in a computerized CMOS (Complex Metal Oxide Semiconductor) circuit comprises of dynamic force consumption, static force utilization and short circuits power utilization. The dynamic force utilization is typically from the dynamic force, which is utilized in charging hub capacitances.

Scaling of transistor geometries have led to integration of millions of devices in a very small space, thus driving realization of complex applications on hardware and supporting high speed applications. This energy has revolutionized not only electronics, but also industry. In order to reduce power, many researchers, designers and engineers have come up with many innovative techniques and have used their ideas.

However, designers will need to budget and plan for power dissipation as a factor nearly as important as performance and perhaps more important than area. Low power techniques have been successfully adopted and implemented in designing complex VLSI circuits. As the demand for faster, low cost and reliable products that operate on remote power source performing high end applications keep increasing there is always a need for new low power design techniques for VLSI.

In digital signal processing multiplier plays very important role in this present day generation. Not only in DSP but also in various applications, multiplier is most widely used.Multiplication is a fundamental math activity for normal Digital Signal Processing (DSP) applications, for example, Fast Fourier transforms (FFT). To accomplish high execution speed, equal cluster multipliers are broadly utilized. However, these multipliers introduces more force. Force utilization has become a basic worry in the present VLSI framework plan. Subsequently the architects are expected to think power proficient multipliers for the plan of low-power DSP frameworks.

# **II. LITERATURE SURVEY**

Fahad Qureshi et al. [1], introduces the reconfiguration of mixed radix core FFT processor. Based on radix-3 wingorad Fourier transform introduced architecture is designed. The general multiplier is utilized for the constant multiplication process. Some type of additional multiplexers is utilized by the FFT to reduce the complexity. All FFT sizes are factorized into 2,4,4 point systems. The introduced architecture will improve the accuracy in effective way.

Fahad Qureshi et al. [2], introduce the FFT which is based on the building blocks by computing the address of elements. Radix-3, radix-5 butterflies are utilized to process the elements for computation. Radix-2/3/4/5 FFT algorithms are utilized to process the reconfigurable elements. Wingorad Fourier transform algorithm is used to process the elements. Instead of using general complex valued multiplier, constant multiplier is utilized for the process of multiplication. In both memory and pipelined FFT architectures, the processing elements are utilized. Hence this will reduce the hardware cost for processing the elements.

Shashidhara. K. S., et al. [3],introduces the both FFT and IFFT concepts. The both FFT and IFFT are depending on the OFDM systems. HereOFDM is nothing but orthogonal frequency division multiplexing systems. Wireless systems are utilized based on the architecture of physical layer. Because of this there will be consumption of low power. Hence in this paper implementation of FFT is done which gives low power and low area. The operations of multiplication and addition are performed in FFT. Because of this there will be reduction of complexity. Therefore the introduced FFT architecture will minimize the intermediate stages decompose the reusable techniques.

Re-usability is identified by using the common sub expression. Because of this there will be reduction in area and power. Depend on the multipliers the realization of design is performed. This multiplier is based on the arithmetic and logical operations. By using the Xilinx technology the FFT architecture is synthesized.

Lekshmi Viswanath et al [4], has introduced the FFT architecture which is reversible in nature. Basically, these are utilized in the applications of cryptography, digital signal processing, computer graphics and communication. Computations are performed effectively because of reversibility and this plays major role in entire operation.

To reduce the area and power factors the concept of reversibility FFT is introduced. Power dissipation is reduced by the reversible logic which is based on classical circuits. Loss of information is prevented by using the reversible logic. 16 bit BCD technique is utilized by the reversible logic. This BCD operation consists of five logical operations and three arithmetic operations. Arithmetic operations consist of addition, subtraction, multiplication and division. By using reversible gates the FFT architecture is implemented. Hence this will reduce the area and power in very effective way.

Akanksha Dixit et al [5], Reversible logic have received great attention in the recent years due to its ability to reduce the power dissipation which is the main requirement in low power digital design. It has wide applications in advanced computing, low power design, Optical information processing. Conventional digital circuits dissipate a significant amount of energy because bits of information are erased during the logic operations. Thus, if logic gates are designed such that the information bits are not destroyed, the power consumption can be reduced dramatically.

The information bits are not lost in case of a reversible computation. This has led to the development of reversible gates. BCD is a fundamental building block of a Central Processing Unit (CPU) in any computing system; reversible arithmetic unit has a high power optimization on the offer. By using suitable control logic to one of the input variables of parallel adder, various arithmetic operations can be realized.

### **III. MEMORY-BASED FFT PROCESSOR USING ADAPTIVE TECHNIQUE**

The below figure (1) shows the architecture of memory based FFT processor using adaptive technique. The entire system is divided into following parts they are, input index vector generator, Scaling Unit, BAGU (Memory bank address and address generation unit), output index vector generator, and three memory banks. The commutators there are some pre-stored twiddle factors, and FFT size parameters.

#### Mathematical Statistician and Engineering Applications ISSN: 2094-0343



### Fig. 1: Architecture of Memory Based Fft Processor Using Adaptive Technique

Initially, input index vector generator will take inputs and transfer the inputs to scaling unit controls the scaling operations and transfer those bits to the BAGU unit. Memory bank address and address generation unit will generate the address to obtain data and saved in the memory-1 block. Similarly same operation is performed and saved that data in memory-2 block. All these data will be computed using  $p \times p$  computator. The computed data is multiplied using modular multiplication unit. The obtained multiplied data is transferred to the BFP unit which is nothing but block floating point.

Based on the computation address generator the concurrent data is obtained for each cycle. Computation address generator block will store the intermediate results. All symbols are operated in ping pong mode which is continuous in nature. Input sampling is performed at the output side memory. By using 3 memory groups the computed data is utilized and it is placed at output side. Based on this place strategy the memory groups are utilized.

Memory bank address and address generation unit will save the address and generate the address for saved data. Efficient routing mechanism is provided between the memory block and commutators block. Address generator will control the computed data. In the register files the parameters like FFT size and twiddle factors are saved. Working modes are utilized in FFT for the process of configuration. Scaling operations are performed by reducing the memory usage and signal to quantization noise.

By using continuous flow mode the FFT is operated. This is based on the in place strategy and consists of three memory banks. In stage 0 only on modular multiplier unit will be active and at other stages two radix based modular multiplier stages are activated. Based on number of clock cycles the computation is completed. The both input and output index vector generated is merged into one at output side. For input side the binary representation of index is utilized in forward manner and for output side the binary representation of index is utilized in reverse manner.

Input data is distributed by using the index vector generator. This is obtained because of memory positions of input data are placed properly. Again the output data is reordered by using the natural sequence. By using natural order the index counter will be counted. Based on mapping the prime

Mathematical Statistician and Engineering Applications ISSN: 2094-0343

factors of input index are obtained. Prime factors obtained in different way after mapping procedure. Trivial factors are utilized by the prime factors by decomposing the large portions. At last trivial factors are obtained in the memory bank and address.

### **IV. RESULTS**

The below figure (2) shows the RTL schematic of memory based FFT processor using adaptive technique. The RTL schematic is the combination of both inputs and outputs.



### Fig. 2: RTL Schematic

The below figure (3) shows the input waveform of memory based FFT processor using adaptive technique.

|                                                    |                                         |                                         |                                         |                                         |                                         |                                         |                                         | 2,000,000 ps |       |
|----------------------------------------------------|-----------------------------------------|-----------------------------------------|-----------------------------------------|-----------------------------------------|-----------------------------------------|-----------------------------------------|-----------------------------------------|--------------|-------|
| Name                                               | Value                                   | 1,999,994 ps                            | 1,999,995 ps                            | 1,999,996 ps                            | 1,999,997 ps                            | 1,999,998 ps                            | 1,999,999 ps                            | 2,000,000 ps | 2,000 |
| ▶ <table-of-contents> aa[15:0]</table-of-contents> | 000000000000000                         |                                         |                                         | 0000000000                              |                                         |                                         |                                         |              | Т     |
| ▶ <table-of-contents> a[7:0]</table-of-contents>   | 00000010                                |                                         |                                         | 000000                                  | 10                                      |                                         |                                         |              |       |
| ▶ <table-of-contents> b(7:0)</table-of-contents>   | 00000001                                |                                         |                                         | 000000                                  | 01                                      |                                         |                                         |              |       |
| ▶ 👹 p(1120)                                        | 000000000000000                         | 100000000000000000000000000000000000000 | 010101000000000000000000000000000000000 | 000000000000000000000000000000000000000 | 000000000000000000                      | 0000000 1000000                         | 000000000000000000000000000000000000000 |              |       |
| 🕨 👹 g(1120)                                        | 00000000000000                          |                                         | 010000000000000000000000000000000000000 | 000000000000000000000000000000000000000 | 00000000000000000                       | 000000000000000000000000000000000000000 | 000000000000000000000000000000000000000 |              |       |
| ▶ 👹 ср[224:0]                                      | 00000000000000                          |                                         | 010101000000000000000000000000000000000 | 000000000000000000000000000000000000000 | 00000000000000000                       | 000000000000000000000000000000000000000 | 000000000000000000000000000000000000000 |              |       |
| ▶ 👹 cg[224:0]                                      | 00000000000000                          |                                         | 01000000000000000                       | 000000000000000000000000000000000000000 | 00000000000000000                       | 000000000000000000000000000000000000000 | 000000000000000000000000000000000000000 |              |       |
| 🕨 👹 d(15:0)                                        | 000000000000000                         |                                         |                                         | 000000000                               | 000010                                  |                                         |                                         |              |       |
| 🕨 👹 e(150)                                         | 000000000000000                         |                                         |                                         | 0000000000                              | 100001                                  |                                         |                                         |              |       |
| 🕨 👹 f(1.5.0)                                       | 0000000000000000                        |                                         |                                         | 0000000000                              | 0000                                    |                                         |                                         |              |       |
| 🕨 👹 h(15.0)                                        | 0000000000000000                        |                                         |                                         | 000000000                               | 0000                                    |                                         |                                         |              |       |
| 🕨 👹 k(15:0)                                        | 000000000000000                         |                                         |                                         | 000000000                               | 0000                                    |                                         |                                         |              |       |
| I[15:0]                                            | 0000000000000000                        |                                         |                                         | 0000000000                              | 0000                                    |                                         |                                         |              |       |
| 🕨 🔣 m(15x0)                                        | 0000000000000000                        |                                         |                                         | 0000000000                              | 0000                                    |                                         |                                         |              |       |
| 🕨 👹 n(15.0)                                        | 00000000000000000                       |                                         |                                         | 000000000                               | 0000                                    |                                         |                                         |              |       |
| 🕨 👹 i(64:0)                                        | 000000000000000000000000000000000000000 | U                                       | 00000000000000000                       | 000000000000000000000000000000000000000 | 000000000000000000000000000000000000000 | 00000000000                             |                                         |              |       |
| ▶ 1640]                                            | 00000000000000                          | 00                                      | 010100000000000000000000000000000000000 | 101010000000000000000000000000000000000 | 0000000000000000                        | 000000000010                            |                                         |              |       |
|                                                    |                                         | X1: 2,000,000 ps                        |                                         |                                         |                                         |                                         |                                         |              |       |

Fig. 3: Input Waveform

The below figure (4) shows the output waveform of memory based FFT processor using adaptive technique.



# Fig. 4: Output Waveform

The below table (1) shows the comparison of existed system and proposed system. Total delay, logic delay, route delay and memory used parameters are compared with existed system and proposed system.

| S.NO | PARAMETERS  | EXISTED           | PROPOSED         |
|------|-------------|-------------------|------------------|
|      |             | SYSTEM            | SYSTEM           |
| 1    | Total Delay | 28.855 ns         | 11.658 ns        |
| 2    | Logic Delay | 15.903 ns         | 0.982 ns         |
| 3    | Route Delay | 12.952 ns         | 10.676 ns        |
| 4    | Memory Used | 316384 K<br>bytes | 305196K<br>bytes |
|      |             |                   |                  |

The below figure (5) shows the comparison of delay for both existed and proposed system. Basically, total delay is classified into two types logic delay and route delay. Compared with existed system, proposed system reduces the delay in very effective way.



Mathematical Statistician and Engineering Applications ISSN: 2094-0343

Fig. 5: Comparison of Delay

The below figure (6) shows the comparison of memory usage. Compared with existed system, proposed system reduces the usage of memory in very effective way.



Fig. 6: Comparison of Memory Usage

### **V. CONCLUSION**

Hence in this paper a novel design of high speed memory based FFT processor using adaptive technique was implemented. The entire memory based FFT processor is simulated in Xilinx technology. Simulation results shows that compared to existed system, proposed system gives effective results in terms of total delay, logic delay, route delay and memory used.

### REFERENCES

- [1] Fahad Qureshi, Jarmo Takala, Anastasia Volkova, Thibault Hilaire, "Multiplierless Unified Architecture for Mixed Radix-2/3/4 FFTs", 2017 25th European Signal Processing Conference (EUSIPCO), IEEE 2017.
- [2] FahadQureshi, Muazam Ali, and Jarmo Takala, "Multiplierless Reconfigurable Processing Elementfor Mixed Radix-2/3/4/5 FFTs", 978-1-5386-0446- 5/17/\$31.00 ©2017 IEEE.
- [3] Ms. A. Anjana and Mrs. A.V Ananthalakshmi, "Design of Reversible 32-Bit BCD Add-Subtract Unit using Parallel Pipelined Method", International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB16) 978-1- 4673-9745-2 ©2016 IEEE.
- [4] Matthew Morrison and Nagarajan Ranganathan, "Design of a Reversible ALU based on Novel Programmable Reversible Logic Gate Structures", 2015 IEEE Computer Society Annual Symposium on VLSI.

- [5] Lekshmi Viswanath and Ponni. M, "Design and Analysis of 16 Bit Reversible ALU", ISSN: 2278-0661 Volume 1, Issue 1 (May-June 2014), PP 46-53.
- [6] Akanksha Dixit and Vinod Kapse, "Arithmetic & Logic Unit (ALU) Design using Reversible Control Unit", International Journal of Engineering and Innovative Technology (IJEIT) Volume 1, Issue 6, June 2014.
- [7] Mr. Abhishek Gupta, Mr. Utsav Malviya and Prof. Vinod Kapse, "Design of Speed, Energy and Power Efficient Reversible Logic Based Vedic ALU for Digital Processors", 2012 IEEE Computer Society Annual Symposium on VLSI.
- [8] H. Thapliyal and N. Ranganathan, "Design of Efficient Reversible Binary Subtractors Based on A New Reversible Gate," Proc. of the IComputerSociety Annual Symposium on VLSI, 2009.
- [9] M. Morrison and N. Ranganathan, "Design of a Reversible ALU Based on Novel Programmable Reversible Logic Gate Structures," IEEE International Symposium on VLSI, 2009, pp. 126-131.
- [10] M. Haghparast, S. J. Jassbi, K. Navi and O. Hashemipour, "Design of a Novel Reversible Multiplier Circuit using HNG Gate in Nanotechnology", World Applied Sci. J., Vol. 3, 2008, pp. 974-978.
- [11] H. Thapliyal, N. Ranganathan, and R. Ferreira, "Design of a Comparator Tree Based on Reversible Logic," 10th Proceedings of the IEEE International Conference on Nanotechnology, 2008, pp 1113-6.
- [12] P. Gupta, A. Agarwal, and N. K. Jha. An algorithm for synthesis of reversible logic ciruits. IEEE Trans. Computer-Aided Design, 25(11):2317–2330, Nov 2006.
- [13] Bucci, O.M. and Migliore, M.D. A Novel Non Uniform Fast Fourier Transform Algorithm and its Application to Aperiodic Arrays. IEEE Antennas and Wireless Propagation Letters. 2006.
- [14] S. Shenbaga Ezhil. Real Time Application of Fourier Transforms. Indonesian Journal of Electrical Engineering and Computer Science Vol. 8, No. 2, November 2005, pp. 574
- [15] Mark E. Jakubauskas, David R. Legates, and Jude H. Kastens Harmonic Analysis of Time-Series AVHRR NDVI Data, Photogrammetric Engineering and Remote Sensing 2001.
- [16] S. Srivastava and R. Kumar, "Indirect method to measure software quality using CK-OO suite," 2013 International Conference on Intelligent Systems and Signal Processing (ISSP), 2013, pp. 47-51, doi: 10.1109/ISSP.2013.6526872.
- [17] Kumar, R., Singh, J.P., Srivastava, G. (2014). Altered Fingerprint Identification and Classification Using SP Detection and Fuzzy Classification. In: , et al. Proceedings of the Second International Conference on Soft Computing for Problem Solving (SocProS 2012), December 28-30, 2012. Advances in Intelligent Systems and Computing, vol 236. Springer, New Delhi. <u>https://doi.org/10.1007/978-81-322-1602-5\_139</u>.