Adaptive Hybrid FFT: A Novel Pipeline and Memory-Based Architecture for Radix-$2^k$ FFT in Large Size Processing
Zhao, Xiao, Wang et al.
In the field of digital signal processing, the fast Fourier transform (FFT) is a fundamental algorithm, with its processors being implemented using either the pipelined architecture, well-known for high-throughput applications but weak in hardware utilization, or the memory-based architecture, designed for area-constrained scenarios but failing to meet stringent throughput requirements. Therefore, we propose an adaptive hybrid FFT, which leverages the strengths of both pipelined and memory-based architectures. In this paper, we propose an adaptive hybrid FFT processor that combines the advantages of both architectures, and it has the following features. First, a set of radix-$2^k$ multi-path delay commutators (MDC) units are developed to support high-performance large-size processing. Second, a conflict-free memory access scheme is formulated to ensure a continuous data flow without data contention. Third, We demonstrate the existence of a series of bit-dimension permutations for reordering input data, satisfying the generalized constraints of variable-length, high-radix, and any level of parallelism for wide adaptivity. Furthermore, the proposed FFT processor has been implemented on a field-programmable gate array (FPGA). As a result, the proposed work outperforms conventional memory-based FFT processors by requiring fewer computation cycles. It achieves higher hardware utilization than pipelined FFT architectures, making it suitable for highly demanding applications.
academic
Adaptive Hybrid FFT: A Novel Pipeline and Memory-Based Architecture for Radix-2k FFT in Large Size Processing
The Fast Fourier Transform (FFT) is a fundamental algorithm in digital signal processing. Processor implementations of FFT typically employ two architectural paradigms: pipeline architecture (suitable for high-throughput applications but with low hardware utilization) and memory-based architecture (suitable for area-constrained scenarios but unable to meet stringent throughput requirements). This paper proposes an adaptive hybrid FFT architecture that combines the advantages of both approaches. The architecture is characterized by: (1) development of a set of radix-2k multi-path delay commutator (MDC) units to support high-performance large-scale processing; (2) formulation of a conflict-free memory access scheme ensuring continuous data flow; (3) proof of the existence of a series of bit-dimension permutations satisfying broad adaptability requirements for variable lengths, high radices, and arbitrary parallelism degrees.
Input Data → Data Reordering Module → FFT Core Processor → Output Data
↑ ↑
Memory Bank Group MDC Unit Group
Address Generation Unit (P parallel units)
Parallel Branch Permutation Circuit
Reordering Circuit
Computational Efficiency Improvement: Through radix-25 algorithm, iteration count reduced to ⌈n/5⌉, achieving over 40% reduction compared to traditional methods
Hardware Utilization Optimization: Through adaptive parallelism, hardware utilization for small-scale FFTs improved by 2-4 times
Enhanced Scalability: Supports wide-range FFT processing from 32 points to 512K points
The paper cites 17 relevant references covering FFT algorithms, FPGA implementations, memory access optimization, and other aspects, providing solid theoretical foundation for the research.
Overall Assessment: This is a high-quality computer architecture paper with significant theoretical and practical value in the field of FFT processor design. Through ingenious architectural design and rigorous theoretical analysis, the authors successfully address inherent problems in traditional FFT architectures, providing new insights and directions for the field's development.