We have analyzed memory footprint and combinational complexity to arrive at a systematic design strategy to derive area-delay-power-efficient architectures for two-dimensional (2-D) finite impulse response (FIR) filter. We have presented novel block-based structures for separable and non-separable filters with less memory footprint by memory sharing and memory-reuse along with appropriate scheduling of computations and design of storage architecture. The proposed structures involve L times less storage per output (SPO), and nearly L times less energy consumption per output (EPO) compared with the existing structures, where L is the input block-size. They involve L times more arithmetic resources than the best of the corresponding existing structures, and produce L times more throughput with less memory band-width (MBW) than others. We have also proposed separate generic structures for separable and non-separable filter-banks, and a unified structure of filter-bank constituting symmetric and general filters.
The proposed unified structure for 6 parallel filters involves nearly 3.6L times more multipliers, 3L times more adders, (N2-N+2) less registers than similar existing unified structure, and computes 6L times more filter outputs per cycle with 6L times less MBW than the existing design, where N is FIR filter size in each dimension. ASIC synthesis result shows that for filter size (4 × 4), input-block size L=4, and image-size (512 × 512), proposed block-based non-separable and generic non-separable structures, respectively, involve 5.95 times and 11.25 times less area-delay-product (ADP), and 5.81 times and 15.63 times less EPO than the corresponding existing structures. The proposed unified structure involves 4.64 times less ADP and 9.78 times less EPO than the corresponding existing structure.