Fast Fir Algorithm Based Area- Efficient
Parallel Fir Digital Filter Structures

Ms. P.THENMOZHI; Ms. C.THAMILARASI; Mr. V.VENGATESHWARAN

Fast Fir Algorithm Based Area- Efficient Parallel Fir Digital Filter Structures

Ms. P.THENMOZHI¹ , Ms. C.THAMILARASI² and Mr. V.VENGATESHWARAN³

Assistant Professor, Dept. of ECE, J.K.K.College of Technology, TN Palayam,Gobi, Erode, Tamilnadu, India
Assistant Professor, Dept. of ECE, Shree Vengateshwara hi-tech Engineering College, Gobi, Erode, Tamilnadu, India
Assistant Professor, Dept. of ECE, Sri Shanmugha College of Engineering and Technology, Gobi, Erode, Tamilnadu, India

Related article at Pubmed, Scholar Google

Visit for more related articles at International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering

Abstract

In digital systems, the filters occupy a major role. This work describes the design of parallel FIR filter structures using poly-phase decomposition technique that requires minimum number of multipliers and low power adders. Normally multipliers consume more power and large area than the adders. For reducing the area, this filter structure uses adders instead of multipliers since the adder requires low power and less area than the multipliers. Moreover, number of adders does not increase along with the length of parallel FIR filter. Finally the proposed parallel FIR filter structures are beneficial in terms of hardware cost and power when compared to the existing parallel FIR filter structure.

Keywords

Digital signal processing (DSP), fast finite-impulse response (FIR) algorithms (FFAs), symmetric convolution

INTRODUCTION

High-performance and low-power digital signal processing (DSP) is more useful in multimedia application, because it has explosive growth. In any digital signal processing (DSP) system, the FIR filter is one of the fundamental processing elements for giving high performance. FIR filters are used in DSP applications such as video and image processing to wireless communications. In video processing, the FIR filter circuit has the tendency to operate at high frequencies and other applications, like cellular telephony and multiple-input multiple-output (MIMO), the FIR filter circuit can be operate in moderate frequencies and also has low-power circuit with high throughput.

Two techniques of DSP applications like parallel and pipelining processing are used to reduce the power consumption. Power consumption of the original filter is reduced by parallel or block processing with digital FIR filters and also throughput is increased. Multiple outputs of parallel processing are computed by parallel in a clock period. So the level of parallelism increases the effective sampling speed. In the parallel processing applications, hardware units are replicated by involvement of an FIR filter and parallel functions of several inputs with several outputs can be processed at the same time. The original circuit area is A, and the L-parallel circuit needs an area of L × A. Linearly increases the circuit area with the block size. Due to the design area limitations, parallel processing hardware has much trouble in design situations. So the trouble can be solved by use of parallel FIR filtering structures that consume less area than traditional parallel FIR filtering structures.

Critical path is reduced due to the pipelining transformation that is introducing pipelining latches along the data path and also it increase the clock speed or sample speed or to reduce power consumption at same speed. Power consumption can be reduced by pipelining as similar to the parallel processing. In [5]-[11], the complexity of parallel FIR filter is reduced by the help of poly-phase decomposition, where first derive the small-sized parallel FIR filter structures and then the larger block-sized ones can be implemented by cascading or iterating small-sized parallel FIR filtering blocks. The complexity of parallel filter can be removed by the use of new class of algorithms termed as Fast FIR Algorithms (FFA) and it reduce the number of multiplications with increasing the number of additions for implement the hardware. This approach is used for implement the L-parallel filter approximately (2L - 1) sub-filter blocks, each having the N/L length. The resulting parallel filtering structure would require (2N – N/L) multiplications instead of L×N.

FAST FIR ALGORITHM

Assuming {xi} and {hi} to be the input sequence and FIR filter Nth-order impulse response respectively, the output sequence yn and the filter transfer function H(z) can be written as

The traditional L-parallel FIR filter can be implemented using poly-phase decomposition as

where Yi(z), Xk(z), and Hj(z) are the poly-phase components of output, input, and the filter transfer function, respectively and the poly-phase components are defined as follows,

The parallel FIR filter can be realized by the above block FIR filtering equation and various FFA structures are used to reduce the linear complexity.

A. 2 × 2 (L = 2) FFAS

From the Equation (2) having theL = 2,

which implies that

Fig. 1 shows the direct implementation of Equation (4)and 2 outputs using 4 length N/2 FIR filters structure computes a block and 2 post-processing additions, which requires 2N multipliers and 2N − 2 adders.

However, Equation (4) can be written as

The equation (5) shows the implementation in Fig. 2. This structure has three FIR sub-filter blocks of length N/2, which requires 3N/2 multipliersand 3(N/2 − 1) + 4 adders. From the figure, this filter structure has one preprocessing and three post-processing adders.

B. 3 × 3 (L = 3) FFAS

The (3×3) FFA produces a block size 3 parallel filtering structure. From (2) with L = 3,

Direct implementation of Equation (6) computes a block of 3 outputs using 9 length N/3 FIR filters and 6 postprocessing additions, which requires 3N multipliers and 3N − 3 adders. By a similar approach as in (2×2) FFA, following (3×3) FFA is obtained,

The hardware implementation of Equation (7) requires six length N/3 FIR sub-filter blocks, three preprocessing and seven post-processing adders, which reduce hardware cost. The implementation obtained from Equation (7) is shown in Fig. 3.

PROPOSED FFA STRUCTURES FOR SYMMETRIC CONVOLUTIONS

A new structure is proposed to utilize the symmetry of coefficients. Poly-phase decomposition is manipulated to earn many sub-filter blocks, which contain the symmetric coefficients. The sub-filter block reuses the half the number of multiplications and the total amount of an N-tap L-parallel FIR filter with saved multipliers uses the half the number of multiplications in a single sub-filter block (N/2L).

A. 2×2 PROPOSED FFA (L = 2)

From (4), A two-parallel FIR filter can be written as

When it comes to a set of even symmetric coefficients, Equation (8) can give one more symmetric coefficientsof subfilter block and the proposed two-parallel FIR filter implementation shown in Fig. 4. Proposed two-parallel FIR filter structure has three sub-filter blocks. Among those, 2 sub-filter blocks (H0 - H1 ) and (H0 + H1 ) are equipped with symmetric coefficients can be realized by Fig. 5. So each output of multiplier responds to two taps. Compared to the existing FFA two-parallel FIR filter structure, the proposed FFA structure needs the half of the multipliers.

B. 3×3 PROPOSED FFA (L=3)

Same as the equation (6), a three parallel FIR filter is written as equation (9). The proposed three-parallel FIR filter structure has the four of six sub-filter blocks with symmetric coefficients.

But the existing three parallel FIR filter structure has only two out of six sub-filter block with symmetric coefficients. Implementation of proposed three-parallel FIR filter structure is shown in Fig. 6. Comparison between proposed and existing three-parallel FIR filter structure is shown in Fig. 7. where the sub-filter blocks with symmetric coefficients shown by shadow blocks. The proposed structure additionally adds two adders in preprocessing and five adders in post processing blocks. Therefore, N/3 multipliers can be saved for proposed N-tap three-parallel FIR filter structure.

C. PROPOSED CASCADING FFA

The proposed parallel FIR filter structure brings more adder cost in preprocessing and post-processing blocks. It reuses the multipliers in some part of the sub-filter blocks. For larger parallel block factor L, cascading the proposed FFA parallel FIR structures increase the number of adders. So hardware complexity can be increased. To avoid complexity, the existing FFA structures are employed for some sub-filter blocks that contain no symmetric coefficients which have more compact operations in preprocessing and post-processing blocks and the proposed FFA structures are applied to the rest of sub-filter blocks with symmetric coefficient. Comparison of sub-filter blocks between four parallel existing FFA and proposed FFA is shown in Fig. 8. The proposed four parallel FIR structure has three more sub-filter blocks having symmetric coefficients compared to existing FFA structure.

EXPERIMENTAL RESULT AND IMPLEMENTATION

The existing FFA structures and the proposed FFAarchitectures have been implemented in VHDL with word length 16- bit and filter length of 24. Carry save, carry select and binary to excess 1 adder are used to implement the sub-filter block. Parallel FIR Filter structure simulation result is shown in Fig. 9. Detailed comparison results of area, LUTS, power, delay and Maximum frequency are showed in the Table I, Table II, Table III, Table IV, Table V.

CONCLUSION

The proposed parallel FIR filter structure was designed to reduce the power consumption and hardware complexity. It gives the more features to symmetric convolutions when the multiple number of taps like 2 or 3. Multiplier provides the higher hardware consumption in implementationparallel FIR filter. This method having the symmetric coefficients nature and saves the more amounts of multipliers with help of adders and it has high benefits. So, the proposed structures have thesymmetric convolutions dealing with advantageous poly-phase decompositions. It gives the better hardware consumptionthan the existing FFA structures.

References

Acha, J.I. (1989), ÃÂ¢Ãâ¬ÃËComputational structures for fast implementation of L-path and L-block digital filters,ÃÂ¢Ãâ¬Ãâ¢ IEEE Transactions on Circuit Systems I, vol. 36, no. 6, pp. 805ÃÂ¢Ãâ¬Ãâ812.

Cheng, C., and Parhi, K. K. (2004), ÃÂ¢Ãâ¬ÃËHardware efficient fast parallel FIR filter structures based on iterated short convolution,ÃÂ¢Ãâ¬Ãâ¢IEEE Transactions onCircuitsSystems I, Reg. Papers, vol. 51, no. 8, pp. 1492ÃÂ¢Ãâ¬Ãâ1500.

Cheng, C., andParhi, K. K. (2005), ÃÂ¢Ãâ¬ÃËFurther complexity reduction of parallel FIR filters,ÃÂ¢Ãâ¬Ãâ¢ in Proc. IEEE International Symposium on Circuits Systems I, Kobe, Japan,

Cheng, C., and Parhi. K. K. (2007), ÃÂ¢Ãâ¬ÃËLow-cost parallel FIR structures with 2-stage parallelism,ÃÂ¢Ãâ¬Ãâ¢IEEE Transactions on Circuits Systems I, Reg. Papers, vol. 54, no. 2, pp. 280ÃÂ¢Ãâ¬Ãâ290.

Chung, J.G., and Parhi, K.K. (2008), ÃÂ¢Ãâ¬ÃËFrequency-spectrum- based low-area low-power parallel FIR filter design,ÃÂ¢Ãâ¬Ãâ¢EURASIP J. Appl. SignalProcess.

Lin, I.S., and Mitra, S.K (1996), ÃÂ¢Ãâ¬ÃËOverlapped block digital filtering,ÃÂ¢Ãâ¬ÃÂ IEEE Transactions on Circuits Systems II, Analog Digital Signal Processing, vol.43, no. 8,pp. 586ÃÂ¢Ãâ¬Ãâ596.

Mou, Z.J., and Duhamel, P. (1991), ÃÂ¢Ãâ¬ÃËShort-length FIR filters and their use in fast non-recursive filtering,ÃÂ¢Ãâ¬Ãâ¢IEEE Transactions on Signal Processing, vol. 39, no.6, pp.1322ÃÂ¢Ãâ¬Ãâ 1332.

Parker, D.A., and Parhi, K.K.(1997), ÃÂ¢Ãâ¬ÃËLow-area/power parallel FIR digital filter implementations,ÃÂ¢Ãâ¬Ãâ¢ J. VLSI Signal Processing andSystems, vol. 17, no. 1, pp. 75ÃÂ¢Ãâ¬Ãâ92.

Parhi, K.K. (1999), VLSI Digital Signal Processing Systems: Design and Implementation. New York.

Yu-Chi Tsao and Ken Choi ,ÃÂ¢Ãâ¬ÃËArea-Efficient Parallel FIR Digital Filter Structures for Symmetric Convolutions based on Fast FIR Algorithm,ÃÂ¢Ãâ¬Ãâ¢ IEEE Transactions.