@MASTERSTHESIS\{IMM2003-02383, author = "G. Plakaris", title = "Power efficient arithmetric circuits for application specific processors", year = "2003", keywords = "Low-power, power efficient arithmetic, operand isolation, dynamic power management", school = "Informatics and Mathematical Modelling, Technical University of Denmark, {DTU}", address = "Richard Petersens Plads, Building 321, {DK-}2800 Kgs. Lyngby", type = "", url = "http://www2.compute.dtu.dk/pubdb/pubs/2383-full.html", abstract = "This thesis presents a study on {RT} level power optimization techniques in terms of their applicability on data-flow intensive data path designs and their efficiency. The dynamic power management techniques of clock gating and operand isolation are investigated and their efficiency evaluated by sample designs. Although, clock gating by itself offers signi cant power savings at low overhead in sequential blocks, it is not always the case that hold conditions can be extracted when input registers are shared among several resources. Latch based operand isolation, was also found quite efective, though savings are o set by the high overhead; evened out in case of the gatebased implementation for 32bit adder/subtractor units. Fine clock gating is proposed as an approach that merges the merits of both methods and yields the highest power savings and the least performance degradation, for the same overhead. The static {RTL} power optimization methods proposed are: power sensitive implementation selection and retiming. The use of carry-save arithmetic to eliminate carry propagation in datapaths is deployed to improve timing slack and provide larger margins for the performance-power trade-off in other parts of the design. The proposed methods are escorted by sample design examples to illustrate their efficiency. Further, by closely controlling unnecessary switching activity the overhead of sharing resources among operations of varying complexity is reduced. The methods proposed are suitable for a synthesisbased design flow and achieve performance comparable to custom application specific processors." }