These clock networkshave a significant impact on power since they connect toeach flip-flop on the FPGA and toggle every clock cycle.Combining these clocks to an efficient clock network, whichin-turn reduces area and reduces power dissipation 2. Clockgating is another technique that can be used to reduce dynamicpower consumption. In this technique the clock signal isdisabled for the inactive regions to prevent unwanted signaltransitions.Using clock gating technique in Xilinx 7 seriesdevices dynamic power consumption can be reduced from 10%to 80% 4.B.
Drowsy modeFig. 1. Drowsy modeThis method provides the ability to connect to two supplyvoltages VDDH and VDDL, a high and low supply voltageas seen in figure 1. The flexibility to connect to either of thesupply voltages is provided by two header PMOS devices thatare controlled by two control signals. When the memory bitis operating at the low supply voltage, the bit will consumeless leakage power since leakage power is proportional tothe supply voltage, while the cell still retains the stored data3. This is called drowsy mode.
This can also be referredas partially sleeping or standby mode. Drowsy mode mainlytarget the cache memories that have variable latencies anddynamic data placement. FPGA embedded memories havedifferent characteristics when compared to cache memories.
FPGA embedded memory accesses are statically scheduledand the data is stored statically. And drowsy mode does notfully turn off transistors so it does not reduce leakage power asmuch but it preserves data. Due to this it is observed that thedrowsy mode scheme offers only 10% static leakage powersavings.III. GLITCHES AND GLITCH REDUCTION TECHNIQUESGlitching occurs when values at the inputs of a LUT toggleat different times due to uneven propagation delays of thosesignals. If the arrival times are far enough apart, spurioustransitions can be produced at the LUT output.
Glitches canoccur multiple times during a clock cycle. The amount ofglitching is greater in circuits with many levels of logic,uneven routing delays, and exclusive-or (XOR) logic . Glitchesdo not adversely affect the functionality of a synchronouscircuit as they settle before the next clock edge, but theyhave a significant effect on power consumption. Glitch powercomprises an average of 26.0% of total dynamic powerA. Glitch reduction using delay insertionGlitches can be reduced by adding programmable delayelements to the configurable logic blocks (CLBs) in the FPGA.
These delay elements programmably align the arrival times ofearly-arriving signals at the inputs of the lookup tables (LUTs)to prevent the generation of glitches. The delay elements alsobehave as filters that eliminate other glitches generated byupstream logic or offchip circuitry. Since it is applied afterrouting, this implementation requires little or no modificationsto the FPGA routing architecture or CAD flow. Furthermore,it can be combined with other low-power techniques.In theory, this offers the potential to eliminate all glitchingin FPGAs, thereby saving significant amounts of power.
Inpractice, however, we must trade-off the power saved withthe area, and speed overhead incurred by the additional circuitryrequired to implement it. Fortunately, the impact oncircuit speed is not significant (other than increased parasiticcapacitance) because only the early arriving signals need tobe delayed. However, the programmable delay elements doconsume chip area so we should expect a modest increase inthe area of the device 1.Fig. 2. Delaying early-arriving signal to remove glitch.
The technique is shown in Figure 2; by delaying inputc, the output glitch can be eliminated. Note that the overallcritical-path of the circuit is not increased since only theearly-arriving inputs are delayed.Implementation of such technique in the configurable logicblock can be seen in the figure 3 below:Fig.
3. Delay elements at LUT inputs of CLB.Here, the LUTs and FFs (flip fops) are paired togetherinto Basic Logic Elements (BLEs). Three parameters are usedto describe a CLB: I specifies the number of input pins, Nspecifies the number of BLEs and output pins, and K specifiesthe size of the LUTs. The local interconnect allows each BLEinput to choose from any of the I CLB inputs and N BLEoutputs. Each BLE output drives a CLB output 1.In the scheme considered here the delay elements areinserted at the input of BLEs as seen in figure 3.
This architectureallows each LUT input to be delayed independently. Morethe number of delay elements more the reduction in glitchesbut slight increase in area and overhead. Using this technique91.8% of the glitches can be eliminated with overall powersavings of 18.2%.
B. Glitch reduction using don’t caresDon’t cares are entries in the truth table where a LUTsoutput can be set as either logic-0 or logic-1 without affectingthe correctness of the circuit. This optimization has zero costin terms of area and delay, and can be executed after timingclosure is completed. Glitch power is reduced by up to 49.0%,while total dynamic power is reduced by up to 12.5%.IV. CONCLUSIONSignificant improvements have been made to improve powerand energy efficiency of FPGAs.
Power management in FPGAswill be mandatory to ensure correct functionality, providehigh reliability, and to reduce packaging costs. Furthermore,lower power is needed if FPGAs are to be a viable alternativeto ASICs in low-power applications, such as battery-poweredelectronics. An example, FPGAs can be used as coprocessorsto perform compute intensive tasks more efficiently than insoftware. Because it is flexible, the hardware implementationof the coprocessor can be optimized for the given task andeven for specific input parameters such as media format.
This report summarizes the different works that have beencarried out and various techniques used at different FPGAlevels to reduce the power consumption of FPGAs namelyclock-aware placement, drowsy mode, delay insertion, don’tcares.In the delay insertion method at the circuit level, addingprogrammable delay elements to the CLB architecture to alignthe edges of each LUT input, thereby preventing formation ofglitches on the LUT outputs. The delay elements can alsofilter some glitches produced by the upstream logic. Usingthis technique 18.2% overall dynamic power reduction can beobtained compared to 12.5% from the don’t care method.