VHDL coding tips and tricks: August 2010

Friday, August 13, 2010

Careful RAM designing for reducing power consumption in FPGA

     I will discuss some points which will be helpful for reducing the power reduction in an FPGA in this post.Mainly I am concentrating on the power dissipation caused by the RAM.These points are selected from the Xilinx white paper for Virtex-5 system power design considerations.But I will note down the points which will apply for any Xilinx FPGA.
     There are two primary types of power consumption in FPGA's: static and dynamic power. Static power is consumed due to transistor leakage. Dynamic power is consumed by toggling nodes as a function of voltage, frequency, and capacitance.The leakage current is directly proportional to the speed of the processor,operating voltage of the processor and junction(or die) temperature.So static power increases from Virtex 4 FPGA to Virtex 5 FPGA.On the other hand dynamic power reduces from Virtex 4(90 nm device) to Virtex 5(65 nm device).This is because dynamic power is directly proportional to the voltage of operation and the capacitance(this includes the transistor parasitic capacitance and metal interconnect capacitance).From Virtex 4 to Virtex 5, these two parameters decrease and so we get around 40 % reduction in dynamic power.
You can get more details from the pdf link I have shared above.

      Xilinx has given some tips in reducing the power consumption by designing RAM's intelligently.I am writing down them one by one:
1)Choose the right RAM primitive for your design.When choosing a RAM organization within the target architecture, the width, depth,and functionality must be considered. Choosing the right memory facilitates the selection of the most power-efficient resource for the end design.
2)Ensure that, the block RAM is only enabled when data is needed from it.This is because the power requirements of a block RAM is directly proportional to the amount of time it is enabled.Normally for ease of coding the enable signal is always "ON".But for power sensitive applications take some extra effort to make use of enable signal of RAM.
Another tip regarding enable signal is explained in the following example.Say you want a 2k x 8 bit RAM in your design.Then use four 512 x 8 bit RAM's for this.Now use a seperate a enable signal for each RAM.This needs some extra logic, but at any time only one RAM will be ON , so we can save around 75% of the power.
3)Ensure the WRITE_MODE of RAM is set properly.If the block RAM contents are never read during a write, the RAM power can be reduced by a significant amount with the selection of the NO_CHANGE mode rather than the default WRITE_FIRST mode.This mode can be set easily if you are using the core generator GUI to create the RAM module.

Note:- It would be wise to see the Xilinx white paper on power reduction for your particular FPGA before start the coding for your design.This will be helpful in getting useful tips which are device specific.

Thursday, August 12, 2010

Using Xilinx primitives in your design-An example

    Xilinx has provided a library named "UNISIM" which contains the component declarations for all Xilinx primitives and points to the models that will be used for simulation.This library is used during functional simulation and contains descriptions for all the device primitives, or lowest-level building blocks.
    In this post I will show you how to use a primitive in your design.In the example I am using the primitive named "RAMB16_S2" which is 8k x 2 Single-Port RAM for Spartan-3E FPGA. This primitive is instantiated twice to make 8k x 4 single port RAM.
    The code is given below.Note that I have made the code in the form of a testbench.So the below code is not synthesisable.This code is just for guiding you, how to use Xilinx primitives in your design.The code is well commented. So I hope I need not explain about the working of the code.Here in order to create a 8k x 4 RAM I used two 8k x 2 RAM's side by side.One RAM used to store the LSB 2 bits of the data while the 2nd RAM is used to store the MSB 2 bits of the data.Note that the address is same for both the RAM's.
    See the way I have instantiated the primitive in the design.Also don't forget to add the UNISIM library to the design.


library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
use ieee.std_logic_arith.all;
--Remember to add this library when you use xilinx primitives from Language templates.
library unisim;
use unisim.vcomponents.all;

entity ram_test is
end ram_test;

architecture Behavioral of ram_test is

--signal declarations.
signal clk,en,ssr,we : std_logic:='0';
signal Dout,Din : std_logic_vector(3 downto 0):="0000";
signal addr : std_logic_vector(12 downto 0):=(others => '0');

begin

--RAMB16_S2 is 8k x 2 Single-Port RAM for Spartan-3E.We use this to create 8k x 4 Single-Port RAM.
--Initialize RAM which carries LSB 2 bits of the data.
RAM1  : RAMB16_S2 port map (
      DO => Dout(1 downto 0),      -- 2-bit Data Output
      ADDR => ADDR,  -- 13-bit Address Input
      CLK => CLK,    -- Clock
      DI => Din(1 downto 0),      -- 2-bit Data Input
      EN => EN,      -- RAM Enable Input
      SSR => SSR,    -- Synchronous Set/Reset Input
      WE => WE       -- Write Enable Input
   )

--Initialize RAM which carries MSB 2 bits of the data.
RAM2  : RAMB16_S2 port map (
      DO => Dout(3 downto 2),      -- 2-bit Data Output
      ADDR => ADDR,  -- 13-bit Address Input
      CLK => CLK,    -- Clock
      DI => Din(3 downto 2),      -- 2-bit Data Input
      EN => EN,      -- RAM Enable Input
      SSR => SSR,    -- Synchronous Set/Reset Input
      WE => WE       -- Write Enable Input
   );

--100 MHz clock generation for testing process.
clk_process : process
begin
wait for 5 ns;
clk <= not clk;
end process;

--Writing and Reading RAM.RAM has a depth of 13 bits and has a width of 4 bits.
simulate : process
begin
        en <='1';
        we <= '1';
        --Write the value "i" at the address "i" for 10 clock cycles.
        for i in 0 to 10 loop
                addr <= conv_std_logic_vector(i,13);
                din <= conv_std_logic_vector(i,4);
                wait for 10 ns;
        end loop;
        we<= '0';
        --Read the RAM for addresses from 0 to 20.
        for i in 0 to 20 loop
                addr <= conv_std_logic_vector(i,13);
                wait for 10 ns;
        end loop;

end process;

end Behavioral;


Thursday, August 5, 2010

Random number generator in VHDL(cont from a prev post)

    This post is a continuation of my previous post(which is not removed from the blog) about random number generator in VHDL. I have used a LFSR(Linear feedback shift register) for creating a random sequence last time. But thanks to one of my blog readers, Chris, the code given in that post was partially wrong.The problem was that the tap values I have taken for feeding back the shift register was wrong.This resulted in a non-maximum length sequence.For example as Chris pointed out, in the older code, when the register size is 32 bits the sequence period is 2^21-1 and not 2^32-1 as claimed.

    So I have written another code which uses the correct tap values to ensure that the sequence generated is of maximum length.The project is uploaded at opencores.org and can be downloaded for free.The code take cares of register sizes from 3 bit to 168 bits.The tap values were referred from a Xilinx documentation about LFSR.

    The project can be downloaded from here.After downloading extract the contents of the file.The codes and documentation is available in the folder named "trunk".
Currently the project is in alpha stage.Please let me know if you find any bugs in the code or any sort of comments.
Hope the project is useful.