VHDL coding tips and tricks: 2012

Wednesday, July 11, 2012

Synthesised code is too big for the fpga device - What to do?

After lot of hard work you completed your HDL project. The simulation results verified that the code is functionally working. To check how it performs in hardware you synthesis the design. To your bad luck you realize that the code you just wrote is too big for the fpga. What can you do? Don't panic. There are few ways you can tackle this problem.

1)If possible choose a higher graded fpga device:

Its a simple but the easiest thing you can do. Check if the lab or a friend has a better fpga device which can afford your design. If you really don't want to test the design in hardware,but just want to see the synthesis results then simply select the largest device available in the list.

2)Is the fpga out of pins?

Some times the synthesis tool will give out an "Out of resources" warning if the design has too many signals in its port list that the device can't support. This happens when you try to input or output large arrays or vectors.
In such cases use a multiplexed input or output system. Rather than inputting everything in one go, do it step wise. Check it out here.

3)Changing synthesis tool settings:

By default, the synthesis tool try to optimize your design for both speed and resource usage. But you can change this setting so that the tool will optimize for less resource usage. This may reduce the speed a little, but may significantly reduce the resource usage.

4)Re-use of resources:

Analyze the design carefully and see if any parts of the design can use time-sharing of resources. To do this you have to synchronize the whole design with a clock.

Time sharing means using the same resource for similar kind of operations like addition, multiplication etc. Suppose you want to do an operation like,

y  = a+b+c+d;   which uses 3 adder circuits.

then split the above operation over 3 clock cycles like this,

y= a + b;  --in first clock cycle.
y= y + c;  --in second clock cycle.
y= y + d;  --in third clock cycle.

this way only one adder will be used for the whole operation. This will increase the time for generating output, but reduces logic usage.

5) Look for any mathematical simplifications:

Analyze the mathematical formula you are implementing and look for any simplification. For instance take this operation,

y=x / 5;

In digital world, division circuit is bigger than multiplication circuit. So make a small change in the formula like this,

y = x * (1/5) = x * 0.2;

6)Simplify design based on nature of inputs:

The code may be written for a generic use. But in real cases, the range of inputs may be small and predictive in nature. In such cases you can further simply the formula.

One good example is multiplication and division of variables by a number which is power of 2. If the multiplicand or divisor is a power of 2, then you can implement it using a left shift or right shift operation respectively. This is an excellent optimization method in some cases.
 

Thursday, June 28, 2012

How to Mix VHDL and Verilog files in your design

Sooner or later you will come across this question in your FPGA design career. There are times you found the right modules in the web, but couldn't use it just because you are using a different HDL. But you dont need to be disappointed. Instantiating VHDL components in Verilog modules, or vice versa is a simple process. Let me show it with an example.

Case 1 -   Instantiating VHDL components in Verilog modules:

  For example sake, take the synchronous D flip flop vhdl code I have written some time before. Suppose I want to write a Verilog module in which I want to instantiate two D- flipflops. Without worrying, you can simply instantiate it like you do it for a verilog module. See the code:

module test(
    output [1:0] Q,
    input Clk,
    input CE,
    input RESET,
    input SET,
    input [1:0] D
    );

example_FDRSE(Q[0],Clk,CE,RESET,D[0],SET);
example_FDRSE(Q[1],Clk,CE,RESET,D[1],SET);

endmodule



Case 2 -   Instantiating Verilog modules in VHDL components:


  This case is also straightforward. You don't need to worry about anything. Just instantiate as you normally do it with a vhdl file.

Take this verilog module for instance,

module a1(
    output Q,
    input [1:0] D
    );

and(Q,D[0],D[1]);

endmodule




A simpe vhdl code for instantiating this Verilog code can look like this:

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;

entity test is 
   port(
      Q : out std_logic_vector(1 downto 0);
      D :in  std_logic_vector(3 downto 0)
   );
end test;

architecture Behavioral of test is 

component a1 is 
   port(
      Q : out std_logic;
      D :in  std_logic_vector(1 downto 0)
   );
end component;

begin  

a11 : a1 port map(Q(0),D(1 downto 0));
a22 : a1 port map(Q(1),D(3 downto 2));

end Behavioral;



Never thought mixing vhdl and verilog files were so easy? But it is!

Saturday, April 28, 2012

Not enough I/O pins in the FPGA board for your design?

Thanks to all the hardwork you have done, you successfully completed writing your first vhdl code. It did great in the functional simulation part. Now its time to test it on a FPGA board.

But looking at the board in hand, you realize that there are not enough input or output pins on the board to test the design. As you may have seen most of the FPGA boards have 8 switch inputs, 4 push buttons, 8 led's. There are other ways to increase I/O by using features like seven segment display, VGA monitor, RS232, DAC and ADC etc. But these features may make the design pretty complex and time consuming.

In such cases we can simply use the basic led's or switches in a multiplexed manner. In this article I have shown how to tackle such a problem in case you are out of input pins. But the same concept apply for output pins also.

Our design is named as my_design which has a 32 bit input and 8 bit output. Considering a typical FPGA board, we dont have 32 switches or push buttons. Suppose we have 8 switches. The idea is to apply the input in four stages of 8 bit each. This is how you do it.

1st stage : Set switches for 1st byte(LSB), press the push button.
2nd stage : change switches for 2nd byte and press the push button again.
3rd stage : change switches for 3rd byte, press the push button again.
4th stage : change switches for 4th byte(MSB) and press the push button again.

The code for my_design:

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;

entity my_design is
    Port ( input : in  STD_LOGIC_VECTOR (31 downto 0);
           output : out  STD_LOGIC_VECTOR (7 downto 0));
end my_design;

architecture Behavioral of my_design is

begin

output <= input(31 downto 24) and input(23 downto 16)
        and input(15 downto 8) and input(7 downto 0);

end Behavioral;

  The code just does the AND operation between the 4 bytes of the 32 bit number entered.

Now the code for reducing the input numbers. I call this module as a wrapper module. This job of this module is to get the input in stages, concatenate it together and apply it to the instantiated my_design.

Code for wrapper module :

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;

entity wrapper is
    Port ( Clk : in  STD_LOGIC;
              push : in STD_LOGIC;
           input : in  STD_LOGIC_VECTOR (7 downto 0);
           output : out  STD_LOGIC_VECTOR (7 downto 0));
end wrapper;

architecture Behavioral of wrapper is

component my_design
port(   input : in  STD_LOGIC_VECTOR (31 downto 0);
      output : out  STD_LOGIC_VECTOR (7 downto 0));
end component;     

--state machine type
type stype is (idle,get_byte,delay);
signal s : stype := idle;
signal c1,c2 : integer := 0;
signal temp_reg : std_logic_vector(31 downto 0) := (others => '0');
       
begin

uut : my_design port map
        (input => temp_reg, --concatenated signal
        output => output    );

process(Clk)
begin
    if(rising_edge(clk)) then
        case s is
            when idle =>
                if(push = '1') then
                    s <= get_byte;
                    c1 <= c1+1;
                end if;
            when get_byte =>
                temp_reg( (8*c1-1) downto (8*(c1-1)) ) <= input;
                s <= delay;
            when delay => --delay for a time gap.
            --this delay is required to avoid the same byte getting
            --registered into temp_reg for a single push button click.
                c2 <= c2+ 1;
                if(c2=25000000) then --for a 50 mhz clock, this generates a 0.5 sec delay.
                    c2 <= 0;
                    s <= idle;
                    if(c1=4) then
                        c1 <= 0;
                    end if;
                end if;
        end case;
    end if;
end process;

end Behavioral;


I am using a state machine in the code, to get this done. The state machine has 3 stages.
1)idle - here system waits for a push button click.
2)get_byte - the system gets the switch inputs and stores in the temp_reg.
3)delay - system waits for a particular time(here 0.5 sec) doing nothing. This is to avoid duplicate registering of the same input.

A testbench was created for testing the design. The simulation waveforms are given below for your better understanding.




Note :- In a similar way you can also use multiplexed outputs.

Sunday, April 22, 2012

Tips for an Error-free Functional Simulation

Getting a VHDL code to work in the functional simulation is not always an easy task.This article will cover some tips to quickly point out the errors in the code and make your life easier.

  1. Create a proper sensitivity list. Some times you may have to add other control signals too(other than clock) into you sensitivity list to get is working.
  2. Initialize the signals and variables correctly. If they are not initialized(normally they are set to '0'), then these signals will appear as "U" in the simulation waveform.
  3. If you see "X" in the waveform then that indicates concurrent writing to the same signal. A simple re-arrangement of the signal inside the process will normally take out this bug.
  4. In case you have arrays in the design make sure to check for out of bound error. This happens when you read or write a different index than the one available within the range of array.
  5. If elsif's are error prone. Always try to consider all the conditions of If elsif. If a particular condition is not considered then the value will remain unchanged. If you dont want this to happen then make sure you reset the signal, using an else condition.
  6. Within a process, signal assignments can be written in any order. They will get executed concurrently. But for variables, the order matters. line 1 is executed first, line 2 second and so on...
  7. One way to debug the code is to force one or more signals as constants and test the design. This will help you in localizing the error.
  8. Writing a location in RAM requires a small time delay. Account for this, while reading and writing from the same location in the same clock cycle. The read data will be the one written in the last clock cycle.
  9. Try synthesising the design. The synthesiser tool may give out some warnings or errors which will point you in the correct direction to solve the error in the functional simulation.
  10. When using components in the design, use name instantiation, so that you don't accidentally assign wrong signals to the component ports.

Thursday, January 19, 2012

Real data types and Synthesisability - Part 1

First of all sorry that I haven't updated this blog for so long. To make up my negligence towards readers I have decided to write a post on the most common problem a vhdl coder may face. How to deal with real type signals in vhdl, when you have to create a synthesisable design?

The truth is that its not possible to make the code synthesisable if you use real type anywhere in your design. The only way to work around this problem is to first convert the real type into a equivalent binary format and then write custom functions to deal with it.

Generally,a real number can be represented in binary in two formats - Fixed point and  Floating point formats. In this article I will talk about only fixed point formats.

When it comes to fixed point, I prefer the Q format. In Q format, we mention the number of integer bits and fractional bits. The number of these bits depends on the actual range and resolution of real numbers you want to deal with. A Q format, will be written as Qm.n where m represents the number integer bits and n represents the number of fractional bits. By default there will be a sign bit as the MSB(Most significant bit) which makes the total size of the binary number as m+n+1.

Range and resolution of a binary number in Qm.n format: 

  • its range is [-2m, 2m - 2-n]
  • its resolution is 2-n

For example for a Q2.6 number,
range is  [-22, 22 - 2-6] = [-4 , 3.984375 ].
resolution is 2-= 2-= 0.015625.

Plan before you decide the value of m and n:

For accuracy and easiness in coding, its better to have a high values for m and n. But a higher value of m and n indicates that the size of your binary number will be high and hence more resource usage. For devices like Spartan 3 etc, its better to stick to a 8 bit binary if possible. Find your own optimized boundary between accuracy and size of design.

Examples for Q format.

3.5 in Q2.2 format = "0.11.10"  = "01110".
3.5 in Q3.5 format = "0.011.10000" = "001110000".
3.4 in Q2.4 format = "0.11.0110" = "0110110".
3.4 in Q2.6 format = "0.11.011001" = "011011001".

How to easily convert a number to Q format?

As far as I know there is no software tool available for a general conversion between a real number and any sized Q format. Some calculators can do this job, but the fastest way is to write a vhdl snippet.

For this purpose we can use the fixed point package available in the ieee_proposed library. This package doesnt work with most of the synthesisers, but it can be used in the form of a testbench code to convert real numbers into Q format.

Follow these steps:
  1. Store the real numbers, which you want to convert to Q format, in a text file.
  2. Write a vhdl snippet to read the real numbers one by one and store in a variable array named, say r. See this post to know how to read and write a text file.
  3. Now declare another array which have elements of sfixed type. One by one convert the values in r to this sfixed type using the to_sfixed function. See the below example:

    variable r : real;
    variable s : sfixed(m downto -n);
    s := to_sfixed(r,s); --convert r to type of s and store it in variable s.
  4. Write a vhdl snippet to write the values in s to another text file.
 Matlab for file handling:

I use Matlab software to easily manipulate text files. Using Matlab you can read or write text files of csv format. Some times I have the real values stored in a MS Excel file. To convert it into a text file, I just have to save it as a csv file, use the Matlab Import feature and then use the dlmwrite command in Matlab to write the text file. This will save a lot of your time and frustration. 
 
Note:- In the next part, I will talk about how to write, custom arithmetic functions for fixed point format numbers.

Friday, January 13, 2012

Reading and writing real numbers using Files - Part 3

After the previous posts on file handling, now I have come up with another way to read and write files.

In this article I will show how to read a text file containing real numbers and store the square roots of these real numbers in another text file. The input file is named as "1.txt" and output file is named as "2.txt".

Contents of 1.txt:

12.23
34.4343
23.11
5.0
25.0
49.0
81.88
1000000.0
121.0
78.9

Contents of 2.txt:

3.497142e+00
5.868075e+00
4.807286e+00
2.236068e+00
5.000000e+00
7.000000e+00
9.048757e+00
1.000000e+03
1.100000e+01
8.882567e+00

VHDL code:

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.MATH_REAL.ALL;
library std;
use std.textio.all;

entity file_handle is
end file_handle;

architecture Behavioral of file_handle is

type real_array is array(1 to 10) of real;

begin

process

variable line_var : line;
file text_var : text;
variable r : real_array;


begin        

       
   --Open the file in read mode.
   file_open(text_var,"1.txt",read_mode);
    --run the loop 10 times to read 10 real values from the file.
    for i in 1 to 10 loop
    --make sure its not the end of file.
    if(NOT ENDFILE(text_var)) then
     readline(text_var,line_var);   --read the current line.
      --extract the real value from the read line and store it in the variable.
     read(line_var,r(i));
    end if;
    end loop;
    file_close(text_var); --close the file after reading.
 
    --Write the square root values of variable 'r' to another file.
    file_open(text_var,"2.txt",write_mode);
    --run the loop 10 times to write 10 real values to the file.
    for i in 1 to 10 loop
      write(line_var,sqrt(r(i))); --sqrt is a fucntion for finding square root.
        writeline(text_var,line_var);
    end loop;
    file_close(text_var);

  wait;

end process;

end Behavioral;

This way of reading and writing files is very helpful in many situations. In my next post I will show an example based on this code snippet.