VHDL coding tips and tricks: VHDL: Synthesisable Code for Finding Square Root (With Testbench!)

Thursday, March 18, 2010

VHDL: Synthesisable Code for Finding Square Root (With Testbench!)

UPDATE: A much more Synthesizable and Clocked Square Root Calculator is available here.

As you are probably aware, there is no built-in library function available in VHDL for finding the square root of an integer. There are several algorithms available for this, and in this article I have chosen to implement one of them using VHDL.

What I have used in this code is Non-Restoring Square Root algorithm. If you want to understand the specifics of it, you can read it in this paper.

I have used VHDL procedure to implement this algorithm. The whole function is combinatorial. So I have to warn you that its a bit of a resource eater, but you will get the result in just one clock cycle. The procedure is declared as follows:

procedure sqrt(variable d : in unsigned(31 downto 0);

            variable root : out unsigned(15 downto 0);

            variable remainder : out unsigned(15 downto 0)) is

The procedure takes one 32 bit unsigned integer and outputs a 16 bit square root and a 16 bit remainder. What I meant by remainder here is simply the difference between the input integer and square of the output square root.

Remainder = d - root*root;

The block diagram of the algorithm is given below(taken from this paper):

Non-Restoring Square Root algorithm in vhdl with testbench

Here, D is the unsigned input number. R is the remainder of the operation for non-perfect squares. Q contains the square root of D.

I have written the VHDL procedure and directly used it within a self checking testbench. The testbench can be used to easily verify the working of any number of inputs without you needing to do any manual verification. Thats why its called a self checking testbench.

Square Root procedure with Testbench:

--library declarations
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
--we need this library to use random number generator function.
use ieee.math_real.all;
 
--entity for testbenches are always empty
entity tb_square_root is
end tb_square_root;
 
architecture behavior of tb_square_root is 
 
--procedure for finding the square root of a 32 bit unsigned number.
--'d' is the input number, 'root' is the square root, 
--'remainder' is 'd minus square_of_squareRoot'.
 procedure sqrt(variable d : in unsigned(31 downto 0); 
        variable root : out unsigned(15 downto 0); 
        variable remainder : out unsigned(15 downto 0)) is
    variable a : unsigned(31 downto 0):=d;  --original input.
    variable q : unsigned(15 downto 0):=(others => '0');  --result.
    --'left' and 'right' are inputs to adder/sub. 'r' is remainder.
    variable left,right,r : unsigned(17 downto 0):=(others => '0');  
    variable i : integer:=0;  --loop index
begin
    for i in 0 to 15 loop  --iterate 16 times.
        right := q & r(17) & '1';  --'&' is used for concatenation in VHDL
        left := r(15 downto 0) & a(31 downto 30);
        a := a(29 downto 0) & "00";  --shifting left by 2 bit.
        if ( r(17) = '1') then
            r := left + right; --add
        else
            r := left - right; --subtract
        end if;
        q := q(14 downto 0) & not r(17);  --left shift and pad msb of 'r'
    end loop;
    root := q;  --assign output
    a := d-q*q;  --calculate remainder manually
    remainder := a(15 downto 0);  --take only lsb 16 bits for output.
end procedure;  --END of procedure
    
--internal signals used in the testbench. 
--They are used for checking the inputs and outputs in simulation waveform.
signal square : unsigned(31 downto 0);  
signal root : unsigned(15 downto 0);
signal remainder : unsigned(15 downto 0);

begin
 
-- Stimulus process
stim_proc: process
    variable seed1, seed2 : integer := 1;
    variable rand : real;
    variable int_rand : integer;
    variable a : unsigned(31 downto 0);  --original input.
    variable rooot : unsigned(15 downto 0):=(others => '0');  --result.
    variable r : unsigned(15 downto 0):=(others => '0');  --result.
begin
    --you can iterate as many times as you want. In each iteration different random numbers 
    --will be used for testing. The results are automatically checked for their correctness.
    --So we call this a 'self checking testbench'.
    for i in 0 to 1000 loop  
        --generate a random number for a.
        --Value of 'rand' is between 0 and 1. 
        --Multiply 'rand' by 2^N and then round down to get an integer.
        uniform(seed1, seed2, rand);  
        int_rand := integer(trunc(rand*real(2**30))); 
        a := to_unsigned(int_rand,32);
        --call the 'sqrt' procedure. All inputs and outputs are variables here.
        sqrt(a,rooot,r);
        --check if the result is correct.
        --two conditions are checked to make sure that the result is correct. 
        --1st we check that the input is greater than or equal to square of 'output square root'.
        assert (a >= rooot*rooot and a < (rooot+1)*(rooot+1))report "ERROR:incorrect square root";
        --2nd we check that square of 'output square root' when added to 'remainder' is equal to input number.
        assert a = rooot*rooot+r report "ERROR:incorrect remainder";
        --assign the operands and results to signals so that we can see it in the simulation waveform.
        square <= a;
        root <= rooot;
        remainder <= r;
        --wait a bit before trying a new set of inputs. 
        --Otherwise we wont be able to see the different iterations in the simulation waveform.
        wait for 10 ns;
    end loop; 
    wait;  --wait endlessly. We are done with testing!
end process;

end;

The code is well commented, so you probabaly wont need me to explain any further. The code was simulated using modelsim. Let me share some screenshots of the simulation waveform below:

Simulation waveform from Modelsim:

Note :- The sqrt procedure was successfully synthesised using Xilinx Vivado 2023.2.

13 comments:

UnknownApril 2, 2010 at 10:25 PM
Hye GURU!!! I'm a beginner in VHDL. i have a coding for square root operation. i key in data for D and the answer appears in square root. so half of my job i done.. but the prob is, i need to interface with a keypad (to key in input for D) and display at LCD (output). somethng like a calculator.. can u help me with the VHDL code.. i'm using a 3x3 keypad and spartan 3 starter kit LCD. Help me plizz!!! thnks =)
ReplyDelete
Replies
UnknownMay 11, 2010 at 7:52 PM
I'm really confused about your construct ...

for i in 0 to 15 loop
[...]
q(15 downto 1) := q(14 downto 0);
q(0) := not r(17);
end loop;

As I know you would get 16 times
q(15 downto 1) := q(14 downto 0);
q(0) := not r(17);
as it is synthesized as parallel copies ...

Don't get, how you can iteratively shift your Q without clock and just with a for loop ...

I would be very interested about helping me out with this ...

Best regards
Thomas
ReplyDelete
Replies
vipinMay 13, 2010 at 8:08 PM
@thomas : Good Question.
As I am using a function for the square root operation,there is no clock involved.It is a purely a combinational circuit.When synthesised Xilinx ISE uses LUT's(Look up tables) and some MUX'x for implementing it in hardware.
If you try synthesising it yourself you can see that a group of LUT's and MUX'x are connected in a cascaded fashion.This means that the logic written inside 'for loop' is implemented 15 times to realize the logic without clock.As you can see that this uses so much resources,but using functions is an easy way to write codes.
If you are concerned about the over use of logic gates, use a clock to implement the logic.This may reduce the logic gate usage by approximately 15.
(Note :- Shifting is not done in parallel here.)
ReplyDelete
Replies
nagarajaFebruary 16, 2012 at 10:03 AM
while execution it is giving error . that is expecting entity or architecture near the function

with regards ,
nagaraja.v
ReplyDelete
Replies
UnknownNovember 14, 2012 at 1:42 PM
the error came from function.... so, plz check it and reply to me.........
ReplyDelete
Replies
vipinNovember 14, 2012 at 1:43 PM
This code is working. But if you still need help regarding anything I can help for a fee.
ReplyDelete
Replies
bedriJune 24, 2013 at 3:55 PM
I'm trying to use your code in my application with using Altera DE2. In my system fpga gets adc out from GPIO pins. Then fpga have to calculate sqrt. I have been using fixed_pkg.vhdl. When i send any fixed point number to your function after conversion to fixed point to unsigned, it works. But when i send the adc value, it doesn't. And says "Error: In lpm_divide megafunction, LPM_WIDTHN must be less than or equals to 64". Could you please help or advise something to me?
ReplyDelete
Replies
UnknownMarch 14, 2014 at 1:29 PM
Hi, i have the same problem with you. If you find a solution, please write for me. I need help.
ReplyDelete
Replies
UnknownFebruary 7, 2016 at 9:15 PM
Could this code be easily adapted to deal with 32 bit fixed point numbers?
ReplyDelete
Replies
UnknownOctober 14, 2016 at 12:25 AM
Does your code gives the output for non perfect square????? ..i.e; sqrt (50)= 7.07
ReplyDelete
Replies
UnknownDecember 27, 2019 at 3:24 PM
It works, just need to use it in in package.
ReplyDelete
Replies