UPDATE: A much more Synthesizable and Clocked Square Root Calculator is available here.

As you are probably aware, there is no built-in library function available in VHDL for finding the square root of an integer. There are several algorithms available for this, and in this article I have chosen to implement one of them using VHDL.

What I have used in this code is

**Non-Restoring Square Root algorithm**. If you want to understand the specifics of it, you can read it in this paper. I have used VHDL

*procedure*to implement this algorithm. The whole function is combinatorial. So I have to warn you that its a bit of a resource eater, but you will get the result in just one clock cycle. The procedure is declared as follows:procedure sqrt(variable d : in unsigned(31 downto 0);

variable root : out unsigned(15 downto 0);

variable remainder : out unsigned(15 downto 0))is

The

*procedure*takes one 32 bit unsigned integer and outputs a 16 bit square root and a 16 bit remainder. What I meant by remainder here is simply the difference between the input integer and square of the output square root.So

Remainder = d - root*root;

The block diagram of the algorithm is given below(taken from this paper):

Here,

Here,

**D**is the unsigned input number.**R**is the remainder of the operation for non-perfect squares.**Q**contains the square root of**D**.I have written the VHDL procedure and directly used it within a self checking testbench. The testbench can be used to easily verify the working of any number of inputs without you needing to do any manual verification. Thats why its called a self checking testbench.

__
Square Root procedure with Testbench:__

__Square Root procedure with Testbench:__

--library declarations library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; --we need this library to use random number generator function. use ieee.math_real.all; --entity for testbenches are always empty entity tb_square_root is end tb_square_root; architecture behavior of tb_square_root is --procedure for finding the square root of a 32 bit unsigned number. --'d' is the input number, 'root' is the square root, --'remainder' is 'd minus square_of_squareRoot'. procedure sqrt(variable d : in unsigned(31 downto 0); variable root : out unsigned(15 downto 0); variable remainder : out unsigned(15 downto 0)) is variable a : unsigned(31 downto 0):=d; --original input. variable q : unsigned(15 downto 0):=(others => '0'); --result. --'left' and 'right' are inputs to adder/sub. 'r' is remainder. variable left,right,r : unsigned(17 downto 0):=(others => '0'); variable i : integer:=0; --loop index begin for i in 0 to 15 loop --iterate 16 times. right := q & r(17) & '1'; --'&' is used for concatenation in VHDL left := r(15 downto 0) & a(31 downto 30); a := a(29 downto 0) & "00"; --shifting left by 2 bit. if ( r(17) = '1') then r := left + right; --add else r := left - right; --subtract end if; q := q(14 downto 0) & not r(17); --left shift and pad msb of 'r' end loop; root := q; --assign output a := d-q*q; --calculate remainder manually remainder := a(15 downto 0); --take only lsb 16 bits for output. end procedure; --END of procedure --internal signals used in the testbench. --They are used for checking the inputs and outputs in simulation waveform. signal square : unsigned(31 downto 0); signal root : unsigned(15 downto 0); signal remainder : unsigned(15 downto 0); begin -- Stimulus process stim_proc: process variable seed1, seed2 : integer := 1; variable rand : real; variable int_rand : integer; variable a : unsigned(31 downto 0); --original input. variable rooot : unsigned(15 downto 0):=(others => '0'); --result. variable r : unsigned(15 downto 0):=(others => '0'); --result. begin --you can iterate as many times as you want. In each iteration different random numbers --will be used for testing. The results are automatically checked for their correctness. --So we call this a 'self checking testbench'. for i in 0 to 1000 loop --generate a random number for a. --Value of 'rand' is between 0 and 1. --Multiply 'rand' by 2^N and then round down to get an integer. uniform(seed1, seed2, rand); int_rand := integer(trunc(rand*real(2**30))); a := to_unsigned(int_rand,32); --call the 'sqrt' procedure. All inputs and outputs are variables here. sqrt(a,rooot,r); --check if the result is correct. --two conditions are checked to make sure that the result is correct. --1st we check that the input is greater than or equal to square of 'output square root'. assert (a >= rooot*rooot and a < (rooot+1)*(rooot+1))report "ERROR:incorrect square root"; --2nd we check that square of 'output square root' when added to 'remainder' is equal to input number. assert a = rooot*rooot+r report "ERROR:incorrect remainder"; --assign the operands and results to signals so that we can see it in the simulation waveform. square <= a; root <= rooot; remainder <= r; --wait a bit before trying a new set of inputs. --Otherwise we wont be able to see the different iterations in the simulation waveform. wait for 10 ns; end loop; wait; --wait endlessly. We are done with testing! end process; end;

The code is well commented, so you probabaly wont need me to explain any further. The code was simulated using modelsim. Let me share some screenshots of the simulation waveform below:

__Simulation waveform from Modelsim:__

__Simulation waveform from Modelsim:__

Hye GURU!!! I'm a beginner in VHDL. i have a coding for square root operation. i key in data for D and the answer appears in square root. so half of my job i done.. but the prob is, i need to interface with a keypad (to key in input for D) and display at LCD (output). somethng like a calculator.. can u help me with the VHDL code.. i'm using a 3x3 keypad and spartan 3 starter kit LCD. Help me plizz!!! thnks =)

ReplyDeleteI'm really confused about your construct ...

ReplyDeletefor i in 0 to 15 loop

[...]

q(15 downto 1) := q(14 downto 0);

q(0) := not r(17);

end loop;

As I know you would get 16 times

q(15 downto 1) := q(14 downto 0);

q(0) := not r(17);

as it is synthesized as parallel copies ...

Don't get, how you can iteratively shift your Q without clock and just with a for loop ...

I would be very interested about helping me out with this ...

Best regards

Thomas

@thomas : Good Question.

ReplyDeleteAs I am using a function for the square root operation,there is no clock involved.It is a purely a combinational circuit.When synthesised Xilinx ISE uses LUT's(Look up tables) and some MUX'x for implementing it in hardware.

If you try synthesising it yourself you can see that a group of LUT's and MUX'x are connected in a cascaded fashion.This means that the logic written inside 'for loop' is implemented 15 times to realize the logic without clock.As you can see that this uses so much resources,but using functions is an easy way to write codes.

If you are concerned about the over use of logic gates, use a clock to implement the logic.This may reduce the logic gate usage by approximately 15.

(Note :- Shifting is not done in parallel here.)

while execution it is giving error . that is expecting entity or architecture near the function

ReplyDeletewith regards ,

nagaraja.v

the error came from function.... so, plz check it and reply to me.........

ReplyDeleteThis code is working. But if you still need help regarding anything I can help for a fee.

ReplyDeleteI'm trying to use your code in my application with using Altera DE2. In my system fpga gets adc out from GPIO pins. Then fpga have to calculate sqrt. I have been using fixed_pkg.vhdl. When i send any fixed point number to your function after conversion to fixed point to unsigned, it works. But when i send the adc value, it doesn't. And says "Error: In lpm_divide megafunction, LPM_WIDTHN must be less than or equals to 64". Could you please help or advise something to me?

ReplyDeleteHi, i have the same problem with you. If you find a solution, please write for me. I need help.

ReplyDeleteto bedri please

DeleteCould this code be easily adapted to deal with 32 bit fixed point numbers?

ReplyDeleteDoes your code gives the output for non perfect square????? ..i.e; sqrt (50)= 7.07

ReplyDeleteyou can make a shift left for the number 50 by 8 bits and after the sqrt shift the result by 4 bits right

Deleteso you will get the value you need

It works, just need to use it in in package.

ReplyDelete