Implement, compare and contrast THREE adders belonging to ONE of the following two sets:
Set 1:
A. Carry-Lookahead Adder
B. Conditional-Sum Adder
C. Hybrid Brent-Kung/Kogge-Stone Parallel Prefix Network Adder
Set 2:
A. Hybrid Carry-Lookahead/Carry-Select Adder
B. Multilevel Carry Select Adder based on 32-bit Ripple Carry Adders
C. Brent-Kung Parallel Prefix Network Adder
For each adder type, develop two designs:
1. combinational carry-propagate adder optimized for the minimum latency, and
2. pipelined carry-propagate adder optimized for the maximum throughput.
Make an effort to
write the codes and perform analysis in a generic way, independent of
an
operand size, k.
Nevertheless, make sure that your codes support and conclusions are true for
at least the following operand size:
k = 256 bits.
For comparison, develop also
D. Ripple-Carry Adder of the same size described using a "+" sign in VHDL.
Assume that each adder is a part of a bigger system located on the same chip. As a result, all operands are generated on the chip, and all results are consumed on the chip, without the need of crossing a boundary of an integrated circuit.
Perform your implementation and analysis using the following technologies:
Xilinx Spartan 3 FPGAs (required, 20 points)
semicustom ASICs based on the 90 nm TCBN90G TSMC library of standard cells (optional, 5 bonus points),
Implement
your adders using synthesizable RTL VHDL
code portable between FPGAs and ASICs.
Synthesize your codes and perform static timing analysis using tools
appropriate for the given technology. In case of the FPGA implementation
perform placing and routing, and static timing analysis after placing and
routing.
Design Requirements
Your VHDL code for
EACH adder should consists of three
levels of the design hierarchy
I. synthesizable code of an adder itself with a clearly defined adder
boundary,
II. synthesizable test circuit with ALL inputs and outputs of an adder
stored in registers in order to facilitate static timing
analysis of your circuit during implementation,
III. non-synthesizable testbench.
The THREE adder types
- should have the same entity declaration at level I
- share the test circuit at level II,
- share the testbench at level III
- use different test vector files at level III.
The total numbers of inputs and outputs of your circuit at level II should be limited by the total number of i/o pins available in the smallest Xilinx Spartan 3 device capable of holding the adder (Hint: You can use, for example, 32-bit input data bus to load data to the operand registers and 32-bit output data bus to read out the contents of the output register).
Dataflow description is a preferred design style for synthesizable portions of your code. Use behavioral description only if necessary (e.g., for description of flip-flops and registers).
Behavioral description is a preferred design style for your testbenches. Your testbenches should stimulate circuit inputs using multiple representative test vectors read from a file common for all adder types.
Deliverables
(submitted through WebCT):
For the implemented adders:
1. ALL source files you have developed as a part of the project
2. test vectors, and a short description how these test vectors were generated. Hint: You may use software (your own or public domain) to generate your test vectors. Your test vectors should be chosen in such a way to trigger the most critical paths of all implemented adders.
3. waveforms demonstrating the correct operation of each circuit for test vectors triggering the most critical path in a given adder (timing simulation in case of FPGAs)
4. full reports from static timing analysis
5. one table per each technology summarizing the relative performance of each of the implemented adders in terms of
minimum latency
speed up in terms of latency compared to ripple carry adder implemented using a "+" sign
area
area increase vs. ripple carry adder implemented using a "+" sign
number of pipeline stages (1 for combinational adders)
maximum throughput (in the number of additions per second)
number of lines of VHDL code at level I
product latency * area
ratio throughput / area
6. two two-dimensional graphs per each technology showing the performance of all implemented adders in terms of
latency and area (for adders optimized for the minimum latency)
throughput and area (for adders optimized for the maximum throughput)
Hint: Use area as your X-coordinate, and latency/throughput as your Y coordinate.
8. conclusions summarizing your recommendations regarding the choice of the best adder for the given optimization goal, technology and operand size.
9. explanation of possible relative differences between FPGA and ASIC technologies (optional)
10. list of encountered problems & difficulties, and unexplained behavior of your designs or design tools.