University of Illinois at Urbana-Champaign Dept. of Electrical and Computer Engineering

ECE 120: Introduction to Computing

Analyzing and Optimizing the Bit-Sliced Comparator

ECE 120: Introduction to Computing

© 2016 Steven S. Lumetta. All rights reserved.

slide 1

# Area Heuristic for One Comparator Bit Slice is 20



ECE 120: Introduction to Computing

© 2016 Steven S. Lumetta. All rights reserved.

slide 2

# How Many Gate Delays to Z<sub>1</sub>?



ECE 120: Introduction to Computing

# Extending from One Bit Slice to N Bit Slices

What happens in an N-bit design?

Say that A and B are available at time 0.



ECE 120: Introduction to Computing

© 2016 Steven S. Lumetta. All rights reserved

slide 3

© 2016 Steven S. Lumetta. All rights reserved.

# Constant Inputs are Available Arbitrarily Early

#### What about the 0s on the right?

Available "forever" ... (time  $-\infty$ ).



ECE 120: Introduction to Computing

ECE 120: Introduction to Computing

 $\ensuremath{\mathbb{C}}$  2016 Steven S. Lumetta. All rights reserved.

slide 5

slide 7

### Use Bit Slice Timing to Calculate Times Between Slices

#### Now we must

- use the delays that we found for one bit slice
- to calculate times for inter-slice C values.

#### Recall that

- all **A** and **B** bits are available at **time 0**,

#### We found

- $\circ$  C<sub>1</sub> to Z<sub>1</sub>: 2 gate delays
- $\circ$  C<sub>0</sub> to Z<sub>0</sub>: 2 gate delays

ECE 120: Introduction to Computing

 $\ensuremath{\mathbb{C}}$  2016 Steven S. Lumetta. All rights reserved.

slide 6

# Calculate the Time at Which C<sup>M</sup> Becomes Available



© 2016 Steven S. Lumetta. All rights reserved.

# A More Detailed Version of Our Calculations

Grey is "not relevant," and green is maximum (time at which  $\mathbf{Z}_i$  is available).

| (bit slice 0)                      | A  | В  | $\mathbf{C_1}$ | $\mathbf{C_0}$ |
|------------------------------------|----|----|----------------|----------------|
| input available at                 | 0  | 0  | -∞             | 8              |
| delay from input to $\mathbf{Z}_1$ | +2 | +2 | +2             |                |
| $\mathbf{Z}_1$ not available until | 2  | 2  | -∞             |                |
| delay from input to $Z_0$          | +2 | +2 |                | +2             |
| ${ m Z}_0$ not available until     | 2  | 2  |                | -8             |

ECE 120: Introduction to Computing © 2016 Steven S. Lumetta. All rights reserved.

### A More Detailed Version of Our Calculations

Grey is "not relevant," and green is maximum (time at which  $\mathbf{Z}_i$  is available).

| (bit slice 1)                      | A  | В  | $\mathbf{C_1}$ | $\mathbf{C_0}$ |
|------------------------------------|----|----|----------------|----------------|
| input available at                 | 0  | 0  | 2              | 2              |
| delay from input to $\mathbf{Z}_1$ | +2 | +2 | +2             |                |
| $\mathbb{Z}_1$ not available until | 2  | 2  | 4              |                |
| delay from input to $Z_0$          | +2 | +2 |                | +2             |
| ${ m Z}_0$ not available until     | 2  | 2  |                | 4              |

Generalize the Result to an N-Bit Comparator

 $C_1^0$  and  $C_0^0$  are available at time 2 (2 gate delays).\*

 $C_1^1$  and  $C_0^1$  are available at time 4.

When are  $C_1^{N-1}$  and  $C_0^{N-1}$  available (these are the answer for an N-bit comparator)?

N-bit answer is available at time 2N.

\*In the notes, the inverters are counted, so paths from A and B are slightly longer, and all timings are increased by 1.

ECE 120: Introduction to Computing

© 2016 Steven S. Lumetta. All rights reserved.

slide 9

ECE 120: Introduction to Computing

 $\ensuremath{\mathbb{C}}$  2016 Steven S. Lumetta. All rights reserved

slide 10

# We May be Able to Improve Our Comparator Design

#### Can we do better?

(You should ask: better in what sense?)

#### Can we reduce delay?

- Unlikely with a bit-sliced design.
- Not easy to implement most functions with one gate.

#### Can we reduce area?

- Maybe ...
- · Let's do some algebra.

# Use Algebra to Find Common Subexpressions (A'B, AB')

Start with  $\mathbf{Z}_1 = \mathbf{AB'} + \mathbf{AC}_1 + \mathbf{B'C}_1$ 

then use distributivity to pull out  $C_1$ :

$$Z_1 = AB' + (A + B')C_1$$

and rewrite the (A + B') factor as a NAND:

$$Z_1 = AB' + (A'B)'C_1$$

Similarly,  $Z_0 = A'B + (AB')'C_0$ 

Notice that we now reuse AB' and A'B.

ECE 120: Introduction to Computing © 2016 Steven S. Lumetta. All rights reserved. slide 11 ECE 120: Introduction to Computing © 2016 Steven S. Lumetta. All rights reserved. slide 12

# The New Implementation Uses Fewer Gates

The diagram below shows the new equations using NAND gates.  $\mathbf{Z}_1 = [\ (AB')'\ ((A'B)'C_1)'\ ]'$ The single-bit core is here.  $= AB' + (A'B)'C_1$ C1

((A'B)'C1)'

(AB')'

(AB')'

(A'B)'

ECE 120: Introduction to Computing

© 2016 Steven S. Lumetta. All rights reserved.

slide 13

slide 15

# Area Heuristic for the New Design is 12

Let's analyze area for the new design.

How many literals? 6

How many operators? 6 (NAND)



ECE 120: Introduction to Computing

© 2016 Steven S. Lumetta. All rights reserved.

Calculate the Time at Which C<sup>M</sup> Becomes Available

slide 14

# Delay Analysis for the New Design

A to  $Z_1$ : 3 gate delays (ignoring NOT)

 $C_1$  to  $Z_1$ : 2 gate delays

B to  $Z_1$ : 3 gate delays



ECE 120: Introduction to Computing

© 2016 Steven S. Lumetta. All rights reserved.

50% slower?!



ECE 120: Introduction to Computing

© 2016 Steven S. Lumetta. All rights reserved.

## A More Detailed Version of Our Calculations

Grey is "not relevant," and green is maximum (time at which  $\mathbf{Z}_i$  is available).

| (bit slice 0)                      | A  | В  | $\mathbf{C_1}$ | $\mathbf{C_0}$ |
|------------------------------------|----|----|----------------|----------------|
| input available at                 | 0  | 0  | -∞             | -∞             |
| delay from input to $\mathbf{Z}_1$ | +3 | +3 | +2             |                |
| $\mathbf{Z}_1$ not available until | 3  | 3  | -∞             |                |
| delay from input to $Z_0$          | +3 | +3 |                | +2             |
| ${ m Z}_0$ not available until     | 3  | 3  |                | -∞             |

ECE 120: Introduction to Computing © 2016 Steven S. Lumetta. All rights reserved

slide 17

## A More Detailed Version of Our Calculations

Grey is "not relevant," and green is maximum (time at which  $\mathbf{Z}_i$  is available).

| (bit slice 1)                      | A  | В  | $\mathbf{C_1}$ | $\mathbf{C_0}$ |
|------------------------------------|----|----|----------------|----------------|
| input available at                 | 0  | 0  | 3              | 3              |
| delay from input to $\mathbf{Z}_1$ | +3 | +3 | +2             |                |
| ${\bf Z}_1$ not available until    | 3  | 3  | 5              |                |
| delay from input to $Z_0$          | +3 | +3 |                | +2             |
| ${ m Z}_0$ not available until     | 3  | 3  |                | 5              |

ECE 120: Introduction to Computing

 $\ensuremath{\mathbb{C}}$  2016 Steven S. Lumetta. All rights reserved

slide 18

# The Slice-to-Slice Paths are the Important Ones

 $C_1^0$  and  $C_0^0$  are available at time 3 (2 gate delays).\*

 $C_1^1$  and  $C_0^1$  are available at time 5.

When are  $C_1^{N-1}$  and  $C_0^{N-1}$  available (these are the answer for an N-bit comparator)?

N-bit answer is available at time 2N+1.

\*In the notes, the inverters are counted, so paths from A and B are slightly longer, and all timings are increased by 1.

# Overall: Much Better Area for Slightly More Delay

So the new design

- reduces area by about 40% (area 12N compared to area 20N).
- increases delay by 1 (2N+1 gate delays compared to 2N gate delays).

ECE 120: Introduction to Computing © 2016 Steven S. Lumetta. All rights reserved. slide 19 ECE 120: Introduction to Computing © 2016 Steven S. Lumetta. All rights reserved.

# Can We Do Even Better?

Yes, but it's not as easy.

For example, we can design a slice

- that compares multiple bits of A and B.
- See Notes 2.4.6 for an example.

We can also solve the full **N-bit** problem.

In other words, trade more human work and complexity for better area and delay.

ECE 120: Introduction to Computing

© 2016 Steven S. Lumetta. All rights reserved.

