Download Report

Advanced Digital Design [VU]
Homework III - Sample Solution
Vienna University of Technology
January 21, 2014
Contents
1
Static data-flow structures
1
2
Dual-rail function blocks I
2
2.1
DIMS circuit implementation . . . . . . . . . . . . . . . .
2
2.2
Implementations costs . . . . . . . . . . . . . . . . . . .
6
2.3
Isochronic forks . . . . . . . . . . . . . . . . . . . . . . .
6
3
Dual-rail function blocks II
8
3.1
Threshold gates . . . . . . . . . . . . . . . . . . . . . . .
8
3.2
TH1n and THnn . . . . . . . . . . . . . . . . . . . . . . .
9
3.3
Dual-rail AND . . . . . . . . . . . . . . . . . . . . . . .
9
4
STG-Synthesis
11
4.1
State Graph . . . . . . . . . . . . . . . . . . . . . . . . .
12
4.2
Next State Logic . . . . . . . . . . . . . . . . . . . . . .
12
4.3
Latch Controller Circuit . . . . . . . . . . . . . . . . . . .
14
4.4
Petrify . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
References
15
2
1
Static data-flow structures
(0)
(E)
R1
R2
A != B
A,B
0
A,B
1
0
A,B
E
1
R3
1
A-B
0
B-A
1
E
R4
gcd(A,B)
A,B
E
0
R5
A>B
Figure 1: Dataflow circuit
Step
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
A,B
(8,20)
8,20
(E)
(E)
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
R1
(0)
0
(E)
(E)
E
(1)
(1)
1
(E)
(E)
E
(1)
(1)
1
(E)
(E)
E
(1)
(1)
1
(E)
(E)
E
(0)
(0)
(0)
R2
(E)
(E)
E
(1)
(1)
1
(E)
(E)
E
(1)
(1)
1
(E)
(E)
E
(1)
(1)
1
(E)
(E)
E
(0)
(0)
0
0
0
R3
E
(8,20)
(8,20)
8,20
(E)
(E)
E
(8,12)
(8,12)
8,12
(E)
(E)
E
(8,4)
(8,4)
8,4
(E)
(E)
E
(4,4)
(4,4)
4,4
(E)
(E)
(E)
E
R4
E
E
E
E
(8,12)
(8,12)
(8,12)
8,12
(E)
(E)
E
(8,4)
(8,4)
8,4
(E)
(E)
E
(4,4)
(4,4)
4,4
(E)
(E)
E
E
E
E
R5
E
E
E
(8,20)
8,20
8,20
(E)
(E)
E
(8,12)
(8,12)
8,12
(E)
(E)
E
(8,4)
(8,4)
8,4
(E)
(E)
E
E
E
E
E
E
Output
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
(4,4)
(4,4)
(4,4)
4,4
(E)
Table 1: Value transitions of the GCD circuit
At step 22 the data is produced and passed to the output as a valued token. At step
24 the circuit is settled. R1 is (0) and as soon as R3 becomes a bubble (when the empty
1
token is passed to the output) a new input can be latched. At step 25 the produced output
is consumed by the following circuit. At step 26 an empty token is produced at the output
and the 4 phase request is completed.
2
Dual-rail function blocks I
2.1
DIMS circuit implementation
DIMS circuits require an array of C-gates to exclusively map every possible (valid) input
data word to a dedicated signal (one-hot code). Note that the C-gates always wait until all
input signals carry valid data or empty tokens before they produce a one or zero on their
outputs. In a second stage, OR-gates map the one-hot code to the desired output signals.
The actual logical function depends only on these OR-gates. It is easy to see that circuits
using the DIMS design style can grow very large. A circuit with n DR inputs requires 2n
n-input C-gates (exponential growth).
Table 2 shows the truthtable for the boolean equations e and f .
e = (a ∧ b) ∨ (a ∧ b) ∨ d
f = (a ∧ b ∧ d) ∨ (a ∧ b ∧ c) ∨ (a ∧ b ∧ c ∧ d) ∨ (a ∧ b ∧ d)
We can now use the truthtable to connect the OR-gates to the outputs of the C-gates. The
e.f signal, for example, must be one for all input vectors where e is zero. The resulting
circuit is shown in Figure 3. Note that the circuit needs 16 4-input C-gates. However,
since we can only use 2-input ones we have to use three of them to constuct one 4-input
C-gate as shown in Figure 2.
C
⇒
C
C
C
Figure 2: 4-input C-gate construction
Now we can observe that many terms in the circuit (Figure 3) are calculated redundantly. The first four C-gates, for example, all contain a 2-input C-gate with the inputs a.f
and b.f . If we remove all this redundancy, we obtain the circuit shown in Figure 4, which
only requires half as many (2-input) C-gates (24).
2
a
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
b
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
c
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
d
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
e
1
1
1
1
0
1
0
1
0
1
0
1
1
1
1
1
f
1
1
0
0
0
0
1
0
0
1
0
1
0
1
0
1
Table 2: Truthtable for e and f
3
a.t b.t c.t d.t
a.f b.f c.f d.f
abcd
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
FFFF
FFFT
FFTF
FFTT
FTFF
FTFT
FTTF
FTTT
TFFF
TFFT
TFTF
TFTT
TTFF
TTFT
TTTF
TTTT
Figure 3: DIMS circuit
4
e.t
e.f
f.t
f.f
a.t b.t c.t d.t
a.f b.f c.f d.f
cd
C
C
C
C
abcd
FF
C
FT
C
TF
C
TT
C
C
C
C
C
C
C
C
C
C
C
C
ab
FF
C
FT
C
TF
C
TT
C
C
FFFF
FFFT
FFTF
e.f
FFTT
FTFF
FTFT
FTTF
FTTT
TFFF
TFFT
TFTF
TFTT
TTFF
TTFT
TTTF
TTTT
Figure 4: Optimized DIMS circuit
5
e.t
f.t
f.f
2.2
Implementations costs
The unoptimized DIMS circuit needs 16 4-input C-Gates while the optimized version
needs 24 2-input C-gates. Both circuits require 4 OR-gates with 4, 7, 9 and 12 inputs.
Table 3 shows a summary of the resource consumption.
Gate
2-input OR-gate
2-input C-gate
4-input C-gate
4-input OR-gate
7-input OR-gate
9-input OR-gate
12-input OR-gate
Implementation
ordinary CMOS OR
Martin’s C-gate implementation
3 × 2-input C-gate
3 × 2-input OR-gate
6 × 2-input OR-gate
8 × 2-input OR-gate
11 × 2-input OR-gate
Transistor Count
6
8
24
18
36
48
66
Table 3: Resource requirements of the DIMS circuits
If we sum everything up, we obtain the following transistor counts for our circuits.
TDIM S = 16 ∗ 3 ∗ 8 + 18 + 36 + 48 + 66 = 552
TDIM Sopt = 24 ∗ 8 + 18 + 36 + 48 + 66 = 360
Note that for one output signal (which consists of two rails), we always need two ORgates, one with n inputs and another one with (16 − n) inputs (n ≥ 1). To construct an
n-input OR-gate, we need (n − 1) 2-input OR-gates (cascading). So the total number of
transistors TDRout for one dual rail output signal can be calculated as follows.
TDRout = (n − 1) ∗ 6 + (16 − n − 1) ∗ 6 = (
n − 1 + 16 − 1 − n) ∗ 6 = 14 ∗ 6 = 84
We can see, that the resource consumption is independent of the actual logic equation and
the structure of the OR-gates. Therefore the overall resource consumption of the DIMS
circuits can also be calculated in the following way.
TDIM S = 384 + 2 ∗ TDRout = 552
TDIM Sopt = 192 + 2 ∗ TDRout = 360
2.3
Isochronic forks
Isochronic forks must be assumed for this circuit to work correctly. To illustrate this, we
refer to the a simpler DIMS circuit, shown in Figure 5, with a non-isochronic fork on the
b.f signal (∆2 >> ∆1 ). Consider the circuit in its null phase, i.e. all inputs rails are
6
zero, the C-gates are reset and the output rails are zero as well. Now the input vector
(a.t, a.f, b.t, b.f ) = (0, 1, 0, 1) is applied. Thus the topmost C-gate (F F ) is activated and
the output rail q.t switches to one. The valid output data is acknowledged and all input
rails are reset to zero again (null phase). However, because of the very long delay ∆2
one of the inputs of the C-gate T F is still set to one, thus the C-gate is excited. After
acknowledgment of the null phase, new input data (1, 0, 1, 0) is applied, which sets the
C-gate T T . However, since one of the inputs of the C-gate T F is still set to one also the
output of this C-gate is set. This results in both output rails being set to one at the same
time.
a.t b.t
a.f b.f
C
FF
∆1
q.t
C
FT
q.f
C
TF
∆2
C
TT
Figure 5: DIMS circuit with non-isochronic fork
7
3
3.1
Dual-rail function blocks II
Threshold gates
Table 4 shows the truthtable of the corresponding threshold gates. For the threshold gates
with 3 inputs the inputs B,C,D are used. The weights of the inputs have to be taken into
account in order to determine the correct truthtable. For instance for the TH23W2 it is
enough that the first input is 1 in order to trigger the set function.
A
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
B
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
C
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
D
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
TH33
0
0
0
0
0
0
0
1
-
TH23W2
0
0
0
1
1
1
1
1
-
TH24
0
0
0
1
0
1
1
1
0
1
1
1
1
1
1
1
TH34W32
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
Table 4: Threshold gates
Table 5 shows the table with the set functions of the threshold gates.
Name
TH22
TH33
TH44
TH23W2
TH23
TH24W22
TH24W2
TH24
TH34W22
TH34W2
TH34W32
TH44W322
TH44W3
Set function
A*B
A*B*C
A*B*C*D
A+B*C
AB + BC + AC
A + B + CD
A + BC + BD + CD
AB + (A + B)(C + D) + CD
AB + AC + AD + BC + BD
AB + AC + AD + BCD
A + BC + BD
AB + AC + AD + BC
AB + AC + AD
Table 5: Threshold gates
8
3.2
TH1n and THnn
The output of a T H1n-gate becomes "1", if more than one input is "1" and "0" again if all
inputs are "0". Thus T H1n-gates are simple OR gates with n inputs.
THnn gates have some kind of internal states, because the output becomes "1" if all inputs
are "1" and "0" again if all inputs are "0". This is exactly the function of a Muller-C gate.
3.3
Dual-rail AND
Table 6 shows the relevant entries of the truthtable of the dual-rail AND gate.
a.t
0
0
0
1
1
a.f
0
1
1
0
0
b.t
0
0
1
0
1
b.f
0
1
0
1
0
y.t
0
0
0
0
1
y.f
0
1
1
1
0
Table 6: Truthtable of the dual-rail AND gate
In the next step the set functions of the two output rails are expressed.
y.t = a.t ∗ b.t
y.f = (a.f ∗ b.f ) + (a.f ∗ b.t) + (a.t ∗ b.f )
The expression for y.f can be further reduced to y.f = a.f + b.f , but it has to be kept in
mind that this results in a weakly indicating circuit. This circuit is shown in Figure 6.
a.t a.f b.t b.f
2
y.t
1
y.f
Figure 6: Weakly indicating Threshold implementation of an AND gate
Figure 7 depicts a strongly indicating circuit implementation. For the false-rail a
T H34W 22-gate can be used. The both true-rails of the inputs of the circuit are connected to the inputs of the Threshold gate that have weight 1. The both false-rails are
connected to the inputs of the T H34W 22-gate that have weight 2. Thus, if the true-rails
9
of both inputs are set, the T H34W 22-gate still outputs 0. If one false-rail of the circuit’s
inputs is set the output of the Threshold gate becomes one if one more rail is set (i.e. both
inputs are in the data-phase).
a.t
2
y.t
3
y.f
a.f
b.t
b.f
Figure 7: Strongly indicating Threshold implementation of an AND gate
The weakly indicating circuit consists of a T H22-gate which is a C-gate and a T H12gate which is OR2-gate. Therefore this implementation needs 8 + 6 = 14 transistors, if
Martin’s C-gate implementation is used.
A CMOS implementation of the T H34W 22-gate which is used in the strongly indicating
circuit is shown in Figure 8.
A
Q
B
C
D
A
B
B
C
D
D
C
Figure 8: CMOS implementation of a T H34W 22-gate
This implementation uses 15 transistors. Therefore the strongly indication circuit uses
15 + 8 = 23 transistors.
10
Figure 9 shows the DIMS implementation of an AND gate, which consists of 4 C-gates
and 1 OR3 gate and has a transistor count of 4 · 8 + 1 · 8 = 40.
a.t b.t
a.f b.f
C
C
C
C
FF
y.t
FT
y.f
TF
TT
Figure 9: DIMS implementation of AND gate
4
STG-Synthesis
Since the output signals Ain and Lt directly depend on the internal signal A, these signals
can be removed from the STG. The simplified version of the STG is shown in Figure 10.
Rin +
A+
Rout+
Rin−
Aout+
A−
Rout−
Aout−
Figure 10: simplified STG of the latch controller
11
4.1
State Graph
The state graph derived from the STG is shwon in Figure 11. The varible ordering in the
state lables is (Rin, A, Rout, Aout). It can be seen that 14 states are required and that
there are no CSC conflicts.
1110
Aout−
Rin−
0110
1010
Rout+
1011
Rin−
Aout−
0010
Rout+
Aout+
Rin−
0011
Aout+
1111
Rin−
0111
A−
0000
Aout−
0100
Rin+
A+
1000
Aout−
Rout−
Rin+
1100
Rout−
0101
Rin+
1101
A+
(Rin, Aout, A, Rout)
Figure 11: state graph
4.2
Next State Logic
In this step we use the state graph to derive the next state logic for the output signals A
and Rout. As can be seen in Figures 12 and 13, we first have to identify the excitation
and quiescent regions for the output signals (A and Rout). The set of states where all
necessary preconditions are fulfilled, such that the output signal will eventually change its
logical value are referred to as excitation regions (ER). All states where the output signal
does not change its logical value belong to the quiescent regions (QR).
KV maps are used to derive boolean equations for the set and reset functions of the
output signals A and Rout. For the set function we have to group all R (rising edge)
entries in the KV map, x (don’t care) and 1 entries can be used to form bigger cubes.
Likewise the reset function is obtained by grouping the F (falling edge) entries (with 0
and x entries). However, to avoid hazards, keep in mind the monotonic cover (unique
entry) constraint. A cube (product term) may only be entered through ER states.
12
1110
1010
1011
1111
0110
0010
0011
0111
Rout
Aout
0
0
0
0000
0100
A
1100
F
1
1
1
10
1
R
QR(A−)
1
15
11
R
14
0
9
8
&
ER(A−)
1
7
6
$
3
2
'
1101
Rin
QR(A+)
4
0101
1
1000
x
0
5
1
13
12
%
x
ER(A+)
(a) state graph
(b) KV map
Figure 12: Excitation and Quiescent Regions of A
1110
1010
1011
1111
0110
0010
0011
0111
Rout
Aout
x
0
0000
0100
0
5
1
&
$
0101
R
A
1000
1100
0
3
2
R
1101
Rin
0
0
QR(Rout+)
ER(Rout−)
QR(Rout−)
ER(Rout+)
(a) state graph
(b) KV map
Figure 13: Excitation and Quiescent Regions of Rout
13
1
6
1
1
14
&
$
x
F
0
9
0
4
%
'
1
7
10
15
11
%
'
8
F
13
12
A.set = Rin ∧ Rout
A.reset = Rin ∧ Aout ∧ Rout
Rout.set = A ∧ Aout
Rout.reset = A
4.3
Latch Controller Circuit
Figure 14 shows the resulting circuit of the latch controller. Note that it is also possible to
use C-gates instead of the SR-latches.
Rin Aout
S
Q
A(Lt, Ain)
Q
Rout
R
S
R
Figure 14: Latch contoller circuit
4.4
Petrify
For the sake of completeness we also want to present a solution utilzing the tool petrify
[1]. The listing below shows how the STG specification of the latch controller is encoded
for petrify.
.model latchcontroller
.inputs Rin Aout
# specify latch controller inputs
.outputs A
Rout
# specify latch controller outputs
.graph
# specify transistions
Rin+ A+
A+
Rin- Rout+
ARin+ RoutRin- ARout+ Aout+
14
Aout+ AAout- Rout+
Rout- Aout- A+
# specify the initial state of the STG,
# i.e. the arcs which have tokens initially
.marking { <A- ,Rin+ > <Rout- ,A+ > <Aout- ,Rout+ > }
.end
To obtain the state graph, we can now execute the tool write_sg (comes with petrify). As can be seen, we get a list of transitions encoding our state graph. Every list entry
is a triple consisting of a source state, a transition label (i.e. the transition condition) and
the destination state.
> write_sg -bin latchcontroller.stg
Dumping a state graph with 14 states ...
.model latchcontroller
.inputs Rin Aout
.outputs A Rout
.state graph # 14 states
s5_1010 Rin- s12_0010
s5_1010 Rout+ s6_1011
s13_0011 Aout+ s11_0111
s8_0100 Rin+ s1_1100
s8_0100 Aout- s0_0000
s3_1101 Rout- s1_1100
s12_0010 Rout+ s13_0011
s11_0111 A- s10_0101
s10_0101 Rin+ s3_1101
s10_0101 Rout- s8_0100
s9_0110 Aout- s12_0010
s2_1110 Rin- s9_0110
s2_1110 Aout- s5_1010
s0_0000 Rin+ s7_1000
s6_1011 Rin- s13_0011
s6_1011 Aout+ s4_1111
s4_1111 Rin- s11_0111
s1_1100 A+ s2_1110
s1_1100 Aout- s7_1000
s7_1000 A+ s5_1010
.marking {s0_0000}
.end
15
References
[1] Spain Universitat Politècnica de Catalunya, Barcelona. Petrify Website. http://
www.lsi.upc.edu/~jordicf/petrify/, Jan. 2014.
16