Document 98908

AUTOMATEDLAYOUT OF REGULAR
FABRIC BRICKS
by
Brian Taylor
A DISSERTATIONSUBMITTEDIN PARTIAL FULFILLMENT
OF THE REQUIREMENTSFOR THE DEGREE OF
MASTER OF SCIENCE
IN ELECTRICAL
AND COMPUTER ENGINEERING
AT CARNEGIE MELLONUNIVERSITY
Committee:
Prof. Larry Pileggi (advisor)
Prof. Andrzej Strojwas
Pittsburgh, PA 15213.
December 2005
ABSTRACT
As minimum feature
sizes
shrink into the deep-subwavelength
range, manufacturing
becomes in-
creasingly diJ~cult, and the costs and complexity of standard cell ASIC design become prohibitive
but the highest-volume products. Manymanufacturing diJ3~culties
based design; therefore,
owe to the irregularity
design using more regular logic fabrics becomes an attractive
standard cell paradigm. In this report,
we outline an alternative
for all
of standard cellalternative
to the
CADflow using a DFM-friendly regular
logic fabric based on heterogenous yet highly regular structures called "bricks." In particular, we describe
methods .for the automatic generation of brick layouts.
the problem of brick generation and can be exploited
strong guarantees of completeness and optimality.
possibilities
for future work.
We show that regularity
requirements simplify
to yield layout methods which allow us to make
Finally,
we present experimental results
and discuss
Table of Contents
1 Introduction
4
1.1
Manufacturing
1.2
The Regular
Challenges
Fabrics
2 Regular Fabric Definition
2.1 Regular
Fabric
2.2 Brick
Layout
3 Transistor
4
....................................
Approach
4
.................................
and Layout Methodology
Definition
Methodology:
....................................
An Overview
. ..........................
Placement
9
3.1
Problem
Description
3.2
Chaining
Algorithm
3.2.1
Transistor
3.2.2
Extended
Brick Routing:
.......................................
9
.......................................
Pairing
10
....................................
Branch-and-Bound
Algorithm
10
........................
An NP Formulation
Overview
5.2 Variable
14
.............................................
Definitions,
5.2.1
Variable
5.2.2
Notation
5.2.3
Terminology
5.3 Routing
5.3.1
5.3.2
10
12
5 SAT Reduction
5.1
5
Rules
General
Segment
Notation,
Definitions
14
and Terminology
.......................
...................................
.......................................
Clauses
Formulation
Rules
Rules
15
15
..........................................
and Enforcer
14
..............................
................................
......................................
16
16
17
19
6
5.3.3
Fixed
5.3.4
Pin
5.3.5
Fabric
Experimental
Terminal
Terminal
P~ules
Results
P~ules
Rules
...................................
.....................................
.......................................
and Conclusions
Appendix
7.1
Variable
7.2
Notation
20
20
22
23
26
Definitions
.......................................
..............................................
27
27
List of Figures
1
Exponential
increase
2
Systematic
3
The
4
Single-height
5
Diffusion
6
A candidate transistor
7
Left
vs.
routing
of ASIC design
Defect-based
grid
to right:
exl,
loss
...........................
.............................
.........................................
one-row
sharing
yield
costs
placement
.................................
.........................................
placement and its associated channel routing problem .......
ex3,
and ex5 .................................
4
5
7
8
9
11
24
1
Introduction
Since its statement in 1965, Moore’s Law[3] has predicted (and driven) an exponential increase
the number of transistors
that can be packed onto an integrated
circuit
density implies an exponential decrease in the size of the features
device sizes scale down into deep-subwavelength territory,
becoming increasingly
difficult
(IC). This increase in transistor
which compose those transistors.
As
the challenges of nanoscale manufacturing are
to overcome. Accordingly, the complexity and costs of Application-Specific
IC (ASIC) design as it is currently practiced
are skyrocketing [14] (see Figure 1). In this section,
discuss the manufacturing challenges of deep-submicron design, and we outline a design methodology for
coping with these challenges.
Figure 1. Exponentialincrease of ASICdesigncosts.
1.1
Manufacturing
Challenges
Many of the challenges of nanoscale manufacturing axe related to optical lithography.
physical reasons, the rate at which the illumination
minimumfeature
size scaling.
wavelength is decreasing is not keeping pace with
For example, a wavelength of 193nm was first
node; however, according to the 2004 ITRS Roadma:p, 193nm lithography
widely used at the 130nm
will remain in use all the way
down to the 32nmnode [16]. As the minimumfeature size scales deep below the illumination
faithful
For complex
reproduction of the smallest features becomes extremely difficult.
wavelength,
This difficulty
yield
manifests itself
in systematic and parametric yield loss, which are now the dominant
loss mechanisms (see Figure 2).
Resolution
Proximity
Correction
(OPC) and Phase Shift
fidelity,
but these suffer from several limitations..
Enhancement Techniques (RETs), such as Optical
Masks (PSM), have been developed to enhance pattern
Rule-based OPCis computationally
expensive and
time-consuming~ as it requires processing and modifying each of the billions
of shapes that define an IC.
Moreover, the correcting
Model-based OPCtechniques
utilize
capability
of simple rule-based
lithography simulations to yield better
infeasible
over large areas of irregular
resolution via destructive interference;
OPCis limited.
corrections,
but this simulation is time-consuming and
layouts [1]. Finally,
however, it is difficult
Alternating
Apertures (~A) PSMenhances
to apply to arbitrary layouts due to phase
conflicts.
The usefulness of RETsis limited by the complexity and irregularity
Consequently, more regular layout styles are needed if classical
of standard cell-based layouts.
scaling is to continue apace.
Figure 2. Systematicvs. Defect-basedyield loss.
1.2
The Regular
Fabrics
Approach
Over the years, several families of regular logic fabrics
have been proposed. Field Programmable
Gate Arrays (FPGAs) are highly regular as well as reprogrammable. However, this flexibility
comes
a very high cost in terms of area, power, and performance whencomparedto ASICdesigns. Pileggi et
al [13] proposed a Via Patterned Gate Array (VPGA)fabric which retained the regularity
but with significantly
of FPGAs,
improved performance, density, and power. Morerecently, the Regular Fabrics
Groupat Carnegie Mellon University has proposed [2] a heterogeneous fabric composedof highly regular
structures referred to as "bricks."
The proposed brick design methodology is as follows. First,
the RTLof the design being im-
plemented is analyzed, and a limited set of logic functions with which the design can be efficiently
implementedis constructed. These logic functions are typically of moderate complexity, usually having
around 6 to 10 inputs. This set of logic functions is then converted into a small set of roughly 20-30
logic-level netlists via BDD-based
decomposition. Each netlist is composedof logic gates from a small
set of primitives (e.g.,
NAND2,
NOR3,AOI21). The logic granularity of these netlists is roughly 2-3x
a single NAND
gate. Finally, a layout is automatically generated for each netlist.
These layouts are
referred to as "bricks," and the automatic generation of regular fabric brick layouts is the subject of this
report.
2
2.1
Regular
Fabric
Definition
and Layout Methodology
Regular Fabric Definition
There are many possible ways in which one could define a regular fabric.
is defined primarily by three characteristics.
Our regular fabric
The first characteristic is that all layers are unidirec-
tional. The unidirectionality of all layers greatly improvesprintability and significantly reduces Across
Chip Linewidth Variation (ACLV).Reducing ACLV
is particularly
important for the poly layer, because
linewidth variation can cause severe deviation of transistor characteristics (due to gate length variation)
and can contribute to parametric yield loss. The second characteristic is that the pitch of each layer is a
6
fixed, single value; all horizontal layers have a fixed pitch which is a multiple of some horizontal quantum,
and vertical
layers have a pitch which is a multiple of a vertical
quantum. Forcing each layer to have a
single, fixed pitch allows off-axis illumination schemes to be optimized for that pitch, thereby increasing
the process window and avoiding the problems associated
[15]. The third characteristic
with phenomena such as the "forbidden
pitch"
is that metal lines are made thicker to avoid notches and landing pads.
Notches and landing pads are irregular,
hard-to-print
features which are easily removed by the thickening
of metal lines, and this thickening is easily accomplished given the fixed pitches at which all metal lines
are required to occur.
The fixed pitch requirement has a pleasant side effect:
illustrated
the conducting layers form a coarse grid, as
in Figure 3. This property is very important for our brick layout algorithms, as will become
clear in subsequent sections.
///////
/,/,,//!,,,/
///’///’,/
Figure3. Therouting grid.
2.2
Brick
Layout
Methodology:
An Overview
Our brick layout is performed using a single-height
tially,
this means that the pMOStransistors
transistors
lie single file in a row at the top of the cell,
are similarly arranged at the bottom of the cell.
placement, and routing.
one-row placement style (see Figure 4). Essenand the nMOS
Brick layout is done in two phases: transistor
Optimal transistor
In this report,
placement for single-height
one-row cells is a well-studied
we extend the minimum-width branch-and-bound
technique
problem [5, 6, 7].
of [8] to exactly
minimize
channel density and estimated wirelength as well .us area. Our new algorithm is described in detail
in
Section 3.
As mentioned above, the fixed pitch requirement of our fabric implies that all routing must take
place on a coarse grid whose granularity
is given by the vertical
and horizontal grid quanta. The discrete
nature of this routing grid gives rise to an NP formulation of the brick routing problem. The solution
method which arises
optimality
from this formulation
that virtually
no general-purpose
allows us to make strong guarantees
routing
algorithms
(e.g.,
routing method is discussed in more detail in Section 4.
p-row
n-row
Figure 4. Single-height one-rowplacement.
of completeness
maze routing)
and
can make. Our
3
3.1
Transistor
Problem
Placement
Description
Webegin our discussion of transistor
placeznent by defining some notation.
Werepresent
a tran-
sistor as an ordered 4-tuple T = (g, s,d, t), where g, s, d are integer net indices (corresponding to the
gate, source, and drain nets, respectively),
p-type).
and t is a boolean indicating the type of T (either n-type
The source s(T) and drain d(T) of transistor
are abuttable if they are of the same type (either
T are its diffusion
nets. Transistors
T1 and T2
p or n), and they share at least one diffusion net
common.
Since bricks are layed out in a single-height,
one-row style,
reduces to the problem of finding a good ordering of the p transistors
the problem of transistor
in the p-row, and of the n transistors
in the n-row. The choice of ordering is very important for two reasons. First,
are abuttable,
two transistors,
of transistors
then the diffusion
placement
if two adjacent transistors
net that they have in commonmay be shared in diffusion
between the
thereby reducing the brick width by one poly pitch (see Figure 5). Werefer to a sequence
TI~...
,Tn such that T~ shares diffusion with T~-I and T~+I as a transistor
ordering will allow us to chain as manytransistors
chain. A good
together as possible using diffusion sharing, resulting
in a brick of minimumwidth. The second reason the choice of ordering is important is that some orderings
may correspond to placements which are much easier to route than other orderings.
ordering determines the width of the brick and its routing characteristics
so it is crucial to find the best possible ordering.
Figure 5. Diffusion sharing.
(e.g.,
Thus, the transistor
best possible wirelength),
Chaining
3.2
Algorithm
Our methodof transistor placement consists of two steps: pairing and chaining. First, we pair the
transistors into (p, n) transistor pairs. Then, we apply an extension of the branch-and-boundalgorithm
of [8] to construct a placement of minimumwidth, channel density, and estimated wirelength.
3.2.1
Transistor Pairing
Rather than consider the n and p transistors
separately, we group them into (p, n) pairs which are
vertically aligned in the final layout, and then find an ordering of the pairs. There are algorithmic
techniques for performing such a pairing [9]. However,because each brick is assembledfrom a logic-level
netlist,
and since the netlist is composedof gates from a small set of logic primitives, we can take a
simpler approach. Namely,we maintain a lookup table from each primitive to an optimal pairing for that
primitive. For example, wherever a NAND2
appears in the netlist,
we replace it with two (p, n) pairs;
the transistors that are paired together are those which share the same gate signal. It is experimentally
observed that this pairing heuristic works quite well in conjunction with the chaining algorithm to be
described.
3.2.2
Extended Branch-and-Bound
Algorithm
The chaining algorithm of [8] uses branch-and-boundsearch to find a transistor ordering with the minimumnumber of chains (and thus the minimumnumber of diffusion breaks, and hence minimumwidth).
However,the algorithm as originally stated makesno consideration of routability.
As previously men-
tioned, different orderings can lead to placements with vastly different routability. Therefore, it is important to consider routing metrics during transistor placement.
Wehave extended the branch-and-boundalgorithm of [8] to minimize two routing metrics: channel
10
density and estimated wirelength 1. The routing problem for bricks, given the single-height
style,
somewhat resembles the classic
layouts and channel routing instances
one-row layout
channel routing problem [10]. The correspondence between brick
is illustrated
in Figure 6 (in the brick layout, each terminal is
labeled with its net number). In fact, in the (rather unlikely) event that all p transistors
width and all n transistors
channel routing
have the same width, the single-height
problem. Given the similarity
expect that for any given brick, transistor
have the same
one-row brick routing problem is a
between brick routing
and channel routing,
one might
placements corresponding to orderings of low channel density
are easier to route than placements corresponding to orderings of higher channel density.
This was in
fact confirmed experimentally, as is discussed shortly.
52045
056
12345
156
Figure 6. A candidatetransistor placementandits associatedchannelrouting problem.
Our chaining algorithm essentially
is run to calculate
the minimumpossible
works as follows. First,
number N of chains.
the branch-and-bound algorithm of [8]
Then, another branch-and-bound search
is performed, using the minimumnumber of chains N as the bound; however, by contrast
branch-and-bound pass, all search paths which lead to solutions
than only the first
such path). Each of these solutions
each such solution,
we calculate
having N chains are explored (rather
corresponds to a minimumwidth placement; for
its channel density and its wirelength.
of minimumwidth, we consider only those of minimumchannel density;
the one of minimumwirelength.
Thus, our placement algorithm exactly
Amongall candidate solutions
of all such solutions,
wirelength
of each net is equal Co half the perimeter of the bounding box of that net.
11
we choose
minimizes width, then channel
density, then estimated wirelength, in that order.
1The estimated
with the first
As mentioned above, the utility
routing
of channel density as a brick routing metric has been borne out by
comparisons between placements generated by our extended branch-and-bound algorithm
optimizes
width, channel density,
annealing
program (which optimized
our algorithm had significantly
and estimated wirelength)
width and estimated
and placements generated
wirelength
alone).
(which
by a simulated
Placements produced
lower wirelength than those produced by simulated annealing,
and were
easier to route: the maze router which was used for routing comparison always converged to a solution
more quickly for our placements than for the simulated annealing-based
circuits,
our placement led to a routing solution whereas the maze router failed to find a solution for the
simulated-annealing
Additionally,
based placement.
our transistor
placement algorithm
branch-and-bound search may theoretically
than a few seconds to place the transistors
4
placements. Moreover, for some
has been observed to be efficient.
take exponential
time to terminate,
While the
it has never taken more
of a bric:k.
Brick Routing:
An NP Formulation
Before we can describe
our brick routing
methods in detail,
we first
outline
some elementary
definitions
and results from complexity theory [4]. A decision problem is a problem, generally stated as
a question,
whose answer is either
YEs or No. An instance
of a problem is a specific
problem. The class NP is the set of decision problems 7, with the following property:
input for that
if I is a YEs
instance of 7, (that is, I is an instance of the problem 7, for which the solution is YES), then there is
proof of this fact which is a polynomial in the size ~ of I, and which can be checked in polynomial time.
There is a polynomial reduction from problem 7~1 to 7’2 if every instance 11 of 7"1 can be mapped to an
instance I2 of 7,~ in polynomial time such that the size of I2 is a polynomial in the size of I1, and I2 is a
2Thesize of an instance is the numberof bits neededto encodeI in some"reasonable" encoding scheme.For example,
if I were an integer, then a reasonable encodingschemewouldbe a binary representation of I; an unreasonableencoding
schemewouldbe a unary representation, because it wouldbe exponentially longer than a binary encoding.
YESinstance of 7~2 if and only if I1 is a YESinstance of :Pl. An NP-complete ~*
problem is a problem 7
in NP such that every other problem T~ in NP has a polynomial reduction
Given a transistor
a routing?
placement, there axe several questions
Does it have a routing
placement, the answer to either
a routing which satisfies
properties
with wirelength
for a given
then a proof may be given in the form of
and this routing can be verified to have the required
in polynomial time. Thus, many variants
NP decision
one may ask: Does this placement have
at most W, using at most V vias? If,
of these questions is "yes,"
the desired properties,
to :P*.
of the brick routing problem can be formulated as
problems.
Another example of an NPdecision problem is the boolean satisfiability
simply as "SAT." An instance
problem, often referred to
of SAT is a boolean formula F in product-of-sums
form 3, together
with
the question, "is there an assignment to the variables in the support of F that makes F evaluate to 1?"
Such an assignment is called a "satisfying
of F contain any l’s?"
SATwas the first
assignment." Equivalent questions are, "Does the truth table
or "Is F identically
zero?" In 1971, Cook showed [12] that SATis NP-complete.
problem shown to have this property;
be NP-complete. No worst-case
polynomial-time
many NP-completeproblems are solved efficiently
thousands of problems since have been shown to
algorithm
is known for any NP-complete problem, but
irk practice for large classes of inputs.
In this work, we formulate the brick routing problem as an NP decision problem whose instances
axe transistor
placements obtained using our extended branch-and-bound algorithm,
is "For this placement,
is there
a routing
on the grid shown in Figure 3?" We then (polynomially)
reduce the brick routing problem to the satisfiability
problem. Although it is possible to reduce the brick
routing problem to any NP-complete problem, SATis a natural
formulae are familiar,
sophisticated,
and whose question
choice for two reasons.
and easy to manipulate and reason about. Second, SATsolvers
First,
boolean
have become very
and are frequently able to solve huge instances involving hundreds of thousands of variables
aTheterm "conjunctive normalform" (CNF)is used morefrequently in the literature. The term derives from the fact
that CNFformulaeare conjunctions(ANDs)of sumterms (called ’clauses’, or ’disjunctions’).
and millions of clauses.
The reduction from brick routing to boolean satisfiability
5
5.1
is given in the next section.
SAT Reduction
Overview
Our reduction from brick routing to SATcan be summedup as follows. Wedefine a set of boolean
variables; each variable encodes a piece of information about the routing solution. For example, each
grid segment has an associated boolean variable which is 1 if and only if that segment is filled in. We
also state a set of rules, whichwe call routing rules, that force a satisfying assignment to the boolean
variables to correspondto a valid routing solution. :Each rule is "enforced" by a set of clauses, whichwe
refer to as the rule’s enforcer clauses. An examplerule might be "every terminal of every net must have
someadjacent segmentfilled in."
A subset of the variables and rules of our reduction are similar to (and inspired by) the routability
checkingformulationof [11]. In the remainderof this section, wedescribe in detail the variables and rules
of our reduction. In Section 5.2, we list all boolean variables neededby our reduction, and we define the
notation and terminology we use to describe the rules enforced on those variables. Then, in Section 5.3,
welist all our rules, and describe the clause structures that enforce those rules.
5.2
Variable
Definitions,
Notation,
and Terminology
Next, we describe each boolean variable used by our SATformulation, and we give the notation
used to refer to that variable. Then, wedefine the notation weuse to refer to various quantities associated
with the routing problem. Finally, wedefine the ter~ninology we use to describe the routing rules.
14
5.2.1
Variable
Definitions
A boolean variable xij is associated with each grid segment; this variable indicates whether the segment is
filled or empty. Clearly, a given routing can be specified as an assignment to these variables. Additionally,
each segment has associated
with it a bit vector Z~j corresponding to the ID of the net which passes
through it.
As it turns out, the rules for routing a two-terminal net can be specified
those for routing a multi-terminal
net. Thus, in this formulation,
a t-terminal
much more simply than
net is treated as t - 1
two-terminal subnets which are allowed to share grid segments (whereas subnets of different
course be disjoint).
(tl,
t2), (/1, t3),...,
If the terminals of net n are (t~, $2,...,
(tl,
nets must of
trn), then the two-terminal subnets of n
tin). If T is the maximum
number of subnets of any net, then every segment x~j
has associated with it T valid bits ffij -- (via(T-i),...,
vijl, vijo). The kth valid bit v~ik indicates whether
x~j carries the kth subnet of the net whose ID is given by Z’ij.
Ifa net n is an I/O net, then a pin must be placed at some grid point (i,j)
in the M2(metal 2) layer
on the routing of n. Our routing formulation accommodates this by adding a floating
"pin" terminal to
each I/O net, and defining a set of variables which encode the location of the pin terminal of each I/O
net. For each grid point (i,j)
in the M2layer, we define a variable Pijt, which is 1 if (i,j)
of a pin terminal of the ~th I/O net. Variable/~j is 1 if (i,j)
For convenience, all variables and their definitions
5.2.2
is the location
is the location of any pin terminal.
are listed in Section 7, the Appendix.
Notation
In this section, we define the notation we use to refer to various parameters of the routing problem. Let
r and c denote the number of rows and columns of the routing grid.
Let N be the total
to be routed. Let b be the number of bits needed to encode nonnegative integers
number of nets
up to N - 1; that is,
b -- [lg N]. Here, b is the number of bits needed to encode the ID of a given net. Let ni denote the net
15,
whose ID is i, for i E {0, 1,...,
if net n has terminals (tl,t2,...
N - 1}. Let T(ni) denote the number of terminals of net hi. Recall that
,tin),
then the subnets of n are (tl,t2),...,
(t~,tm); let s(ni, j) denote
the jth subnet of net ni. Define T to be one less than the maximumnumber of terminals
T = max (T(no), T(nl),...,
For reference,
5.2.3
T(ng-1)} - 1. Finally,
define r/to be the total
on any net:
number of I/O nets.
this notation has been reproduced in the Appendix.
Terminology
A point where two or more grid segments meet is referred
to as a grid point. The two grid points at
either end of a grid segment (usually referred to merely as a segment) are its endpoints. If two segments
share an endpoint, they axe adjacent; the relation "segment s~ is adjacent to s2" is written s~ ~-~ s2. If
a segment xij has a grid point gij as an endpoint~ then xij is incident to gij (written xij ~-~ gij).
The
set of segments adjacent to a given segment are its neighbors; those of its neighbors which are incident
to the same endpoint e are its neighbors at e. A segment which is electrically
connected to a terminal of
net n is said to carry net n; the net carried by a given segment xij is the net whose net ID is given by
¯ ~j. Similarly, a segment x~j is said to carry a subnet s(n, k) if~j = n and Vijk = 1. There are two types
of terminals: fixed terminals,
terminals
which are pre-specified
by the transistor
placement, and pin terminals, net
which correspond to I/O pins, and whose location is determined by the router.
Werefer
to a
terminal at the ith row and jth column using the notation gij, or gLAVE~tij when we wish to specify the
layer in which the terminal lies.
Whenwe wish to refer to the net ID of a given terminal gij~ we use the
notation
5.3
Routing
Rules
In this section,
and Enforcer
we list
we have broken the routing
Clauses
the routing rules and describe the enforcer clause structures.
rules
down into groups,
which we describe
here.
For clarity,
The segment rules
are
the routing rules applied to each grid segment. The fixed terminal rules and pin terminal rules are the
rules applied to every fixed terminal and pin terminal,
arise from the parameters of the regular fabric (e.g.,
considerations.
respectively.
the vertical
The fabric rules are rules which
and horizontal
quanta), and from DFM
The rules which ensure that the routing will be DRC-correct are a subset of the fabric
rules. Finally, various other rules which ensure that a satisfying
assignment corresponds to a valid routing
are grouped together as general formulation rules.
5.3.1
General
Formulation
Rules
Rule 1. This rule ensures that the net ID given by Zij of every filled-in
words, if the net ID Z~j of a segment xij is out of range (i.e.,
segment xij is valid. In other
~ij _> N when there are N nets),
segment x~j is empty. The logic for this rule is 2’/j _> N ~ ~7~, or equivalently,
(Zij _< N - 1) V ~-~.
The precise CNF logic to implement (Zij _< N - 1) depends on the value of the constant
straightforward
to generate
such logic
N - l’s binary representation,
significant
of N-- 1 is then 1101010. By looking at the binary number
on Zij to satisfy Zij _< N - 1 in the following way. For every 0 in
if Zij has a 1 in that position,
then it cannot also have a 1 in every more
position in which N - 1 also has a 1. For example, N - 1 has a 0 in its third most significant
bit (bit 4), so if :~j has a 1 in its third most significant bit (i.e.,
xij~ =- 1), then it cannot have bits 5 and
6 both 1, because that would imply Zi~ > N - 1. Completing this example, the constraints
aHopefully,
N, but it
when the value of N is known. For example, suppose N = 107
(and thus b --- 7) ~. The binary representation
1101010, we determine the constraints
then
N will never be anywhere near this big, but it makes the example more interesting.
17
on the bits
So, for this example, the constraints
to enforce x~j _< N - 1 axe (in CNFformat):
((~-~.~
+ z-~+ x-~)
^ (~---~
+x~-+~---~
+~---~.~)
^ (~ + x--~-.+ ~ + ~-~- + ~))
(~+ x--~ +~ +~)
(~ + ~ + ~ + ~-~ + ~-~-~)
(~-~ + ~-~+ ~ + ~-7j~+ x--~ + ~)
Rule 2. This rule ensures that segment xij is filled in if and only if it caxries at least one subnet (i.e.,
at
least one of its valid bits are 1). The logic to enforce this rule is:
Rule 3. If segment xij has a valid net ID (i.e.,
less than N), then all valid bits corresponding to subnet
indices outside the subnet range for this segment are set to 0. For example, suppose segment xij has a
net ID of 1, N = 10, net 1 has 4 terminals (hence it has 3 two-terminal subnets),
and T = 7. Since x~j
caxries net 1, we should zero out all its valid bits that correspond to subnet indices which are outside the
subnet range for net 1 (namely, vii3, vii4, vij5, and Vii6),
The logic to enforce this rule assumes that the nets are named in order of decreasing number of
terminals; i.e.,
net 0 has at least as manyterminals as net 1, which has at least as manyas net 2, etc. This
can be easily accomplished by simply renaming nets.
The exact logic generated per segment depends on
the number of terminals each net has, but an example will serve to illustrate
number of terminals for each net in our example is given by the following table:
18
the method. Suppose the
net ID # terminals
Fromthe table wesee that if the net ID ~j of segment x~j is greater than 0, its fourth valid bit
v{ja should be set to 0. Additionally, if Zij > 1, then its third valid bit v{j2 should be set to 0 as well
(its fourth valid bit v{j3 should also be zeroed, but that was already taken care of by the logic that
enforces ~j > 0 => v-~). Finally, if ~{j > 2, then its second valid bit v~jl should be zeroed. The logic
to implementa rule such as ~ > n => vi~t was described in detail in General Formulation Rule 1, and
won’t be repeated here.
5.3.2
Segment Rules
Rule 1. For each pair of adjacent segmentsx~,j,, x~j~: if both x{,j, and x~j~ axe filled in, they must
carry the same net ID (otherwise the net carried by x{,j, would be shorted to the net carried by x{~j~).
The derivation of the CNFclauses to enforce this rule is nowshown.
Rule 2. For each endpoint e of each filled-in
segment x{j and for each subnet s carried by xi~: if e is
not a terminal of s (neither a fixed terminal nor a pin terminal), then exactly one of xij’s neighbors
e carries s. Note that only segments in the VIA1and M2layers maybe incident to pin terminals. The
19
full CNFclause set corresponding to this rule is too complex and layer-dependent to give here, so here
is a simplified description,
which can easily be converted to CNFformat for a given layer:
foreach segment xij
foreach endpoint e of xO. which is not a terminal
foreach subnet index t, t E [0, T - 1]
vijt =~ Exactly one of vij~ are set to 1, xij adjacent to xij at e
5.3.3
Fixed
Terminal
Rules
Rule 1. If grid point gij is a fixed terminal of subnet s(n, k), then exactly one segment incident to gij
carries s(n, k). For fixed terminals, the clauses are generated thusly:
foreach fixed terminal gij
EXACTLYONE(Vijk
I Xij
~ gij)
Rule 2. If grid point gij is a fixed terminal of subnet s(n, k), then any filled-in
segment incident to gij
must carry net n. The following logic generates the necessary clauses:
foreach fixed terminal
Note that the net ID gij of fixed terminals gij is actually a constant, rather than a b-bit vector of boolean
variables; thus the logic required to enforce the above rules is simpler than is needed for pin terminals.
5.3.4
Pin Terminal
Rules
Rule 1. For each possible pin location -- that is, for each M2grid point (i,j)
plements the pin terminal of the ~th I/O net (i.e.,
neighbors carries
-- if that location im-
if Pi~ is 1), then exactly one of M2grid point (i, j)’s
the subnet corresponding to that terminal.
Note that the subnet index corresponding
to a pin terminal is always the maximumsubnet index of that net (if an I/O net n has 4 terminals,
thus 3 subnet indices 0, 1, and 2, the subnet index of the pin terminal will be 2).
for i ~- l~ i ~_r + l; i +-i + l
forj +--0; j _< c; j ~--j + 1
2O
and
for g ~- 0~ ~ < ~1; g +- g + 1
{ Let k be the maximumsubnet index of the gth I/O net }
Pijt =~ EXACTLYONE(VM2ijk~ M2i~ ~
Rule 2. For every po~ible pin location (i, j), if that location implements the pin terminal of the gth I/O
net (~jt = 1), then by rule 1, one of grid point gM2ij’s M2neighbors c~ries the pin ter~nal subnet s
corresponding
to the ~th I/O net.
Since we only want one segmem adjacem to gM2ij to ~ry s (~ in
Fixed Terminal Rule 1), we m~e sure that
for i ~ 1; i ~r+l;
forj~0;j~c;j~j+l
the VIA1 ~gmemVIAlij does not c~ry s.
i ~i+1
for t ~ O; t < ~; t ~ t + l
{ Let k be the m~im~ subnet index of the gth I/O net }
POt ~ VIAlijk
Rule 3. If grid point gM2ij is the pin termin~ of net n, the tth I/O net, then ~y segment incident
to
must c~ry net n.
~2ij
fori ~ l;i ~r+ l;i
forj~0;j~c;j~j+l
~i+ l
for l ~ 0; ~<~; ~+I
( Let n be the gth I/O net
Pi~t A M2i~ ~ ~ij
~jt A M2i(j_~)
~jt A VIAlij
= n
n
~ ~i(j-1)
~ ~j = n
Rule 4. Every M2grid point gM2ij can be the site of at most one pin terminal.
for/~
forj
1; i ~r+l;
i ~i+l
~0; j ~ c; j ~j + 1
ATMOSTONE(Pij~,~ = 0,...,~
-- 1)
Rule 5. For every I/O net n, at le~t one possible
pin terminal loc~tion
termin~ of net n.
for g = 0,...,~-
1:
21
actually
implements the pin
r+l
c
VV
i=1 j=O
Rule 6. For every I/O net n, at most one possible
pin terminal location
actually
implements the pin
terminal of net n.
for t +- 0; ~< r~; ~+-£+ 1
for/+- 1; i _<r+l;
forj+-0;j<c;j+-j+l
i +-i+1
form+-/;m_<r+l;m+-m+l
for n +- j + 1; n<c; n+-n+l
ATMOSTONE(Pij~.,
Pmn~)
Rule 7. This rule enforces the definition
(i,j)
of the pin variable class. Pin variable Pij is 1 iff M2grid point
is the location of a pin terminal for some net.
for i +- l; i < r + l; i +-i + l
forj +-0; j < c; j +-j + 1
5.3.5
Fabric
Rules
This section describes those rules which are arise from the particular
used, from DRCcorrectness,
or from DFMconsiderations.
layout style (or "scheme") being
The two styles
we currently
support are a
BEOL-limited design style and an FEOL-limited design style (for our purposes, the only difference
tween the two is in the choice of horizontal
correctness,
are shared between both styles;
and vertical
these are listed
quanta). Some rules,
first.
be-
which are based on DRC
The logic to implement these rules is
so simple that an English description of the rules suffices.
DRCRule 1. Vertically
adjacent contacts
are not allowed.
DRCRule 2. Horizontally adjacent vias are disallowed; vertically
adjacent vias are allowed only if they
share the same net.
DRCRule 3. A filled-in
via segment must be adjacent to at least one filled-in
22
M1 segment (to satisfy
minimummetal rules).
FEOLl~ule 1. Horizontally
in the FEOLstyle,
adjacent
M1 segments are not allowed. This rule ensures DRC-correctness
but is not needed in the BEOLstyle.
FEOLl~ule 2. If two filled-in
M2segments s and s~ lie in the same row and are separated by exactly
two empty segments, then the horizontal
between and including
tracks
s and s ~. This rule,
modate hammerhead corrections
FEOLRule 3. If a filled-in
c- 1), then the M2segments directly
to accom-
on the left
or right boundary of the brick (column 0 or
above and below s are empty.
M1segments lie in row r and occupy columns j, j + 1, and j ÷ 2, then is
not allowed for both Ml(r+l)j and Ml(r+l)(j+l)
Ml(r+l)(j+2)
as well as FEOLRule 3 and BEOLRule 1, exist
for OPC.
M2segment s lies
BEOLRule 1. If three filled-in
above and below s and s ~ must be empty in the columns
to be empty. If three filled-in
to be empty; nor is it allowed for both Ml(r+l)(j+l)
M1segments lie in row 1 and occupy columns j, j ÷ 1,
j ÷ 2, then is not allowed for both M10j and M10(j+l) to be empty; nor is it allowed for both M10(j+l)
and M10(j+2) to be empty.
6
Experimental
Results
and Conclusions
Although our SAT-based routing tool is still
under development, the results
are promising. In the table below, runtimes are given for netlists
we have seen so far
whose complexity and composition are
typical for bricks. The runtime reported is the total time required to convert the logic-level
layout.
All experiments
were run on a 2.8 GHzIntel
HaifaSat was the SATsolver used in all experiments.
23
netlist
to a
Pentium 4 with 1 GB of RAMrunning GNU/Linux.
Name
exl
ex2
ex3
ex4
ex5
Primitives
in netlist
No. vars No. clauses
Runtime (s)
2 ND2
2 ND2, 1 INV
3 ND2
3259
4791
~905
19613
26873
44903
.644
1.162
2.593
2 ND2, 2 NR2
3 ND2, 1 ND3
~809
1667
57770
83075
2.873
16.09
In Figure 7, the layouts of bricks exl, ex2, and ex3 axe shown5. It is cleax (paxticulaxly
and ex5) that the routing which is found by the SAT-based router
for layouts ex3
does not have the minimumpossible
wirelength. This is because our NPformulation asks, "is there a routing?" rather than "is there a routing
that uses less than W segments?" We are promised only a routing solution,
no guarantees
about its quality.
if one exists;
we can make
On the other hand, our routing method has the completeness property
that it is guaranteed to find a routing, if one exists.
no other routing algorithm can make.
This is a claim that, to the best of our knowledge,
It is quite possible
to extend our framework to find routing solutions
of minimumwirelength.
In
fact, there are manyobjectives which one could exactly minimize using reductions to boolean satisfiability.
For example, if one wishes to exactly minimize wirelength,
one can add to the above routing rules a new
rule that forces the routing
at most W (that
allowed to be filled
in).
solution
to have wirelength
is, at most Wsegments are
This rule could be implemented via CNF logic which adds all
and checks that their
variables
together,
than just
one), one could binary search on the value of Wto find the minimum value such that
corresponding formula is satisfiable;
sum is at most W. Now, using multiple
the segment
the resulting
the
routing would have the minimumpossible wirelength!
Similar techniques could be applied to exactly minimize other objectives;
wirelength, then via count, and so on. Clearly,
SATsolves (rather
one could first
exactly minimize
the NP framework is a powerful one.
However, it is possible that such minimization strategies
may take a very long time. For example, if
wirelength is to be minimized, then O(lg(Wma~- Wmin)) SATsolves are needed, where Wm~is any upper
bound on total
wirelength
and
to be less than W will likely
individual
Wmin
is any lower bound. Furthermore, the logic that forces wirelength
be complex, inflating
SATsolve to increase.
each instance size and causing the runtime of each
On the other hand, the alternative
to optimal brick generation
is
hand-layout,
which for complex bricks can take a week or more. Thus, a program which can produce
high-quality
layouts in a matter of days may still
approaches are feasible,
and what results
be desirable.
can be achieved.
25
More research is needed to see which
References
[1] Larry Pileggi, HermanSchmit, Andrzej J. Strojwas, Padmini Gopalakrishnan, Veerbhan Kheterpal, Aneesh
Koorapaty, Chetan Patel, Vycheslav V. Rovner, and K. YawTong. Exploring Regular Fabrics to Optimize the
Performance-Cost Tradeoff. In Proceedings of the 40th Conference on Design Automation, pages 782-787,
2003.
[2] Veerbhan Kheterpal, Vycheslav V. Rovner, Thiago G. Hersan, Dipti Motiani, Yoichi Takegawa, Andrzej J.
Strojwas, and Larry Pileggi. Design Methodologyfor IC Manufacturability Based on Regular Logic Bricks. In
Proceedings of the 42nd Conference on Design Automation, pages 353-358, 2005.
[3] Gordon E. Moore. CrammingMore ComponentsOnto Integrated Circuits. Electronics,
[4] Michael R. Garey and David S. Johnson. Computers and Intractability:
Completeness. Freeman, 1979.
[5] T. Uehara and W. VanCleemput. Optimal Layout of CMOSFunctional
Computers, Vol. C30, pages 305-314, 1981.
Vol. 38, No. 8, 1965.
A Guide to the Theory of NPArrays.
IEEE Transactions
on
[6] S. Wimer, R. Pinter, and J. Feldman. Optimal Chaining of CMOSTransistors in a Functional Cell. IEEE
Transactions on CAD,Vol 30(5), pages 795-801, 1987.
Leaf Cells of
[7] Bradley S. Carson, C. Y. Roger Chen, and Dikran S. Meliksetian. Transistor Chaining in CMOS
Planar Topology. In Proceeding of the Sixth Great, Lakes Symposiumon VLSIDesign, pages 194-199, 1996.
[8] Chi-Yi Hwang, Yung-Ching Hsieh, Youn-Long Lin, and Yu-Chin Hsu. An Optimal Transistor
Algorithm for CMOSCell Layout. ICCAD-89,pages 344-347, November1989.
Chaining
[9] H. Zhang and K. Asada. An Improved Algorithm of Transistors Pairing for CompactLayout of non-SeriesParallel CMOSNetworks. Proceedings of the IEEE 1993 Custom Integrated Circuits Conference, pages
17.2.1-17.2.4, May1993.
[10] T. Yoshimura and E. S. Kuh. El~icient Algorithms for Channel Routing. IEEE Transactions on Computer
Aided Design of Integrated Circuits and Systems, Vol. 1, Issue 1, pages 25-35, January 1982.
[11] William N. N. Hung, Xiayou Song, Timothy Kam, Lerong Cheng and GuowuYang. Routability Checking for
Three-Dimensional Architectures 1EEETransactions on Very Large Scale Integration (VLSI) Systems, Vol.
12, pages 1371-1374, December2004.
[12] Stephen A. Cook. The Complexity of Theorem Proving Procedures. Annual ACMSymposiumon Theory of
Computing, pages 151-158, 1971.
[13] Larry Pileggi, HermanShmit, Jigar Shah, K. YawTong, Chetan Patel, and Vikas Chandra. A Via Patterned
Gate Array. Technical Reports Series of the CMU
Center for Silicon System Implementation, No. CSSI 02-15,
March 2002.
The "Missing Link" of SoC Design -- Platform
[14] Jim Lipman.
http://www.techonline.com/community/ed_resource/36652,
June 2004.
and Structured
ASICs.
[15] Tejas Jhaveri. Private communication. November2005.
[16] International Technology P~oadmapfor Semiconductors. International Technology Roadmapfor Semiconductors 2004 Update -- Lithography. http://www.itrs.net/Common/2004Update/2004-07-Lithography.pdf,
December 2004
7
Appendix
.All the variable definitions
and notation associated with the reduction from brick routing to SAT
(see Section 5) are listed here for reference.
7.1
Variable
Definitions
x~__Aj: Here x is a layer,
i.e.,
x E (PO, CO, M1, VIA1, M2, MI_RAIL}.This variable indicates
the grid segment of layer x is filled or empty (1 or 0, respectively).
whether
Wealso refer to the segment in layer
x at row i and column j as x~j.
xi’j:
Each segment xij has associated with it a bit vector Zij which denotes the net ID of the net it is
carrying. This bit vector is of length b -- [lg N~. Whenwe need to refer to the individual bits of Zij, we
write O.
Xij(b_l)
, ...
, Xijl,
Xij
viii: Each segment xij has associated with it T valid bits.
whether x~j carries
The kth valid bit V~jk of segment xi~’ indicates
the kth subnet of the net whose ID is given by x~j. Whenreferring
bit of a segment in a given layer, row, and column, the notation
Pijt:
This variable is 1 if M2grid point (i,j)
VLAYERijk
is used (e.g.,
to the kth valid
VMlijk).
corresponds to a pin terminal of the £th I/O net (whose
net ID is not in general
Pi_A: This variable is 1 if M2grid point (i, j) corresponds to a pin terminal of some net. /~j = Y~=0P~j~7.2
Notation
r, c: The number of grid rows and columns.
N: Total number of nets.
b: The number of bits needed to represent N different
ni: The net whose net ID is ~. i 6 {0, 1,...,
things,
rlg N~.
N - 1}
T(ni): The number of terminals of net ni.
s(n~, 1),...,
s(n~, ~ - 1): The two-terminal subnets of net n~.
T: One less than the maximumnumber of terminals
T = max {T(n0), T(n~),...~.
on any net;
T(nN-I)} --
~/: The total number of I/O nets.
27