AUTOMATEDLAYOUT OF REGULAR FABRIC BRICKS by Brian Taylor A DISSERTATIONSUBMITTEDIN PARTIAL FULFILLMENT OF THE REQUIREMENTSFOR THE DEGREE OF MASTER OF SCIENCE IN ELECTRICAL AND COMPUTER ENGINEERING AT CARNEGIE MELLONUNIVERSITY Committee: Prof. Larry Pileggi (advisor) Prof. Andrzej Strojwas Pittsburgh, PA 15213. December 2005 ABSTRACT As minimum feature sizes shrink into the deep-subwavelength range, manufacturing becomes in- creasingly diJ~cult, and the costs and complexity of standard cell ASIC design become prohibitive but the highest-volume products. Manymanufacturing diJ3~culties based design; therefore, owe to the irregularity design using more regular logic fabrics becomes an attractive standard cell paradigm. In this report, we outline an alternative for all of standard cellalternative to the CADflow using a DFM-friendly regular logic fabric based on heterogenous yet highly regular structures called "bricks." In particular, we describe methods .for the automatic generation of brick layouts. the problem of brick generation and can be exploited strong guarantees of completeness and optimality. possibilities for future work. We show that regularity requirements simplify to yield layout methods which allow us to make Finally, we present experimental results and discuss Table of Contents 1 Introduction 4 1.1 Manufacturing 1.2 The Regular Challenges Fabrics 2 Regular Fabric Definition 2.1 Regular Fabric 2.2 Brick Layout 3 Transistor 4 .................................... Approach 4 ................................. and Layout Methodology Definition Methodology: .................................... An Overview . .......................... Placement 9 3.1 Problem Description 3.2 Chaining Algorithm 3.2.1 Transistor 3.2.2 Extended Brick Routing: ....................................... 9 ....................................... Pairing 10 .................................... Branch-and-Bound Algorithm 10 ........................ An NP Formulation Overview 5.2 Variable 14 ............................................. Definitions, 5.2.1 Variable 5.2.2 Notation 5.2.3 Terminology 5.3 Routing 5.3.1 5.3.2 10 12 5 SAT Reduction 5.1 5 Rules General Segment Notation, Definitions 14 and Terminology ....................... ................................... ....................................... Clauses Formulation Rules Rules 15 15 .......................................... and Enforcer 14 .............................. ................................ ...................................... 16 16 17 19 6 5.3.3 Fixed 5.3.4 Pin 5.3.5 Fabric Experimental Terminal Terminal P~ules Results P~ules Rules ................................... ..................................... ....................................... and Conclusions Appendix 7.1 Variable 7.2 Notation 20 20 22 23 26 Definitions ....................................... .............................................. 27 27 List of Figures 1 Exponential increase 2 Systematic 3 The 4 Single-height 5 Diffusion 6 A candidate transistor 7 Left vs. routing of ASIC design Defect-based grid to right: exl, loss ........................... ............................. ......................................... one-row sharing yield costs placement ................................. ......................................... placement and its associated channel routing problem ....... ex3, and ex5 ................................. 4 5 7 8 9 11 24 1 Introduction Since its statement in 1965, Moore’s Law[3] has predicted (and driven) an exponential increase the number of transistors that can be packed onto an integrated circuit density implies an exponential decrease in the size of the features device sizes scale down into deep-subwavelength territory, becoming increasingly difficult (IC). This increase in transistor which compose those transistors. As the challenges of nanoscale manufacturing are to overcome. Accordingly, the complexity and costs of Application-Specific IC (ASIC) design as it is currently practiced are skyrocketing [14] (see Figure 1). In this section, discuss the manufacturing challenges of deep-submicron design, and we outline a design methodology for coping with these challenges. Figure 1. Exponentialincrease of ASICdesigncosts. 1.1 Manufacturing Challenges Many of the challenges of nanoscale manufacturing axe related to optical lithography. physical reasons, the rate at which the illumination minimumfeature size scaling. wavelength is decreasing is not keeping pace with For example, a wavelength of 193nm was first node; however, according to the 2004 ITRS Roadma:p, 193nm lithography widely used at the 130nm will remain in use all the way down to the 32nmnode [16]. As the minimumfeature size scales deep below the illumination faithful For complex reproduction of the smallest features becomes extremely difficult. wavelength, This difficulty yield manifests itself in systematic and parametric yield loss, which are now the dominant loss mechanisms (see Figure 2). Resolution Proximity Correction (OPC) and Phase Shift fidelity, but these suffer from several limitations.. Enhancement Techniques (RETs), such as Optical Masks (PSM), have been developed to enhance pattern Rule-based OPCis computationally expensive and time-consuming~ as it requires processing and modifying each of the billions of shapes that define an IC. Moreover, the correcting Model-based OPCtechniques utilize capability of simple rule-based lithography simulations to yield better infeasible over large areas of irregular resolution via destructive interference; OPCis limited. corrections, but this simulation is time-consuming and layouts [1]. Finally, however, it is difficult Alternating Apertures (~A) PSMenhances to apply to arbitrary layouts due to phase conflicts. The usefulness of RETsis limited by the complexity and irregularity Consequently, more regular layout styles are needed if classical of standard cell-based layouts. scaling is to continue apace. Figure 2. Systematicvs. Defect-basedyield loss. 1.2 The Regular Fabrics Approach Over the years, several families of regular logic fabrics have been proposed. Field Programmable Gate Arrays (FPGAs) are highly regular as well as reprogrammable. However, this flexibility comes a very high cost in terms of area, power, and performance whencomparedto ASICdesigns. Pileggi et al [13] proposed a Via Patterned Gate Array (VPGA)fabric which retained the regularity but with significantly of FPGAs, improved performance, density, and power. Morerecently, the Regular Fabrics Groupat Carnegie Mellon University has proposed [2] a heterogeneous fabric composedof highly regular structures referred to as "bricks." The proposed brick design methodology is as follows. First, the RTLof the design being im- plemented is analyzed, and a limited set of logic functions with which the design can be efficiently implementedis constructed. These logic functions are typically of moderate complexity, usually having around 6 to 10 inputs. This set of logic functions is then converted into a small set of roughly 20-30 logic-level netlists via BDD-based decomposition. Each netlist is composedof logic gates from a small set of primitives (e.g., NAND2, NOR3,AOI21). The logic granularity of these netlists is roughly 2-3x a single NAND gate. Finally, a layout is automatically generated for each netlist. These layouts are referred to as "bricks," and the automatic generation of regular fabric brick layouts is the subject of this report. 2 2.1 Regular Fabric Definition and Layout Methodology Regular Fabric Definition There are many possible ways in which one could define a regular fabric. is defined primarily by three characteristics. Our regular fabric The first characteristic is that all layers are unidirec- tional. The unidirectionality of all layers greatly improvesprintability and significantly reduces Across Chip Linewidth Variation (ACLV).Reducing ACLV is particularly important for the poly layer, because linewidth variation can cause severe deviation of transistor characteristics (due to gate length variation) and can contribute to parametric yield loss. The second characteristic is that the pitch of each layer is a 6 fixed, single value; all horizontal layers have a fixed pitch which is a multiple of some horizontal quantum, and vertical layers have a pitch which is a multiple of a vertical quantum. Forcing each layer to have a single, fixed pitch allows off-axis illumination schemes to be optimized for that pitch, thereby increasing the process window and avoiding the problems associated [15]. The third characteristic with phenomena such as the "forbidden pitch" is that metal lines are made thicker to avoid notches and landing pads. Notches and landing pads are irregular, hard-to-print features which are easily removed by the thickening of metal lines, and this thickening is easily accomplished given the fixed pitches at which all metal lines are required to occur. The fixed pitch requirement has a pleasant side effect: illustrated the conducting layers form a coarse grid, as in Figure 3. This property is very important for our brick layout algorithms, as will become clear in subsequent sections. /////// /,/,,//!,,,/ ///’///’,/ Figure3. Therouting grid. 2.2 Brick Layout Methodology: An Overview Our brick layout is performed using a single-height tially, this means that the pMOStransistors transistors lie single file in a row at the top of the cell, are similarly arranged at the bottom of the cell. placement, and routing. one-row placement style (see Figure 4). Essenand the nMOS Brick layout is done in two phases: transistor Optimal transistor In this report, placement for single-height one-row cells is a well-studied we extend the minimum-width branch-and-bound technique problem [5, 6, 7]. of [8] to exactly minimize channel density and estimated wirelength as well .us area. Our new algorithm is described in detail in Section 3. As mentioned above, the fixed pitch requirement of our fabric implies that all routing must take place on a coarse grid whose granularity is given by the vertical and horizontal grid quanta. The discrete nature of this routing grid gives rise to an NP formulation of the brick routing problem. The solution method which arises optimality from this formulation that virtually no general-purpose allows us to make strong guarantees routing algorithms (e.g., routing method is discussed in more detail in Section 4. p-row n-row Figure 4. Single-height one-rowplacement. of completeness maze routing) and can make. Our 3 3.1 Transistor Problem Placement Description Webegin our discussion of transistor placeznent by defining some notation. Werepresent a tran- sistor as an ordered 4-tuple T = (g, s,d, t), where g, s, d are integer net indices (corresponding to the gate, source, and drain nets, respectively), p-type). and t is a boolean indicating the type of T (either n-type The source s(T) and drain d(T) of transistor are abuttable if they are of the same type (either T are its diffusion nets. Transistors T1 and T2 p or n), and they share at least one diffusion net common. Since bricks are layed out in a single-height, one-row style, reduces to the problem of finding a good ordering of the p transistors the problem of transistor in the p-row, and of the n transistors in the n-row. The choice of ordering is very important for two reasons. First, are abuttable, two transistors, of transistors then the diffusion placement if two adjacent transistors net that they have in commonmay be shared in diffusion between the thereby reducing the brick width by one poly pitch (see Figure 5). Werefer to a sequence TI~... ,Tn such that T~ shares diffusion with T~-I and T~+I as a transistor ordering will allow us to chain as manytransistors chain. A good together as possible using diffusion sharing, resulting in a brick of minimumwidth. The second reason the choice of ordering is important is that some orderings may correspond to placements which are much easier to route than other orderings. ordering determines the width of the brick and its routing characteristics so it is crucial to find the best possible ordering. Figure 5. Diffusion sharing. (e.g., Thus, the transistor best possible wirelength), Chaining 3.2 Algorithm Our methodof transistor placement consists of two steps: pairing and chaining. First, we pair the transistors into (p, n) transistor pairs. Then, we apply an extension of the branch-and-boundalgorithm of [8] to construct a placement of minimumwidth, channel density, and estimated wirelength. 3.2.1 Transistor Pairing Rather than consider the n and p transistors separately, we group them into (p, n) pairs which are vertically aligned in the final layout, and then find an ordering of the pairs. There are algorithmic techniques for performing such a pairing [9]. However,because each brick is assembledfrom a logic-level netlist, and since the netlist is composedof gates from a small set of logic primitives, we can take a simpler approach. Namely,we maintain a lookup table from each primitive to an optimal pairing for that primitive. For example, wherever a NAND2 appears in the netlist, we replace it with two (p, n) pairs; the transistors that are paired together are those which share the same gate signal. It is experimentally observed that this pairing heuristic works quite well in conjunction with the chaining algorithm to be described. 3.2.2 Extended Branch-and-Bound Algorithm The chaining algorithm of [8] uses branch-and-boundsearch to find a transistor ordering with the minimumnumber of chains (and thus the minimumnumber of diffusion breaks, and hence minimumwidth). However,the algorithm as originally stated makesno consideration of routability. As previously men- tioned, different orderings can lead to placements with vastly different routability. Therefore, it is important to consider routing metrics during transistor placement. Wehave extended the branch-and-boundalgorithm of [8] to minimize two routing metrics: channel 10 density and estimated wirelength 1. The routing problem for bricks, given the single-height style, somewhat resembles the classic layouts and channel routing instances one-row layout channel routing problem [10]. The correspondence between brick is illustrated in Figure 6 (in the brick layout, each terminal is labeled with its net number). In fact, in the (rather unlikely) event that all p transistors width and all n transistors channel routing have the same width, the single-height problem. Given the similarity expect that for any given brick, transistor have the same one-row brick routing problem is a between brick routing and channel routing, one might placements corresponding to orderings of low channel density are easier to route than placements corresponding to orderings of higher channel density. This was in fact confirmed experimentally, as is discussed shortly. 52045 056 12345 156 Figure 6. A candidatetransistor placementandits associatedchannelrouting problem. Our chaining algorithm essentially is run to calculate the minimumpossible works as follows. First, number N of chains. the branch-and-bound algorithm of [8] Then, another branch-and-bound search is performed, using the minimumnumber of chains N as the bound; however, by contrast branch-and-bound pass, all search paths which lead to solutions than only the first such path). Each of these solutions each such solution, we calculate having N chains are explored (rather corresponds to a minimumwidth placement; for its channel density and its wirelength. of minimumwidth, we consider only those of minimumchannel density; the one of minimumwirelength. Thus, our placement algorithm exactly Amongall candidate solutions of all such solutions, wirelength of each net is equal Co half the perimeter of the bounding box of that net. 11 we choose minimizes width, then channel density, then estimated wirelength, in that order. 1The estimated with the first As mentioned above, the utility routing of channel density as a brick routing metric has been borne out by comparisons between placements generated by our extended branch-and-bound algorithm optimizes width, channel density, annealing program (which optimized our algorithm had significantly and estimated wirelength) width and estimated and placements generated wirelength alone). (which by a simulated Placements produced lower wirelength than those produced by simulated annealing, and were easier to route: the maze router which was used for routing comparison always converged to a solution more quickly for our placements than for the simulated annealing-based circuits, our placement led to a routing solution whereas the maze router failed to find a solution for the simulated-annealing Additionally, based placement. our transistor placement algorithm branch-and-bound search may theoretically than a few seconds to place the transistors 4 placements. Moreover, for some has been observed to be efficient. take exponential time to terminate, While the it has never taken more of a bric:k. Brick Routing: An NP Formulation Before we can describe our brick routing methods in detail, we first outline some elementary definitions and results from complexity theory [4]. A decision problem is a problem, generally stated as a question, whose answer is either YEs or No. An instance of a problem is a specific problem. The class NP is the set of decision problems 7, with the following property: input for that if I is a YEs instance of 7, (that is, I is an instance of the problem 7, for which the solution is YES), then there is proof of this fact which is a polynomial in the size ~ of I, and which can be checked in polynomial time. There is a polynomial reduction from problem 7~1 to 7’2 if every instance 11 of 7"1 can be mapped to an instance I2 of 7,~ in polynomial time such that the size of I2 is a polynomial in the size of I1, and I2 is a 2Thesize of an instance is the numberof bits neededto encodeI in some"reasonable" encoding scheme.For example, if I were an integer, then a reasonable encodingschemewouldbe a binary representation of I; an unreasonableencoding schemewouldbe a unary representation, because it wouldbe exponentially longer than a binary encoding. YESinstance of 7~2 if and only if I1 is a YESinstance of :Pl. An NP-complete ~* problem is a problem 7 in NP such that every other problem T~ in NP has a polynomial reduction Given a transistor a routing? placement, there axe several questions Does it have a routing placement, the answer to either a routing which satisfies properties with wirelength for a given then a proof may be given in the form of and this routing can be verified to have the required in polynomial time. Thus, many variants NP decision one may ask: Does this placement have at most W, using at most V vias? If, of these questions is "yes," the desired properties, to :P*. of the brick routing problem can be formulated as problems. Another example of an NPdecision problem is the boolean satisfiability simply as "SAT." An instance problem, often referred to of SAT is a boolean formula F in product-of-sums form 3, together with the question, "is there an assignment to the variables in the support of F that makes F evaluate to 1?" Such an assignment is called a "satisfying of F contain any l’s?" SATwas the first assignment." Equivalent questions are, "Does the truth table or "Is F identically zero?" In 1971, Cook showed [12] that SATis NP-complete. problem shown to have this property; be NP-complete. No worst-case polynomial-time many NP-completeproblems are solved efficiently thousands of problems since have been shown to algorithm is known for any NP-complete problem, but irk practice for large classes of inputs. In this work, we formulate the brick routing problem as an NP decision problem whose instances axe transistor placements obtained using our extended branch-and-bound algorithm, is "For this placement, is there a routing on the grid shown in Figure 3?" We then (polynomially) reduce the brick routing problem to the satisfiability problem. Although it is possible to reduce the brick routing problem to any NP-complete problem, SATis a natural formulae are familiar, sophisticated, and whose question choice for two reasons. and easy to manipulate and reason about. Second, SATsolvers First, boolean have become very and are frequently able to solve huge instances involving hundreds of thousands of variables aTheterm "conjunctive normalform" (CNF)is used morefrequently in the literature. The term derives from the fact that CNFformulaeare conjunctions(ANDs)of sumterms (called ’clauses’, or ’disjunctions’). and millions of clauses. The reduction from brick routing to boolean satisfiability 5 5.1 is given in the next section. SAT Reduction Overview Our reduction from brick routing to SATcan be summedup as follows. Wedefine a set of boolean variables; each variable encodes a piece of information about the routing solution. For example, each grid segment has an associated boolean variable which is 1 if and only if that segment is filled in. We also state a set of rules, whichwe call routing rules, that force a satisfying assignment to the boolean variables to correspondto a valid routing solution. :Each rule is "enforced" by a set of clauses, whichwe refer to as the rule’s enforcer clauses. An examplerule might be "every terminal of every net must have someadjacent segmentfilled in." A subset of the variables and rules of our reduction are similar to (and inspired by) the routability checkingformulationof [11]. In the remainderof this section, wedescribe in detail the variables and rules of our reduction. In Section 5.2, we list all boolean variables neededby our reduction, and we define the notation and terminology we use to describe the rules enforced on those variables. Then, in Section 5.3, welist all our rules, and describe the clause structures that enforce those rules. 5.2 Variable Definitions, Notation, and Terminology Next, we describe each boolean variable used by our SATformulation, and we give the notation used to refer to that variable. Then, wedefine the notation weuse to refer to various quantities associated with the routing problem. Finally, wedefine the ter~ninology we use to describe the routing rules. 14 5.2.1 Variable Definitions A boolean variable xij is associated with each grid segment; this variable indicates whether the segment is filled or empty. Clearly, a given routing can be specified as an assignment to these variables. Additionally, each segment has associated with it a bit vector Z~j corresponding to the ID of the net which passes through it. As it turns out, the rules for routing a two-terminal net can be specified those for routing a multi-terminal net. Thus, in this formulation, a t-terminal much more simply than net is treated as t - 1 two-terminal subnets which are allowed to share grid segments (whereas subnets of different course be disjoint). (tl, t2), (/1, t3),..., If the terminals of net n are (t~, $2,..., (tl, nets must of trn), then the two-terminal subnets of n tin). If T is the maximum number of subnets of any net, then every segment x~j has associated with it T valid bits ffij -- (via(T-i),..., vijl, vijo). The kth valid bit v~ik indicates whether x~j carries the kth subnet of the net whose ID is given by Z’ij. Ifa net n is an I/O net, then a pin must be placed at some grid point (i,j) in the M2(metal 2) layer on the routing of n. Our routing formulation accommodates this by adding a floating "pin" terminal to each I/O net, and defining a set of variables which encode the location of the pin terminal of each I/O net. For each grid point (i,j) in the M2layer, we define a variable Pijt, which is 1 if (i,j) of a pin terminal of the ~th I/O net. Variable/~j is 1 if (i,j) For convenience, all variables and their definitions 5.2.2 is the location is the location of any pin terminal. are listed in Section 7, the Appendix. Notation In this section, we define the notation we use to refer to various parameters of the routing problem. Let r and c denote the number of rows and columns of the routing grid. Let N be the total to be routed. Let b be the number of bits needed to encode nonnegative integers number of nets up to N - 1; that is, b -- [lg N]. Here, b is the number of bits needed to encode the ID of a given net. Let ni denote the net 15, whose ID is i, for i E {0, 1,..., if net n has terminals (tl,t2,... N - 1}. Let T(ni) denote the number of terminals of net hi. Recall that ,tin), then the subnets of n are (tl,t2),..., (t~,tm); let s(ni, j) denote the jth subnet of net ni. Define T to be one less than the maximumnumber of terminals T = max (T(no), T(nl),..., For reference, 5.2.3 T(ng-1)} - 1. Finally, define r/to be the total on any net: number of I/O nets. this notation has been reproduced in the Appendix. Terminology A point where two or more grid segments meet is referred to as a grid point. The two grid points at either end of a grid segment (usually referred to merely as a segment) are its endpoints. If two segments share an endpoint, they axe adjacent; the relation "segment s~ is adjacent to s2" is written s~ ~-~ s2. If a segment xij has a grid point gij as an endpoint~ then xij is incident to gij (written xij ~-~ gij). The set of segments adjacent to a given segment are its neighbors; those of its neighbors which are incident to the same endpoint e are its neighbors at e. A segment which is electrically connected to a terminal of net n is said to carry net n; the net carried by a given segment xij is the net whose net ID is given by ¯ ~j. Similarly, a segment x~j is said to carry a subnet s(n, k) if~j = n and Vijk = 1. There are two types of terminals: fixed terminals, terminals which are pre-specified by the transistor placement, and pin terminals, net which correspond to I/O pins, and whose location is determined by the router. Werefer to a terminal at the ith row and jth column using the notation gij, or gLAVE~tij when we wish to specify the layer in which the terminal lies. Whenwe wish to refer to the net ID of a given terminal gij~ we use the notation 5.3 Routing Rules In this section, and Enforcer we list we have broken the routing Clauses the routing rules and describe the enforcer clause structures. rules down into groups, which we describe here. For clarity, The segment rules are the routing rules applied to each grid segment. The fixed terminal rules and pin terminal rules are the rules applied to every fixed terminal and pin terminal, arise from the parameters of the regular fabric (e.g., considerations. respectively. the vertical The fabric rules are rules which and horizontal quanta), and from DFM The rules which ensure that the routing will be DRC-correct are a subset of the fabric rules. Finally, various other rules which ensure that a satisfying assignment corresponds to a valid routing are grouped together as general formulation rules. 5.3.1 General Formulation Rules Rule 1. This rule ensures that the net ID given by Zij of every filled-in words, if the net ID Z~j of a segment xij is out of range (i.e., segment xij is valid. In other ~ij _> N when there are N nets), segment x~j is empty. The logic for this rule is 2’/j _> N ~ ~7~, or equivalently, (Zij _< N - 1) V ~-~. The precise CNF logic to implement (Zij _< N - 1) depends on the value of the constant straightforward to generate such logic N - l’s binary representation, significant of N-- 1 is then 1101010. By looking at the binary number on Zij to satisfy Zij _< N - 1 in the following way. For every 0 in if Zij has a 1 in that position, then it cannot also have a 1 in every more position in which N - 1 also has a 1. For example, N - 1 has a 0 in its third most significant bit (bit 4), so if :~j has a 1 in its third most significant bit (i.e., xij~ =- 1), then it cannot have bits 5 and 6 both 1, because that would imply Zi~ > N - 1. Completing this example, the constraints aHopefully, N, but it when the value of N is known. For example, suppose N = 107 (and thus b --- 7) ~. The binary representation 1101010, we determine the constraints then N will never be anywhere near this big, but it makes the example more interesting. 17 on the bits So, for this example, the constraints to enforce x~j _< N - 1 axe (in CNFformat): ((~-~.~ + z-~+ x-~) ^ (~---~ +x~-+~---~ +~---~.~) ^ (~ + x--~-.+ ~ + ~-~- + ~)) (~+ x--~ +~ +~) (~ + ~ + ~ + ~-~ + ~-~-~) (~-~ + ~-~+ ~ + ~-7j~+ x--~ + ~) Rule 2. This rule ensures that segment xij is filled in if and only if it caxries at least one subnet (i.e., at least one of its valid bits are 1). The logic to enforce this rule is: Rule 3. If segment xij has a valid net ID (i.e., less than N), then all valid bits corresponding to subnet indices outside the subnet range for this segment are set to 0. For example, suppose segment xij has a net ID of 1, N = 10, net 1 has 4 terminals (hence it has 3 two-terminal subnets), and T = 7. Since x~j caxries net 1, we should zero out all its valid bits that correspond to subnet indices which are outside the subnet range for net 1 (namely, vii3, vii4, vij5, and Vii6), The logic to enforce this rule assumes that the nets are named in order of decreasing number of terminals; i.e., net 0 has at least as manyterminals as net 1, which has at least as manyas net 2, etc. This can be easily accomplished by simply renaming nets. The exact logic generated per segment depends on the number of terminals each net has, but an example will serve to illustrate number of terminals for each net in our example is given by the following table: 18 the method. Suppose the net ID # terminals Fromthe table wesee that if the net ID ~j of segment x~j is greater than 0, its fourth valid bit v{ja should be set to 0. Additionally, if Zij > 1, then its third valid bit v{j2 should be set to 0 as well (its fourth valid bit v{j3 should also be zeroed, but that was already taken care of by the logic that enforces ~j > 0 => v-~). Finally, if ~{j > 2, then its second valid bit v~jl should be zeroed. The logic to implementa rule such as ~ > n => vi~t was described in detail in General Formulation Rule 1, and won’t be repeated here. 5.3.2 Segment Rules Rule 1. For each pair of adjacent segmentsx~,j,, x~j~: if both x{,j, and x~j~ axe filled in, they must carry the same net ID (otherwise the net carried by x{,j, would be shorted to the net carried by x{~j~). The derivation of the CNFclauses to enforce this rule is nowshown. Rule 2. For each endpoint e of each filled-in segment x{j and for each subnet s carried by xi~: if e is not a terminal of s (neither a fixed terminal nor a pin terminal), then exactly one of xij’s neighbors e carries s. Note that only segments in the VIA1and M2layers maybe incident to pin terminals. The 19 full CNFclause set corresponding to this rule is too complex and layer-dependent to give here, so here is a simplified description, which can easily be converted to CNFformat for a given layer: foreach segment xij foreach endpoint e of xO. which is not a terminal foreach subnet index t, t E [0, T - 1] vijt =~ Exactly one of vij~ are set to 1, xij adjacent to xij at e 5.3.3 Fixed Terminal Rules Rule 1. If grid point gij is a fixed terminal of subnet s(n, k), then exactly one segment incident to gij carries s(n, k). For fixed terminals, the clauses are generated thusly: foreach fixed terminal gij EXACTLYONE(Vijk I Xij ~ gij) Rule 2. If grid point gij is a fixed terminal of subnet s(n, k), then any filled-in segment incident to gij must carry net n. The following logic generates the necessary clauses: foreach fixed terminal Note that the net ID gij of fixed terminals gij is actually a constant, rather than a b-bit vector of boolean variables; thus the logic required to enforce the above rules is simpler than is needed for pin terminals. 5.3.4 Pin Terminal Rules Rule 1. For each possible pin location -- that is, for each M2grid point (i,j) plements the pin terminal of the ~th I/O net (i.e., neighbors carries -- if that location im- if Pi~ is 1), then exactly one of M2grid point (i, j)’s the subnet corresponding to that terminal. Note that the subnet index corresponding to a pin terminal is always the maximumsubnet index of that net (if an I/O net n has 4 terminals, thus 3 subnet indices 0, 1, and 2, the subnet index of the pin terminal will be 2). for i ~- l~ i ~_r + l; i +-i + l forj +--0; j _< c; j ~--j + 1 2O and for g ~- 0~ ~ < ~1; g +- g + 1 { Let k be the maximumsubnet index of the gth I/O net } Pijt =~ EXACTLYONE(VM2ijk~ M2i~ ~ Rule 2. For every po~ible pin location (i, j), if that location implements the pin terminal of the gth I/O net (~jt = 1), then by rule 1, one of grid point gM2ij’s M2neighbors c~ries the pin ter~nal subnet s corresponding to the ~th I/O net. Since we only want one segmem adjacem to gM2ij to ~ry s (~ in Fixed Terminal Rule 1), we m~e sure that for i ~ 1; i ~r+l; forj~0;j~c;j~j+l the VIA1 ~gmemVIAlij does not c~ry s. i ~i+1 for t ~ O; t < ~; t ~ t + l { Let k be the m~im~ subnet index of the gth I/O net } POt ~ VIAlijk Rule 3. If grid point gM2ij is the pin termin~ of net n, the tth I/O net, then ~y segment incident to must c~ry net n. ~2ij fori ~ l;i ~r+ l;i forj~0;j~c;j~j+l ~i+ l for l ~ 0; ~<~; ~+I ( Let n be the gth I/O net Pi~t A M2i~ ~ ~ij ~jt A M2i(j_~) ~jt A VIAlij = n n ~ ~i(j-1) ~ ~j = n Rule 4. Every M2grid point gM2ij can be the site of at most one pin terminal. for/~ forj 1; i ~r+l; i ~i+l ~0; j ~ c; j ~j + 1 ATMOSTONE(Pij~,~ = 0,...,~ -- 1) Rule 5. For every I/O net n, at le~t one possible pin terminal loc~tion termin~ of net n. for g = 0,...,~- 1: 21 actually implements the pin r+l c VV i=1 j=O Rule 6. For every I/O net n, at most one possible pin terminal location actually implements the pin terminal of net n. for t +- 0; ~< r~; ~+-£+ 1 for/+- 1; i _<r+l; forj+-0;j<c;j+-j+l i +-i+1 form+-/;m_<r+l;m+-m+l for n +- j + 1; n<c; n+-n+l ATMOSTONE(Pij~., Pmn~) Rule 7. This rule enforces the definition (i,j) of the pin variable class. Pin variable Pij is 1 iff M2grid point is the location of a pin terminal for some net. for i +- l; i < r + l; i +-i + l forj +-0; j < c; j +-j + 1 5.3.5 Fabric Rules This section describes those rules which are arise from the particular used, from DRCcorrectness, or from DFMconsiderations. layout style (or "scheme") being The two styles we currently support are a BEOL-limited design style and an FEOL-limited design style (for our purposes, the only difference tween the two is in the choice of horizontal correctness, are shared between both styles; and vertical these are listed quanta). Some rules, first. be- which are based on DRC The logic to implement these rules is so simple that an English description of the rules suffices. DRCRule 1. Vertically adjacent contacts are not allowed. DRCRule 2. Horizontally adjacent vias are disallowed; vertically adjacent vias are allowed only if they share the same net. DRCRule 3. A filled-in via segment must be adjacent to at least one filled-in 22 M1 segment (to satisfy minimummetal rules). FEOLl~ule 1. Horizontally in the FEOLstyle, adjacent M1 segments are not allowed. This rule ensures DRC-correctness but is not needed in the BEOLstyle. FEOLl~ule 2. If two filled-in M2segments s and s~ lie in the same row and are separated by exactly two empty segments, then the horizontal between and including tracks s and s ~. This rule, modate hammerhead corrections FEOLRule 3. If a filled-in c- 1), then the M2segments directly to accom- on the left or right boundary of the brick (column 0 or above and below s are empty. M1segments lie in row r and occupy columns j, j + 1, and j ÷ 2, then is not allowed for both Ml(r+l)j and Ml(r+l)(j+l) Ml(r+l)(j+2) as well as FEOLRule 3 and BEOLRule 1, exist for OPC. M2segment s lies BEOLRule 1. If three filled-in above and below s and s ~ must be empty in the columns to be empty. If three filled-in to be empty; nor is it allowed for both Ml(r+l)(j+l) M1segments lie in row 1 and occupy columns j, j ÷ 1, j ÷ 2, then is not allowed for both M10j and M10(j+l) to be empty; nor is it allowed for both M10(j+l) and M10(j+2) to be empty. 6 Experimental Results and Conclusions Although our SAT-based routing tool is still under development, the results are promising. In the table below, runtimes are given for netlists we have seen so far whose complexity and composition are typical for bricks. The runtime reported is the total time required to convert the logic-level layout. All experiments were run on a 2.8 GHzIntel HaifaSat was the SATsolver used in all experiments. 23 netlist to a Pentium 4 with 1 GB of RAMrunning GNU/Linux. Name exl ex2 ex3 ex4 ex5 Primitives in netlist No. vars No. clauses Runtime (s) 2 ND2 2 ND2, 1 INV 3 ND2 3259 4791 ~905 19613 26873 44903 .644 1.162 2.593 2 ND2, 2 NR2 3 ND2, 1 ND3 ~809 1667 57770 83075 2.873 16.09 In Figure 7, the layouts of bricks exl, ex2, and ex3 axe shown5. It is cleax (paxticulaxly and ex5) that the routing which is found by the SAT-based router for layouts ex3 does not have the minimumpossible wirelength. This is because our NPformulation asks, "is there a routing?" rather than "is there a routing that uses less than W segments?" We are promised only a routing solution, no guarantees about its quality. if one exists; we can make On the other hand, our routing method has the completeness property that it is guaranteed to find a routing, if one exists. no other routing algorithm can make. This is a claim that, to the best of our knowledge, It is quite possible to extend our framework to find routing solutions of minimumwirelength. In fact, there are manyobjectives which one could exactly minimize using reductions to boolean satisfiability. For example, if one wishes to exactly minimize wirelength, one can add to the above routing rules a new rule that forces the routing at most W (that allowed to be filled in). solution to have wirelength is, at most Wsegments are This rule could be implemented via CNF logic which adds all and checks that their variables together, than just one), one could binary search on the value of Wto find the minimum value such that corresponding formula is satisfiable; sum is at most W. Now, using multiple the segment the resulting the routing would have the minimumpossible wirelength! Similar techniques could be applied to exactly minimize other objectives; wirelength, then via count, and so on. Clearly, SATsolves (rather one could first exactly minimize the NP framework is a powerful one. However, it is possible that such minimization strategies may take a very long time. For example, if wirelength is to be minimized, then O(lg(Wma~- Wmin)) SATsolves are needed, where Wm~is any upper bound on total wirelength and to be less than W will likely individual Wmin is any lower bound. Furthermore, the logic that forces wirelength be complex, inflating SATsolve to increase. each instance size and causing the runtime of each On the other hand, the alternative to optimal brick generation is hand-layout, which for complex bricks can take a week or more. Thus, a program which can produce high-quality layouts in a matter of days may still approaches are feasible, and what results be desirable. can be achieved. 25 More research is needed to see which References [1] Larry Pileggi, HermanSchmit, Andrzej J. Strojwas, Padmini Gopalakrishnan, Veerbhan Kheterpal, Aneesh Koorapaty, Chetan Patel, Vycheslav V. Rovner, and K. YawTong. Exploring Regular Fabrics to Optimize the Performance-Cost Tradeoff. In Proceedings of the 40th Conference on Design Automation, pages 782-787, 2003. [2] Veerbhan Kheterpal, Vycheslav V. Rovner, Thiago G. Hersan, Dipti Motiani, Yoichi Takegawa, Andrzej J. Strojwas, and Larry Pileggi. Design Methodologyfor IC Manufacturability Based on Regular Logic Bricks. In Proceedings of the 42nd Conference on Design Automation, pages 353-358, 2005. [3] Gordon E. Moore. CrammingMore ComponentsOnto Integrated Circuits. Electronics, [4] Michael R. Garey and David S. Johnson. Computers and Intractability: Completeness. Freeman, 1979. [5] T. Uehara and W. VanCleemput. Optimal Layout of CMOSFunctional Computers, Vol. C30, pages 305-314, 1981. Vol. 38, No. 8, 1965. A Guide to the Theory of NPArrays. IEEE Transactions on [6] S. Wimer, R. Pinter, and J. Feldman. Optimal Chaining of CMOSTransistors in a Functional Cell. IEEE Transactions on CAD,Vol 30(5), pages 795-801, 1987. Leaf Cells of [7] Bradley S. Carson, C. Y. Roger Chen, and Dikran S. Meliksetian. Transistor Chaining in CMOS Planar Topology. In Proceeding of the Sixth Great, Lakes Symposiumon VLSIDesign, pages 194-199, 1996. [8] Chi-Yi Hwang, Yung-Ching Hsieh, Youn-Long Lin, and Yu-Chin Hsu. An Optimal Transistor Algorithm for CMOSCell Layout. ICCAD-89,pages 344-347, November1989. Chaining [9] H. Zhang and K. Asada. An Improved Algorithm of Transistors Pairing for CompactLayout of non-SeriesParallel CMOSNetworks. Proceedings of the IEEE 1993 Custom Integrated Circuits Conference, pages 17.2.1-17.2.4, May1993. [10] T. Yoshimura and E. S. Kuh. El~icient Algorithms for Channel Routing. IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems, Vol. 1, Issue 1, pages 25-35, January 1982. [11] William N. N. Hung, Xiayou Song, Timothy Kam, Lerong Cheng and GuowuYang. Routability Checking for Three-Dimensional Architectures 1EEETransactions on Very Large Scale Integration (VLSI) Systems, Vol. 12, pages 1371-1374, December2004. [12] Stephen A. Cook. The Complexity of Theorem Proving Procedures. Annual ACMSymposiumon Theory of Computing, pages 151-158, 1971. [13] Larry Pileggi, HermanShmit, Jigar Shah, K. YawTong, Chetan Patel, and Vikas Chandra. A Via Patterned Gate Array. Technical Reports Series of the CMU Center for Silicon System Implementation, No. CSSI 02-15, March 2002. The "Missing Link" of SoC Design -- Platform [14] Jim Lipman. http://www.techonline.com/community/ed_resource/36652, June 2004. and Structured ASICs. [15] Tejas Jhaveri. Private communication. November2005. [16] International Technology P~oadmapfor Semiconductors. International Technology Roadmapfor Semiconductors 2004 Update -- Lithography. http://www.itrs.net/Common/2004Update/2004-07-Lithography.pdf, December 2004 7 Appendix .All the variable definitions and notation associated with the reduction from brick routing to SAT (see Section 5) are listed here for reference. 7.1 Variable Definitions x~__Aj: Here x is a layer, i.e., x E (PO, CO, M1, VIA1, M2, MI_RAIL}.This variable indicates the grid segment of layer x is filled or empty (1 or 0, respectively). whether Wealso refer to the segment in layer x at row i and column j as x~j. xi’j: Each segment xij has associated with it a bit vector Zij which denotes the net ID of the net it is carrying. This bit vector is of length b -- [lg N~. Whenwe need to refer to the individual bits of Zij, we write O. Xij(b_l) , ... , Xijl, Xij viii: Each segment xij has associated with it T valid bits. whether x~j carries The kth valid bit V~jk of segment xi~’ indicates the kth subnet of the net whose ID is given by x~j. Whenreferring bit of a segment in a given layer, row, and column, the notation Pijt: This variable is 1 if M2grid point (i,j) VLAYERijk is used (e.g., to the kth valid VMlijk). corresponds to a pin terminal of the £th I/O net (whose net ID is not in general Pi_A: This variable is 1 if M2grid point (i, j) corresponds to a pin terminal of some net. /~j = Y~=0P~j~7.2 Notation r, c: The number of grid rows and columns. N: Total number of nets. b: The number of bits needed to represent N different ni: The net whose net ID is ~. i 6 {0, 1,..., things, rlg N~. N - 1} T(ni): The number of terminals of net ni. s(n~, 1),..., s(n~, ~ - 1): The two-terminal subnets of net n~. T: One less than the maximumnumber of terminals T = max {T(n0), T(n~),...~. on any net; T(nN-I)} -- ~/: The total number of I/O nets. 27
© Copyright 2024