Electronic Design - Lecture #1: introduction

Outline
EDC: the course
Electronic Design
Design of Integrated Ciruits (IC)
Integrated Circuits (IC): types, economics
IC design flow
Conclusion
Lecture #1: introduction
R. Pacalet
March 4, 2015
Design of integrated systems (SoC)
2/75
R. Pacalet
EDC: goals
March 4, 2015
EDC: resources
2 × 3 + 5 × 1.5 = 13.5 hours lectures
Digital integrated systems overview
4 × 1.5 = 6 hours labs
Introduction to design methods
Discover CAD tools
•
•
•
•
1.5 hours teamwork
18 hours project
39 hours personal work
PCs (GNU/Linux) equiped with:
What they can do
Limitations
Cost
Future evolutions
•
•
•
•
•
Allow you to
• Join designers’ team
• Communicate with ”hardware guys”
• Join CAD tool vendor
CU-Boulder VIS tools: formal verification (or Mentor Graphics 0-in)
Mentor Graphics CAD tools: simulation, FPGA synthesis
Synopsys CAD tools: ASIC synthesis
Altera Nios prototyping boards (or Xilinx Zynq based ZedBoard)
Altera CAD tools: system level design, synthesis (or Xilinx tools)
A WEB site: http://soc.eurecom.fr/EDC/
4/75
R. Pacalet
March 4, 2015
5/75
R. Pacalet
March 4, 2015
EDC: organization
EDC: exam
Some theory (very practical): 2 × 3 + 5 × 1.5 = 13.5 hours
•
•
•
•
Introduction, ICs and SoC design flows (1 × 3 hours)
Validation (2 × 1.5 hours)
VHDL hardware modeling language (1 × 3 + 3 × 1.5 hours)
Team work (1 × 1.5 hours)
A 2 hours exam
•
•
•
•
A lot of practice: 4 × 1.5 + 6 × 3 = 24 hours
• Simulation-based and formal verification (2 × 1.5 hours)
• Design, simulation, synthesis: ”simple operator” (2 × 1.5 hours)
• Hardware accelerator design, simulation, synthesis, optimization
(2 × 3 hours)
• Design of CPU-based embedded system. Software
implementation, performance evaluation, optimization (2 × 3 hours)
• Integration of hardware accelerator, rework of software
implementation, performance evaluation (2 × 3 hours)
6/75
R. Pacalet
March 4, 2015
75% of the final grade
Questions about lectures
Exercices, problems
All documents allowed
A written labs report
• 25% of the final grade
• Understanding (mainly) and results you got
• Modulated by ”motivation” coefficient
7/75
Integrated Circuits
R. Pacalet
March 4, 2015
Wafer
Packages usually made of plastic or ceramic
Silicon area: mm2 s to cm2 s; also measured in transistors or
NAND2 gates equivalent
Manufacturing process characterized by transistor’s
minimum channel length (e.g. 22 nm) or ”lambda” (half
channel length, e.g. 11 nm)
1970: bipolar transistors, a few dozens of gates (1971
Intel’s 4004, 2250 tr., 10000 nm)
2015: CMOS, 14 nm, 3D, multi-billion transistors
Example: Nvidia’s Kepler GK110 GPU 7.1 × 109 , 28 nm,
561 mm2 (2012)
Source: Moe Adham (Intel 300mm wafer)
10/75
R. Pacalet
March 4, 2015
11/75
R. Pacalet
March 4, 2015
Dies in package
Printed circuit board
Source: Evan-Amos (Sony PSone motherboard)
Source: Emulation Technology Inc. (multi-die BGA package)
12/75
R. Pacalet
March 4, 2015
13/75
Types of ICs
R. Pacalet
March 4, 2015
The Std Cells IC design flow: global view
Full custom
Standard cells
Gate array
Programmable logic devices
• EPLD
• FPGA
Functional
validation
Specification /
modeling
Functional
validation
Refinement
for synthesis
Physical
validation
Synthesis /
optimization
Physical
validation
Floor
plan
English /
VHDL
Logic
design
VHDL
Netlist
Blocks
Test
insertion
Extraction
14/75
R. Pacalet
March 4, 2015
15/75
Place &
route
R. Pacalet
Circuit
Physical
design
March 4, 2015
Type
FPGA
Std cell
NRE
$M 0
$M 0.8
Unit
$80
$12
Example of a 5 M-gates
product, 3 Mbits of internal
memory, one high-speed
SERDES interface, 5 years life
cycle, 8.5% discount rate.
(source: Arun Kottolli,
Open-Silicon - EDN, 3/16/2006)
Std Cell
FPGA
2.5
Net Present Value ($M)
Example of a 250 k-gates
product, 200 MHz, 250 pins
BGA package, 5 years life
cycle, 8.5% discount rate.
(source: Arun Kottolli,
Open-Silicon - EDN, 3/16/2006)
Economics of ICs
2.0
1.5
Type
FPGA
Std cell
1.0
NRE
$M 0
$M 1.0
Unit
$240
$32
0.5
Std Cell
FPGA
5
Net Present Value ($M)
Economics of ICs
0.0
4
3
2
1
0
0
5
0
10 15 20 25 30 35 40 45 50 55 60 65
5
10 15 20 25 30 35 40 45 50 55 60 65
Annual volume (units × 100)
Annual volume (units × 100)
16/75
R. Pacalet
March 4, 2015
17/75
R. Pacalet
Economics of ICs
Type
FPGA
Std cell #1
Std cell #2
NRE
$M 0
$M 2
$M 2.5
Unit
$4000
$150
$320
Standard cell libraries
A key factor
Cell model = several views:
Std Cell #2
Std Cell #1
FPGA
•
•
•
•
•
•
•
10
Net Present Value ($M)
Example of a 12 to 20 M-gates
product, 10 to 20 Mbits of
internal memory, multiple
SERDES interfaces, a
high-speed DDR interface, 5
years life cycle, 8.5% discount
rate. Does not fit in a single
FPGA.
(source: Arun Kottolli,
Open-Silicon - EDN, 3/16/2006)
8
6
4
2
5
10 15 20 25 30 35 40 45 50 55 60 65
R. Pacalet
Layout
Schematic icon
Functional model
Propagation delay
Simulation (VHDL/Verilog)
Test information
Transistor level netlist
March 4, 2015
20/75
Z
I
Z
Inverter models
Z =I
tplh = tp0lh + ∆tplh × Cl
Foundry choice (technology,
standard cell libraries) essential
Annual volume (units × 10)
18/75
I
Some views are private
0
0
March 4, 2015
R. Pacalet
module IN01D1(I, Z)
input I; output Z;
assign Z = !I;
endmodule
March 4, 2015
Specification, Architecture
Specification: function, interface, physical
Functional
Interface
P < 500 µW
Physical
F > 2 GHz
Validated by cost study (clients, CAD, foundry, other providers)
Requirements
Functional,
interface and
physical
specifications
DI0 DI1 DI2
Architecture
1
START
DSO
1
8
DI[7:0]
DO[10:0]
11
1
CLK
Negociations
cost analyses
21/75
R. Pacalet
DO=F(DI)
S < 150 µ2
March 4, 2015
DO0
22/75
C
R. Pacalet
Example of specification
March 4, 2015
Example of specification
Interface
Function
•
•
•
•
•
• Accumulator on P
series of 8 unsigned integers
• F (DI1 , ...DI8 ) = i=8
i=1 DIi
• Integers in the [0..255] range ⇒ result in [0..8 × 255]
Physical
First DI of series valid on CLK rising edges for which START is set
Next DI valid on 7 next CLK rising edges
DO = F (DI1 , ...DI8 ) valid on CLK rising edges for which DSO is set
Maximum latency between last DI and DO: 2 clock cycles
Minimum latency between DO and next START: 1 clock cycle
• Manufacturing process: 28 nm
• Operating conditions
– Process: typical
– Power supply: 1V
– Temperature: 25 ◦ C
Initialization Interval (II)
CP
START
DI
DSO
DO
• Clock frequency > 2 GHz
• Silicon area < 150 µ2
• Power < 500 µW
23/75
R. Pacalet
1
2
3
4
5
6
7
4
8
36
4
Minimum II
4
4
4
4
4
4
42
42
42
32
Maximum latency
March 4, 2015
24/75
R. Pacalet
March 4, 2015
42
High level modelling
Bus-cycle, bit accurate, reference model
Reference golden model
• (Timed) transaction level model
• Bus-cycle accurate, bit accurate
Golden reference all along the project
Updated and re-validated when needed
Maintained all along the design flow
Fast in simulation
Close from actual hardware:
Validated by simulations
• Hierarchy compatibility
• Data flow compatibility
High level
VHDL or
SystemC model
Far from actual hardware:
• Better coverage
Specifications,
architecture
Validated by functional simulations
• Against a reference algorithmic model (C)
• And performance / interface specifications
Functional
simulations
25/75
R. Pacalet
March 4, 2015
26/75
Example of high level model
R. Pacalet
March 4, 2015
Refinement for synthesis
Synthesizable RTL (Register Transfer Level) model
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27/75
package acc_pkg i s
subtype u i 8 i s n a t u r a l range 0 t o 255;
type u i 8 _ t i s array (1 to 8) of ui8 ;
subtype u i 1 1 i s n a t u r a l range 0 t o 8 ∗ 255;
f u n c t i o n f ( t : u i 8 _ t ) r e t u r n ui11 ;
end package acc_pkg ;
package body acc_pkg i s
f u n c t i o n f ( t : u i 8 _ t ) r e t u r n ui11 i s
v a r i a b l e r : ui11 := 0;
begin
f o r i i n 1 to 8 loop
r := r + t ( i ) ;
end l o o p ;
return r ;
end f u n c t i o n f ;
end package body acc_pkg ;
use work . acc_pkg . a l l ;
e n t i t y acc i s
port ( clk , s t a r t : in
di :
in
dso :
out
do :
out
end e n t i t y acc ;
1 −−pragma t r a n s l a t e _ o f f
2 a r c h i t e c t u r e hlm o f acc i s
3 begin
4
5
process
6
v a r i a b l e t : u i 8 _ t : = ( o t h e r s => 0 ) ;
7
variable n : ui8
:= 0;
8
begin
9
w a i t u n t i l c l k = ’ 1 ’ and c l k ’ event and
start = ’1 ’;
10
t (1) := di ;
11
f o r i i n 2 to 8 loop
12
w a i t u n t i l c l k = ’ 1 ’ and c l k ’ event ;
13
t ( i ) := di ;
14
end l o o p ;
15
w a i t u n t i l c l k = ’ 1 ’ and c l k ’ event ;
16
dso <= ’ 1 ’ ; do <= f ( t ) ;
17
w a i t u n t i l c l k = ’ 1 ’ and c l k ’ event ;
18
dso <= ’ 0 ’ ;
19
end process ;
Validated by simulations, formal verification and synthesis
Reference
model
Synthesis,
optimization
Synthesizable
RTL VHDL
model
Specifications,
architecture
Functional
simulations
Formal
verification
bit ;
ui8 ;
bit ;
ui11 ) ;
R. Pacalet
March 4, 2015
28/75
R. Pacalet
March 4, 2015
The VHDL synthesizable model
Example of VHDL synthesizable model
Logic synthesis is 100% functional
The logic synthesizer supports only VHDL
The logic synthesizer does not support all VHDL
The coding style may impact the results
DI
Synthesis options are sometimes VHDL (attributes)
. . . and sometimes not:
D
Q
D
DO
Q
EN
Z
• Pragma comments
• Command-line options
Packages for synthesis exist
Simulation speed is critical
START
CTRL
CLK
DSO
Joint verification function / synthesis
Functional verification can be formal (equivalence or model
checking)
29/75
R. Pacalet
March 4, 2015
30/75
Example of VHDL synthesizable model
1 −−pragma t r a n s l a t e _ o n
2
3 use work . acc_pkg . a l l ;
4
5 a r c h i t e c t u r e r t l o f acc i s
6
signal di_reg :
ui8 ;
7
s i g n a l do_local : ui11 ;
8
s i g n a l sum :
n a t u r a l range 0 t o 9 ∗ 255;
9
signal cnt :
n a t u r a l range 0 t o 8 ;
10
s i g n a l en :
bit ;
11 begin
12
13
c t r l _ p : process ( c l k )
14
begin
15
i f c l k = ’ 1 ’ and c l k ’ event then
16
dso <= ’ 0 ’ ;
17
i f c n t / = 0 then
18
i f c n t = 8 then
19
dso <= ’ 1 ’ ;
20
c n t <= 0 ;
21
else
22
c n t <= c n t + 1 ;
23
end i f ;
24
e l s i f s t a r t = ’ 1 ’ then
25
c n t <= 1 ;
26
end i f ;
31/75
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
R. Pacalet
R. Pacalet
March 4, 2015
Logic synthesis
end process c t r l _ p ;
do <= d o _ l o c a l ;
sum <= d o _ l o c a l + d i _ r e g ;
en <= ’ 1 ’ when c n t / = 0 e l s e
’0 ’;
Registers
P3
do_p : process ( c l k )
begin
i f c l k = ’ 1 ’ and c l k ’ event then
i f s t a r t = ’ 1 ’ then
d o _ l o c a l <= 0 ;
e l s i f en = ’ 1 ’ then
d o _ l o c a l <= sum ;
end i f ;
end i f ;
end process do_p ;
March 4, 2015
Combinatorial
P2
P1
d i _ p : process ( c l k )
begin
i f c l k = ’ 1 ’ and c l k ’ event then
d i _ r e g <= d i ;
end i f ;
end process d i _ p ;
D
Q
HDL
32/75
R. Pacalet
March 4, 2015
Synthesizable VHDL
Logic synthesis: options
100% hardware
•
•
•
•
No pointers
No files
Realistic types only (no reals)
No physical time (after clause)
They may be dedicated comments, interpreted by the synthesizer
VHDL attributes
Commands of the synthesizer
Various types:
Resources identified, allocated,
scheduled
•
•
•
•
•
•
•
•
P1
33/75
Registers
Reasonable complexity (granularity)
R. Pacalet
P2
March 4, 2015
synthesis on/off
translate on/off
set and reset management
Architecture of arithmetic operators
Enumerated types encoding
Multiplexed or hard wired logic (case)
Coding and optimization of state machines
Semantics of resolution functions, . . .
34/75
Logic synthesis: validation
R. Pacalet
Netlist optimization
Interface,
functional
and physical
specifications
Carefully read the messages
• Syntax errors
• Warnings (latches, sensitivity lists,. . . )
Parameter
extraction
Synthesis,
optimization
Count the registers
Estimate and check complexity (adders)
R. Pacalet
March 4, 2015
36/75
Static timing
analysis
Equivalence
checking,
simulation
Synthesizable
RTL VHDL
Beware the CPU run time
35/75
March 4, 2015
R. Pacalet
March 4, 2015
Synthesis, optimization
Synthesis, optimization, step by step
Source code analysis
Synthesizer
Syn.
param.
• Memory elements inference
• Combinatorial parts extraction
D
Q
Intermediate representation
HDL
Logic minimization (Karnaugh, Quine-McCluskey, Espresso, . . . )
Logic optimization
• Factorizations
• Substitutions
• Simplifications, . . .
Optimizer
D
Opt.
param.
Q
Technology-independant gate mapping
Technology mapping
37/75
R. Pacalet
March 4, 2015
38/75
Synthesis, optimization
R. Pacalet
March 4, 2015
Optimization
Physical specifications taken into account
S <= (A and B and (not C)) or
(F1 and F2);
F1 <= A or (not C);
F2 <= A and C;
• Silicon area
• Speed
• Environment
Flattening
Can be slow and CPU intensive
Simplification
S <= (A and B and (not C)) or
((A or (not C)) and (A and C));
S <= (A and B) or
(A and C);
• No useless optimizations
• Fine grain designs
Reproducibility
Impact on and from other steps of the design flow
Factorization
S <= A and F1;
F1 <= B or C;
39/75
R. Pacalet
March 4, 2015
• Testability
• Floor plan
• Place and route
40/75
R. Pacalet
March 4, 2015
Propagation delay model
Timing constraints
• Temperature
• Voltage
• Manufacturing process quality
R
C
C1
Cn
R/n
driver
RC tree structure
Wire-load mode
Wire-load model (area)
Voltage
C
C1
Cn
Balanced
C/n
C1
C/n
Cn
The clocks and their characteristics
The multi-cycle paths
The input delays
The output delays
R: Wireload model resistance
C: Wireload model capacitance
C1 to Cn : Pin capacitance
R/n
1
Input delay
Mode = top
100k wire-load model
100k wire-load model
80k
40k
Output delay
Mode = enclosed Mode = segmented
100k wire-load model
80k
40k
R. Pacalet
•
•
•
•
driver
100k
41/75
R
driver
Depends on parameters
•
•
•
•
The designer specifies:
Worst case
Best case
Used by optimizer to predict
delays
Depends on operating conditions
40k
D
2
Q
80k
40k
80k
40k
40k
D
3
Q
D
Q
4
D
Q
40k
80k
March 4, 2015
40k
42/75
R. Pacalet
Other parameters
March 4, 2015
Optimization: validation
Carefully read the messages
•
•
•
•
Driving gates of inputs
Maximum allowed ramp delay
Syntax errors in input netlist
Number of registers
Estimate and check complexity (adders)
Timing constraints
Maximum allowed load capacitance
Beware the CPU run time
Maximum number of gates driven by a signal (fanout)
Syntax checking
Hierarchy modifications (uniqueness, merge, flatten, etc.)
Static timing analysis
Tuning of CPU run time and memory usage for each phase
Equivalence checking
Timed simulation
• Initialization defects
• Setup and hold violations
• Hardware emulators can be usefull
43/75
R. Pacalet
March 4, 2015
44/75
R. Pacalet
March 4, 2015
Static timing analysis
Test
The delay calculator is a major component of the optimizer
It uses different models of the propagation delays
Metrics
extraction
• Prop ramp delays
• Input slope
• Wave tabular
Optimized
netlist
Its performance (CPU run time and accuracy) are critical
It can be used in stand alone mode:
Scan test
insertion,
fault coverage
Static timing
analysis
Equivalence
checking,
simulation
• To evaluate an unoptimized design (hiererachical assembly)
• To evaluate the impact of some modifications:
– Operating conditions
– Test insertion
– Place and route
STA and optimization parameters are almost the same
45/75
R. Pacalet
March 4, 2015
46/75
Test
March 4, 2015
Test
It’s a manufacturing test, not a functional verification
The fault model is usually simplified:
The designer provides the test vectors
The manufacturer usually tests
• Stuck at 0 or 1 of a net
• One single fault per chip
• On the wafer
• After packaging
Fault simulation:
• The chip is simulated with the assumption of a fault
• The fault is ”covered” iff the outputs differ from the non-faulty chip
PCBs (boards) are also tested
100% coverage means that the test vectors detect every possible
fault
The manufacturing test has a great economics impact
? Exercise #1: what is the defect probability of a PCB:
• 10 ICs, 1.5 cm2 each
• Manufactured with a defect probability of 10% per cm2
• All tested with a 97% coverage
47/75
R. Pacalet
R. Pacalet
March 4, 2015
The coverage depends on the controllability and observability of
the chip
48/75
R. Pacalet
March 4, 2015
Test
Physical design
Different strategies
Metrics
extraction,
parasitics
extraction
• ”Scan path”
• Built-in self test (”BIST”)
• Autotest, . . .
The test usually impacts the timings
Optimized
netlist with
inserted test
The test usually increases the silicon area
There are important dependencies between test and place and
route
Clock tree
synthesis, floor
plan, place
and route
• Connection of scan chains after placement
PCB testing uses different approaches (boundary scan, JTAG
standard) that must be taken into account at chip level
49/75
R. Pacalet
March 4, 2015
50/75
Floor plan
R. Pacalet
Static timing
analysis
Equivalence
checking,
simulation
March 4, 2015
Floor planning tools
Block-based organization of the chip
Position of input / output pads
Position and number of power pads
Aspect ratio of standard
cell areas
Power supply routing strategy
Detect congestions
Clocks routing strategy
Identify the critical interconnect delays
. . . and solve them
Power routing strategy
• In deep submicron designs (since 500 nm) the propagation delays
are wire-dominated
Clocks routing strategy
Source: Magma Design
Automation
51/75
R. Pacalet
March 4, 2015
52/75
R. Pacalet
March 4, 2015
Place & route
Place & route
Slow, CPU intensive
Timing characteristics changed
Placement
•
•
•
•
Power supply dimensioning
Impacts on the architecture
• Partitioning
• Bus segmentation
Standard cells are placed in rows
Interaction with test (scan chains)
Attempts to minimize the congestion
Uses a lot of classical optimization algorithms
Routing
• Multi-level (cells, blocks)
• Block routing is a bit more difficult (irregular block sizes and routing
channels)
• Critical nets can be manually routed or constrained
• Provides a good estimation of parasitics
Impacts on the synthesis / optimization
• Timings
• Silicon area
Very complex tools
Frequently done by specialized companies
53/75
R. Pacalet
March 4, 2015
54/75
R. Pacalet
Last steps
March 4, 2015
Conclusion
Parasitics extraction
Logic design is critical
Back-annotated static timing analysis
Back-annotated timed simulation
High added value
Huge impact on product quality
• Power consumption estimation
• Setup and hold violations
Verification is very expensive
Hardware choices are frequently driven by non-hardware
constraints:
Design Rules Checking (DRC)
Electrical Rules Checking (ERC)
•
•
•
•
•
Design for manufacturing
Bonding diagram
Tape-out
Partitioning
Verification techniques
Performances of the tools
Manageable complexity
Reusability, . . .
Champagne
55/75
R. Pacalet
March 4, 2015
57/75
R. Pacalet
March 4, 2015
What is a SoC?
SoC: design space exploration
”System” ⇒ a collection of interacting elements
Use cases
specification
Functional
modeling of
applications
”On Chip” ⇒ on a single silicon
But any integrated circuit is not a SoC:
• A SoC implements a complete functionnality:
– MPEG2 video decoder + audio + system layers + transport + graphics
+ user interface
– GSM handset: everything but RF
Application
mapping on
architecture
Architecture
modeling
• The elements of a SoC are ”complex”, reusable and
heterogeneous:
–
–
–
–
59/75
Performance
evaluation for each
use case / architecture
Analog functions (AD/DA converters, filters, etc.)
Hardware digital functions (Viterbi decoder, FFT)
Software functions (on a microcontroller, a DSP, a RISC processor)
Others (memories, busses, etc.)
R. Pacalet
March 4, 2015
60/75
Example of application
R. Pacalet
Example of architecture
3G mobile communications
ROM
Basesand and application processor
CPU
Telecommunications (multi-standards)
Applications
•
•
•
•
•
61/75
DSP
DMA
RAM
CTRL
RAM
Local high speed bus
Voice
Video
Graphics 2D/3D (clone animation, gaming)
Data (WEB, etc.)
Agenda, address book, PDA-like functions, . . .
R. Pacalet
March 4, 2015
BRIDGE
Peripheral bus
SIM
March 4, 2015
62/75
USB
IRDA
UART
R. Pacalet
ASIC
RAM
March 4, 2015
Example of a use case
Design space exploration
Situation #1
Hardware / software partitioning
• LTE
• MPEG4 video streaming
Selection of CPUs, processing power, level of parallelism
Specification of software and hardware modules
Situation #2
Selection of interconnect architectures (buses, crossbars, NoCs)
• LTE
• Visiophony
Dimensioning of shared ressources (memories, buses), caches,
...
Validation
Situation #3
• LTE not available
• Switch back to GSM
• Clone animation
63/75
• Deadlock - livelock detection
• Real time constraints, . . .
R. Pacalet
March 4, 2015
64/75
Design space exploration
R. Pacalet
March 4, 2015
SoC: refinement
Architecture /
partitioning
Validation by simulation
On cycle accurate models
• 100% functional
• Huge simulation time, reduced coverage
• Long and expansive design tasks before first simulation
Software
specification
Physical
specification
Hardware
specification
On approximate models
?
• Non functional
• Reduced simulation time, increased coverage
• Reduced design effort
65/75
R. Pacalet
March 4, 2015
66/75
?
R. Pacalet
March 4, 2015
SoC: design reuse
SoC: constraints
Complexity increases
Use of virtual components (IP blocks)
Non recurring engineering costs increase
Changing environments
Highly generic
Various providers
Impact on verification strategy
Design time decreases
Performances of CAD tools
IP design
How to design as fast as possible very complex systems, without
errors and with inadequate tools
IP maintenance
67/75
R. Pacalet
March 4, 2015
68/75
Example: the PXA320 (Smartphone Application Processor) by Marvell
CFI/NAND
Flash
Camera
Sensor
Interface
USB
Full-Speed
OTG with
integrated
transceiver
USB
Full-Speed
Host x2
Xcvr x1
3.5G Cellular
Baseband
802.11
Bluetooth
Digital TV
Mass Storage
GPS
Quick Capture
Interface:
Camera
Pre-processing
Crossbar
Switch
Marvell PXA320
Core Scalable to
806 MHz
32K
L1 D$
Multi-media
Co-processor
32K
L1 I$
Marvell® Scalable Power Manager
Keypad
Controller
General
Purpose
I/O
2x I2C
AC97 or I2S
Interface
Audio
Codec
2x UART
I/O
MUX
I2C
Interface x2
AMBA® Switches
ARM® CoreSight™ Multi-core & Trace Debug
2x CAN
2D Graphics/
Scaling/
Rotation
UART
Interface x3
AMBA® Switches
2x SPI
Level 2 Cache
(L2-cache)
256KB
768KB
Internal SRAM
Dynamic Memory Controller
DDR3, DDR2, LPDDR2
Static Memory Controller
Quad-SPI, NAND, NOR
LCD Controller
Touch Screen
Controller
(TSI)
interface
SSP
Interface x4
SD/SDIO/MMC
Interface x2
Processing System
VLIO/SRAM
NOR Flash
Memory Controller
March 4, 2015
Example:
the
Zynq family by Xilinx
Zynq-7000
Block
Diagram
Multimedia and General Purpose Processing
USB 2.0
High-Speed
Device
USB
Xcvr
x32 DDR
R. Pacalet
GPIO
Programmable
Logic:
System Gates,
DSP, RAM
S_AXI_HP0
S_AXI_HP1
NEON™/ FPU Engine
NEON™/ FPU Engine
S_AXI_HP2
Cortex™-A9 MPCore™
MIO
32/32 KB I/D Caches
Cortex™-A9 MPCore™
32/32 KB I/D Caches
S_AXI_HP3
2x SDIO
with DMA
512 KB L2 Cache
2x USB
with DMA
General Interrupt Controller
Snoop Control Unit (SCU)
Timer Counters
S_AXI_ACP
256 KB On-Chip Memory
DMA
Configuration
PMIC
2x GigE
with DMA
OS, Watchdog Timers, Real-time Clock,
PWM Interface, 1-wire Interface,
Consumer IR, USIM x2, JTAG
EMIO
® Switches
® Switches
AMBA
AMBA
XADC
S_AXI_GP0/1
M_AXI_GP0/1
Multi-Standards I/Os (3.3V & High Speed 1.8V)
•
•
•
•
PCIe
Keypads, Switches,
Rotary Encoder, LEDs
Legend:
System-on-Chip
Multi-Standards I/Os (3.3V & High Speed 1.8V)
Central Processing
Multi Gigabit Transceivers
Peripheral Controller
External Component
Page 6
PXA320 Processor Block Diagram
© Copyright
2009Xilinx
Xilinx
Copyright
2011
FOUNDATION FOR A VERSATILE PLATFORM
69/75
The PXA320 processor integrates technologies into compact, powerful devices for network operators, enterprises,
and consumers. A new low-latency, fully connected internal memory switch eliminates memory bottlenecks, while separate
interfaces for 32-bit DDR memory and Flash memory (NANDX8, NANDX16, and NOR) provide performance for premium
Pacalet interfaces, including high-speed serial
March
4, 2015
phone features and platform flexibility. More and fasterR.
peripheral
interfaces,
enable
Wi-Fi, WiBro, and WiMAX connectivity. A built-in touch-screen and LCD controller supports high-end PDA phones and other
touch-screen driven devices. Plus support for removable memory cards makes the PXA320 processor ideal for a range
of designs needing performance and flexibility to address the needs of multiple market segments.
70/75
R. Pacalet
March 4, 2015
Application-Specific ICs (ASICs) to SoC:
what changes?
Summary
System level skills mandatory
Embedded CPUs (hardware / software co-design)
Embedded large memories (increased bandwidth)
EDC: the course. Questions?
Design reuse, virtual components (IPs)
IC design flow. Questions?
Interconnect sandards (VCI, AMBA, . . . )
SoC. Questions?
Mixed signal
Low power
Testability
Complexity
71/75
R. Pacalet
March 4, 2015
72/75
Further Reading I
M. Balch
Complete digital design: a comprehensive guide to digital
electronics and computer system architecture
McGraw-Hill - 2003 - 460 p.
F. Nekoogar
From ASICS to SoCs: a practical approach
Prentice-Hall - 2003 - 188 p.
M. J. S. Smith
ASICs... the book
Addison-Wesley - 06/1997 - 1040 p.
D. Chinnery, K. W. Keutzer
Closing the gap between ASIC & custom: tools & techniques for
high-performance ASIC design
Kluwer academic - 2002 - 407 p.
Ashenden, Peter J
Designer’s guide to VHDL
Morgan Kaufmann - 06/2008 - 936 p.
Zwolinski, Mark
Digital system design with VHDL
Prentice-Hall - 2004 - 384 p.
D. Jansen
Electronic design automation handbook
Kluwer academic - 2003 - 675 p.
R. Pacalet
March 4, 2015
Further Reading II
R. Zurawski
Embedded systems handbook
CRC Press - 16/08/2005 - 1160 p.
73/75
R. Pacalet
March 4, 2015
74/75
R. Pacalet
March 4, 2015
Further Reading III
Wolf, Wayne Hendrix
Modern VLSI design: systems on chip design
Prentice-Hall - 2002 - 618 p.
And a lot more in the library. . .
75/75
R. Pacalet
March 4, 2015