the ECE 449 Computer Design Lab

ECE 448
Lecture 21
FPGA Platforms
High Level Language (HLL)
Design Flows
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
Resources
USB
http://en.wikipedia.org/wiki/USB
PCI
http://en.wikipedia.org/wiki/PCI_Local_Bus
PCI-X
http://en.wikipedia.org/wiki/PCI-X
PCIe
http://en.wikipedia.org/wiki/PCI_Express
ECE 448 – FPGA and ASIC Design with VHDL
2
Resources
• Clive „Max” Maxfield, The Design Warrior’s Guide
to FPGAs
Chapter 11 C/C++ etc.-Based Design Flows
Reconfigurable Supercomputing
T. El-Ghazawi, K. Gaj, D. Buell, D. Pointer
Tutorial at the Supercomputing 2005 conference
http://hpcl.seas.gwu.edu/openfpga/tutorial_html/index.html
ECE 448 – FPGA and ASIC Design with VHDL
3
FPGA Device Capacity Trends
Virtex-5
550 MHz
24M gates*
Xilinx Device Complexity
Virtex-II Pro
450 MHz
8M gates*
Virtex-II
450 MHz
8M gates
Virtex-E
240 MHz
4M gates
Virtex
200 MHz
1M gates
XC4000
100 MHz
250K gates
XC2000
50 MHz
1K gates
XC3000
85 MHz
7.5K gates
1985 1987
1991
XC5200
50 MHz
23K gates
1995
Spartan-3
326 MHz
5M gates
Spartan-II
200 MHz
200K gates
1998 1999 2000 2002 2003 2004
Year
ECE 448 – FPGA and ASIC Design with VHDL
Spartan
80 MHz
40K gates
Virtex-4
500 MHz
16M gates*
2006
Source: http://class.ece.iastate.edu/cpre583/lectures/Lect-01.ppt
4
Prices of the most recent families of
Xilinx FPGAs
Low-cost
High-performance
Spartan 3
< $130*
Virtex II, Virtex II-Pro
< $3,000*
Spartan 3E
< $35*
Virtex 4, Virtex 5
< $3,000*
* approximate cost of the largest device per unit for
a batch of 10,000 units
ECE 448 – FPGA and ASIC Design with VHDL
5
FPGA families
Low-cost
Xilinx
Altera
Spartan 3
Spartan 3E
Spartan 3A Virtex 6
Spartan 3AN
Spartan 3A DSP
Spartan 6
High-performance
Virtex 4 LX / SX / FX
Virtex 5 LX/LXT/SXT/FXT
Cyclone II
Aria
Stratix II
Cyclone III
Aria II
Stratix II GX
Stratix III L/E
Stratix IV E/GX/GT
ECE 448 – FPGA and ASIC Design with VHDL
6
Virtex 4
Source: [Xilinx, Inc.]
ECE 448 – FPGA and ASIC Design with VHDL
7
Virtex-5 Family Platforms
ECE 448 – FPGA and ASIC Design with VHDL
8
FPGA Boards
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
General Architecture of an FPGA-Based Board
CLK
I/O CARD
Processing
Element
(PE#0)
Processing
Element
(PE#1)
Processing
Element
(PE#N-1)
BUS
LOCAL
MEMORY
LOCAL
MEMORY
LOCAL
MEMORY
BUS INTERFACE
CONTROLLER
COMMON MEMORY / INTERCONNECT NETWORK
ECE 448 – FPGA and ASIC Design with VHDL
10
Reconfigurable Computing Boards
• Boards may have one or several
interconnected FPGA chips
• Support different bus standards, e.g. PCI,
PCI-X, PCIe, USB, etc.
• May have direct real-time data I/O through a
daughter board
• Boards may have local onboard memory
(OBM) to handle large data while avoiding the
system bus (e.g. PCI) bottleneck
ECE 448 – FPGA and ASIC Design with VHDL
11
Reconfigurable Computing Boards
• Many boards per node can be supported
• Host program (e.g. C) to interface user (and
mP) with a board via the board’s API
• Driver API functions may include
functionalities such as Reset, Open, Close,
Set Clocks, DMA, Read, Write, Download
Configurations, Interrupt, Readback
ECE 448 – FPGA and ASIC Design with VHDL
12
Universal Serial Bus (USB)
It supports three data rates.
• Full speed rate of 1.5 MB/s as defined by USB 1.0.
• Low speed rate of 1.5 Mb/s which is also defined by USB
1.0. Very similar to full speed operation except that it
takes each bit 8 times as long to transmit. Devices that
run on the low speed rate are Keyboards, Mice and
Joysticks.
• High speed rate of 60 MB/s as defined by USB 2.0.
13
Digilent: BASYS
•
FPGA
: Spartan-3E (XC 3S100E/3S250E ) in TQ144
•
Price
: $59 - $69
•
Interfaces
: USB port
•
Memory
: XCF02 Platform Flash ROM
•
Ethernet
: None
•
Configuration: Device configuration through JTAG via JTAG3 parallel
cable or through USB using Digilent Adept Suite software.
•
Applications : Academic purposes as a teaching aid in digital logic
design courses.
•
URL:
http://www.digilentinc.com/Products/Detail.cfm?Prod=BASYS&Nav1=Products&Nav2=Prog
rammable
Digilent: Spartan3E starter
board
•
FPGA
: Spartan-3E (XC3S500E)
•
Price
: $149
•
Interfaces
: USB3 port
•
Memory
: XCF04 Platform Flash for storing FPGA configurations,
16 Mb Serial Flash, 128 Mb Strata Flash, 256 Mb DDR SDRAM
•
Ethernet
•
Configuration: JTAG programming via on-board USB3 port; JTAG and
SPI Flash programming with parallel or JTAG USB cable
•
Applications : General Prototyping.
•
URL:
: 10/100 Ethernet PHY
http://www.digilentinc.com/Products/Detail.cfm?Prod=S3EBOARD&Nav1=Prod
ucts&Nav2=Programmable
Xilinx: Spartan3A starter kit
•
FPGA
: Spartan-3A (XC3S700A-FG484)
•
Price
: $189
•
Interfaces
: JTAG USB download board
•
Memory
: 256MB DDR2 SDRAM, 32 Mb parallel Flash, 4 Mb
Platform Flash PROM, 2-16 Mb SPI Flash devices
•
Ethernet
•
Configuration: Configuration via JTAG using USB port, Platform Flash
PROM or SPI Flash Memory
•
Applications : General Prototyping.
•
URL: http://www.xilinx.com/products/devkits/HW-SPAR3A-SK-UNI-G.htm
: 10/100 Ethernet PHY
Common Interface - PCI
PCI = Peripheral Component Interconnect
32-bit bus
ECE 448 – FPGA and ASIC Design with VHDL
64-bit bus
17
Evolution of the PCI Interface
ECE 448 – FPGA and ASIC Design with VHDL
18
Disadvantages of PCI & PCI-X:
• Fixed Bus width which all the PCI devices in the system
share.
• No data prioritization. Important data could get caught in
the bottleneck.
• Interference and signal degradation common in parallel
connections.
• Poor materials and cross over signal from nearby wires
translates into noise, which slows the connection down.
PCI Express (PCIe):
• Not a bus like PCI or PCI-X. Communication based on
the concept of lanes.
• A serial bi-directional point-to-point connection is known
as a lane.
• Full duplex bi-directional lanes.
• Transfer rate of a single Lane is a single bit/cycle in each
direction.
• Different PCI lane configurations: x1, x2, x4, x8, x16,
x32.
• Prioritization of data which allows
the system to move the most
important data first and helps
prevent bottlenecks.
• Improvements in the physical
materials used to make the
connections.
• Better handshaking and error
detection.
• Better methods for breaking data
into packets and putting the
packets together again.
Xilinx: Virtex-5
LXT/SXT/FXT ML50x
Evaluation Platform
•
FPGA
: Virtex-5 LXT/SXT/FXT (LX50T/SX50T/FX70T-1FFG1136)
•
Price
: $1,195
•
Interfaces
: x1 PCI Express; SFP, SMA, SATA connectors
•
Memory
: DDR2 SODIMM (256 MB), 1 MB SRAM,
32 MB Linear Flash
•
Ethernet
: x1 Tri-mode Ethernet port
•
Configuration: Through on board System ACE controller or PROM or
Linear Flash or SPI Flash Memory. Can also be downloaded via JTAG
through Xilinx download cable.
•
Applications : High speed design, DSP, Embedded design, Image
processing etc.
URL: http://www.xilinx.com/products/devkits/HW-V5-ML505-UNI-G.htm
•
Xilinx: Virtex-5 FXT ML510
Embedded Development
Platform
•
FPGA
: Virtex-5 FXT (XC5VFX130T-2FFG1738)
•
Price
: $3,100
•
Interfaces
: x2 PCIe downstream connectors,x4 32-bit @33 MHz PCI
connectors; x2 SATA connectors
•
Memory
(512 MB)
: 512 MB Compact Flash card, x2 72-bit DDR2 DIMMs
•
Ethernet
: x2 Tri-mode Ethernet ports
•
Configuration: Through on board System ACE controller with the
configuration files stored in the CF card.
•
Applications : Embedded design, High speed design, Digital video,
Telecom/Datacom etc.
URL: http://www.xilinx.com/products/devkits/HW-V5-ML510-G.htm
•
DINI Group:
DN9000K10
'Bride of
Monster'
•
FPGA
: Virtex-5 LX330 (2 to 16 FPGAs per board)
•
Price
: $125,000 (for 16 LX330s)
•
Interface
: MEG cards available provide for PCI Express interface
•
Memory
: 6 DDR2 SODIMM sockets (up to 4 GB in each)
•
Ethernet
: None
•
Configuration: Configured via Compact Flash controlled by an on-board
Cypress microprocessor or via USB.
•
Applications : ASIC prototyping of logic and memory designs for a fraction of
the cost of existing solutions.
•
URL: http://www.dinigroup.com/DN9000k10.php
FPGA Boards Conclusions
• Boards with PCI Express are of much interest to the
design community because of the high speeds they offer
which will enable to prototype high speed serial systems.
• PCI as a communication interface will soon become
outdated in a few years as the need for ever increasing
communication speeds and high bandwidth applications
increases.
• Boards with the PCI Express interface are relatively
costly compared to those without it.
• The price of the high performance Virtex family FPGA
boards ranges from $799 - $125,000 and boards with the
PCI, PCI-X or PCI-Express interfaces start from $1,195.
• The price of the low cost Spartan3 family FPGA boards
ranges from $59 - $2,100.
Behavioral Synthesis
ECE 448 – FPGA and ASIC Design with VHDL
27
Behavioral Synthesis
I/O
Behavior
Target
Library
Algorithm
Behavioral
Synthesis
RTL
Design
Logic
Synthesis
Classic RTL
Design Flow
Gate level
Netlist
ECE 448 – FPGA and ASIC Design with VHDL
28
Need for High-Level Design
•
•
•
•
•
•
Higher level of abstraction
Modeling complex designs
Reduce design efforts
Fast turnaround time
Technology independence
Ease of HW/SW partitioning
ECE 448 – FPGA and ASIC Design with VHDL
29
Advantages of Behavioral Synthesis
•
•
•
•
•
•
Easy to model higher level of complexities
Smaller in size source compared to RTL code
Generates RTL much faster than manual method
Multi-cycle functionality
Loops
Memory Access
ECE 448 – FPGA and ASIC Design with VHDL
30
Untimed C Domain
SystemC
(Non-implementation-specific)
Timed C Domain
RTL Domain
(Implementation-specific)
Verilog
and VHDL
(Implementation-specific)
ECE 448 – FPGA and ASIC Design with VHDL
Augmented
C/C++
More abstract, less
implementationspecific
Pure C/C++
Different Levels of C/C++ Synthesis Abstraction
Less abstract, more
implementationspecific
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) 31
Pure Untimed C/C++ Design Flow
Verilog /
VHDL RTL
User interaction
and guidence
RTL
Synthesis
Gate-level
netlist
ASIC
target
Pure C/C++
Pure C/C++
Synthesis
Auto-generated,
implementation-specific
FPGA
target
- Non-implementation-specific
- Easy to create
- Fast to simulate
- Easy to modify
Verilog /
VHDL RTL
RTL
Synthesis
LUT/CLBlevel netlist
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
ECE 448 – FPGA and ASIC Design with VHDL
32
Mentor Graphics – Catapult C
ECE 448 – FPGA and ASIC Design with VHDL
33
Mentor Graphics – Catapult C
• Catapult C automatically converts un-timed
C/C++ descriptions into synthesizable RTL.
ECE 448 – FPGA and ASIC Design with VHDL
34
Hardware-Oriented High-Level Languages
• C-Based System level languages
• Commercial
•
•
•
•
SystemC -- The Open SystemC Initiative
Handel C -- Celoxica Ltd.
Impulse C -- Impulse Accelerated Technologies
Carte C – SRC Computers
• Research
• Streams-C -- Los Alamos National Laboratory
• SA-C -- Colorado State University, University of
California, Riverside, Khoral Research, Inc.
• SpecC – University of California, Irvine and
SpecC Technology Open Consortium
ECE 448 – FPGA and ASIC Design with VHDL
35
Other High-Level Design Flows
• Matlab-based
• AccelChip DSP Synthesis -- AccelChip
• System Generator for DSP -- Xilinx
• GUI Data-Flow based
• Corefire -- Annapolis Microsystems
• Java-based
• Commercial
• Forge -- Xilinx
• Research
• JHDL – Brigham Young University
ECE 448 – FPGA and ASIC Design with VHDL
36
SystemC -based design-flow alternatives
Implementation specific,
relatively slow to simulate,
relatively difficult to modify
Auto-RTL
Translation
Verilog /
VHDL RTL
RTL
Synthesis
Gate-level
netlist
SystemC
SystemC
Synthesis
Alternative SystemC flows
ECE 448 – FPGA and ASIC Design with VHDL
37
SystemC Evolution
System
Untimed
SystemC 2.0
Algorithmic
Behavioral/
Transactionlevel
RTL
SystemC
1.0
Timed
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
ECE 448 – FPGA and ASIC Design with VHDL
38
Handel-C Overview
• High-level language based on ISO/ANSI-C for the
implementation of algorithms in hardware
• Allows software engineers to design hardware without
retraining
• Clean extensions for hardware design including flexible
data widths, parallelism and communications
• Well defined timing model
• Each statement takes a single clock cycle
• Includes extended operators for bit manipulation, and
high-level mathematical macros (including floating point)
ECE 448 – FPGA and ASIC Design with VHDL
39
Handel-C/ANSI-C Comparisons
ANSI-C
ANSI-C Standard
Library
Recursion
Floating Point
HANDEL-C
Handel-C Standard
Library
Preprocessors
i.e. #define
Pointers
Structures
Parallelism
ANSI-C Constructs
Arrays
for, while, if, switch
Bitwise logical operators
Logical operators
Arbitrary width
variables
Enhanced bit
manipulation
Arithmetic operators
Functions
Signals
RAM, ROM
Interfaces
ECE 448 – FPGA and ASIC Design with VHDL
40
Handel-C Design Flow
Executable
Specification
Handel-C
VHDL
Synthesis
EDIF
EDIF
Place & Route
ECE 448 – FPGA and ASIC Design with VHDL
41
Type Summary
Type
Width
char
8 bits
unsigned char
8 bits
short
16 bits
unsigned short
16 bits
long
32 bits
unsigned long
32 bits
int
Compiler
unsigned int
Compiler
int n
n bits
unsigned int n
n bits
unsigned n
n bits
ECE 448 – FPGA and ASIC Design with VHDL
42
Arrays
• Same way as in ANSI-C
int 6 x[7];
7 registers of 6 bits wide
unsigned int 6 x [4] [5] [6];
120 registers of 6 bits wide
• Index must be a compile time constant. If
random access is required, consider using
RAM or ROM
ECE 448 – FPGA and ASIC Design with VHDL
43
Internal RAMs and ROMs
• Using ram and rom keywords
ram int 6 a [43];
a RAM consisting of 43 entries of 6 bits wide
rom int 16 b [4];
a ROM consisting of 4 entries of 16 bits wide
• RAMs and ROMs are accessed the same
way that arrays are accessed in ANSI-C
• Index need not be a compile time constant
ECE 448 – FPGA and ASIC Design with VHDL
44
Restrictions on RAMs and ROMs
• RAMs and ROMs are restricted to
performing operations sequentially. Only
one element may be addressed in any
given clock cycle
ram unsigned int 8 x [4];
x [1] = x [3] + 1; illegal
if (x [0] == 0)
x [1] = 1;
illegal
ECE 448 – FPGA and ASIC Design with VHDL
45
Multi-port RAMs
static mpram Fred
{
ram <unsigned 8> ReadWrite[256];
(read/write port)
rom <unsigned 8> Read[256];
(read only port)
}
Now we can read and write in a given
clock cycle
ECE 448 – FPGA and ASIC Design with VHDL
46
Handel-C Language
• Each assignment and delay statement take one
clock cycle
• Automatic generation of the state machine from
an algorithmic description of the circuit in terms of
parallel and sequential blocks
• Automatic scheduling of parallel and sequential
blocks, that is the code following a group is
scheduled only after that whole group has
completed
ECE 448 – FPGA and ASIC Design with VHDL
47
Handel C vs. C - functions
Functions may not be called recursively, since all logic must be
expanded at compile-time to generate hardware
You can only call functions in expression statements.
These statements must not contain any other calls or assignments.
Variable length parameter lists are not supported.
Old-style ANSI-C function declarations
(where the type of the parameters is not specified) are not supported.
main() functions take no arguments and return no values.
Each main() function is associated with a clock.
If you have more than one main() function in the same source file,
they must all use the same clock.
ECE 448 – FPGA and ASIC Design with VHDL
48
Celoxica Handel-C
+ very easy to learn and use
+ super set of ANSI C
+ hides implementation details
+ very flexible , no limitation in parallelism and data
type, extended operators for bit manipulation
+ well-defined timing model
+ portable to a wide range of FPGA devices
- legacy C code requires rewriting
- each statement takes 1 clock cycle to execute
49
Handel-C Example
x[n]
void polyphase() {
ram int IN_WIDTH pin0_0[2], pin0_1[2], pin0_2[2], pin0_3[2];
G0(z)
32
G1(z)
32
G31(z)
z-1
ram int IN_WIDTH pin1_0[2], pin1_1[2], pin1_2[2], pin1_3[2];
ram int IN_WIDTH pin2_0[2], pin2_1[2], pin2_2[2], pin2_3[2];
32
z-1
…..
z-1
while (1) {
par {
padd0_0[half] = (pmult0_0[half][15] @ (pmult0_0[half] \\ 7)) + (pmult0_1[half][15] @ (pmult0_1[half] \\ 7));
padd0_1[half] = (pmult0_2[half][15] @ (pmult0_2[half] \\ 7)) + (pmult0_3[half][15] @ (pmult0_3[half] \\ 7));
pmult0_0[half] = 0;
pmult0_1[half] = -7 * (pin0_1[half][7] @ pin0_1[half][7] @ pin0_1[half][7] @ pin0_1[half][7] @
pin0_1[half][7] @ pin0_1[half][7] @ pin0_1[half][7] @ pin0_1[half][7] @ pin0_1[half]);
pmult0_2[half] = 109 * (pin0_2[half][7] @ pin0_2[half][7] @ pin0_2[half][7] @ pin0_2[half][7] @
if (half) {
par {
output[0] ! (((padd0_0[1][9] @ padd0_0[1]) + (padd0_1[1][9] @ padd0_1[1])) \\ 3);
ECE 448 – FPGA and ASIC Design with VHDL
50
Reconfigurable
Supercomputers
ECE 448 – FPGA and ASIC Design with VHDL
51
What is a
Reconfigurable Computer?
Microprocessor system
mP
...
mP
mP
memory
...
mP
memory
I/O
Interface
ECE 448 – FPGA and ASIC Design with VHDL
Reconfigurable system
FPGA
...
FPGA
FPGA . . . FPGA
memory
memory
Interface
I/O
52
Most advanced reconfigurable
computing machines currently on the market
Machine
Released
SRC 6 from
SRC Computers
2002
Cray XD1 from
from Cray
2005
SGI Altix from
SGI
2005
SRC 7 from
SRC Computers, Inc,
2006
ECE 448 – FPGA and ASIC Design with VHDL
53
Pros and cons
of reconfigurable computers
+ can be programmed using high-level programming
languages, such as C, by mathematicians
& scientist themselves
+ facilitates hardware/software co-design
+ shortens development time, encourages experimentation
and complex optimizations
+ allows sharing costs among users of various
applications
- high entry cost (~$100,000)
- hardware aware programming
- limited portability
- limited availability of libraries
- limited maturity of tools.
ECE 448 – FPGA and ASIC Design with VHDL
54
Two major high-level language (HLL)
programming models
SRC 6 & SRC 7 from
SRC Computers
SRC MAP C programming model
Cray XD1 from
from Cray
Mitrion-C programming model
SGI Altix from
SGI
ECE 448 – FPGA and ASIC Design with VHDL
55
SRC Programming Model
Microprocessor
function_1
FPGA
main.c
macro_1(a, b, c)
function_1()
function_2()
Libraries of macros
macro_1
macro_2
macro_3
macro_4
……………………….
macro_2(b, d)
macro_2(c, e)
VHDL
FPGA
function_2
I/O
a
macro_3(s, t)
ANSI C
Macro_1
macro_1(n, b)
macro_4(t, k)
c
b
Macro_2
MAP C
(subset of ANSI C)
Macro_2
d
e
I/O
ECE 448 – FPGA and ASIC Design with VHDL
56
SRC Compilation Process
Application sources
Macro sources
.mc or .mf files
.c or .f files
.vhd
or .v files
.
HDL
sources
.v files
mP Compiler
Logic synthesis
MAP Compiler
Netlists .ngo files
Object
files
.o files
.o files
Linker
Application
executable
ECE 448 – FPGA and ASIC Design with VHDL
Place & Route
.bin files
Configuration
bitstreams
57
Library Development - SRC
LLL
(ASM)
HLL
(C, Fortran)
HLL
(C, Fortran)
mP system
FPGA system
HDL
(VHDL,
Verilog)
HLL
(C, Fortran)
Library
Developer
ECE 448 – FPGA and ASIC Design with VHDL
HLL
(C, Fortran)
Application
Programmer
58
SRC Programming Environment
+ very easy to learn and use
+ standard ANSI C
+ hides implementation details
+ very well integrated environment
+ mature - in production use for over 4 years with constant
improvements
- subset of C
- legacy C code requires rewriting
- C limitations in describing HW (paralellism, data types)
- closed environment, limited portability of code to
HW platforms other than SRC
ECE 448 – FPGA and ASIC Design with VHDL
59
Application Development
for
Reconfigurable
Computers
ECE 448 – FPGA and ASIC Design with VHDL
60
Application Development
for Reconfigurable Computers
Program
Entry
Platform
mapping
Debugging &
Verification
Compilation
Execution
ECE 448 – FPGA and ASIC Design with VHDL
61
Program Entry
Program
ECE 448 – FPGA and ASIC Design with VHDL
62
Platform Mapping
SW/HW Partitioning
Program
Software
(executed in
the microprocessor
system)
ECE 448 – FPGA and ASIC Design with VHDL
Hardware
(executed in
the reconfigurable
processor
system)
63
SW/HW Partitioning & Coding
Traditional Approach
Specification
SW/HW Partitioning
SW Coding
HW Coding
SW Compilation
HW Compilation
SW Profiling
HW Profiling
ECE 448 – FPGA and ASIC Design with VHDL
64
SW/HW Partitioning & Coding
New Approach
Specification
SW/HW Coding
SW/HW Partitioning
SW Compilation
HW Compilation
SW Profiling
HW Profiling
ECE 448 – FPGA and ASIC Design with VHDL
65
Platform Mapping
FPGA mapping
Program
Hardware
FPGA 1
FPGA 2
Software
FPGA 3
FPGA 4
ECE 448 – FPGA and ASIC Design with VHDL
66
Platform Mapping
FPGA-FPGA data transfer & synchronization
Program
Hardware
FPGA 1
FPGA 2
Software
FPGA 3
FPGA 4
ECE 448 – FPGA and ASIC Design with VHDL
67
Platform Mapping
Use of Internal and External Memories
Program
Hardware
FPGA 1
Software
OCM
FPGA 2
SM
FPGA 3
OCM – On-Chip Memory
LM – Local Memory
SM – Shared Memory
ECE 448 – FPGA and ASIC Design with VHDL
FPGA 4
LM
68
Platform Mapping
I/O
Program
Hardware
OCM
FPGA 1
Software
FPGA 2
SM
SRC
StarBridge
FPGA 3
FPGA 4
LM
ECE 448 – FPGA and ASIC Design with VHDL
69
Ideal Program Entry
Function
Program
Entry
ECE 448 – FPGA and ASIC Design with VHDL
70
Actual Program Entry
Preferred
Architectures
Use of FPGA
Resources
(multipliers,
μP cores)
Function
SW/HW
Partitioning
Program
Entry
Sequence of Run-time
Reconfigurations
SW/HW Interface
ECE 448 – FPGA and ASIC Design with VHDL
FPGA
Mapping
Data Transfers
& Synchronization
Use of Internal
and External Memories
71
Evolution and the current status of tools
Not
Supported
Manual
Entry
Compiler
Automated
mP-FPGA Partitioning
FPGA-FPGA Partitioning
mP-FPGA Data Transfer
FPGA-FPGA Data Transfer
Computation-Data transfer
Overlapping
Choosing component
version
.........
ECE 448 – FPGA and ASIC Design with VHDL
72
Summary
• Mapping algorithms onto reconfigurable
computing systems is a parallel processing
problem
• Languages for reconfigurable computers range
from high level C/Java to schematic to hardware
description languages
• Compilers face a daunting task - extract ILP,
pipeline loops, unroll, trade-off area/speed
• Current tool chains have many components
unfamiliar to software developers
ECE 448 – FPGA and ASIC Design with VHDL
73