演習3
(PRACTICE FOR 3RD GRADES)
HOW TO USE XEON PHI ON
KNF3-5
Background: Architecture
2
¨
Current Configuration
Multi-core host with Many-core Co-processor
¤ Connected through PCI Express
¤ Co-Processor (e.g., Xeon Phi):
¤
Large number of power efficient,
n but lower performance CPU cores
n Limited on-board memory and smaller caches
n
¨
Future Architecture Direction
¤
Standalone Many-core Unit possibly with Heterogeneous
CPU Cores
Interface for Heterogeneous Kernels (IHK)
3
• A low-level software layer for rapid OS prototyping
• Main components:
•
•
•
•
IHK-Host
IHK-Many-core
IHK-IKC (Inter-Kernel Communication)
mcKernel (Lightweight Kernel for Many-Cores)
IHK-Host
4
¨
Implemented as a Linux kernel module
¤ Through
¨
ioctl() from userspace
Provides interface to
¤ Initialize/manage
Co-Processor(s)
¤ Create/Destroy OS instances
¤ Bind resources to OS instances
n CPU
cores
n Physical memory
¤ Bootstrap/shutdown
OS instances
¤ Map device memory to host / Access device memory
¤ Interrupt device CPUs
¤ Drive DMA engines
IHK-Manycore
5
Sort of a hardware abstraction layer
¨ Provides a standard interface to
¨
¤ Map
host memory to device / Access host memory
¤ Interrupt host CPUs
¤ Drive DMA engines
¨
Implemented as a library that is linked to the manycore kernel
Inter-Kernel Communication (IKC)
6
¨
Provides asynchronous messaging facility for kernels
running over IHK
¤ Listen/accept/connect
semantics
¤ Callback notification or poll based message reception
Implemented as a pair of send/receive queues
¨ IHK provides a master IKC channel that is used to
¨
¤ Establish
¨
and manage other IKC channels
IKC is used for syscall offloading in mcKernel
mcKernel: Lightweight Kernel for Many Cores
7
¨
¨
A lightweight kernel developed from scratch over IHK
Goals:
¤
¤
¨
¨
Small memory and cache footprint
Scalable kernel data structures (e.g., partially separated page tables)
Maintains Linux ABI
Only necessary services:
¤
¤
¤
Memory management
Processes / Threading
Simple system calls executed on the Co-processor
n
¤
Complex system calls are offloaded to the host kernel
n
¨
¨
¨
clone(), mmap(), etc..
Such as file I/O
Support for futex/pthreads/OpenMP
Hierarchical memory management
Infiniband MPI integration in progress
Set up ssh_config for gower.il.is.s.u-toyo.ac.jp
¨
Add this to your /etc/ssh_config
Host gower
HostName %h.il.is.s.u-tokyo.ac.jp
Port 10298
Host kncclogin
ProxyCommand ssh 133.11.233.10 -p 10298 -W 133.11.249.234:%p
Host knf3 knf4 knf5 # add any host you'd like to use
# Use gower as a proxy. for OpenSSH 5.4p and later
ProxyCommand ssh 133.11.233.10 -p 10298 -W %h:%p
# Use below for OpenSSH clients older than 5.4p
# ('ssh -V' to check)
# ProxyCommand ssh gower nc %h %p
¨
(http://www.il.is.s.u-tokyo.ac.jp/~bgerofi/ssh_config)
We have 3 Xeon Phi machines for Enshu3:
¤ knf3,
¨
knf4, knf5
Now you can log in just by typing:
¤ ssh
knf3 (or knf4, knf5)
McKernel Git Setup
Create a directory for your sources
¨ Clone the repositories from my (bgerofi) directory:
¨
¤ git
clone /home/bgerofi/Code/enshu3/ihk/
¤ git clone /home/bgerofi/Code/enshu3/mckernel/
¤ git clone /home/bgerofi/Code/enshu3/glibc/
¨
Copy the rebuild and reload scripts:
¤ cp
/home/bgerofi/Code/enshu3/rebuild.sh ./
¤ cp /home/bgerofi/Code/enshu3/reload_ihk_modules.sh ./
Compile IHK and McKernel
¨
Set up Intel ICC compiler environment:
¤ .
/opt/intel/bin/compilervars.sh intel64
¤ There is a space after the dot!!
¨
Compile kernel sources:
¤ ./rebuild.sh
¨
Compile userland programs:
¤ icc
–Wall –L/home/bgerofi/Code/libs-stat/ -mmic -static pthread source.c –o objname
Reload and boot kernel
¨
Reload the IHK and boot the kernel
¤
¨
Display kernel log on KNC (McKernel’s log)
¤
¤
¨
(This is currently the best way to debug)
./ihk/linux/user/ihkostest 0 kmsg
Execute user app:
¤
¨
./reload_ihk_modules.sh –w (press ENTER at the end)
./mckernel/executer/user/mcexec /path-to-app/app
Hello world example:
¤
./mckernel/executer/user/mcexec /home/bgerofi/Code/hello/hellointel
Source tree (shown in editor)
© Copyright 2025