3 slides per page - GenoToul Bioinfo

Training Day :
High Performance Computing
Cluster
Pre-requisite : Linux
●
Connect to « genotoul » server
●
Basic command line utilization
●
File system Hierarchy Standard
●
Useful tools (find, sort, cut, grep...)
●
Transferring & compressing files
Today
●
How to use High Performance Computing
Cluster (compute nodes)
16/03/15
2
Objectives
➔
➔
16/03/15
To optimise computational power
➔ How to submit jobs on compute nodes
➔ How to manage his jobs (stat, kill...)
Autonomy, self mastery
3
Planning of the day
Part I : 09h00 - 12h00
Compute nodes environment
Open Grid Engine
Practical 1
Part II : 14h00 – 17h00
Submit array of jobs
Practical 2
Parallel environments
Practical 3
16/03/15
4
Connection to « genotoul » cluster
node001 à 068 :
2720 INTEL cores,
17TB of memory
Ceri001 à 034 :
1632 AMD cores
12TB of memory
ssh
Bigmem01 à 02 :
128 INTEL cores
2TB of memory
smp :
240 INTEL cores
3TB of memory
Internet
«genotoul»
login nodes
computer nodes
Storage
Facilities
16/03/15
5
Connection to genotoul
●
●
●
16/03/15
Pre-requisite : ask for a linux account
http://bioinfo.genotoul.fr/index.php?id=81
SSH connection to the login nodes (use putty if
Windows desktop) : genotoul.toulouse.inra.fr
Linux command line (terminal session)
6
Vocabulary : Cluster / Node
●
Cluster : set of nodes
●
Node : Huge computer (with several CPUs)
CPU
CPU
CPU
CPU
16/03/15
7
Vocabulary : CPU / Core
●
●
CPU : Central Processing Unit
Core
1 CPU Dual Core
16/03/15
8
Login nodes : alias « genotoul »
Each server = 32 INTEL cores, 128 GB of memory
Linux 64 bits based on CentOS-6 distribution
● Hundreds of users simultaneous
● Secured (SSH only), daily saved (backup)
● FUNCTIONS :
➔ To serve development environments
➔ To test his scripts before data analysis
➔ To launch batches on the cluster nodes
➔ To follow the execution of jobs
➔ To get data results on the /save directory
●
●
16/03/15
9
Login nodes : alias « genotoul »
●
Environment dedicated to bioinformatics
Software into : /usr/local/bioinfo/src
(ex: blastall, clustalw, iprscan, megablast, wublast, ...)
➔ Genomics databanks into : /bank
➔
●
●
Development languages
➔
Shell, perl, C++, java, python...
Editing tools
➔ nedit, geany, nano, emacs, vi, ...
16/03/15
10
Access to cluster nodes
●
●
●
Interactive mode : for beginners / for remote
display
Batch access : for intensive usage (most of jobs)
Communication between server and
computational nodes is managed by the grid
scheduler. No direct ssh-access to the nodes.
16/03/15
11
Data storage
16/03/15
Drive bay
12
Disk spaces
/usr/local/bioinfo/
Bioinformatics Software
/bank/
International genomics Databanks
/home/
User configuration files (ONLY)
(100 MB user quota)
/save/
User disk space (with BACKUP)
(250 GB user quota)
/work/
HPC TEMPORARY disk space
(1 TB user quota)
16/03/15
13
HPC environment
High Performance Computing
● Workspace is exactly the same as genotoul servers
(software, data-banks, disk spaces).
● Exception with permissions rights onto disk spaces (read
only on /save directory).
●
Tips :
➔ Submission and control from genotoul
➔ Portable binaries (no need to recompile)
➔ Facilities to get results
16/03/15
14
Cluster nodes
High Performance
Computing
cluster
Node001 à 068 (INTEL)
Ceri001 à 034 (AMD)
Bigmem01 à 02
smp
16/03/15
15
Cluster nodes
●
●
●
●
●
INTEL cluster: 68 nodes purchased in 2014
=> each 20 cores (40 threads), 256GB memory
AMD cluster: 34 nodes purchased in 2012
=> each 48 cores (48 threads), 384GB memory
BIGMEM : 2 nodes purchased in 2010/2012
=> each 32 cores (64 threads), 1TB memory
SMP : 1 node purchased in 2014
=> 120 cores (240 threads), 3TB memory
High-performance clustered file system (GPFS)
/work
16/03/15
16
Planning of the day
Part I : 09h00 - 12h00
Compute nodes environnment
Open Grid Engine
Practical 1
Part II : 14h00 – 17h00
Submit array of jobs
Practical 2
Parallel environments
Practical 3
16/03/15
17
Grid Engine is responsible for accepting, scheduling,
dispatching, and managing the remote and distributed
execution of large numbers of standalone, parallel or
interactive user jobs.
It also manages and schedules the allocation of distributed
resources such as processors, memory.
16/03/15
18
OGE (Open Grid Engine)
Queues availables for users
●
●
●
●
●
●
workq :
address all nodes but limited to 48H
unlimitq : just a few nodes per user (unlimited)
hypermemq : bigmem nodes (on demand)
smpq :
smp node (on demand)
...
Others : special restricted
16/03/15
19
OGE (Open Grid Engine)
Defaults parameters
Workq
1 core
8 GB memory maximum
●
Write only /work directory (temporary disk space)
● 1 TB quota disk per user (on /work directory)
● 120 days files without access automatic purged
●
100 000H annually computing time (more on demand)
●
●
●
16/03/15
20
OGE (Open Grid Engine)
qrsh (interactive mode)
qlogin (interactive with graphical redirection)
Connected
Disconnected
16/03/15
[laborie@genotoul2 ~]$ qlogin
Your job 2470388 ("QLOGIN") has been submitted
waiting for interactive job to be scheduled ...
Your interactive job 2470388 has been successfully scheduled.
Establishing /SGE/ogs/inra/tools/qlogin_wrapper.sh session to host node001 ...
[laborie@node001 ~]$
[laborie@node001 ~]$ exit
logout
/SGE/ogs/inra/tools/qlogin_wrapper.sh exited with exit code 0
[laborie@genotoul2 ~]$
21
OGE (Open Grid Engine)
qsub : batch Submission
1 - First write a script (ex: myscript.sh) with the command line as following:
#$ -o /work/.../output.t
#$ -e /work/.../error.txt
#$ -q workq
#$ -m bea
# My command lines I want to run on the cluster
blastall -d swissprot -p blastx -i /save/.../z72882.fa
2 - Then submit the job with the qsub command line as following:
job ID ->
16/03/15
$qsub
$qsubmyscript.sh
myscript.sh
Your
Yourjob
job15660
15660
("mon_script.sh")
("mon_script.sh")has
hasbeen
beensubmitted
submitted
22
OGE (Open Grid Engine)
Job Submission : basic options
●
-N job_name : to give a name to the job
●
-q queue_name : to specify the batch queue
●
-o output_file_name : to redirect output standard
●
-e error_file_name : to redirect error file
●
-m bae : mail sending options (b : begin, a : abort, e : end)
●
-l mem=8G: to ask for 8GB of memory (minimum reservation)
●
-l h_vmem=10G : to fix the maximum consumption of memory
●
-l myarch=intel / adm
16/03/15
23
OGE (Open Grid Engine)
Job Submission : some examples
●
Default (workq, 1 core, 8 GB memory max)
$qsub
$qsubmyscript.sh
myscript.sh
Your
Yourjob
job15660
15660
("mon_script.sh")
("mon_script.sh")has
hasbeen
beensubmitted
submitted
●
More memory (workq, 1 core, 32 / 36 GB memory)
$qsub
$qsub-l-lmem=32G
mem=32G-l-lh_vmem=36G
h_vmem=36Gmyscript.sh
myscript.sh
Your
Yourjob
job15661
15661
("mon_script.sh")
("mon_script.sh")has
hasbeen
beensubmitted
submitted
●
More cores (workq, 8 core, 8*8 GB memory)
$qsub
$qsub-l-lparallel
parallelsmp
smp88myscript.sh
myscript.sh
Your
Yourjob
job15662
15662
("mon_script.sh")
has
("mon_script.sh") hasbeen
beensubmitted
submitted
16/03/15
24
OGE (Open Grid Engine)
Job Submission : some examples
Script edition
$nedit
$neditmyscript.sh
myscript.sh
###
###head
headof
ofmyscript.sh
myscript.sh###
###
# !/bin/bash
# !/bin/bash
#$
-m
a
#$ -m a
#$
#$-l-lmem=32G
mem=32G
#$
#$-l-lh_vmem=36G
h_vmem=36G
#Mon
#Monprogramme
programmecommence
commenceici
ici
lsls
###
end
of
myscript.sh
### end of myscript.sh###
###
Submission
$qsub
$qsubmyscript.sh
myscript.sh
Your
Yourjob
job15660
15660
("mon_script.sh")
("mon_script.sh")has
hasbeen
beensubmitted
submitted
16/03/15
25
OGE (Open Grid Engine)
Monitoring jobs : qstat
$qstat
$qstat
job-ID
job-ID prior
prior name
name user
user state
state submit/start
submit/start queue
queue slots
slots ja-task-ID
ja-task-ID
Job-ID :
Job identifier
prior :
priority of job
name :
job name
user :
user name
state :
actualy state of job (see follow)
submit/start at : submit/start date
Queue :
batch queue name
slots :
number of slots aked for the job
ja-task-ID : job array task identifier (see follow)
16/03/15
26
OGE (Open Grid Engine)
Monitoring jobs : qstat
state : actually state of job
●
d(eletion) : job is deleting
E(rror) :
job is in error state
h(old), w(waiting) : job is pending
➢ t(ransferring) : job is about to be executed
➢ r(unning) :
job is running
➢
➢
➢
●
man qstat : to see all options of qstat command
16/03/15
27
OGE (Open Grid Engine)
qstat -f : full format display
$qstat
$qstat-f-f
queuename
qtype
states
queuename
qtyperesv/used/tot.
resv/used/tot.load_avg
load_avgarch
arch
states
----------------------------------------------------------------------------------------------------------------------------------------------------------------hypermemq@bigmem01
BIP
0/25/64
25.21
linux-x64
hypermemq@bigmem01
BIP 0/25/64
25.21 linux-x64
2654562
rr 02/01/2015
2654562502.47578
502.47578scriptIMR.
scriptIMR.pbert
pbert
02/01/201510:43:21
10:43:21 24
24
3417296
rr 02/23/2015
3417296510.00000
510.00000spades.sh
spades.sh klopp
klopp
02/23/201509:50:08
09:50:08 11
----------------------------------------------------------------------------------------------------------------------------------------------------------------hypermemq@bigmem02
BIP
2.00
hypermemq@bigmem02
BIP 0/3/32
0/3/32
2.00 linux-x64
linux-x64
2717127
2717127500.10764
500.10764bayesian_m
bayesian_mlbrousseau
lbrousseau rr 02/03/2015
02/03/201520:28:58
20:28:58 22
2822735
505.00000
LasMap
faraut
r
02/11/2015
2822735 505.00000 LasMap faraut
r 02/11/201514:29:35
14:29:35 11
----------------------------------------------------------------------------------------------------------------------------------------------------------------interq@node001
IPIP
interq@node001
3455759
3455759501.10143
501.10143QLOGIN
QLOGIN
3456700
501.10143
QLOGIN
16/03/15
3456700 501.10143 QLOGIN
3456911
3456911506.13893
506.13893QLOGIN
QLOGIN
0/13/40
2.12
0/13/40
2.12 linux-x64
linux-x64
mmolettadena
mmolettadena rr 02/23/2015
02/23/201515:21:13
15:21:13
mmolettadena
mmolettadena rr 02/23/2015
02/23/201515:33:25
15:33:25
smehdi
r
02/23/2015
15:36:48
smehdi
r
02/23/2015 15:36:48
11
11
11
28
OGE (Open Grid Engine)
Deleting a job : qdel
$qstat
$qstat-u
-ulaborie
laborie
job-ID
user
state
job-ID prior
prior name
name
user
statesubmit/start
submit/startat
at queue
queue
slots
slotsja-task-ID
ja-task-ID
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------3629151
3629151512.54885
512.54885sleep
sleep laborie
laborie rr 02/25/2015
02/25/201516:23:03
16:23:03
workq@node002
11
workq@node002
$$qdel
qdel3629151
3629151
laborie
laboriehas
hasregistered
registeredthe
thejob
job3629151
3629151for
fordeletion
deletion
16/03/15
29
Connection to « genotoul » cluster
node001 à 068 :
2720 INTEL cores,
17TB of memory
Ceri001 à 034 :
1632 AMD cores
12TB of memory
ssh
Internet
«genotoul»
login nodes
Access to platform
Developpment (scripts)
Jobs submission (cluster)
Transfert files to /save
16/03/15
qrsh
qlogin
qsub
qstat
qdel
Bigmem01 à 02 :
128 INTEL cores
2TB of memory
smp :
240 INTEL cores
3TB of memory
computer nodes
Work, hypermemq, smpq
Storage Facilities
/save : Read only
/work : Read + Write
30
Monitoring genotoul cluster
16/03/15
31
Practical
Part 1
16/03/15
32
Planning of the day
Part I : 09h00 - 12h00
Compute nodes environnment
Open Grid Engine
Practical 1
Part II : 14h00 – 17h00
Submit array of jobs
Practical 2
Parallel environments
Practical 3
16/03/15
33
Array of jobs concept
➔
➔
Concept : segment a job into smaller atomic jobs
Improve the processing time very significantly
(the calculation is performed on multiple processing
cores)
16/03/15
34
Ex.1: blast in basic mode
(genbank nucléique
Sequence reference)
(multi fasta file)
NT
seqs.fa
script.sh
blastn+ ­db nt ­query seqs.fa qsub script.sh
Execution on single core
16/03/15
35
Ex.2 : blast in split mode
seqs.fa
seq1.fa
script1.sh
seq2.fa
seq3.fa
blastn+ ­db nt ­query seq1.fa script2.sh
script3.sh
blastn+ ­db nt ­query seq2.fa blastn+ ­db nt ­query seq3.fa qsub script1.sh
qsub script2.sh
qsub script3.sh
16/03/15
Execution on 3 cores
36
Ex.3 : blast in job array mode
seqs.fa
split ...
seq1.fa
seq2.fa
script.sh
seq3.fa
blastx+ ­d nt ­i seq1.fa blastx+ ­d nt ­i seq2.fa blastx+ ­d nt ­i seq3.fa for i in ...
qarray script.sh
16/03/15
Execution on 3 cores
37
Ex.3 : blast in job array mode
script1.sh
script.sh
3 blast line
script2.sh
script3.sh
qarray script.sh
16/03/15
qsub script1.sh
qsub script2.sh
qsub script3.sh
38
Tools
Split a fasta file
fastasplit <path> <dirpath>
Sequence Input Options:
­­­­­­­­­­­­­­­­­­­­­­
­f ­­fasta [mandatory] <*** not set ***>
­o ­­output [mandatory] <*** not set ***>
­c ­­chunk [2]
Example :
#mkdir out_split
#fastasplit ­f seqs.fa ­o out_split ­c 6
16/03/15
39
Tools
Create multi commands file
1 rm script.sh
2 for f in `ls out_split/*`
3 >do
4 >echo blastn+ ­query $f ­db ensembl_danio_rerio ­o $f.blast >> script.sh
5 >done
`: is the character on the key '7' (2)
for: the $f will loop on the result of the command between ` … ` , (2)
i.e.: output of the split
➢ do: syntaxically required (3)
➢
echo: mean print to the screen (4)
➢ >>: redirect screen printing to the file script.sh (4)
➢
done: syntaxically required (5)
➢
➢
(1) If you execute the 'for loop' a second time, you MUST DELETE
script.sh as '>>' add lines in the file if it exists .
16/03/15
40
Practical
Part 2
16/03/15
41
Planning of the day
Part I : 09h00 - 12h00
Compute nodes environnment
Open Grid Engine
Practical 1
Part II : 14h00 – 17h00
Submit array of jobs
Practical 2
Parallel environments
Practical 3
16/03/15
42
OGE (Open Grid Engine)
Previous use of cluster
1 job = 1 thread (one core)
qarray script.sh
script.sh
blastx+ ­d nt ­i seq1.fa blastx+ ­d nt ­i seq2.fa blast1
16/03/15
blast2
Each blast use 1 core
43
OGE (Open Grid Engine)
Parallel environments
If the program was developed for : 1 job could use multi-threads
script.sh
blastx+ ­num_threads 2 ­d nt ­i seqs.fa qsub ­pe parallel_smp 2 script.sh
blast
16/03/15
Each blast use 2 cores
44
OGE (Open Grid Engine)
Parallel environments
Visualisation :
qconf -spl
qconf -sp <parallel_env>
Utilisation: qsub -pe <parallel_env> <n slots> myscript.sh
●
smp : X cores on the same node (multi-thread, OpenMP)
●
parallel_fill : fill up the node then use others nodes (MPI)
●
parallel_rr : X cores on strictly different nodes (MPI)
16/03/15
45
OGE (Open Grid Engine)
Parallel environments : smp
blast
Shared memory in the same node
Need optimized program (e.i. for blast do not use
multithread > 8)
16/03/15
46
OGE (Open Grid Engine)
Parallel environments : rr / fill
thre
ad2
thre
ad1
Thre
ad3
Only for MPI programmation (Message Passing Interface)
Read the manual of the software before use it
Not optimized for blast !
16/03/15
47
OGE (Open Grid Engine)
Parallel environments
Examples:
qsub -hard -l myarch=intel … myscript.sh (intel nodes utilisation)
qsub -soft -l myarch=intel … myscript.sh (intel nodes if free only)
qsub -pe parallel_fill 32 -soft -l myarch=intel … myscript.sh
qsub -pe parallel_smp N -hard -l myarch=intel … myscript.sh
Why this job stay in queue waiting ?
qsub -q workq -pe parallel_smp 20 -l mem=12G… myscript.sh
16/03/15
48
OGE (Open Grid Engine)
qstat -r : resources requirements
$qstat
$qstat-r-r
3193243
3193243516.61063
516.61063tneg_V1_UC
tneg_V1_UCaghozlane
aghozlane qw
qw 02/19/2015
02/19/201512:16:10
12:16:10
Full
Fulljobname:
jobname: tneg_V1_UC35_0_GL0032312
tneg_V1_UC35_0_GL0032312
Requested
RequestedPE:
PE: parallel_rr
parallel_rr88
Hard
HardResources:
Resources: h_stack=256M
h_stack=256M(0.000000)
(0.000000)
h_vmem=50G
h_vmem=50G(0.000000)
(0.000000)
memoire=50G
memoire=50G(0.000000)
(0.000000)
pri_work=true
pri_work=true(2400.000000)
(2400.000000)
16/03/15
49
OGE (Open Grid Engine)
qstat -t : sub-tasks (parallel jobs)
$qstat
$qstat-t-t
3191467
3191467516.61063
516.61063tneg_MH034
tneg_MH034aghozlane
aghozlane
workq@node012
MASTER
workq@node012
MASTER
workq@node012
SLAVE
workq@node012
SLAVE
3191467
3191467516.61063
516.61063tneg_MH034
tneg_MH034aghozlane
aghozlane
workq@node014
SLAVE
workq@node014
SLAVE
3191467
516.61063
tneg_MH034
aghozlane
3191467 516.61063 tneg_MH034 aghozlane
workq@node015
SLAVE
workq@node015
SLAVE
3191467
3191467516.61063
516.61063tneg_MH034
tneg_MH034aghozlane
aghozlane
workq@node016
SLAVE
workq@node016
SLAVE
3191467
3191467516.61063
516.61063tneg_MH034
tneg_MH034aghozlane
aghozlane
workq@node017
SLAVE
workq@node017
SLAVE
3191467
516.61063
tneg_MH034
aghozlane
3191467 516.61063 tneg_MH034 aghozlane
workq@node018
SLAVE
workq@node018
SLAVE
3191467
3191467516.61063
516.61063tneg_MH034
tneg_MH034aghozlane
aghozlane
16/03/15
workq@node019
SLAVE
workq@node019
SLAVE
3191467
3191467516.61063
516.61063tneg_MH034
tneg_MH034aghozlane
aghozlane
workq@node020
SLAVE
workq@node020
SLAVE
rr
02/25/2015
02/25/201509:02:18
09:02:18
rr
02/25/2015
02/25/201509:02:18
09:02:18
rr
02/25/2015
02/25/201509:02:18
09:02:18
rr
02/25/2015
02/25/201509:02:18
09:02:18
rr
02/25/2015
02/25/201509:02:18
09:02:18
rr
02/25/2015
02/25/201509:02:18
09:02:18
rr
02/25/2015
02/25/201509:02:18
09:02:18
rr
02/25/2015
02/25/201509:02:18
09:02:18
50
Practical
Part 3
16/03/15
51