Biology 559R: Introduction to Phylogenetic Comparative Methods

Biology 559R: Introduction to Phylogenetic Comparative Methods
Topics for this week (April 7 & April 9): • Cluster fun: Intro to BYU supercomputer Intro to UNIX Running programs and jobs • First round of presenta?ons 1 Presentations and Report
• This next Thursday (April 9th and April 14th)
• You have 15 minutes (12 min for presentation and 3 min for questions)
• You will need to send me or bring a pdf of your presentation (so I can put it in my
computer)
• You will need a final report (no more than 4500 words – 10 pages) in the format of a ‘Brief
Communication’ of your project (last day to turn this report April 20th):
Title: 150 characters (including spaces)
Abstract: 300 words maximum for your abstract.
Introduction: A brief background introduction
Your main question or hypothesis
Materials and Methods: Summary methods and data used
Results: Main results including figures and tables
Discussion: The relevance of the main results in the light of other evidence
References: Use the Evolution journal guideline
http://onlinelibrary.wiley.com/journal/10.1111/%28ISSN%291558-5646/homepage/
ForAuthors.html
2 Computer Clusters
• A computer cluster is an array of connected computers with similar aDributes that can be used individually or work together so that, in many respects, they can be viewed as a single system.
• Most clusters use UNIX as operating system and users are required to have a minimum
knowledge of this programing language.
• Phylogenetic and comparative methods might require the use of computer clusters to reduce
the time and memory requirements to analyze large datasets.
3 Computer Clusters at BYU
• The Fulton Supercomputing Lab: https://marylou.byu.edu/
4 Computer Clusters at BYU
• The Fulton Supercomputing Lab: https://marylou.byu.edu/
1 5 Computer Clusters at BYU
• The Fulton Supercomputing Lab: https://marylou.byu.edu/
2 6 Computer Clusters at BYU
• The Fulton Supercomputing Lab: https://marylou.byu.edu/
3 7 Computer Clusters at BYU
• The Fulton Supercomputing Lab: https://marylou.byu.edu/
• For reference: MacPro 12-cores, 64 GB memory, 1 TB storage
8 Computer Clusters at BYU
• The Fulton Supercomputing Lab: https://marylou.byu.edu/
• An introductory video to the Fulton Supercomputing Lab (Time 6:37):
https://www.youtube.com/watch?v=i1r9BxHBG0I
9 Getting an account at BYU Clusters
• The Fulton Supercomputing Lab: https://marylou.byu.edu/account/create/
10 UNIX
• This is family of computer operating systems used by programmers and very
common in cluster computers.
• For users, UNIX operating system is characterized:
1) Command-line based interaction with the computer
2) Plain text for storing and input of data
3) A large collection of software tools that need to be called by the user
• Several Unix-like (e.g., Linux) exist that are free or open-source software development that
has facilitated its distribution and popularization.
11 Intro to UNIX
• For Mac Users, the terminal is probably the main software to access and connect to the BYU
cluster
• We can open any directory in our computer by typing cd and then dragging a folder that we
are interested to explore
12 Intro to UNIX
• For the course website, we can explore the ‘Unix-cheat-sheet’ and some of the most
common tools
• Using terminal, you can login to your BYU cluster account using ssh (secure shell)
• Then, you can explore any folder by using list (ls) and move up and down directories (cd).
13 Intro to UNIX
• One of the most common used to abbreviate names is the use of ‘wildcard’ characters. This
reduces significantly the amount of typing.
• By pressing the up (é) arrow, you can get the previous command line that you typed before.
You can continue pressing the up arrow to get previous commands.
• By pressing the down (ê) arrow, you can get the posterior command line that you typed
after.
14 Intro to UNIX
• It is very likely that you also spend a significant amount of time copying and making new
directories
• Notice the scp command, this will allow you to copy files and folders from your computer to
the BYU cluster and vice versa.
15 Intro to UNIX
• Removing, deleting and renaming files and folders can be tedious and you have to be very
careful with these commands. Once deleted, some files and folders are lost and there is no
‘trash bin’ that you can recover those files.
16 Intro to UNIX
• Most text edition should be done in your computer using ‘Text Wrangler’, ‘NotePad++’ or any
other editor. However, you might need to inspect the files in the cluster (e.g., checking that you
are getting the correct output or results in error.logs)
Notice the use of VIM editor, this is a very powerful text editor available in the cluster.
However, it use requires significant practice. Here is a link if you interested in more details
about this software:
http://www.fprintf.net/vimCheatSheet.html
17 Intro to UNIX
• Most process that require a computer cluster are usually large and require that you copy
such files from your computer to the cluster and vice versa. For this reason, most files are
compressed before and expanded after the transfer.
18 Submitting Jobs to the BYU clusters
• With the exception of some scripts (e.g., testing scripts), most jobs that will be run after a job
script has been submitted to the cluster.
• Currently, the BYU clusters accept job scripts submitted as sbatch included as a command
of the Slurm (Simple Linux Utility for Resource Management) Workload Manager.
• Slurm is an open-source job scheduler that resources (computer nodes) to users for some
duration of time so they can perform work.
• Slurm also provides a framework for starting, executing, and monitoring work on a set of
allocated nodes.
• Finally, Slurm arbitrates contention for resources by managing a queue of pending jobs.
• Here is and intro video by FSL:
https://www.youtube.com/watch?v=U42qlYkzP9k&index=4&list=PL326A5EB4E3B16FED
19 Creating Job Scripts
• Make a text file (e.g., mytest.job)
Note: You can also have a working folder in your home directory:
cd /bluehome3/user_name/mydirectory!
20 Creating Job Scripts
• Batch job with modules:
https://marylou.byu.edu/documentation/apps/softwareModuleList
21 Creating Job Scripts
• Batch job with modules:
https://marylou.byu.edu/documentation/apps/softwareModuleList
22 Creating Job Scripts
• Running R in the cluster: It requires that you have r script (i.e., a set commands ready by
run, like copy and paste)
23 Creating Job Scripts
• Running R in the cluster: It requires that you have r script (i.e., a set commands ready by
run, like copy and paste)
More info:
https://www.osc.edu/documentation/howto/install-local-R-packages
24 Copy folder and files to the Cluster
• Copy your files and job directories to the BYU cluster. You can use ‘FileZilla’ which is a free,
cross-platform FTP (File Transfer Protocol) application software.
https://filezilla-project.org/
Binaries are available for Windows, Linux, and Mac OS X.
• If this is your first time, you need to set up the FTP connection:
Host: ssh.fsl.byu.edu
Username: user_name
Password: your password
Port: 22 (SSH Remote Login Protocol)
• Then, you can ‘drag and drop’ your folders and files (including your job files) to
the BYU cluster
25 Submitting and monitoring jobs in the Cluster
• To submit jobs, you will need to locate your job file (e.g., mytest.job)
sbatch mytest.job!
• Your job will be scheduled and run based on Job Scheduler see video:
https://www.youtube.com/watch?v=h8TZokyI6yo&list=PL326A5EB4E3B16FED&index=2
• Your can check the status of your job
squeue -u user_name!
• If, for some reason, you want to cancel a job , find the job id name with and then cancel it
squeue -u user_name!
scancel jobnumber!
26 Presentations and Report
• This next Thursday (April 9th and April 14th)
• You have 15 minutes (12 min for presentation and 3 min for questions)
• You will need to send me or bring a pdf of your presentation (so I can put it in my
computer)
• You will need a final report (no more than 4500 words – 10 pages) in the format of a ‘Brief
Communication’ of your project (last day to turn this report April 20th):
Title: 150 characters (including spaces)
Abstract: 300 words maximum for your abstract.
Introduction: A brief background introduction
Your main question or hypothesis
Materials and Methods: Summary methods and data used
Results: Main results including figures and tables
Discussion: The relevance of the main results in the light of other evidence
References: Use the Evolution journal guideline
http://onlinelibrary.wiley.com/journal/10.1111/%28ISSN%291558-5646/homepage/
ForAuthors.html
27