Illumina Pipeline Overview Casava Pipeline v1.8.2 Consensus Assessment of Sequence And VAriation New ISAAC software Much of this information comes from the Illumina documentation ,QWURGXFWLRQ ŗ ¢ Firecrest – Image Analysis tiff image files RTA allows on the fly image processing and basecalling. Bustard – Base calling, quality calibration, filtering and statistics intensity files sequence files RTA file format called .bcl files – binary files contain all basecalled sequence reads and quality scores. CASAVA – demultiplexing, alignment and more statistics Align reads with phageAlign or ELAND ELAND output - export files ELAND - Efficient Large-Scale Alignment Nucleotide Databases eland_pair - align each read separately, then use uniquely aligning reads to estimate read orientation and distance - perl script picks the best read pair - anomaly file is created for reads that do not map in the expected orientation or size range eland_rna -abundant sequences files: mitochondrial DNA, ribosomal region sequences, 5S RNA (optional), and other contaminants -splice junction set files New ELAND v2e • Multi-seed and gapped alignment (since CASAVA1.6) • Improved repeat resolution (multiple overlapping seeds to anchor into unique) • Orphan alignment – Try to align orphaned mate with defined window (default ~450bp) Run time improvements Basic Sample Sheet *HQHUDWLQJ WKH 6DPSOH 6KHHW ǻǯ Ǽ ǰ ¡ ǯ ¢ ǯ ǰ ǰ ¡ǯ DZ ¡ ǰ ǻŗȬŞǼ ¡ ǰ ¡ ¡ ǯ ǯ ǰ ¡ ǯ ǯ ¢ ¢ ¡ ǰ ¡ ¢ǯ ¢ ¢ǯ DZ " > @ ? ! A _ )LOHV ǯ¡ Ȭ ǯ Run Folder Structure !"#$%&'#()*&#+%&+#$),-./01$) 110608_Nirvana_0063_AB0ABTABXX/Data/Intensities/BaseCalls !"#$%&'(%)*+(%*,+-./.+0%1'-23.'45% %%%%!#421.64+0% %%%%%%%%!7('8+-39:$;9;% %%%%%%%%%%%%!<2=,1+9$;>?@A>% %%%%%%%%%%%%%%%%$;>?@A>9;BC;CD9E??F9">9??>G/2*3HG6I% %%%%%%%%%%%%%%%%$;>?@A>9;BC;CD9E??F9">9??FG/2*3HG6I% %%%%%%%%%%%%%%%%$;>?@A>9;BC;CD9E??F9">9??AG/2*3HG6I% %%%%%%%%%%%%%%%%$;>?@A>9;BC;CD9E??F9"F9??>G/2*3HG6I% %%%%%%%%%%%%%%%%$;>?@A>9;BC;CD9E??F9"F9??FG/2*3HG6I% %%%%%%%%%%%%%%%%$;>?@A>9;BC;CD9E??F9"F9??AG/2*3HG6I% %%%%%%%%%%%%%%%%$;>?@A>9;BC;CD9E??A9">9??>G/2*3HG6I% %%%%%%%%%%%%%%%%$;>?@A>9;BC;CD9E??A9">9??FG/2*3HG6I% %%%%%%%%%%%%%%%%$;>?@A>9;BC;CD9E??A9"F9??>G/2*3HG6I% %%%%%%%%%%%%%%%%$;>?@A>9;BC;CD9E??A9"F9??FG/2*3HG6I% %%%%%%%%%%%%!<2=,1+9$;>?@JK% %%%%%%%%%%%%%%%%$;>?@JK9CD;BDB9E??F9">9??>G/2*3HG6I% %%%%%%%%%%%%%%%%$;>?@JK9CD;BDB9E??F9">9??FG/2*3HG6I% %%%%%%%%%%%%%%%%$;>?@JK9CD;BDB9E??F9"F9??>G/2*3HG6I% %%%%%%%%%%%%%%%%$;>?@JK9CD;BDB9E??F9"F9??FG/2*3HG6I% %%%%%%%!7('8+-39"$;9L% %%%%%%%%%%%%!<2=,1+9$;>?@M>% %%%%%%%%%%%%%%%%$;>?@M>9CD;BDB9E??A9">9??>G/2*3HG6I% %%%%%%%%%%%%%%%%$;>?@M>9CD;BDB9E??A9"F9??>G/2*3HG6I% %%%%%%%%%%%%!<2=,1+9$;>?@M?% %%%%%%%%%%%%%%%%$;>?@M?9BB;DDC9E??A9">9??>G/2*3HG6I% %%%%%%%%%%%%%%%%$;>?@M?9BB;DDC9E??A9"F9??>G/2*3HG6I% %%%%%%%!7('8+-39C'43('1% """"""""""""""""""""""""""""""""""""& :)(=76.=&?@&,-.&A<<&+,&BCD&+2=&E<CDF&,-.@&).:).3.2,&,-.&*(3,&:(:79+)&'()*+,&7 Z& :)(5.6,3&G-.).&,-.&EDH&83&82I(9I.=;&J.&G899&73.&,-.&3+*.&+::)(+6-;&&$-.&&!"#$% CCCC<<<<[\"[C<[SOO[[[[[[[ACC"OOOO"OO& >K8:&6(*:).33.=&,(&*828*8K.&3,()+>.;&&L+2@&:(:79+)&+98>2.)3&+).&+?9.&,(&=8).6,9@ ! 6(*:).33.=&!"#$%&'89.3;& $-.&'8)3,&982.&83&:).'8M.=&?@&,-.&ȃOȄ&3@*?(9&+2=&6(2,+823&,-.&).+=&2+*.;&$-.3.&2+*.3&+).& "&3+*:9.&.2,)@&83&:)(I8=.=&+2=&.M:9+82.=&?.9(GN& :+)3.=&72,89&,-.&'8)3,&.26(72,.).=&G-8,.3:+6.;&A7.&,(&,-83&?.-+I8()F&+==82>&+==8,8(2+9&,+>3&,(& ,-.&-.+=.)&982.&83&2(,&:)(?9.*+,86&'()&.M,+2,&!"#$%&:+)3.)3;&&! & $-.&3.6(2=&982.&6(2,+823&,-.&3.]7.26.&?+3.3&! OB"#/PQN/PRN!<S0RTUNVNWN/000N/VXW0&/NYN/XN"$<"<1& $-.&,-8)=&982.&83&:).'8M.=&?@&+&Z&3@*?(9&+2=&3(*.,8*.3&).:.+,3&,-.&).+=&2+*.;&$-.&).+=&2+*.& """"""""""""""""""""""""""""""""""""& 83&(*8,,.=&82&,-.&*828*+9&!"#$%&6+3.;&! Z& $-.&'(7),-&982.&6(2,+823&,-.&?+3.&]7+98,8.3&G-.).&C%&Z&PP&^&"#<DD&I+97.&3-(G2&82&,-.&?+3.& CCCC<<<<[\"[C<[SOO[[[[[[[ACC"OOOO"OO& ]7+98,@&3,)82>&! & ! $-.&-.+=.)&982.&83&82,.):).,.=&+3&'(99(G3N! $-.&'8)3,&982.&83&:).'8M.=&?@&,-.&ȃOȄ&3@*?(9&+2=&6(2,+823&,-.&).+=&2+*.;&$-.3.&2+ O&\823,)7*.2,_2+*.`N\)72&DA`N\'9(G6.99&DA`N\9+2._27*?.)`N\,89._27*?.)`N&& :+)3.=&72,89&,-.&'8)3,&.26(72,.).=&G-8,.3:+6.;&A7.&,(&,-83&?.-+I8()F&+==82>&+==8, \M_:(3`N&\@_:(3`&\).+=&27*?.)`N\83&'89,.).=`N\6(2,)(9&27*?.)`N\?+)6(=.&3.]7.26.`&& ,-.&-.+=.)&982.&83&2(,&:)(?9.*+,86&'()&.M,+2,&!"#$%&:+)3.)3;&&! & & & $-.&3.6(2=&982.&6(2,+823&,-.&3.]7.26.&?+3.3&! E(,.&,-.&3:+6.&?.,G..2&\@:(3`&+2=&\).+=&27*?.)`;&&D2&+&:+8).=&.2=&)72F&).+=&/&+2=&).+=&V& G899&?.&82&=8''.).2,&!"#$%&'89.3F&?7,&G.&G+2,&,-.*&,(&-+I.&*+,6-82>&,.*:9+,.&2+*.3;&&$-.& $-.&,-8)=&982.&83&:).'8M.=&?@&+&Z&3@*?(9&+2=&3(*.,8*.3&).:.+,3&,-.&).+=&2+*.;&$2+*.&7:_,(&,-.&3:+6.&G899&+93(&?.&73.=&+3&,-.&).+=&2+*.&82&,-.&'82+9&C"L&'89.;&&& 83&(*8,,.=&82&,-.&*828*+9&!"#$%&6+3.;&! \).+=&27*?.)`&G899&,@:86+99@&?.&/&()&VF&?7,&,-.&'8.9=&6+2&37::(),&(,-.)&I+97.3;&&a!()&.M+*:9.F& $-.&'(7),-&982.&6(2,+823&,-.&?+3.&]7+98,8.3&G-.).&C%&Z&PP&^&"#<DD&I+97.&3-(G2&82 6.),+82&82=.M82>&'()*+,3&9.+=&,(&P&).+=3;b& ]7+98,@&3,)82>&! \83&'89,.).=`&83&Y&8'&,-.&).+=&83&'89,.).=F&E&(,-.)G83.;&& \6(2,)(9&27*?.)`&83&0&G-.2&2(2.&('&,-.&6(2,)(9&?8,3&+).&(2F&(,-.)G83.&8,&83&+2&.I.2&27*?.);&& & \?+)6(=.&3.]7.26.`&).:).3.2,3&,-.&c#BdC"#B#&*+3e.=&?+)6(=.&3.]7.26.F&.*:,@&(,-.)G83.;& $-.&-.+=.)&982.&83&82,.):).,.=&+3&'(99(G3N! Fastq files O&\823,)7*.2,_2+*.`N\)72&DA`N\'9(G6.99&DA`N\9+2._27*?.)`N\,89._27*?.)`N&& Demultiplex Stats file (Indexing) QC metrics (Summary file) For aligned paired reads, the summary file shows read orientation and average insert size for each lane. ISAAC Aligner and Variant Caller (processing in less than half the time compared to Casava) Aligner - Sort reference index by 32mers - Find candidate mappings by 32bp seed search - Select best mapping (3’ LQ and adapter trimming) - Assign alignment scores (use base quality and position of mismatches) - Output is sorted de-duped BAM Variant Caller - Call SNVs and small indels (<50bp) - Bayesian SNP caller computes probability of each genotype - Filters are applied (quality, depth, etc) - Reads are realigned around indels (Bayesian indel caller) - gVCF output ([FHOOHQW 4XDOLW\ 0HWULFV Sequence Analysis (SAV) Viewer ¢ ¡ ¢ ǯ Ȭ ǻƖǁřŖǼ ¢ ǻ Ǽ ƖǁřŖ ǻ Ǽǯ ř ¡ ¢ Save thumbnail images SAV quality metrics charts SAV check images SAV Summary ,QWHUIDFH &RPPDQGV Studio Illumina Variant ǯ ǰ ǰ ¢ ǰ ¡ ǯ ř • Import vcf files • Annotate, filter, and classify variants • Filter based on family structure and disease model • Somatic variants/ COSMIC • Generate reports with histograms and charts $ ȯ ǰ ǰ ǯ £ DZȱ ǰ ǰ ǰ ǯ % ȯ ¢ ǯ & ¢ȯ ¢ ǯ ' ȯ ǰ ǰ Ȭ ǯ ( ȯ ǯ ) ȯ ǰ ǰ Ȭ ǯ ǯ MiSeq Reporter MiSeq Reporter Workflows Instrument Control Software (MCS) RTA Images Base calls & Quality Scores MiSeq Reporter Resequencing Amplicon Library QC Small RNA Limited Visualization via HTTP interface Denovo Assembly 16S Metagenomics MiSeq Workflows • • • • • • Library QC Resequencing Amplicon (up to 384 loci in 96 samples) De Novo Assembly (<20Mb) Small RNA Metagenomics (16S rRNA) MiSeq PhiX Validation Run Paired end 100 cycle run Illumina Control Library (PE 151 MiSeq run) Coverage report Coverage depth Mismatches Quality score (avg) Variant call score
© Copyright 2024