Zeynep Kalender Atak & Hana Imrichova http://aertslab.org http://iregulon.aertslab.org From big-data to bedside @aertslab OUTLINE • Gene regulation • iRegulon overview and case studies • i-cisTarget overview and case studies iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies Undifferentiated tissue iRegulon overview iRegulon case studies Differentiated cells i-cisTarget overview i-cisTarget case studies Genome control Undifferentiated tissue iRegulon overview iRegulon case studies Differentiated cells i-cisTarget overview i-cisTarget case studies T R A N S C R I P T I O N A L R E G U L AT I O N TF motif (logo) from Position-weight matrices (PWM) Image Credit: http://www.cancerwatch.org & Wassermann and Sandelin, Nature Reviews 2004 What are the key transcription factors (TFs) and the direct target genes (TGs)? What are the co-factors and their direct target genes? TG4 TG2 TF2 TG3 TG1 TF1 iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies • 6383 candidate motifs • motif clustering (CRMs) • cross-species conservation • whole-genome rankings • gene set enrichment INPUT Set of genes MOTIF DETECTION TG4 TF2 TG2 TG3 TG1 TF1 iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies S c o r e m u l t i p l e g e n o m e s w i t h 9713 P W M s • 6383 candidate motifs • motif clustering (CRMs) • cross-species conservation • whole-genome rankings • gene set enrichment OFFLINE PWM 1 hg19 Set of genes MOTIF DETECTION 5’ g2 5’ ... INPUT g1 gn 3’ OUTPUT Regulons 5’ 3’ 3’ 5’ TG4 3’ 5’ TG2 3’ TF2 TG3 TG1 1. Define search space for each gene in the ref genome hg19 : [TSS-500bp,TSS] [TSS-5kb,TSS+5kb] [TSS-10kb,TSS+10kb] 2. HMM scoring for homotypic clusters for PWM 1 (Cluster-Buster) ! iRegulon overview iRegulon case studies i-cisTarget overview TF1 i-cisTarget case studies S c o r e m u l t i p l e g e n o m e s w i t h 9713 P W M s • 6383 candidate motifs • motif clustering (CRMs) • cross-species conservation • whole-genome rankings • gene set enrichment OFFLINE INPUT Set of genes PWM 1 hg19 5’ 3’ mm9 5’ 3’ rheMac2 5’ 3’ panTro2 5’ 3’ ponAbe2 5’ 3’ rn4 3’ 5’ susScr2 5’ MOTIF DETECTION TG4 canFam2 5’ bosTau4 5’ TG2 monDom55’ 3’ 3’ 3’ TF2 TG3 3’ TG1 1. Define search space for each gene in the ref genome hg19 : [TSS-500bp,TSS] [TSS-5kb,TSS+5kb] [TSS-10kb,TSS+10kb] 2. HMM scoring for homotypic clusters for PWM 1 (Cluster-Buster) TF1 3. Rescoring of orthologous regulatory regions determined via LiftOver ! iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies S c o r e m u l t i p l e g e n o m e s w i t h 9713 P W M s OFFLINE hg19 PWM 1 mm9 rheMac2 ... bosTau4 OS whole genome ranking • 6383 candidate motifs • motif clustering (CRMs) • cross-species conservation • whole-genome rankings • gene set enrichment INPUT Set of genes MOTIF DETECTION TG4 TG2 TF2 TG3 TG1 1. Define search space for each gene in the ref genome hg19 : [TSS-500bp,TSS] [TSS-5kb,TSS+5kb] [TSS-10kb,TSS+10kb] 2. HMM scoring for homotypic clusters for PWM 1 (Cluster-Buster) TF1 3. Rescoring of orthologous regulatory regions determined via LiftOver 4. Integration via Order Statistics ! iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies S c o r e m u l t i p l e g e n o m e s w i t h 9713 P W M s PWM n OFFLINE hg19 PWM 1 mm9 rheMac2 ... bosTau4 OS whole genome ranking • 6383 candidate motifs • motif clustering (CRMs) • cross-species conservation • whole-genome rankings • gene set enrichment INPUT Set of genes MOTIF DETECTION TG4 TG2 TF2 TG3 TG1 1. Define search space for each gene in the ref genome hg19 : [TSS-500bp,TSS] [TSS-5kb,TSS+5kb] [TSS-10kb,TSS+10kb] 2. HMM scoring for homotypic clusters for PWM 1 (Cluster-Buster) TF1 3. Rescoring of orthologous regulatory regions determined via LiftOver 4. Integration via Order Statistics ! iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies • 6383 candidate motifs • motif clustering (CRMs) • cross-species conservation • whole-genome rankings • gene set enrichment AUC INPUT Set of genes MOTIF DETECTION TARGET SELECTION TG4 TF2 TG2 TG3 TG1 TF1 iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies Fast analysis of a gene signature PWM 1 ONLINE PWM n Whole genome ranking • 6383 candidate motifs • motif clustering (CRMs) • cross-species conservation • whole-genome rankings • gene set enrichment AUC INPUT Set of genes MOTIF DETECTION TARGET SELECTION TG4 TF2 TG2 TG3 TG1 TF1 iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies Fast analysis of a gene signature PWM 1 PWM n ONLINE Whole genome ranking • 6383 candidate motifs • motif clustering (CRMs) • cross-species conservation • whole-genome rankings • gene set enrichment AUC INPUT Set of genes MOTIF DETECTION TARGET SELECTION TG4 TF2 TG2 TG3 TG1 AUC iRegulon overview Whole genome ranking Recovery curve Recovered genes Recovered genes Recovery curve AUC iRegulon case studies Whole genome ranking TF1 i-cisTarget overview i-cisTarget case studies Fast analysis of a gene signature PWM 1 PWM n ONLINE Whole genome ranking • 6383 candidate motifs • motif clustering (CRMs) • cross-species conservation • whole-genome rankings • gene set enrichment AUC INPUT Set of genes MOTIF DETECTION TARGET SELECTION TG4 Normalised Enrichment Score NES = (AUC-μ)/σ TF2 TG2 TG3 TG1 NES > 2.5 AUC iRegulon overview Whole genome ranking Recovery curve AUC distribution Recovered genes Recovered genes Recovery curve AUC iRegulon case studies Whole genome ranking TF1 i-cisTarget overview i-cisTarget case studies • 6383 candidate motifs • motif clustering (CRMs) • cross-species conservation • whole-genome rankings • gene set enrichment AUC INPUT Set of genes MOTIF DETECTION TARGET SELECTION TG4 TF2 TG2 TG3 TG1 TF1 iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies • 6383 candidate motifs • motif clustering (CRMs) • cross-species conservation • whole-genome rankings • gene set enrichment AUC INPUT Set of genes MOTIF DETECTION TARGET SELECTION TG4 TF2 TG2 TG3 TG1 TF1 M3 TF7 M4 MOTIF2TF MAPPING M2 1) motif-TF annotations 2) in input 3) orthology 4) motif similarity M1 TF6 TF5 TF5 TF1 TF3 TF4 TF2 iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies SOURCE # MOTIFS ORGANISM(S) JASPAR 1315 Mul4ple species Curated TRANSFAC 1551 Mul4ple species Curated, ChIP-‐chip YeTFaSCo 1709 Yeast FlyFactorSurvey 614 Drosophila B-‐1H, others (e.g., FlyReg) TIFFIN 120 Drosophila Predicted (gene sets) SelexConsensus 38 Drosophila Curated (FlyReg) Stark et al. 228 Drosophila Predicted (conserved, BLS) hPDI 437 Elemento et al. 371 Total now 6383 motifs Human • 6383 candidate Experimental • motif clustering (CRMs) conservation Drosophila• cross-species Predicted (conserved, NLS) • whole-genome rankings • gene set enrichment ENCODE 683 Human Mo4f discovery in DHS regions Factorbook 79 Human ENCODE ChIP-‐Seq mo4fs Taipale 820 Human HT-‐Selex Wolfe 36 Drosophila 39 Human INPUT iDMMPMM Set of genes SwissRegulon 1865 dimer mo4fs 603 Next release 9713 Uniprobe, curated, ChIP-‐chip “10K PWM Collection” AUC 9713 PWMs Known TF Unknown TF ZFP mo4fs How many TFs in human? • • • 944 TFs Human MOTIF DETECTION TARGET SELECTION Mo4fs from ChIP-‐seq and other public Mul4ple species 1943 motifs collec4on Human Jankowski et al 2392 motif-TF 190 HOMER ANNOTATION TYPE a_er removal of redundant mo4fs TG4 interactions TF2 TG2 TG3 TG1 TF1 M3 TF7 M4 MOTIF2TF MAPPING M2 1) motif-TF annotations 2) in input 3) orthology 4) motif similarity M1 TF6 TF5 TF5 TF1 TF3 TF4 TF2 iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies “10K PWM Collection” • 6383 candidate motifs • motif clustering (CRMs) • cross-species conservation • whole-genome rankings • gene set enrichment AUC How many TFs in human? INPUT Set of genes •944 TFs •1943 motifs •2392 motif-TF interactions MOTIF DETECTION TARGET SELECTION TG4 TF2 TG2 TG3 TG1 TF1 M3 TF7 M4 MOTIF2TF MAPPING M2 1) motif-TF annotations 2) in input 3) orthology 4) motif similarity M1 TF6 TF5 TF5 TF1 TF3 TF4 TF2 iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies “10K PWM Collection” direct annotations gene orthology + motif similarity gene orthology motif similarity 1191 TFs 6031 mo4fs 6031 motifs 1191 TFs Direct • 6383 candidate motifs • motif clustering (CRMs) • cross-species conservation • whole-genome rankings • gene set enrichment Direct Orthology AUC Similarity SO Orthology SO Similarity How many TFs in human? INPUT Set of genes •1191 TFs (+ 26%) •6031 motifs (x3) •54497 motif-TF interactions MOTIF DETECTION TARGET SELECTION TG4 TF2 TG2 TG3 TG1 TF1 M3 TF7 M4 MOTIF2TF MAPPING M2 1) motif-TF annotations 2) in input 3) orthology 4) motif similarity M1 TF6 TF5 TF5 TF1 TF3 TF4 TF2 iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies Set of differentially expressed genes ChIP-derived targets One TF node iRegulon overview Master Regulons & cofactors TF regulon TF metatargetome iRegulon case studies Set of miRNA target genes Functional network cluster i-cisTarget overview miRNA and TF co-regulons Regulatory network cluster i-cisTarget case studies Set of differentially expressed genes ChIP-derived targets One TF node iRegulon overview Master Regulons & cofactors TF regulon TF metatargetome iRegulon case studies Set of miRNA target genes Functional network cluster i-cisTarget overview miRNA and TF co-regulons Regulatory network cluster i-cisTarget case studies Using iRegulon to decipher the p53 targetome Set of differentially expressed genes iRegulon overview iRegulon case studies Master Regulons & cofactors i-cisTarget overview i-cisTarget case studies Using iRegulon to decipher the p53 targetome Tr e a t m e n t o f M C F - 7 cells with Nutlin-3a Library Preparation & Sequencing Differential Expression Analysis & Filtering Control (2x) / Nutlin-3a (2x) 20-30 million reads/sample FDR<0.05 & |log2FC|>1 iRegulon overview iRegulon case studies 801 up (790 down) i-cisTarget overview i-cisTarget case studies Using iRegulon to decipher the p53 targetome Tr e a t m e n t o f M C F - 7 cells with Nutlin-3a Library Preparation & Sequencing Differential Expression Analysis & Filtering Control (2x) / Nutlin-3a (2x) 20-30 million reads/sample FDR<0.05 & |log2FC|>1 801 up (790 down) Identify p53 as the master regulator and N F Y, A P - 1 , F O X a s c o - f a c t o r s iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies Using iRegulon to decipher the p53 targetome iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies Using iRegulon to decipher the p53 targetome Validation with p53 ChIP-seq with Nutlin-3a iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies Using iRegulon to decipher the p53 targetome Validation with p53 ChIP-seq with Nutlin-3a ChIP-seq p53 Stimulated ChIP-seq p53 Input RNA-seq p53 Stimulated RNA-seq p53 Not stimulated Refseq Genes iRegulon overview CDKN1A iRegulon case studies i-cisTarget overview i-cisTarget case studies Using iRegulon to decipher the p53 targetome Validation with p53 ChIP-seq upon Nutlin-3a 110 p53 direct targets are enriched in ChIP-seq peaks ! 80 are novel targets iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies More case studies in http://iregulon.aertslab.org/ tutorial.html iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies RANKING (offline) iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies RANKING (offline) iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies RECOVERY (online) iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies RECOVERY (online) iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies RECOVERY (online) iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies RECOVERY (online) iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies RECOVERY (online) iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies TF ChIP-seq peaks Set of differentially active regions A set of genes iRegulon overview The motifs and TF ChIP-seq tracks of the ChIP'ped factor Motifs and ChIP-seq tracks of co-factors Correlated DHS/Faire/ Histone marks The most correlated motifs and TF ChIP-seq tracks Correlated DHS/Faire/ Histone marks The most correlated motifs and TF ChIP-seq tracks Correlated DHS/Faire/ Histone marks iRegulon case studies i-cisTarget overview i-cisTarget case studies TF ChIP-seq peaks Set of differentially active regions A set of genes iRegulon overview The motifs and TF ChIP-seq tracks of the ChIP'ped factor Motifs and ChIP-seq tracks of co-factors Correlated DHS/Faire/ Histone marks The most correlated motifs and TF ChIP-seq tracks Correlated DHS/Faire/ Histone marks The most correlated motifs and TF ChIP-seq tracks Correlated DHS/Faire/ Histone marks iRegulon case studies i-cisTarget overview i-cisTarget case studies TF ChIP-seq peaks Set of differentially active regions A set of genes iRegulon overview The motifs and TF ChIP-seq tracks of the ChIP'ped factor Motifs and ChIP-seq tracks of co-factors Correlated DHS/Faire/ Histone marks The most correlated motifs and TF ChIP-seq tracks Correlated DHS/Faire/ Histone marks The most correlated motifs and TF ChIP-seq tracks Correlated DHS/Faire/ Histone marks iRegulon case studies i-cisTarget overview i-cisTarget case studies Analyzing GATA1 ChIP-seq data on K562 cell line TF ChIP-seq peaks iRegulon overview The motifs and TF ChIP-seq tracks of the ChIP'ped factor iRegulon case studies Motifs and ChIP-seq tracks of co-factors i-cisTarget overview Correlated DHS/Faire/ Histone marks i-cisTarget case studies Analyzing GATA1 ChIP-seq data on K562 cell line Top 1000 peaks analyzed by i-cisTarget iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies Analyzing GATA1 ChIP-seq data on K562 cell line Top 1000 peaks analyzed by i-cisTarget hg19, mm9, dm3 iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies Analyzing GATA1 ChIP-seq data on K562 cell line Top 1000 peaks analyzed by i-cisTarget hg19, mm9, dm3 bed file, gene symbol (HGNC), i - c i s Ta r g e t r e g i o n I D s iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies Analyzing GATA1 ChIP-seq data on K562 cell line Top 1000 peaks analyzed by i-cisTarget hg19, mm9, dm3 bed file, gene symbol (HGNC), i - c i s Ta r g e t r e g i o n I D s iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies GATA1 ChIP-seq on K562 cell line Top 1000 peaks analyzed by i-cisTarget iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies GATA1 ChIP-seq on K562 cell line Top 1000 peaks analyzed by i-cisTarget G ATA m o t i f s iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies GATA1 ChIP-seq on K562 cell line Top 1000 peaks analyzed by i-cisTarget G ATA m o t i f s DHS regions on K562 cell line iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies Melanoma phenotype switching Set of differentially active regions iRegulon overview The most correlated motifs and TF ChIP-seq tracks iRegulon case studies Correlated DHS/Faire/ Histone marks i-cisTarget overview i-cisTarget case studies Melanoma phenotype switching - Malignant melanoma development is characterized by a bistable switch between different states - Invasive and proliferative states were described in melanoma cell cultures (Hoek et al., 2008) - We observed them also in tumour biopsies (TCGA: RNA-seq for 375 samples) A normal B proliferative C invasive Verfaillie & Imrichová & Kalender Atak et al., Nature Communications (in press) iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies Melanoma phenotype switching Data 11 malignant melanoma (cell cultures) profiled with RNA-seq FA I R E - s e q ChIP-seq H3K27ac iRegulon overview iRegulon case studies ChIP-seq H3K27me3 i-cisTarget overview i-cisTarget case studies Melanoma phenotype switching Data Samples are separated into proliferative and invasive based on MDS (multidimensional scaling) Which TFs cause the chromatin changes leading to the proliferative/ i n vproliferative asive state? samples ! Where the regulators bind? iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies Melanoma phenotype switching Data Samples are separated into proliferative and invasive based on MDS (multidimensional scaling) Which TFs cause the chromatin changes leading to the proliferative/ i n vproliferative asive state? samples ! Where the regulators bind? iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies Melanoma phenotype switching DESeq2 & MACS2 applied on H3K27ac iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies Melanoma phenotype switching DESeq2 & MACS2 applied on H3K27ac iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies p53 targetome revisited A set of genes iRegulon overview The most correlated motifs and TF ChIP-seq tracks iRegulon case studies Correlated DHS/Faire/ Histone marks i-cisTarget overview i-cisTarget case studies p53 targetome revisited Tr e a t m e n t o f M C F - 7 cells with Nutlin-3a Library Preparation & Sequencing Differential Expression Analysis & Filtering Control (2x) / Nutlin-3a (2x) 20-30 million reads/sample FDR<0.05 & |log2FC|>1 Rank Feature NES 1 transfac_pro-M01656 Description: V$P63_01 Possible TFs: TP63, TP53, TP73 6,19819 2 taipale-RACATGYCNNGRCATGTY-Tp53-DBD Description: RACATGYCNNGRCATGTY-Tp53-DBD Possible TFs: TP63, TP53 5,97630 3 transfac_pro-M01651 Description: V$P53_03 Possible TFs: TP63, TP53, TP73 5,79805 801 up (790 down) Logo … 7 lcbTfbs_mcf7_p53_nutlin Description: lcbTfbs_mcf7_p53_nutlin 5,34445 … 111 ENCFF001WPZ Description: DNase-seq on human MCF-7 iRegulon overview iRegulon case studies 3,11541 i-cisTarget overview i-cisTarget case studies p53 targetome revisited Tr e a t m e n t o f M C F - 7 cells with Nutlin-3a Library Preparation & Sequencing Differential Expression Analysis & Filtering Control (2x) / Nutlin-3a (2x) 20-30 million reads/sample FDR<0.05 & |log2FC|>1 Rank Feature NES 1 transfac_pro-M01656 Description: V$P63_01 Possible TFs: TP63, TP53, TP73 6,19819 2 taipale-RACATGYCNNGRCATGTY-Tp53-DBD Description: RACATGYCNNGRCATGTY-Tp53-DBD Possible TFs: TP63, TP53 5,97630 3 transfac_pro-M01651 Description: V$P53_03 Possible TFs: TP63, TP53, TP73 5,79805 801 up (790 down) Logo 167 candidate regions … 7 lcbTfbs_mcf7_p53_nutlin Description: lcbTfbs_mcf7_p53_nutlin 5,34445 … 111 ENCFF001WPZ Description: DNase-seq on human MCF-7 iRegulon overview iRegulon case studies 3,11541 i-cisTarget overview i-cisTarget case studies p53 targetome revisited Tr e a t m e n t o f M C F - 7 cells with Nutlin-3a Library Preparation & Sequencing Differential Expression Analysis & Filtering Control (2x) / Nutlin-3a (2x) 20-30 million reads/sample FDR<0.05 & |log2FC|>1 Rank Feature NES 1 transfac_pro-M01656 Description: V$P63_01 Possible TFs: TP63, TP53, TP73 6,19819 2 taipale-RACATGYCNNGRCATGTY-Tp53-DBD Description: RACATGYCNNGRCATGTY-Tp53-DBD Possible TFs: TP63, TP53 5,97630 3 transfac_pro-M01651 Description: V$P53_03 Possible TFs: TP63, TP53, TP73 5,79805 801 up (790 down) Logo 167 candidate regions … 7 lcbTfbs_mcf7_p53_nutlin Description: lcbTfbs_mcf7_p53_nutlin 5,34445 … 111 ENCFF001WPZ Description: DNase-seq on human MCF-7 iRegulon overview iRegulon case studies 3,11541 i-cisTarget overview i-cisTarget case studies Using i-cisTarget to decipher the p53 targetome Validation with p53 ChIP-seq with Nutlin-3a ChIP-seq p53 Stimulated ChIP-seq p53 Input RNA-seq p53 Stimulated RNA-seq p53 Not stimulated Refseq Genes iRegulon overview CDKN1A iRegulon case studies i-cisTarget overview i-cisTarget case studies More case studies in https://gbiomed.kuleuven.be/apps/ lcb/i-cisTarget-NAR/index.php#examples iRegulon overview iRegulon case studies i-cisTarget overview i-cisTarget case studies Input Output Genes Genes or Regions Upstream TFs Upstream TFs Target Genes Target Regions & The Closest Genes Enriched Regulatory Tracks (not TFs) Databases PWM PWM TF ChIP-seq TF ChIP-seq Histone mark ChIP-seq DHS/FAIRE Interface iRegulon overview Cytoscape App iRegulon case studies Web server i-cisTarget overview i-cisTarget case studies
© Copyright 2024