Reconstruction of Gene Regulatory Networks from RNA

Reconstruction of Gene Regulatory
Networks from RNA-Seq Data
Jianlin Jack Cheng
Computer Science Department
University of Missouri, Columbia
ACM-BCB, 2014
Big Data Challenge in Genomic Era
Biological
Experiments
DNA/RNA
Sequencing
Omics
Data
Genomics
Transcriptomics
Proteomics
Metabolomics
…
Biological System
Mass
Spectrometry
Knowledge
Analysis
Expression Profiles of Genes under
Multiple Conditions / Time Points
Gene 1
Gene 2
Gene 3
Gene 4
….
Con 1 Con 2
Con 3
Con 4
Con 5
Con 6
Con 7
Con 8
….
10
40
35
20
100
5
60
…
30
Gene Regulatory Networks (GRN)
Transcription factor (TF) regulates
a gene
GRN of yeast in rich medium
Gene regulatory module
TF1
TF2
TF3
Bar-Joseph et al., 2003
Bayesian Probabilistic Modeling
• Assign genes into co-regulated
modules
• Construct regulatory relations of
each module
P(GRN | D) = argmax(P(D | GRN)* P(GRN))
Posterior
GRN
Likelihood
Prior
Gene Regulatory Network Modeling
Join
Zhu et al., 2013
Gene Regulatory Logic of a Gene
Module as a Decision Tree
High Expression
Transcription
factors and
binary
regulatory
tree
One
Gene
Module
gene 1
gene 2
gene 3
….
….
gene n
Low Expression
Biological Conditions (Treatments) in Columns
Regulatory Tree Construction
• Pick a TF
• Divide conditions into
two subsets based
expression states
• Calculate probability
Gaussian Mixture
s
p ( gi )  
k 1 jS k
g1
g2
.
gi
.
.
gn
μ1, σ1
μ2, σ2
Zhu et al., 2013
1
2 k

e
( xij   k )2
2 k 2
Regulatory Tree Construction
• Repeat at next level
Gaussian Mixture
s
p ( gi )  
k 1 jS k
g1
g2
.
gi
.
.
gn
Zhu et al., 2013
1
2 k

e
( xij   k )2
2 k 2
Regulatory Tree Construction
Algorithm
• Pick a TF
• Divide conditions
based on TF states
• Calculate likelihood
• Select TF maximizing
likelihood
• Repeat
g1
g2
.
gi
.
.
gn
Gaussian Mixture
s
p ( gi )  
k 1 jS k
Zhu et al., 2013
1
2 k

e
( xij   k )2
2 k 2
Gene Re-Assignment
Regulatory
Tree of
a Module
gi
μ1
σ1
μ2
σ2
.
.
.
.
.
.
.
.
.
0.3
0.2
1.5
.
.
.
.
.
.
.
.
s
p ( gi )  
k 1 jS k
1
2 k

e
( xij   k )2
2 k 2
RNA-Seq Data of Soybean
Nodulation
• An important source of
protein and oil
• Nitrogen fixation enabled
by soybean-rhizobia
symbiotic interactions
Nodule
Gene Regulatory Modules of Differentially
Expressed Genes
One out of 10 modules
A TF functioning in
nodulation according
to literature.
NSP, whose homologous
protein is a nodulation
signaling in rice.
Zhu et al., 2013
Application to Other Species
Helix-loop-helix transcription factor 2
•
•
•
•
•
Arabidopsis
Drosophila
Mouse
Human
…
Soybean proteins affect TWIST2 – a novel protein related to Kidney disease?
Acknowledgements
Students
• Deb Bhattacharya
• Renzhi Cao
• Jie Hou
• Jilong Li
• Matt Spencer
• Trieu Tuan
• Mingzhu Zhu
Collaborators
Jim Birchler, Bill Folk, Kevin Fritsche, Michael Greenlief,
Zezong Gu, Mark Hannink, Trupti Joshi, Dennis Lubahn,
Valeri Mossine, Alan Parrish, Frank Schmidt, Gary Stacey,
Grace Sun, John Walker, Dong Xu
Binding Site Analysis
• MEME + TomTom
to identify two
binding sites:
BetabetaAlphazinc,
finger and Leucine
Zipper
• TFs in GRAS family
contain proteins
binding to the
motifs.
Function Enrichment Validation
Function predicted by MULTICOM-PDCN
P-value calculated by hypergeometric distribution.
Some functions are related to formation of nodule organ.
Zhu et al., 2013
Protein Interaction and Literature Validation
I: TF-TF interactions by STRING, L: Literature Function Support
Zhu et al., 2013
Computational Model Evaluation
GRN of Human Prostate Cancer
Under Botanical Treatments
Lu et al., submitted
Li et al., submitted.