Outline Sequence-comparison methods

Outline
Sequence-comparison methods
 Why
compare sequences?
 Dotplots
 Pairwise sequence alignments
 Multiple sequence alignments
 Profile methods
Gerard Kleywegt
Uppsala University
Why?
Pepparkaka stryktipset!
 Tells
you if you have grasped some of
the important concepts of the first two
lectures about sequence-comparison
methods
 Tells
me if my teaching has been
effective
Quiz rules





Make a “stryktipsformulär”
on your PostIt note with
room for 5 questions
Answer the five questions
Hand in the PostIt notes
Everyone gets a
pepparkaka for their trouble!
We will discuss the answers
to the questions
Question 1
1
1
2
3
4
5
X
2
 Of
the following three scientists, who is
famous for his/her contributions to the
development of a sequence alignment
method?
– 1: Smith
– X: Jones
– 2: Needlepoint
1
Question 2
Question 3
 Two
 Given
proteins (of 100 residues each)
have 23% sequence identity. Which
statement is true:
two DNA sequences that are
each other’s inverse (e.g., GAATTC and
CTTAAG), what does their dotplot look
like?
– 1: the two proteins are likely to be
homologous
– X: the two proteins are possibly
homologous
– 2: the two proteins are not homologous
–1
X
Question 4
Question 5
 We
 We
want to do a standard NeedlemanWunsch alignment of the “sequences”
CA and TA. A match scores +2, a
mismatch -1, and a gap -1. What is the
value of the missing element in the
matrix?
Δ
C
A
– 1: 0
– X: -3
– 2: 1
Δ
0
-1
-2
T
-1
-1
-2
A
-2
-2
???
want to do a standard NeedlemanWunsch alignment of the “sequences”
CA and TA. A match scores +3, a
mismatch -1, and a gap -2. What is an
optimal alignment?
– 1: CA with TA
– X: -CA with T-A
– 2: C-A with -TA
Please hand in your answers …
Question 1
…
 Of
and have a cookie
2
Δ
C
A
Δ
0
-2
-4
T
-2
-1
-3
A
-4
-3
2
the following three scientists, who is
famous for his/her contributions to the
development of a sequence alignment
method?
– 1: Smith
– X: Jones
– 2: Needlepoint
 And
the correct answer is …
1
2
Question 2
Question 2
 Two
proteins (of 100 residues each)
have 23% sequence identity. Which
statement is true:
– 1: the two proteins are likely to be
homologous
– X: the two proteins are possibly
homologous
– 2: the two proteins are not homologous
 And
✖
the correct answer is …
X
Question 3

Question 4
Given two DNA sequences that are each other’s
inverse (e.g., GAATTC and CTTAAG), what does
their dotplot look like?
– 1
X

2
– 1: 0
– X: -3
– 2: 1
And the correct answer is …
 X



Question 5

the correct
answer is …
1
And the correct
answer is …
2
Δ
C
A
Δ
0
-1
-2
T
-1
-1
-2
A
-2
-2
???
Results?
We want to do a standard NeedlemanWunsch alignment of the “sequences” CA
and TA. A match scores +3, a mismatch -1,
and a gap -2. What is an optimal alignment?
– 1: CA with TA
– X: -CA with T-A
– 2: C-A with -TA
Δ
C
A
 And
We want to do a standard NeedlemanWunsch alignment of the “sequences” CA
and TA. A match scores +2, a mismatch -1,
and a gap -1. What is the value of the missing
element in the matrix?
Δ
0
-2
-4
T
-2
-1
-3
A
-4
-3
2
 How
well did you do?
 Results
for the whole class will be on
the website tomorrow
3
BG349 - Class of 2008
 Results
…
– 93% correct answers!!
– Distribution in table
– 5 correct - 4 students
(67%)
– 4 correct - 2 students
(33%)
1
MB330 - Class of 2008
1
X
2
?
4
-
2
-
2
-
6
-
-
3
-
6
-
-
4
-
-
6
-
5
6
-
-
-
BG351 - Class of 2007

Results …
– 64% correct answers
– Distribution in table
–
–
–
–
–
–
1
5 correct - 5 students (14%) 2
4 correct - 12 (32%)
3
3 correct - 12 (32%)
2 correct - 4 (11%)
4
1 correct - 2 (5%)
0 correct - 2 (5%)
5
Results very good!
1
X
2
?
13
2
15
7
8
18
7
4
5
24
4
4
3
2
30
2
33
1
1
2




– 73% correct answers
– Distribution in table

Correct answers:
1XX21
“5 rätt” - 7 students (17%)
“4 rätt” - 22 students (52%)
 “3 rätt” - 7 students (17%)
 “2 rätt” - 3 students (7%)
 “1 rätt” - 3 students (7%)


1
X
2
1
20
2
20
2
3
32
7
3
3
31
8
4
5
1
36
5
34
7
1

1
X
2
1
15
0
5
2
1
17
2
3
3
13
4
4
0
0
20
5
19
1
0
Results very good!
– 84% correct answers
– Distribution in table

Correct answers:
1XX21
“5 rätt” - 9 students (45%)
“4 rätt” - 7 students (35%)
 “3 rätt” - 4 students (20%)


MB330 over the years …
Year
% Correct
answers
% Students with 4 or 5
correct answers
2006
89
86
1
2007
84
80
0
28
2008
73
69
1
1
2009
?
?
1
X
2
1
22
3
3
Correct answers:
1XX21
2
1
24
3
3
2
25
“5 rätt” - 17 students (61%)
“4 rätt” - 7 students (25%)
“3 rätt” - 4 students (14%)
4
0
5
26
– 89% correct answers
– Distribution in table
Results:
MB330 - Class of 2007
MB330 - Class of 2006


4