Lecture 17 Expanded Genetic Code

Lecture 17: Introduction of non-natural amino acids and bases
Introduction of noncannonical amino acids into proteins
1) synthetic peptides (<50 amino acids)
2) native chemical ligation (allows the generation of larger proteins
containing synthetic elements)
Intein ligation
O
HS
N
H
HS
O
H 2N
SR
-Thiol ester
-Made by solid-phase
synthesis
N
NH 2
acyl donor
O
-N-terminal Cys
fragment
NH 2
H
N
HS
O
From resins that look like:
N
H
intein
S
splice junction
N
R
C-extein
O
O
O
splice junction
HSR
O O
O
S
intein
S
intein
Inteins in protein splicing:
N-extein
O
O
S shift
O
O
N-extein
intein
H 2N
O
O
O
S
S
O
S
N-extein
H 2N
C-extein
O
H 2N
intein
thiol to aminoacyl migration
NH
NH 2
cyclization
HS
O
O
N-extein
O
S
H 2N
NH
HS
NH 2
H
N
S
requires cysteine
residue at site of
ligation
(can be removed
chemically)
HS
O
N-extein
N
H
(Cys)
C-extein
C-extein
O
C-extein
Large protein:
200 aas
O
Cys
O
O
P
O
Cys
biosynthetically made
3) Biosynthesis of proteins with unnatural amino acids
-64 triplets code for 20 amino acids
-3 triplets code for stop (nonsense codon)
Process:
1) "blank" codons (nonsense, four base, rare codons, condon/tRNA deleted E. coli
genome, etc.)
2) Engineer a tRNA that is orthogonal to all other tRNAs
3) Evolve an aminoacyl-tRNA synthetase to uniquely recognize this tRNA
4) Evolve a synthetase to uniquely charge this tRNA with the 21st amino acid
5) Biosynthesize or transport amino acid (most amino acids are transported into bacteria
as the monomer or LysX dipeptide)
Lys
Phe
triplet codons each
AAAU
YYYXXX
UU
HO
B
O
O
Y'Y'Y'
YYY
P
encode amino acid
ZZZ
A
migration from A to P site
O
O
HO
anticodoncodon WC
base pairing
interaction
B
H
N
R'
O
R HO
O
X'X'X'
XXX
P site
NH 2
O
O
HO
O
B
Peptidyl transferase
-broad substrate specificy
Y'Y'Y'
YYY
A site
-tRNA xxx
P
AA
tRNA
XXX
X'X'X'
B
anticodon
codon
O
O
Y'Y'Y'
YYY
A
-to add another amino acid you need a unique codon that codes for the
new amino acid, as well as a tRNA specific for it
-use the stop codon TAG (UAG) to encode unnatural amino acids
Usually:
"stop"
UAG
Nonsense sequence tRNA cannot be recognized by endogenous
synthetases
identity element in
every tRNA except
for proline in E. coli
.aaRS
G C
Instead use UAA as the stop codon, and UAG to code for unnatural amino acids
UAA
UAG
2) Build a tRNA that recognizes UAG
R
O
HO
B
20 natural aaRS
Archae bacteria
NH 2
O
tRNA. aaRS pair from archae
O
C G
identity elements distinct from E. coli
Must be
functional
AUC
UAG
If you import a tRNA. aaRS pair from archae into E. coli, they do not cross
react with E.coli tRNAs or synthetases but are functional in translation
anticodon
codon
3) Evolve an aminoacyl-tRNA synthetase (aaRS) to uniquely recognize
this tRNA and no endogennous E. coli tRNAs
Tyrosyl tRNA synthetase:
O
O
O
+PP
i
NH 2
ATP + Tyr
AMP
P
NH 2
O
O
O
HO
O
R
R
HO
R
OH
R
O
A
C TyrRS (archae)
C
Tyr
A tRNA
CG
tRNA
AUC
O
tRNA
But there is a problem:
When you change the anticodon to CUA, the identity uniqueness is lost,
and in addition to the Tyr . aaRS (archae) recognizing it, some E. coli aaRS
also recognize and load the tRNA. The solution is to create a large library
of tRNAs and use an in vitro selection scheme to identify an orthogonal one.
library of:
TAG
AA
TAG at permissive site
(accepts all 20 amino
acids)
barnase (RNAse)- a toixic
gene that kills cells when
non-orghogonal tRNAs
are recognized by host
aaRS (negative selection)
AUC
4) Evolve a synthetase to uniquely charge this tRNA with the 21st amino
acid and not endogenous host amino acids
Tyr
aaRS
AUC
UAG
aaRS:
Tyr
Want to substitute
Tyr with NAla
Tyr
NAla:
AMP
15-20 nucleotide
mutations to N (A,G,C,T)
UAG
ATP
OH
H 2N
-the only cells that live are ones in which the tRNA is not recognized by any E.
coli aaRS or are non functional in translation, i.e.:
-no longer recognized by archae Tyr . aaRS
-or no longer functional on the ribosome
20 mutations
O
-Generate a structure based randomized library where the 6 active residues
binding to tyrosine are randomized.
then:
Permissive site for NAla
AUC
TAG
Solution: Take all the tRNA winners and transform E. coli along with the
archae TyrRS
-lactamase
ampicillin
aaRS library
NAla
-NAla and 20 canonical amino
acids
tRNA winners
AUC
TAG
-lactamase
Tyr . aaRSarchae
-lactamase allows growth on
ampicillin - positive selection
Take all the winners- load NAla or endogenous
host amino acid
tolerates Tyr mutation at TAG site
Expanding the number of base pairs in DNA and RNA (Benner)
To remove aaRS that recognize endogenous host amino acids
take the winners :
Permissive site
aaRS library
TAG
barnase
T
G
X
C
Y
Stable, base pairs with itself, does not base pair
with A,G,C or T
-20 canonical amino acids,
but no NAla
IsoC
Anything remaining does not
take any canonical amino acids
so must be specific for NAla
AUC
A
O
H 2N
N
HN
N
NH 2 O
N
N
IsoG
T
Labelling:
N3
O
N
hv
H 2N
-3H bonds
-orthogonal H-bonding patterns
N
nitrene
(CH insertion affinity label)
COOH O
H 2N
COOH
NH
N
O
Using isoGTP, isoCTP, dTTP*(radio-labelled), dGTP, dCTP
360nm
CH insertion
H 2N
COOH
O 2N
N
H 2N
5'- GATTTTGACC(isoG)GCT*
3'- CTAAAACTGG(isoC)CGA
3'- CTAAAACTGG(isoC)CGA
Then using isoGTP, dCTP, dTTP*
Metal binding amino acids:
N
5'- GATTTTGA-OH
OMe
5'- GATTTTGA-OH
5'- GATTTTGACC(isoG) CT*
3'- CTAAAACTGGTGA
3'- CTAAAACTGG---T----GA
OMe
O
IsoC
M 2+
COOH
H 2N
N
HN
N
Photocaged amino acids:
innactive
Nucleophile
OH
O
NO 2
hv
-lactamase
O
O
OMe
OMe
N
N
IsoG
NH 2 O
T
hv
active
N
NH
N
O
H 2N
N
HO
N
N
N
IsoG
tautomer of IsoG,
has comparible
stability and pairs
with T
Expanding the number of base pairs in DNA and RNA (DiPic)
O
O
O
M 2+
N
DiPic
N
Py
O
O
O
To measure these look at Tm
X
Y
A
Tm
Temp
Tm(oC)
DiPic-Py
DiPic-A
DiPic (G,C,T)
Py (A,G,C,T)
40.4
36.1
<30
<30
Another way (that didn't work): use a tricycle and monocycle pair
longer
tricycle
shorter