Protein Structure Determination by NMR

Protein Structure Determination by NMR
Protein Sample
NMR Data Collection
Processing of NMR data
Resonance assignment
Conformational constraints
3D protein structure
Structure analysis & refinement
Conformational constraints
NMR provides indirect information about 3D
structure
•  chemical shifts
•  coupling constants
•  NOEs
•  residual dipolar couplings
•  PRE
dihedral angles
dihedral angles
interproton distances
bond orientation
distances
NMR data describes local conformation of the
protein. The dense network of constraints yields
the protein 3D structure.
NOE constraints in structure determination
Dense Network of distance restraints è NOE
NOE are essential NMR data to define the the 2º and 3º
structure of a protein.
Connects the pairs of protons separated by less than 5Å in
the protein through space (dipole-dipole interaction).
Vij = αDij-6
Magnetization can also get transferred by “spin diffusion”
Isolated Spin-Pair Approximation
is valid for short mixing time, but
if the mix time is too short, long
range NOE will not be present.
www.bmrb.wisc.edu
Quantification of NOE è peak volume or intensity.
care should be taken when dealing with overlap peaks.
Distance are derived from cross-peaks volumes by D = (α-1V)-1/6.
NOEs are usually converted as upper and lower bounds inter-atomic
distances never as precise distance restraints because of the
presence of internal motions. (Isolated spin-pair approx. and relaxation
matrix models for NOE analysis).
L = max(0, D- Δ), U = D + Δ (where Δ = 0.125D2)
Calibration of the NOEs is important.
à Calibration using the upper bounds: for strong NOEs set 2.7Å, for
medium NOEs set 3.3Å, for weak NOEs set 5.0Å or 6.0Å
à  From Vij = αDij-6 ; α (Vij/Dij-6) can be determined on the
basis of the known distances e.g Hα-HN or from the previously
determined structure.
à Averages over all distance < 3.5Å from calculated structures as ref.
exp
V
a=∑
∑V
i
i
i
i
th
Assignment of the NOE peaks
Number of distance restraints used determines the precision of the str.
For correct NOE assignments three main task needs to be performed:
à remove all artifacts (noise, water, T1-noise)
à completeness (>90%) of your chemical shifts table and
OFCOURSE they have to be correct.
àChemical shifts and NOE signal must be self-consistent within
tolerance window Δωtol.
Wrong assignment of chemical shift table will lead to wrong structure.
Many NOEs can be “UNAMBIGUOUSLY” assigned either “MANUALLY” or
“AUTOMATICALLY” to two isolated spin system. Such as ‘Intra-residue,
sequential and some long range NOEs’. è Unambiguous restraints
Some NOEs can be “AMBIGUOUSLY” assigned either “MANUALLY” or
“AUTOMATICALLY” to many spin system, specially in the overlapping area
of the NOE spectrum. è Ambiguous restraints
Manual Assignment :NMRView ; AUTOMATIC Assignment : ARIA
Ambiguous Distance Restraints
Isolated spin approximation:
NOE ~ d-6
Peak 1: NOE1 ~ d1-6
Peak 2: NOE2 ~ d2-6
NOE1 + NOE2 ~ d1-6 + d2-6
NOE12 ~ deff-6
deff = (d1-6 + d2-6)-1/6
For k contribution to the ambiguous NOE, the minimum distance (Dkmin)
from the ensemble of structure is determined.
Contribution Ck for assignment k to the cross-peak is estimated as:
Ck =
k
Dmin
∑
−6
N ( F 1, F 2 )
i
i
Dmin
−6
And the contributions (Np) that are kept which satisfy :
p is user-defined
Np
i
C
∑ >p
i
Network-anchored assignment
Network-anchoring exploits the fact that any network
of correct NOE peak assignments forms a selfconsistent set.
Each initial assignment is weighted by the extent to
which it can be embedded into the network formed by
all other NOE peak assignments.
Network-anchoring evaluates the selfconsistency of NOE assignments
independent of knowledge on the 3D
structure, thus compensates for the
absence of 3D structural knowledge
at the outset of a de novo structure
calculation (cycle 1).
C
A
B
Chemical shift indexing
helix
strand
Wishart CSI: http://www.bionmr.ualberta.ca/bds/software/csi/latest/csi.html
PECAN: http://bija.nmrfam.wisc.edu/PECAN
Talos+: http://spin.niddk.nih.gov/bax/software/TALOS/
CSI gives you the information about secondary structure.
Given the Hα, C’, Cα, Cβ chemical shift assignments.
This information is very useful while doing MANUAL assignments.
NOEs characteristic of β-strands
1. 
2. 
3. 
4. 
NH(i) à NH’(i) (opposite strand)
NH(i) à Hα(i-1) (same strand)
NH(i) à H’α(i+1) (opposite strand)
Hα(i) à Hα’(i) (opposite strand)
NOEs characteristic of α-helix
α-helix
Hα(i) à NH(i+4) (medium)
Hα(i) à NH(i+3) (Strong)
Hα(i) à Hβ(i+3) (Strong)
Hα(i) à NH(i+2) (not possible)
NH(i) à NH(i+1) (strong)
NH(i) à NH (i+2) (weak)
310-helix
Hα(i) à NH(i+4) (not present)
Hα(i) à NH(i+3) (medium)
Hα(i) à Hβ(i+3) (weak)
Hα(i) à NH(i+2) (medium)
NH(i) à NH(i+1) (strong)
NH(i) à NH (i+2) (weak)
Manual Assignment using NMRViewJ
NH(i) à Hα(i-1)
NH(i) à NH’(i)
i.h* j.h* i.n
D3
0.05
0.05
0.5
NH(i) à H’α(i+1)
Manual Assignment using NMRViewJ
Assign NOEs Using NmrViewJ
Auto-Assign NOEs Using NmrViewJ
Assign NOEs Using NmrViewJ
Assign NOEs Using NmrViewJ
Assign NOEs Using NmrViewJ
Assign NOEs Using NmrViewJ
Assign NOEs Using NmrViewJ
Assign NOEs Using NmrViewJ
Auto; AutoP; Manual
Dihedral angle constraints
Protein Secondary Structure and Carbon Chemical Shifts
Backbone φ and ψ from TALOS+
à Given the Hα, N, C’, Cα, Cβ chemical shift assignments and primary
sequence.
à Compares the secondary chemical shifts (as tri-peptide) against
database of chemical shifts and associated high-resolution structure.
Molecular dynamics simulation
MD numerically solves Newton‘s equation of motion in order to obtain
a trajectroy for the molecular system.
‘Standard‘ MD tries to simulate the behaviour of a real physical
system as close as possible.
MD used for NMR structure calculation searches the conformational
space of the protein for the 3D structure that fulfills all the
restraints (force field)
simulated annealing using target energy function
E = ∑ wi • Ei = wbond•Ebond + wangle•Eangle + wdihedral • Edihedral +
wimproper•Eimproper + wvdW•EvdW + wNOE•ENOE + wRdc•Erdc + wtorsion•Etorsion
Important difference of MD compared to gradient minimization of a
target function is the presence of kinetic energy.
From : Torsten Herrmann, Eidgenössische Technische Hochschule Zürich, Switzerland
Simulated Annealing
E
E
x
Energy landscape of protein
E
x
High temperature
x
Low temperature
From : Torsten Herrmann, Eidgenössische Technische Hochschule Zürich, Switzerland
A starting structure is heated to a high temperature in a
simulation. During many discrete cooling steps the starting
structure can evolve towards the energetically favorable final
structure under the influence of a force field derived from the
constraints.
Simulated Annealing Protocol:
Step 1: High temperature to 10,000K (1100 steps) Torsion Angle
Step 2: Cool phase 10,000K to 2000K (550 steps) Torsion Angle
Step 3: Cool phase 2000K to 1000K (5000 steps) Cartesian dynamics
Step 4: Cool phase 1000K to 0K (2000 steps) Cartesian dynamics
Methods for structure calculation
Simulated Annealing
–Molecular Dynamics in Cartesian Space
Degree of freedom are the Cartesian coordinates of the atoms (3N
coordinates).
A starting structure is needed, could be extended.
Computational complexity : proportional to N
Adopted in CNS and XPLOR-NIH.
–Molecular Dynamics in Torsion Angle Space
Degree of freedom are the torsion angle (n torsion angles).
Fixed length and fixed angle constraints are imposed.
A starting structure is needed, could be extended.
Computational complexity : if solving system of linear equations ∝ n3
if solving as tree structure ∝ n
Faster structure calculations.
Most useful for bigger proteins as allows longer integration time steps.
Adopted in CNS, XPLOR-NIH and CYANA
Softwares for structure calculations
XPLOR, XPLOR-NIH
CNS
CYANA, DYANA/ATNOS
AMBER
Automated NOE assignment & Structure Cal
ARIA/CNS : Ambiguous Restraints for Iterative
Assignments (http://aria.pasteur.fr/) Latest ver. 2.3
CANDID/CYANA
PASD/XPLOR
Strategy for Structure Calculations
Assign manually the NOEs obtained from the N15 edited
NOESY spectrum corresponding to the secondary str. (obtained
from CSI).
Allow ARIA to assign the remaining peaks.
If the secondary structure looks OK then check the NOE
assignment done by ARIA and re-do the structure calculations,
minimize NOE violations
ADD C13 NOESY peak list and let ARIA assign. OR manually
assign the intra- and sequential NOEs.
Check all the assignments done by ARIA. Minimize NOE
violations and improve the precession of the structure.
ADD Aromatic NOESY peak list. You can either MANUALLY
assign or let ARIA assign the peak list.
Disclaimer: This is how I do my structure calculations.
Criteria for NOE validation using
chemical shift data
ü  Chemical shift
agreement
ü  Network-anchoring
ü  Compatibility with
intermediate structure
ωΑ
Atom
A
ωΒ
Atom B
(ω1,ω2)
From : Torsten Herrmann, Eidgenössische Technische Hochschule Zürich, Switzerland
ARIA Protocol
Automated NOE assignment and Structure Calculation.
Completion of NOE
assignments
Removal of noise
peaks
Adjustment of
frequency windows
Iter 0
Iter 8
Correction of Input
data
Protein sequence
Chemical shift list
NOESY peak lists
Iter 2
Iter 7
ARIA
Automated
NOE assignment
Creates restraints
list
Iter 3
Iter 6
Setup of a new run
Iter 5
Iter 4
Inspection of report files
and analysis of proposed
assignments
ARIA
Conversion of data into
XML
Graphical Project Setup
Investigation of
quality indices of final
solvent-refined str.
ensemble
operates on an invariant peak list.
àAria Package uses Python as Scripting language, XML (eXtensible
Markup Language) as the data format.
àConversion script : to convert all input data (Chemical shift, Peak
lists, Sequence file) into XML format. Can read in data from XEASY,
SPARKY, NMRView etc.
àGUI interface to setup, very user-friendly. Create a “Project.xml”
file.
àData can be exchanged with other software packages using CCPN.
Input data using CCPN project file in Aria 2.2 and higher.
àOther data : Hydrogen Bond ( as CNS and XPLOR table format)
Example:
assign ( residue 97 and name HN ) ( residue 78 and name O ) 1.80 0.00 0.50
assign ( residue 97 and name N ) ( residue 78 and name O ) 2.80 0.00 0.50
assign ( residue 78 and name HN ) ( residue 97 and name O ) 1.80 0.00 0.50
assign ( residue 78 and name N ) ( residue 97 and name O ) 2.80 0.00 0.50
àOther data : J-Coupling ( as CNS and XPLOR table format)
Example:
! Karplus restraints for phi from hnha
coefficients 6.40 -1.40 1.9 -60.00
assign (resid
(resid
assign (resid
(resid
1 and name C ) (resid 2 and name N )
2 and name CA) (resid 2 and name C )
2 and name C ) (resid 3 and name N )
3 and name CA) (resid 3 and name C )
7.300
0.200
6.400
0.200
àOther data : Residual Dipolar Coupling ( as CNS and XPLOR table format)
Example:
! residual dipolar coupling restraints
! allignment tensor components: R = 3.59 Da = 8.09
assign (resid 999 and name OO)
(resid 999 and name Z)
(resid 999 and name X)
(resid 999 and name Y)
(resid 12 and name N)
(resid 12 and name HN) 2.433 0.0000
Obtained from pales
àOther data : Dihedral Angle (as CNS and XPLOR table format or TALOS
table) Example: Use script to convert from TALOS+ to CNS format.
Input file for Talos+ can be made from NMRViewJ
VARS
RESID RESNAME PHI PSI DPHI DPSI DIST COUNT CLASS
FORMAT %4d %s %8.3f %8.3f %8.3f %8.3f %8.3f %2d %s
4
5
6
9
10
11
F -104.000
G
85.000
S -96.000
L -92.000
T -122.000
Y -112.000
140.000
13.000
135.000
136.000
147.000
140.000
40.000
11.000
35.000
35.000
24.000
33.000
31.000
12.000
31.000
26.000
25.000
17.000
13.510
17.530
15.120
17.450
11.340
11.910
! phi and psi dihedral restraint file generated by Talos2Aria.py
! TALOS filename:
! talos-phi-psi.tbl
! settings: min phiError=20, min psiError=20, errorFactor=2.0
! Talos derived phi restraint:
assign (resid 8 and name C)
(resid 9 and name N)
(resid 9 and name CA)
(resid 9 and name C)
1.0 -92 70 2
! Talos derived phi restraint:
assign (resid 10 and name C)
(resid 11 and name N)
(resid 11 and name CA)
(resid 11 and name C)
1.0 -112 66 2
5
4
6
8
6
8
Good
Good
Good
Good
Good
Good
α helix ϕ –60 ± 30
β sheet ϕ -120 ± 30
α helix ψ –45 ± 30
β sheet ψ 135 ± 30
à Preparation Stage:
1) Filtering of Data: a) Checks for unique assignments (one atom & one
chemical shifts, b) degenerate chemical shifts assignments (one group of
equivalent atoms assigned to one chemical shifts and c) Stereo-specific
assignments : create a “floating chirality assignments”.
2) Create Molecular topology file (MTF) (CNS script “generate.inp”).
Create extended str. (PDB) (CNS script “generate_template.inp).
3) Initial NOE Assignments depending on the chemical shift tolerance.
Completeness of the chemical shift effects the initial assignment.
à Iterative Structure Calculations:
After each iteration ARIA analyzes the str. ensemble, calibrates
the spectra, detects inconsistent (violated) peaks.
Restraint is “violated” if the distance found in the ensemble str.
lies beyond a user defined ‘violation tolerance’ bounds.
Restraint is “violated” if the distance exceeds user-defined
‘violation threshold’ (0.5Å)
At each iteration ambiguous assignments are reduced using a
user-defined ‘ambiguity cutoff’ limits.
à Refinement in Explicit Solvents: in water (40 lowest energy str.)
In normal MD simulations, simplifies the non-bonded interactions.
Vander Waals and electrostatic potentials are not treated properly,
àAnalyses and Output file:
Analysis of NMR Restraints: done at each iteration
noe_restraints.ambig
list of ambiguous peaks
noe_restraints.assignments
list of all assignments
noe_restraints.merged
list of merged peaks
noe_restraints.unambig
list of unambiguous peaks
noe_restraints.violations
list of the distance restraint violations
noe_restraints.network
list of network-anchoring score
noe_restraints.xml
list of all restraints used for Str. Cal.
Peaklistname.assigned.xml
list of assigned peaks (one for each
peak list)
Creates unambig.tbl and ambig.tbl (final restraint list to be used in the
next iteration)
àAnalyses and Output file:
Analysis done by CNS (stored in ~it8/analysis/cns)
*.viol files have all the distance restraints violations
*.disp files have all the analysis results
(see for description
http://aria.pasteur.fr/documentation/use-aria/version-2.2/
copy_of_results-analysis/)
Analysis can also be done by in-house scripts freom it8 using the cns
output file (refine_xx.out) and from refine directory using the cns
output file (refine_water_xxx.out)
à Quality Check:
Can use WHAT IF, PROCHECK and PROSAII
Also independently use WHATCHECK
ARIA Directory Structure
Sequence.xml
shifts.xml and NoesyPeakList.xml
Hbond.tbl
Rdc.tbl
Dihedral.tbl
Unambig.tbl and ambig.tbl
Unambig.tbl and ambig.tbl
noe_restraints.xml
Refine_xx.out
Refine_water_xx.out
Thank You!!!