Modelling protein unfolding: hen egg-white lysozyme

Protein Engineering vol.10 no.8 pp.895–903, 1997
Modelling protein unfolding: hen egg-white lysozyme
M.A.Williams1,2, J.M.Thornton1 and J.M.Goodfellow3
Laboratory of Molecular Biology, Department of Crystallography,
Birkbeck College, Malet Street, London, WC1E 7HX, UK
1Also at the Biomolecular Structure and Modelling Unit, Department of
Biochemistry and Molecular Biology, University College London,
Gower Street, London WC1E 6BT, UK
2Present
address: Molecular Structure Division, NIMR, The Ridgeway,
Mill Hill, London NW7 1AA, UK
3To
whom correspondence should be addressed
Introduction
Considerable progress has been made recently in obtaining
structural information about intermediates along the folding
pathways of proteins. A variety of experimental techniques,
principally H–D exchange NMR, circular dichroism and
fluorescence spectroscopy and protein engineering, have provided information about the changing environments and configurations of individual residues during the folding process
(Baldwin, 1993; Fersht, 1993; Evans and Radford, 1994;
Serrano, 1994). Structural models of the intermediate states
have then been built which are consistent with these data.
Unfortunately, the experimental information is insufficient to
determine high-resolution structures of the folding intermediates and we have therefore turned to molecular simulation
techniques in an attempt to model them.
An atomic resolution simulation of a protein folding from
an initial random coil state is currently computationally impracticable. Consequently, several researchers have recently
carried out simulations of the unfolding of proteins in aqueous
solution in order to derive models of the folding intermediates
(Daggett and Levitt, 1994). Modelling the unfolding process
is a tractable problem, as the system starts from a well defined
state whose stability can be artificially disrupted, precipitating
rapid structural change. This disruption is usually produced
by increasing the temperature of the simulation to very high
© Oxford University Press
895
Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014
A novel modelling procedure, which rapidly unfolds a
protein by enhancing solvent penetration of its core, was
used to investigate the unfolding pathway of hen egg-white
lysozyme. Early on the unfolding pathway there is a
dramatic disruption of the tertiary contacts within the
protein, which decouples its domains. Subsequently, the
helical domain slowly loses its compactness and the helices
fluctuate rapidly. The protein then adopts a ‘molten globulelike’ structure in which the native β-sheet is essentially
intact. The modelled structures have properties similar to
those of lysozyme’s experimentally characterized partially
folded states and provide insight into its complex
(un)folding process. The sequence of unfolding events shows
how the unfolding pathway of a multidomain protein may
be most similar to its fastest, but not necessarily its
dominant, folding pathway.
Keywords: folding intermediate/hen lysozyme/molecular
modelling/molten globule
levels (225 and 327°C have both been used in several studies),
which dramatically increases the rate of unfolding, reducing
the total unfolding time to about 1 ns, compared with the
several milliseconds expected under experimental conditions.
We have developed a novel, alternative, method for unfolding
of a protein at experimentally attainable temperatures by
enhancing solvent penetration of the protein core. The thermodynamics of solvent–solvent and solvent–protein interactions
are vitally important in determining the structure and stability
of proteins (Dill, 1990; Levitt and Park, 1993). Experimentally,
protein unfolding is initiated by a change in the relative
thermodynamics of solvent–protein, solvent–solvent and protein–protein interactions which results in an increased propensity for the solvent molecules to interact with protein rather
than other solvent molecules. We aim to mimic the protein
unfolding process using molecular dynamics-based modelling
of the protein–solvent system at 80°C and encouraging the
protein to become more solvated.
As the protein structure fluctuates during the dynamics,
relatively large cavities form transiently within its core. To
promote unfolding, we periodically interrupt the dynamics and
insert water molecules into some of these cavities. We have
previously shown that a cavity within a protein is increasingly
likely to be hydrated as its size and the number of polar groups
which line the cavity increase (Williams et al., 1994). Thus
we chose to insert water molecules into those cavities which
are sufficiently large and polar that they could be hydrated
under conditions in which proteins unfold, but which would
not necessarily be occupied in the native structure at 25°C. In
an effort to mimic the diffusional process, we also require that
a new water molecule may only be inserted at a position less
than 4.5 Å from an existing water molecule. During the
subsequent period of molecular dynamics, the protein reacts
to the disturbance created by each new set of water molecules.
Each individual water molecule can stabilize or destabilize the
local structure or simply diffuse out of the cavity into the
surrounding bulk solvent. However, the net effect of this
enhanced solvent penetration protocol (detailed in Figure 1
and under Methods) is to cause the protein to unfold rapidly
at 80°C. This protocol can model the early stages of protein
unfolding using an order of magnitude less computational
effort than previously reported procedures.
We chose to study the unfolding of hen egg-white lysozyme
using this novel methodology. Hen egg-white lysozyme
(HEWL) is an α 1 β protein with a large α-domain containing
four α-helices and a 310-helix and a smaller β-domain consisting of a triple-stranded anti-parallel β-sheet, an irregular
loop containing two disulphide bridges and a 310-helix (Figure
2). The protein contains a total of four disulphide bonds, all
of which remain intact in both the reported experimental and
our theoretical studies. Extensive experiments show that HEWL
has multiple folding pathways, that on the dominant folding
pathway the α-domain becomes substantially folded prior to
the β-domain and that a significant minority of molecules
M.A.Williams, J.M.Thornton and J.M.Goodfellow
follow a much faster route on which the domains fold at
similar rates (Radford et al., 1992; Dobson et al., 1994;
Keifhaber, 1995). This protein has also been investigated by
other workers using several modelling protocols (Hunenberger
et al., 1995), including one of the pioneering applications of the
very high temperature method (Mark and van Gunsteren, 1992).
Methods
The modelling procedure is summarized in the flowchart shown
in Figure 1. An X-ray crystal structure of HEWL (1hel, Wilson
et al., 1992) is placed in a periodic box of water molecules
and all those within 2.3 Å of a protein atom are removed. The
AMBER/OPLS potential (Weiner et al., 1984; Jorgensen and
Swenson, 1985) is used to model the interactions of the protein
and the TIP3P potential (Jorgensen et al., 1983) those of the
water molecules. The charge states of the protein residues are
set at those expected for pH 7 (States and Karplus, 1987).
Interactions are set to zero for residues or molecules separated
by more than 8 Å. The dielectric constant is set equal to 1.
All molecular dynamics and minimization are carried out using
AMBER (Pearlman et al., 1991). The protein–water system is
initially simulated in this periodic manner for 10 ps of
molecular dynamics at 300 K and 1 atm pressure, using the
constant NPT algorithm of Berendsen et al. (1984) with a
coupling constant of 0.2 ps. The time step for each iteration
of the dynamics is 2 fs and SHAKE is used to fix the
length of covalent bonds involving hydrogen. Following this
‘relaxation’ phase, those water molecules further than 5 Å
from a protein atom are removed from the system, using
the program SHELL (M.A.W.), and the periodic boundary
conditions no longer applied. The subsequent modelling of the
protein surrounded by its shell of waters is carried out with a
896
Fig. 2. Crystal structure of hen egg-white lysozyme (Wilson et al., 1992).
(a) Ribbon representation showing two α-helices (A and B) at the
N-terminal end of the protein (pale blue) leading into a three-stranded βsheet and a large loop (green) which is followed by a helical segment (dark
blue) consisting of a 310-helix, two more α-helices (C and D) and the Cterminal 310-helix. The protein has four disulphide bonds (C6–C127, C30–
C115, C64–C80, C76–C94), which are shown in yellow. The active site of
the enzyme lies in the cleft at the top right of the structure. (b) The
molecular surface of the crystal structure of HEWL in the same orientation.
distance-dependent dielectric at constant temperature.
Although this particular ‘solvent shell’ approach does not
precisely mimic the behaviour of a rigorous and computationally demanding periodic box simulation with long cut-off
distances or Ewald sum and does somewhat reduce the mobility
of surface residues, it has been shown to give good structural
agreement with both those more exact simulations and crystal
structures, producing molecular systems which are conformationally stable, conserve energy and exhibit very good hydrogen
bonding (Guenot and Kollman, 1992, 1993; Arnold and
Ornstein, 1994). We believe that the solvent shell protocol is
appropriate and adequate for our purpose of rapid generation
of structural models of partially unfolded proteins.
Following the relaxation phase, the system undergoes a
number of cycles of modelling, each involving a period of
dynamics followed by repair of the solvent shell and solvent
insertion into selected cavities within the protein. Preliminary
studies showed that the protein responds very rapidly to the
insertion of the new water molecules. The expansion of the
protein framework, the convergence of the total energy and
the formation of new cavities occur substantially in the first
Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014
Fig. 1. Flow diagram of the modelling procedure used to unfold the protein.
Modelling HEWL unfolding
Results and discussion
Change in global properties during unfolding
The extent of the structural change undergone by HEWL
during unfolding is most simply monitored by the Cα root
mean square (r.m.s.) deviation from the crystal structure (Figure
3a) and changes in the radius of gyration Rg (Figure 3b). The
deviation of a ‘control’ modelling study of HEWL, carried out
at 25°C with no solvent insertion, remains below 2.0 Å r.m.s.
and the protein Rg increases by only 4% over 100 cycles of
modelling. Such changes are in line with those expected for a
protein which is effectively transferred from a crystal to a
solution environment (Smith et al., 1993).
In contrast, under unfolding conditions, carried out at 80°C
with new water molecules inserted into suitable cavities, the
protein structure undergoes dramatic punctuated change. The
protein remains near its starting structure for the first 12 cycles
(structures A to B) then expands extremely rapidly (structures
B to C) to a state which is stable over the following 30 cycles
(structures C to D). It then undergoes a second slower expansion
(structures D to E), finally reaching another structural plateau
(structures E to F). Changes in the principal moments of inertia
reveal that the protein expands roughly equally in all directions
during unfolding. The β-sheet is maintained throughout the
unfolding, although there is some fraying at the ends of strands
(Figure 3c). In contrast, the α-helical content fluctuates more
and falls to 40–50% of the value for the crystal structure, with
the most rapid loss of helical content coinciding with the first
major expansion phase. The loss of helical structure is not
uniformly distributed with helices C and D and the 310-helices
being much more disrupted than helices A and B.
The robustness of this model of the unfolding pathway was
investigated by repeating the unfolding procedure, as detailed
in Methods, using an alternative starting crystal structure (4lyz;
Diamond, 1974). The stages in the resulting alternative pathway
are very similar to those described above and the structures at
each stage have similar global properties. The only distinct
differences between the alternative and original pathways are
that the helices, particularly helix B, are less structured in the
alternative pathway following the first rapid expansion and
that consequently the alternative final structure is slightly more
expanded. Our modelling procedure appears to produce a
consistent unfolding pathway from different initial structures
and consequently we shall only describe the original pathway
in detail in this paper. However, partially unfolded structures
along the simulated pathway differ in detail, reflecting the
expected diversity of such structures in a real population.
The values of parameters used in our protocol, as detailed
in Methods, are not definitive, but were chosen to give
relatively rapid unfolding and produce well defined intermediate structures. If the modelling procedure is applied to
our original crystal structure, but using a different temperature
and/or with water insertion at different depths, the unfolding
path also passes through a similar sequence of stages, i.e.
rapid expansion followed by a structure with fluctuating helices
and a relatively stable sheet, with higher temperatures and
deeper insertion increasing the rate of unfolding. Inserting
water molecules into smaller and/or less polar cavities modifies
the unfolding path, as this disrupts more stable regions of
the protein.
Solvent penetration of the protein core
At the end of the first cycle of the unfolding procedure, only
two new water molecules are inserted, occupying cavities in
the protein core between the domains. The relaxation which
follows this insertion causes the protein to expand, opening
up new cavities into which water molecules may be inserted
in subsequent cycles. Cavities that are suitable for water
insertion do not occur on every cycle and few or no new water
molecules are inserted during each of the first 10 cycles (Figure
4a). The water molecules that are inserted disrupt the protein
structure slightly so that some other molecules are able to
diffuse from the surrounding solvent shell into the space
between the domains during the molecular dynamics phase.
Despite this, the structure at cycle 10 is still well packed, with
only a small increase in cavity volume (Figure 3d).
During the next few cycles, the protein undergoes rapid
structural change (structures B to C). The protein expands
dramatically, losing many close packing interactions within
the core and creating a large amount of empty space inside
the protein (Figure 3d) into which many new water molecules
are inserted (Figure 4a). The protein responds to these insertions
by expanding still further such that at the end of cycle 15, 44
new water molecules are inserted into the cavities in the
897
Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014
picosecond of dynamics, with little further change in the period
between 1 and 5 ps. Consequently, in line with our aim of
producing protocol capable of rapidly unfolding proteins, we
decided that the dynamic phase of each cycle should consist
of 200 steps of steepest descents minimization followed by
only 1 ps of molecular dynamics. After the dynamics the
movement of protein and water often leaves gaps in the shell.
The SHELL program removes any waters that have moved
more than 5 Å from a protein atom, then immerses the protein
and its shell of waters in a large bath of water. Any of these
bath water molecules within 5 Å of a protein atom and more
than 2.3 Å from every protein atom or shell water molecule
is added to the shell. This repair process is vital as it maintains
a uniform shell of water around the protein as it expands
during the unfolding process. No water molecules are placed
in cavities at this stage. The structure of the protein and its
shell of water molecules is then analysed using PRO_ACT
(Williams et al., 1994), which identifies and characterizes the
empty cavities within the structure. When a cavity is found
which has a sufficiently high probability of being hydrated, a
water molecule is placed in an arbitrary orientation in that
cavity by the program INSERT (M.A.W.). Analysis of protein
crystal structures has shown that the probability of a cavity
being hydrated increases as its size and polarity increase
(Williams et al., 1994). In the modelling reported in detail
here, we chose to place water molecules in any cavity with a
probability of hydration of .20%. The new structure, with
repaired shell and possibly additional water molecules in its
interior cavities, is then simulated as described above and a
new cycle begins.
Secondary structures were assigned using SSTRUC
(D.Smith, UCL), which implements the algorithm of Kabsch
and Sander (1983). The stereochemistry of the structures was
analysed using PROCHECK (Laskowski et al., 1993). Solvent
accessibilities were calculated using ACCESS (S.Hubbard,
UCL) and contact maps and cavity volumes (i.e. the total
volume of probe spheres with a diameter .1.9 Å that can be
inserted into the protein) using PRO_ACT. Black and white
illustrations were created with ROMPLOT (R.Laskowski,
UCL) and colour illustrations with GRASP (Nicholls et al.,
1991).
M.A.Williams, J.M.Thornton and J.M.Goodfellow
protein, principally between the two folding domains (Figure
5). The separation of the secondary structure elements in the
protein core makes many residues solvent accessible and
SHELL (Figure 1 and Methods) increases the amount of
surface water accordingly (Figure 4b). The principal core of
the protein, formed by the interaction of the first two helices
with each other and with the structure between the start of the
β-sheet and the end of the C helix, is disrupted and becomes
substantially solvated, as indicated indirectly by the reduction
of interior cavity in the protein at this time (Figure 3d).
The protein structure then remains stable for some time
before a second set of water insertions are made (Figure 4a)
precipitating a gentler structural transition. The main effect of
these later insertions is to disrupt what are apparently two
smaller well packed cores within the protein at the interface
of the B and D helices and the interface of the C helix plus
the preceding 310-helix and the β-sheet plus loop structure.
The rate-limiting transition and a late-folding intermediate?
Despite the burial of 10 water molecules within the protein
(cf. six in the crystal structure), the domain interface is still
fairly well packed at cycle 10 of the modelling procedure.
Then, eight water molecules are inserted at the end of cycle
11, which together with the waters already present so disrupt the
structure’s integrity as to precipitate a catastrophic expansion of
the structure, creating a poorly packed protein interior which
is quickly penetrated by water molecules (Figure 6a and b).
898
The rapidity of the change in the protein structure suggests
that the structural integrity of the protein in its native state is
highly cooperative and that the disruption of a few contacts
has a domino effect, which results in the loss of interactions
between the two domains of the protein and between the
A 1 B helices. Changes in the backbone conformation of the
protein are very much localized to the loop between the A
and B helices with lesser changes occurring in the loops
between other secondary structural elements (Figure 7a). The
secondary structure elements are only slightly disrupted, with
some distortion of the region which formed the C-terminal
310-helix in the crystal structure and some fraying of the ends
of the α-helices.
The similarity of this catastophic ‘unlocking’ of the protein
to the period of rapid expansion on the unfolding pathway of
chymotrypsin inhibitor II immediately following its transition
state (Li and Daggett, 1994) suggests to us that changes
occurring during the first transition (B–C) represent the ratelimiting step in the unfolding of HEWL. The transition state
itself, the state with the highest free energy during this
transition, is difficult to identify definitively since we have no
way of accurately determining the free energy of any of our
model structures. However, it is most likely similar to the
structures obtained in the early part of the transition (Figure
6a), in which previously favourable packing interactions are
disrupted but not yet fully compensated by increased sidechain entropy or hydration.
Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014
Fig. 3. Changes in structural parameters during unfolding. (a) Cα r.m.s. deviation of the model from the crystal structure as a function of the number of
cycles of modelling. The grey triangles represent the control procedure (25°C, no solvent insertion) and the black the unfolding prodedure (80°C, shallow
insertion). (b) Radius of gyration of the models from the control and unfolding procedures distinguished as in (a). (c) Proportion of each model structure
forming α-helices (black) and β-sheet (grey). (d) Cavity volume of the model structures from the unfolding prodecure.
Modelling HEWL unfolding
Fig. 5. Ribbon representation of model structure C (cycle 16 of the
unfolding procedure), illustrating the 44 new water molecules (blue) that
were inserted at the end of the previous cycle.
In the metastable state (C–D) which follows this transition,
the two lobes of the molecule are essentially decoupled and
the active site is substantially disrupted (Figure 6b). The A
helix and C-terminus have moved away from the rest of the
protein (i.e. some Cα atoms of the A and C helices have
moved more than 10 Å further apart) and much of the sheet
Fig. 6. Domain interface viewed from above the active site, (a) at cycle 12
and (b) at cycle 14 during the first transition. Cavities are represented by
the red mesh and buried or cleft water molecules by the blue mesh.
through to the end of the C helix has moved away from the
rest of the protein, with the C-terminal end of the C helix
moving furthest. The loss of specific side-chain packing and
consequent increased solvent accessibility of many residues
(Figure 7b) will result in a more disordered side-chain structure
than in the native state. In particular, two (W28 and W108)
of the three tryptophan residues which were buried in the
crystal structure become partially solvent exposed and only
W111 retains a similar degree of burial to that in the native
state including complete burial of the indole NH group. (This
pattern of tryptophan burial is also found on the alternative
pathway derived using a different starting crystal structure,
despite it having somewhat different secondary structure at
this point). A consequence of the increased solvent exposure
of side chains is that the protein surface is more hydrophobic
than in the native state (e.g. the non-polar solvent-accessible
surface area of structure D is 60% greater than that of A).
A late intermediate on the folding pathway has been characterized by spectroscopic studies of its tryptophan residues and
interactions with ligand molecules (Itzhaki et al., 1994). The
experiments suggest that the late-folding intermediate is a
substantially collapsed state which buries the W111 indole,
though lacks many fixed tertiary interactions including those
in the domain interface and which attains most of the secondary
indole structure of the molecule. The set of structures from
the metastable state (C–D) is consistent with these observations
and might be a good model for this late-folding intermediate.
899
Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014
Fig. 4. Changes in solvation during unfolding. (a) Average number of water
molecules found buried (black), in clefts (grey) and newly inserted into the
protein core (cross hatched) per model for each five cycle segment of the
unfolding procedure. (b) Number of water molecules covering the surface of
the protein as a function of the number of cycles of modelling.
M.A.Williams, J.M.Thornton and J.M.Goodfellow
Fig. 7. Structural parameters of the late-folding intermediate and molten
globule-like states. (a) Local structural deviations from the crystal structure
of structures C (black triangles) and F (grey triangles) defined as the r.m.s.
fit of every five residue segment of each structure to the corresponding
segment of the crystal structure. The values for structure F are shifted up by
1 Å for clarity. (b) Solvent accessibility differences of each residue of
structures C (black triangles) and F (grey triangles) from those of the crystal
structure. The solvent accessibility of the residues in the crystal structure
(shifted up by 200%) is shown as the black line. The values for the solvent
accessibility change on going from the crystal structure to structure C are
shifted up by 100%.
Fluctuating α-helices and a stable β-sheet
After the first transition (structures B to C), the lengths of the
helices fluctuate. In the compact but solvated state which
follows this transition the main chain C5O and H–N groups
in helices can quickly alternate between bonding with each
other and with nearby water molecules. When further tertiary
contacts are lost in the second transition, the protein becomes
less compact and the average rate of reforming main-chain
helical hydrogen bonds is consequently reduced, broken bonds
tend to persist and helices consequently disappear altogether.
The stability of the helices seems to be sequence and perhaps
tertiary structure dependent. The helices A and B are essentially
intact, though some of their side chains change conformation,
whereas only the central part of C remains and the D and 310helices have entirely disappeared. However, even in the C and
D helices, despite the loss of the main-chain hydrogen bonds,
only one residue at the beginning of each of the D and C
helices has moved out of the α-region of the Ramachandran
plot. Thus, ‘refolding’ of these helices might well occur on a
longer time-scale than that of our modelling procedure. Such
900
fluctuating helices have been proposed as the explanation of
the discrepancy between the rate of formation of helical
structure measured by circular dichroism and the slower rate
measured by the protection from exchange of amide hydrogens
in several proteins (Chaffotte et al., 1992; Evans and
Radford, 1994).
In contrast to the helices, the triple-stranded sheet remains
intact throughout, despite transiently losing a few hydrogen
bonds. Relatively persistent β-sheets have been observed
in other unfolding simulations (Daggett and Levitt, 1993;
Vijayakumar et al., 1993; Hunenberger et al., 1995) and it has
been suggested that this stability is due to burial of a large
amount of non-polar surface in the sheet and to the cooperative
nature of the structure. Also, in lysozyme the β-sheet is rather
polar and its cooperative resistance to dynamic fluctuations is
probably enhanced by its side chain–side chain hydrogen
bonds and salt bridge.
A transient molten globule?
The final 35 structures obtained by the unfolding procedure
(structures E to F) have very similar global properties (e.g.
Rg, shape and solvent-accessible surface area) and define a
second metastable state for the protein under our modelling
conditions. This state is substantially more expanded, more
solvated and has less secondary structure than the earlier
metastable state (C–D). This state (E–F) has many of the
properties of a ‘classical molten globule’—a compact, mobile,
partially structured state, which can be stabilized for many
proteins under conditions of low pH and which has been
proposed as a general intermediate on protein folding/unfolding
pathways (Ptitsyn et al., 1991; Ptitsyn, 1995). The radius of
gyration of our model of HEWL increases by 18% after 100
cycles (Figure 3b), which is within the range (15–35%)
reported for other simulations (Daggett and Levitt, 1992). The
structure has lost many of its native tertiary contacts (Figure
8) and the majority of those residues which were buried in the
Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014
Fig. 8. Cα–Cα distance maps for the crystal structure (upper left triangle)
and structure F (lower right triangle). The radius of the circle representing
each residue pair depends on their separation. The largest circles represent a
distance of ,5 Å and the radius is reduced linearly with separation to 15 Å,
which is the largest separation shown.
Modelling HEWL unfolding
Fig. 9. Final ‘molten globule-like’ model structure (F) from the unfolding
procedure. (a) Atomic representation with tryptophan residues shown in
purple and the backbone ribbon coloured according to the local r.m.s.
deviation from the crystal structure (red representing the least and blue the
greatest deviation). (b) Molecular surface of structure F of HEWL in the
same orientation as in (a).
native structure have become substantially solvent exposed,
including all of the tryptophan residues (Figures 7b, 9a and
9b). The increase in solvent accessibility of many of the
previously buried residues will inevitably lead to their having
greater mobility. This high side-chain mobility is compatible
with experimental observations of chemical shift broadening
in NMR spectra of molten globules (Alexandrescu et al.,
1993). The amount of non-polar surface exposed to solvent is
100% greater in structure F than in the crystal structure (the
total surface area has increased by only 60%), in accord with
the observations of increased binding of the hydrophobic
probes, lower solubility, increased likelihood of aggregation
and higher heat capacity of molten globules (Kim and
Baldwin, 1990).
Hen lysozyme has not been found to form an equilibrium
molten globule, although the homologous proteins α-lactalbumin and human and equine lysozyme do (Haezebrouck et al.,
1995; Morozova et al., 1995; Schulman et al., 1995) and a
transient state similar to these molten globules is thought to
exist on the hen lysozyme folding pathway (Ikeguchi et al.,
1986). In general terms, the properties of the structures E to
Conclusions
The properties of the structures generated during unfolding
are similar to those observed experimentally for the folding/
unfolding intermediates of many proteins. In particular, the
model pathway we have presented exhibits many features
found in the experimental studies of the folding of HEWL.
The first step in unfolding is an ‘unlocking’ of the two
independent folding/unfolding domains, the reverse of the final
step of folding (Dobson et al., 1994). In common with the
experimentally characterized late-folding intermediate, our
compact ‘unlocked’ structure loses many fixed tertiary interactions and the integrity of the sugar binding site, buries
tryptophan residue W111 completely and maintains much of
the native secondary structure. However, the ‘molten globulelike’ structures, which we observe subsequently on the
unfolding pathway, do not correlate with the dominant kinetic
intermediates observed early on the folding pathway.
What is the reason for the discrepancy between the observed
structures of the majority of early-folding molecules and our
late-unfolding structures? The experimentally characterized
folding pathway has several branches. The dominant route
involves prior folding of the α-domain, followed by formation
of the β-sheet domain. Although the majority of folding
molecules go down this slow pathway, a small minority take
a ‘fast track’ in which the two domains form at similar rates
(Dobson et al., 1994; Kiefhaber, 1995). This provides an
explanation for our observation of ‘molten globule-like’ structures in which a disrupted α-domain coexists with a somewhat
less disrupted β-domain. The majority of unfolding proceeds
by the quickest route, along the reverse of the ‘fast track’, on
which the α- and β-domains form, and presumably are disrupted, at similar rates. It seems plausible that it is generally
true that the different stability and kinetic accessibility of α
901
Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014
F resemble those of the well characterized α-lactalbumin
molten globule. Most of the protein’s residues remain in their
native region of the Ramachandran plot in accord with CD
observations (Chyan et al., 1993). Only some of the secondary
structure is completely stable (in our case 90% of the β-sheet
and 40% of the helix), while some helices fluctuate and allow
amide hydrogen exchange with solvent, in accord with H–D
exchange data (Chyan et al., 1993). However, it is difficult
make a more detailed comparison as the helices which are
relatively resistant to H–D exchange in the experimental studies
of the equilibrium molten globules of α-lactalbumin (B and
C) and human (A, B and C) and equine (A, B and D) lysozymes
are different from each other and those which we find to be
stable for our HEWL molten globule-like structures (A and B
only, see Figures 7a and 9a). Qualitatively these differences
are easily rationalized since the stability of helices is strongly
dependent on sequence and local environment, particularly the
presence of disulphide bonds (Yang et al., 1995); however,
only speculative inferences as to the detailed unfolding pathway
of HEWL can be drawn from either the experimental data on
these molten globules or our models. The fact that helices A
and B are stable in both lysozyme molten globules and our
model structures could be taken as support for our models.
The greater flexibility of helices C and D seen in our models
of HEWL is in contrast to the observed stabilty of at least one
of the two in the known experimental molten globules and
could be the reason why HEWL lacks an equilibrium molten
globule state, as has been suggested previously (Haezebrouck
et al., 1995).
M.A.Williams, J.M.Thornton and J.M.Goodfellow
902
of its unfolding pathway persist despite changes to the details
of the modelling protocol.
The relative rapidity of the unfolding protocol that we have
described, involving a self-repairing solvent shell and enhanced
solvent penetration of the protein core, opens the way for
extensive modelling of the (un)folding intermediates of many
other proteins (including very large proteins) for which experimental data are available. We hope to find further correlation
with the experiments on these proteins and enable a productive
dialogue between theory and experiment. Perhaps we could
attempt to begin that dialogue by pointing out that unfolding
experiments on mutant hen egg-white lysozymes, particularly
mutations of tryptophan, β-sheet and domain interface residues,
could potentially confirm or deny key aspects of our model
pathway.
Acknowledgements
We thank Drs C.Dobson, P.Evans, P.Kim and L.Smith for encouragement and
valuable discussions and are particularly grateful for the comments of Chris
Dobson and Phil Evans on earlier versions of this paper. This work was
supported by grants from the UK Biotechnology and Biological Sciences
Research Council (GR/H37051 and GR/H63678).
References
Alexandrescu,A.T., Evans,P.A., Pitkeathly,M., Baum,J. and Dobson,C.M.
(1993) Biochemistry, 32, 1707–1718.
Arnold,G.E. and Ornstein,R.L. (1994) Proteins: Struct. Funct. Genet., 18,
19–33.
Baldwin,R.L. (1993) Curr. Opin. Struct. Biol., 3, 84–91.
Berendsen,H.J.C., Postma,J.P.M., van Gunsteren,W.F., DiNola,A. and
Haak,J.R. (1984) J. Chem. Phys., 81, 3684–3690.
Caflisch,A. and Karplus,M. (1994) Proc. Natl Acad. Sci. USA, 91, 1746–1750.
Chaffotte,A.F., Guillou,Y. and Goldberg,M.E. (1992) Biochemistry, 31,
9694–9703.
Chyan,C.-L., Wormald,C., Dobson,C.M., Evans,P.A. and Baum,J. (1993)
Biochemistry, 32, 5681–5691.
Daggett,V. and Levitt,M. (1992) Proc. Natl Acad. Sci. USA, 89, 5142–5146.
Daggett,V. and Levitt,M. (1993) J. Mol. Biol., 232, 600–619.
Daggett,V. and Levitt,M. (1994) Curr. Opin. Struct. Biol., 4, 291–295.
Diamond,R. (1974) J. Mol. Biol., 82, 371–390.
Dill,K.A. (1990) Biochemistry, 29, 7133–7155.
Dobson,C.M., Evans,P.E. and Radford, S.E. (1994) Trends Biochem. Sci., 19,
31–37.
Evans,P.E. and Radford,S.E. (1994) Curr. Opin. Struct. Biol., 4, 100–106.
Fersht,A.R. (1993) FEBS Lett., 325, 5–16.
Guenot,J. and Kollman,P.A. (1992) Protein Sci., 1, 1185–1205.
Guenot,J. and Kollman,P.A. (1993) J. Comput. Chem., 14, 295–311.
Haezebrouck,P., Joniau,M., Vandael,H., Hooke,S.D., Woodruff,N.D. and
Dobson,C.M. (1995) J. Mol. Biol., 246, 382–387.
Hunenberger,P.H., Mark,A.E. and van Gunsteren,W.F. (1995) Proteins, 21,
196–213.
Ikeguchi,M., Kuwajima,K., Mitani,M. and Sugai,S. (1986) Biochemistry, 25,
6965–6972.
Itzhaki,L.S., Evans,P.A., Dobson,C.M. and Radford,S.E. (1994) Biochemistry,
33, 5212–5220.
Jorgensen,W.L. and Swenson,C. (1985) J. Am. Chem. Soc., 107, 1489–1496.
Jorgensen,W.L., Chandrasekar,J., Madura,J.D., Impey,R.W. and Klein,M.L.
(1983) J. Chem. Phys., 79, 926–935.
Kabsch,W. and Sander,C. (1983) Biopolymers, 22, 2577–2637.
Kiefhaber,T. (1995) Proc. Natl Acad. Sci. USA, 92, 9029–9033.
Kim,P.S. and Baldwin,R.L. (1990) Annu. Rev. Biochem., 59, 631–660.
Laskowski,R.A., MacArtur,M.W., Moss,D.S. and Thornton,J.M., (1993)
J. Appl. Crystallogr., 26, 283–291.
Levitt,M. and Park,B.H. (1993) Structure, 1, 223–226.
Li,A. and Daggett,V. (1994) Proc. Natl Acad. Sci. USA, 91, 10430–10434.
Mark,A.E. and van Gunsteren,W.F. (1992) Biochemistry, 31, 7745–7748.
Morozova,L.A., Haynie,D.T., Arico-Muendel,C., van Dael,H. and
Dobson,C.M. (1995) Nature Struct. Biol., 2, 871–875.
Nicholls,A., Sharp,K.A. and Honig,B. (1991) Proteins: Struct. Funct. Genet.,
11, 281–296.
Pearlman,D.A. et al. (1991) AMBER 4.0. University of California, San
Francisco.
Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014
and β structures means that the unfolding pathways of α 1 β
and more complex multidomain proteins will often not simply
be the reverse of the dominant folding pathway, but will follow
the fastest path avoiding the ‘kinetic traps’ encountered by
most folding molecules.
Our novel unfolding protocol is a valuable alternative to
those published protocols which require the use of very high
temperatures to increase the unfolding rate. There is always
the concern with high-temperature protocols that the large
kinetic energy present in the protein allows it to cross energy
barriers which it could not usually cross under experimental
conditions and consequently produce unrealistic structures.
In practice, this potential problem has not prevented hightemperature studies from providing many useful insights into
the unfolding/folding of several proteins (Daggett and Levitt,
1994). One such early study of HEWL at high temperature
(Mark and van Gunsteren, 1992) also showed that unfolding
proceeds via a metastable molten globule-like state, though
with relatively little secondary structure. Later theoretical
studies involving the forced expansion of the molecule
(Hunenberger et al., 1995) additionally showed separate
domains and relatively stable β-structure, though the structures
differed in detail from ours and reservations were expressed
in respect of the artificiality of the perturbations applied to the
protein in these studies in order to produce rapid unfolding
(Hunenberger et al., 1995). Our protocol for enhancing solvent
diffusion into the protein is intended to mimic a natural process
and we believe it provides a relatively gentle way to unfold
proteins in a reasonable time.
However, we are clearly making some significant assumptions about the unfolding process and the modelling protocol.
Primarily we are assuming that changes in protein solvation
are the most important events in the unfolding process and
that enhancement of the diffusion of water into the protein
without a corresponding enhancement of the rate of internal
structural rearrangement of the protein itself does not significantly alter the structures that are observed on the folding
pathway. It has been observed in some high-temperature studies
that, because of the relative rates of solvent penetration and
protein structural rearrangement, some cavities may persist for
a substantial length of time in unfolding proteins without
becoming hydrated and that these cavities allow a rearrangement of the protein’s internal structure, which in turn precipitates further unfolding. Our protocol is unlikely to allow polar
cavities (in particular) to persist and this may influence the
unfolding pathway. However, the observation of persistent
unhydrated cavities is not universal [e.g. rapid solvation of
cavities was observed by Caflish and Karplus (1994)] and may
itself be dependent on particular simulation protocols. Even
if, as we assume, the different relaxation rates of water and
protein have no substantial effect on the main features of the
pathway, there will be minor effects. In particular, there is
limited scope for changes in side-chain internal geometry in
the short time-scale of each modelling cycle in our current
protocol. Consequently, we have restricted our analysis to a
level of detail that we believe is likely to be adequately
modelled by this current protocol and which is equivalent to
that currently offered by experiment. As future experiments
provide more detailed information, it will be necessary for
theoretical models of all kinds to improve, if they are to
continue to make a contribution to knowledge in this field.
Fortunately, at this stage, our investigations of HEWL show
good agreement with experimental data and that many features
Modelling HEWL unfolding
Ptitsyn,O.B. (1995) Curr. Opin. Struct. Biol., 5, 74–78.
Ptitsyn,O.B., Pain,R.H., Semisotnov,G.V., Zerovnik,E. and Razgulyaev,O.I.
(1991) FEBS Lett., 262, 20–24.
Radford,S.E., Dobson,C.M and Evans,P.A. (1992) Nature, 358, 302–307.
Schulman,B.A., Redfield,C., Peng,Z., Dobson,C.M. and Kim,P.S. (1995)
J. Mol. Biol., 253, 651–657.
Serrano,L. (1994) Curr. Opin. Struct. Biol., 4, 107–111.
Smith,L.J., Sutcliffe,M.J., Redfield,C. and Dobson,C.M. (1993) J. Mol. Biol.,
229, 930–944.
States,D.J. and Karplus,M. (1987). J. Mol. Biol., 197, 123–130.
Vijayakumar,S., Vishveshwara,G., Ravishanker,G. and Beveridge,D.L. (1993)
Biophys. J., 65, 2304–2312.
Weiner,S.J., Kolman,P.A., Case,D.A., Singh,U.C., Ghio,C., Alagona, G.,
Profeta,S. and Weiner,P. (1984) J. Am. Chem. Soc., 106, 765–784.
Williams,M.A., Goodfellow,J.M. and Thornton,J.M. (1994) Protein Sci., 3,
1224–1235.
Wilson,K.P., Malcolm,B.A. and Matthews,B.W. (1992) J. Biol. Chem., 267,
10842–10849.
Yang,J.J, Buck, M., Pitkeathly,M., Kotik,M., Haynie,D.T., Dobson,C.M. and
Radford,S.E. (1995) J. Mol. Biol., 252, 483–491.
Received June 5, 1996; revised April 23, 1997; accepted April 29, 1997
Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014
903