Reduced Molecular Dynamics Streamlined version beta1 Users’ Manual Filip Leonarski Joanna Trylska 1 The RedMDStream program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 2, as published by the Free Software Foundation. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. Copyright (C) 2007 - 2014: University of Warsaw CONTENTS 2 Contents 1 Introduction to the RedMDStream Package 3 2 Installation and execution 2.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 3 Units 6 4 Input files 4.1 XML format parsing . . . . . . . . . . . . 4.2 Equation parsing . . . . . . . . . . . . . . 4.3 Simulation protocol . . . . . . . . . . . . . 4.3.1 Simulation execution . . . . . . . . 4.3.2 Trajectory analysis . . . . . . . . . 4.4 Topology definition . . . . . . . . . . . . . 4.4.1 Atom mapping rules . . . . . . . . 4.4.2 Bond creation rules . . . . . . . . . 4.4.3 Potential energy function definition 4.4.4 Nonbonded potential . . . . . . . . 4.5 Optimization protocol . . . . . . . . . . . 4.5.1 Evolutionary algorithm example . . 4.5.2 Grid spacing example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 7 8 8 10 12 13 13 15 15 17 17 17 18 1 INTRODUCTION TO THE REDMDSTREAM PACKAGE 3 Disclaimer — beta version status Thank you for downloading beta version of the RedMDStream. The software is still a project in development and full version is expected by the end of June. This manual refers to a beta version of RedMDStream. It is made available to demonstrate the software features and options. We have made an effort to make sure that functionality and options described in the following text are error–free, however since testing process is not finished, the code might contain bugs. Also for this reason we mark some options in the manual as EXPERIMENTAL, if we feel that not adequate amounts of tests were done for the option in question. At the moment presented manual is designed to provide a technical support for the user. For a full version we will place more emphasis on providing a reference information through description of methods and algorithms included in the RedMDStream. All comments and suggestions are welcome to the authors: [email protected] or [email protected]. 1 Introduction to the RedMDStream Package RedMDStream is a free open source package for developing coarse–grained (CG) models for biomolecules (nucleic acids, proteins and their complexes) by automatizing and streamlining the most time consuming steps of the procedure. Since creating a CG model is a very time consuming task, we have created this software to help the researcher efficiently explore plethora of possibilities available in an art of designing such models (representation, bonding patterns, potential energy function). This software focus is on point–beads models, not bound to lattice, with a small number of beads (1–4) representing a residue. RedMDStream provides routines to perform the following tasks: • Mapping a full–atomistic structure to a coarse-grained representation based on user provided topology. • Setting a network of bonded interactions (pseudo–bond, pseudo–angle, pseudo–dihedral) between beads based on their position in sequence, spatial coordinates, additional structural information (2D/3D) and other properties. • Introducing a potential energy functional form for the bonded interactions defined above and the nonbonded interactions between all the others. • Running molecular dynamics using a built–in RedMD simulation engine. • Analyzing root mean square deviation, root mean square fluctuations and distance distributions for the resulting trajectory. • Optimizing parameters of the model using metaheuristic or local search algorithms. RedMDStream, as name suggests, is an extension to RedMD coarse–grained molecular dynamics software developed in Dr Joanna Trylska group in years 2008-2014. The RedMD simulation engine is fully integrated into the RedMDStream software. Please note however tht additional 2 INSTALLATION AND EXECUTION 4 tools bundled in the RedMD package (e.g., structure preparation, analysis) are not included in the RedMDStream package to avoid redundancy. If needed please install them from the RedMD package available at the laboratory webpage. RedMDStream is intended to be extensible. The source code is structured, therefore adding new methods and routines is straightforward. The RedMDStream source code is written entirely in C and C++ languages and parallelized with OpenMP and MPI technologies. The code was tested on various LINUX platforms. Please direct your comments and questions about the RedMDStream package to: [email protected] or [email protected] Centre of New Technologies, University of Warsaw ˙ Zwirki i Wigury 93 02-089 Warsaw, Poland phone: +48 22 5540 843 The source code of the RedMDStream package can be found at: http://bionano.cent.uw.edu.pl/software The RedMDStream package is distributed under the terms of the GNU Public License. A copy of the GPL is provided with the RedMD distribution and is also available at http://www.gpl.org. 2 Installation and execution 2.1 Requirements RedMDStream is a C++ Linux/UNIX application, installation on Windows system should be possible using Cygwin. RedMDStream is available for download as a source code, to allow compilation for particular user system. It therefore requires: • GNU C or C++ compiler1 , • certain basic library function, which are usually supplied by Linux/UNIX system, however lack of getline function on older Mac OS X systems might prevent program from compiling, • libxml2 library with headers for compilation (libxml2-devel or libxml2-dev package has to be installed on certain Linux distributions). After downloading the package from our web page one should unpack the archive: tar xzf RedMDStream-beta-src.tar.gz 1 It might be beneficial to try machine-specific compilers to obtain better performance on clusters like IBM BlueGene or Power systems. This is however not tested at the moment. 2 INSTALLATION AND EXECUTION 5 Rnter into the RedMDStream sources directory: cd RedMDStream Execute a configuration script and compile the program: ./configure make make install By default RedMDStream will be compiled with an OpenMP parallelization enabled (if available). To compile without OpenMP, one should use the following commands: ./configure -disable-openmp make make install Compiled executable will be stored in the /usr/local/bin folder. To use a different location add -prefix= option with a directory name to the configure script (e.g., ./configure -prefix=/opt/RedMD). To run RedMDStream simply write: RedMDStream <input file name> If necessary add a directory prefix, depending on usage of prefix option in configure script, e.g.: /opt/RedMD/bin/RedMDStream <input file name> 3 UNITS 3 6 Units RedMDStream uses the following system of units: internal ˚ Length A = 10−10 m Mass u = 1.6605402(10) · 10−27 kg Charge e = 1.602 · 10−19 C Time ps = 10−12 s Temperature K J Energy 10 mol input and output files input parameters ˚ A u e ps K kcal mol For angle measure, degrees are taken to measure pure angles and radians in force constants and other composite parameters. This behavior can be changed by using RedMDStream options. 4 INPUT FILES 4 7 Input files RedMDStream can be executed in two different modes: • Simulation mode In this mode a single simulation, with a defined coarse–grained model, is prepared, executed and analyzed. To use this mode provide a simulation protocol file as a RedMDStream input (see Section 4.3). • Optimization mode In this mode multiple simulations are executed in a batch mode to find an optimal model parameter set. To use this mode provide an optimization protocol file as a RedMDStream input (see Section 4.5). 4.1 XML format parsing RedMDStream protocol and topology files are defined in the XML standard. In a XML file data are organized into tags. A tag can incorporate other tags or text. In such case it is written in the following way: <TAG> text <INNER TAG> </INNER TAG> </TAG> with always an opening and closing tag. Missing a closing tag or trying to interlace tags (closing a tag before first closing all enclosed tags) will result in a syntax error and stop of a program execution. If a tag does not include any content it can be written in a shorter version: <TAG /> In this way closing tag is unnecessary. Tags may also include properties: <TAG p r o p e r t y 1=”v a l u e 1 ” p r o p e r t y 2=”v a l u e 2 ” /> One tag may include only one property with a certain name, but can enclose multiple tags with the same name. Including some properties is obligatory, for example if one writes a tag to open a file, this would be a nonsense operation without providing a filename. If a tag without an obligatory property is found a program execution will stop. Other properties can be optional. If they are omitted, there won’t be an error message, however a default value will be assigned to the property. RedMDStream will produce an error, if a tag is found that should not be enclosed in other tags, however properties that unneeded are ignored. This might lead to errors, when a typographic error is made while writing an optional property name, as default value will be assigned. Everything inside comment tags is being ignored: <!−− i g n o r e d t e x t −−> 4 INPUT FILES 4.2 8 Equation parsing RedMDStream has included an equation parser. One can use standard operations in the equation editor (+ − ∗ / ˆ ) and mathematical functions (log log10 exp sin cos tan sinh cosh tanh). One can also use comparison operators (since in XML standard lower/greater signs are used for tags, such operators have to be described using acronyms) : • eq or = Equals • neq or != Not equals • lt Less than • gt Greater than • lte Less than or equals • gte Greater than or equals Logical expressions and mathematical expressions can be mixed using ? and : operators: expr1 ? expr2 : expr3 expr1 can be a logical expression or mathematical expression. If it is true (logical) or not zero (math), expr2 is evaluated. Otherwise expr3 is evaluated. Number all always parsed as real numbers (floating point), there is no integer mathematics available. One can also use variables in the RedMDStream equations. 4.3 Simulation protocol Simulation protocol tag has to include a root tag PROTOCOL. Directives inside the PROTOCOL tag are executed in sequential order. If any of the directives stop program execution is stopped. The directives include: • Load another protocol file: <PROTOCOL f i l e =” f i l e n a m e ” /> • Define variable for equation parsing: <VAR name=”PI ” >3.1415926 </VAR> <VAR name=”E”>exp (1) </VAR> Expression for the variable is placed inside the tags. • Print variable value: <PRINT name=”var1 ” /> • Create a distance distribution: 4 INPUT FILES 9 <DISTRIBUTION name=” d i s t r 1 ” beg =”0.0” end =”100.0” b i n s =”100”> <EXPR> exp(−4∗ r ) </EXPR> </DISTRIBUTION> Generates a distribution for distances from 0.0 ˚ A to 100 ˚ A in 100 bins of an f (r) = exp(−4r) function and saves it in memory under name distr1). One can also load a distribution from a text file (in every line there is bin position and number of counts separated by space): <DISTRIBUTION> <TXTFILE f i l e =” f i l e n a m e ” /> </DISTRIBUTION> or load a distribution from a binary file generated by RedMDStream: <DISTRIBUTION> <BINFILE f i l e =” f i l e n a m e ” /> </DISTRIBUTION> • Define a coarse–grained mapping and topology in the same file: <TOPOLOGY> ... </TOPOLOGY> or loaded from a different file: <TOPOLOGY f i l e =”top . xml” /> See Section 4.4 for details on how to define topology. • Load a molecule structure file: <MOLECULE name=”mol1 ” pdb=” f i l e 1 . pdb” /> <MOLECULE name=”mol2 ” pdb=” f i l e 2 . pdb” rnaview=” f i l e 2 . pdb . out ” /> <MOLECULE name=”mol3 ” pdb=” f i l e 3 . pdb” d o t b r a c k e t=” f i l e 3 . 2 d” /> Loads a PDB file structure from file, optionally include 2D RNA structure in dot-bracket format or RNAView tertiary RNA structure and applies a coarse–grained topology afterwards. • Load an ensemble of structures in the PDB format to generate a distance distribution: <DIRECTORY name=” d i r ” pdb=”.pdb” s c a n S u b d i r s=”t r u e ”> <DISTRIBUTION name=” d i s t r 2 ” beg =”0.0” end =”100.0” b i n s =”100” bondRule=” r u l e 1 ”> <TXTFILE s a v e=” f i l e . t x t ” /> <BINFILE s a v e=” f i l e . d i s t r ” /> </DISTRIBUTION> </DIRECTORY> 4 INPUT FILES 10 Analyzes all PDB files (extension .pdb) in a directory named dir and all its sub directories. Creates a beads distance distribution named distr2, based on bond named rule1 (see Sec. 4.4). If bondRule is omitted, distribution will be calculated for distances between all beads. Distribution is calculated for distances from 0.0 ˚ A to 100.0 ˚ A, in 100 bins and saved in memory as distr2, in a text file called file.txt and in a binary file called file.distr. • Save a coarse–grained molecule structure: <OUTPUT mol=”mol1 ” pdb=”mol1 . cg . pdb” /> <OUTPUT mol=”mol1 ” sxml=”mol1 . sxml ” /> Saves mol1 coarse–grained structure to mol1.cg.pdb in the PDB format and coarse–grained structure and topology to a file mol1.sxml in the RedMD Structure XML file (contains information about beads and interactions). • Run an MD simulation: <SIMULATION mol=”mol1 ” . . . > ... <ANALYZE> ... </ANALYZE> </SIMULATION> Simulate molecule named mol1. See Section 4.3.1 for detailed information on SIMULATION tag. Simulation trajectory analysis commands are enclosed in ANALYSIS tag and described in Section 4.3.2. • Load a saved DCD trajectory: <TRAJECTORY mol=”mol1 ” dcd=”mol1 . t r a j . dcd ” cg=”t r u e ”> <ANALYZE> ... </ANALYZE> </TRAJECTORY> See Section 4.3.2 for details on analysis of a simulation trajectory enclosed in ANALYSIS tag. 4.3.1 Simulation execution For details on how simulation is performed (equations of motion, etc.) please refer to Ref. [1] or RedMD manual (available on the http://bionano.cent.uw.edu.pl/Software/RedMD webpage). Simulation execution tag depends on simulation mode. For integration (NVE simulation) one should start the SIMULATION section in the following way: <SIMULATION mol=”mol1 ” mode=” i n t e g r a t i o n ” dt =”10 f s ”> with all properties obligatory. A property named dt denotes simulation timestep. One can use time units (fs or ps) or if left unitless picoseconds will be used. One can also define how simulation 4 INPUT FILES 11 will be saved by using optional properties of SIMULATION tag. By default simulation is stored and analyzed in memory. To save trajectory to file one has to use dcd property, for example: <SIMULATION mol=”mol1 ” mode=” i n t e g r a t i o n ” dt =”10 f s ” dcd=”mol1 . dcd”> Other simulation modes are NVT simulation with Berendsen thermostat: <SIMULATION mol=”mol1 ” mode=”b e r e n d s e n ” dt =”10 f s ” tau=”1”> where tau denotes coupling constant in ps. There is also Langevin dynamics available: <SIMULATION mol=”mol1 ” mode=” l a n g e v i n ” dt =”10 f s ” gamma=”1”> with gamma being damping constant in ps−1 . Simulation mode is followed by simulation commands: • Run minimization: <MINIMIZE /> Structure is minimized using a limited memory Broyden-Fletcher- Goldfarb-Shanno (LBFDS) method. • Set temperature for initialization (if no dynamics were performed), Berendsen bath coupling or Langevin dynamics: <TEMPERATURE v a l =”310” /> Temperature pattern can be also defined in terms of actual step. For heating during 10000 simulation steps from 10 K to 300 K and later having constant temperature simulation one can use the following command: <TEMPERATURE> ( s t e p < 10000) &l t ; 10 + s t e p / 10000 ∗ (300 − 1 0) : 300 </TEMPERATURE> If expression is used to define temperature, it has to be enclosed between opening and closing TEMPERATURE tags. • Perform actual molecular dynamics simulation: <STEP s t e p s =”10000” /> This perform simulation for 10000 steps. One can also define a number of steps in time units: <STEP s t e p s =”10 ns ” /> which can be more convenient for a longer simulation setup (fs/ps/us2 /ms are available). There are also options to control size of saved trajectory. This can be defined either by setting interval in number of steps, when frame save is performed: 2 for microsecond 4 INPUT FILES 12 <STEP s t e p s =”10 ns ” saveFreq =”1000” /> frame will be saved every 1000 frames or by setting a target amount of frames to be saved in total: <STEP s t e p s =”10 ns ” saveFrames =”1000” /> in total 1000 frames will be saved from the simulation. The last option is not to save at all: <STEP s t e p s =”10 ns ” noSave=”t r u e ” /> • Analyze trajectory — see Section 4.3.2. 4.3.2 Trajectory analysis RedMDStream provides three ways to analyze trajectories (see Ref. [2] on how the measures are defined in details): • Structural fidelity by means of root mean square deviation (RMSD) RedMDStream calculates RMSD using Kabsch algorithm [3] to account for the best translation and rotation (that minimizes RMSD) of a molecule prior to calculating the value. Calculated average RMSD for the whole trajectory can be saved to a variable: <RMSD var=”rmsd” /> or RMSD for each frame printed to a file: <RMSD f i l e =”mol1 . rmsd” /> • Structure mobility by means of root mean square fluctuations (RMSF) To calculate average RMSF one can use the following tag: <RMSF var=”rmsf ” /> To save RMSF to file for each bead: <RMSF f i l e =”mol1 . rmsf ” /> RedMDStream performs alignment of trajectory before calculating RMSF to not include translations and rotations of the molecule during simulation in the RMSF count. This can be turned off by adding align="false" to XML tag: <RMSF var=”rmsf ” a l i g n =” f a l s e ” /> RedMDStream can also calculate RMSF in B-factor units: <RMSF var=”rmsf ” b F a c t o r U n i t s=”t r u e ” /> • Distance distribution functions Beads distance distribution functions can be calculated based on trajectory using DISTRIBUTION tag: 4 INPUT FILES 13 <DISTRIBUTION beg =”0.0” end =”100.0” b i n s =”100” bondRule=” r u l e 1 ” name=” d i s t r 3 ” /> Creates a beads distance distribution named distr3, based on bond named rule1 (see Sec. 4.4). If bondRule is omitted, distribution will be calculated for distances between all beads. Distribution is calculated for distances from 0.0 ˚ A to 100.0 ˚ A, in 100 bins and saved in memory as distr3. One can use also a previously saved distribution as a reference to allow for comparison. In such case distance range and bins number are taken from a reference distribution and should not be provided: <DISTRIBUTION bondRule=” r u l e 1 ” name=” d i s t r 4 ” r e f =” d i s t r 1 ” /> If reference is provided the following analysis and comparisons can be made: <DISTRIBUTION bondRule=” r u l e 1 ” name=” d i s t r 4 ” r e f =” d i s t r 1 ”> <DIVIDE> r ˆ2 </DIVIDE> <K−S var=”comp1” /> <CHISQUARE var=”comp2” /> <HODKIN var=”comp3” /> <RMSD var=”comp4” /> <MAX POSITION var=”max” /> <TXTFILE name=”s a v e . t x t ” /> </DISTRIBUTION> First distribution might be divided by a function (in this case r2 ). Then following comparisons with distr1 distribution are made: Kolomogorov-Smirnov measure is stored in comp1 variable, χ-square in comp2, Hodkin index in comp3 and RMSD–like measure in comp4. Position of a bin with highest count is stored in max. Distribution is saved in a text file save.txt. 4.4 Topology definition A coarse–graining process in RedMDStream is performed in the following way: • Atoms from high–resolution structure are mapped into a coarse–grained representation based on ”atom rules”. • Bond network is created based on ”bond rules”. • Potential energy function terms are associated with particular bonds. All the tasks mentioned above are referred in the manual as a ”topology”. Rules how to perform these tasks are encoded in a topology file or in a TOPOLOGY tag. If a separate file is used it has to contain a single TOPOLOGY root tag. 4.4.1 Atom mapping rules RedMDStream provides two ways to create a coarse–grain bead: 4 INPUT FILES 14 • Use a currently existing atom. Coarse–grain bead is associated with a particular atom in the full–atomistic representation. For example phosphorus in nucleic acids or α-carbon in amino acids. In such case definition goes as following: <ATOMTYPE name=”CGP”> <VAR name=”mass”>300</VARIABLE> <VAR name=”c h a r g e ”>−1.0</VARIABLE> <ATOMRULE> <FIND ATOM atomName=”P” /> </ATOMRULE> </ATOMTYPE> This will create an atom of type CGP with mass of 300 u and charge -1.0 e, placed in position of atom named ”P” in the PDB file. One can extend ATOMRULE section to incorporate more conditions, for example: <ATOMTYPE name=”CGP”> <VAR name=”mass”>300</VARIABLE> <VAR name=”c h a r g e ”>−1.0</VARIABLE> <ATOMRULE> <FIND ATOM atomName=”CA” resName=”LYS” /> <EXPR>atom . i s H e l i x </EXPR> </ATOMRULE> </ATOMTYPE> Find all Cα atoms in lysine residue, which are part of a helix (based on data in the PDB file). • Create a bead based on average position of multiple atoms. In this case a bead is created in a new position, for example in center of mass of a residue. This rules are calculated for each residue, so at the moment it is not possible to build a bead corresponding to multiple residues. While creating a bead, variables that correspond to that bead can be aggregated from member beads. For example: <ATOMTYPE name=”CGP”> <NEWATOM name=”LYS” centreOfMass=”t r u e ”> <RESIDUE name=”LYS” /> <VAR name=”mass ” ag gr=”sum” /> <VAR name=”tempFactor ” ag gr=”avg ” /> </NEW ATOM> </ATOMTYPE> This creates a bead for lysine amino acid, in the residue center of mass (centreOfMass="false" would create a bead in a geometrical center of residue). This bead will have two variables: mass being sum of atom masses in the residue and tempFactor being average of atoms temperature factors. 4 INPUT FILES 4.4.2 15 Bond creation rules Second point is to define rules, where bonds are placed in the structure. These rules will be used when defining potential energy terms. Each rule has to have a name. An example rule: <BONDRULE name=”Pneigh”> <NEIGHBOR /> <ATOM type=”P” /> </BONDRULE> This rule includes P (phosporus) beads that are neighboring (1–2 for pseudo–bond, 1–3 for pseudo– angle, 1–4 for pseudo–dihedral). <BONDRULE name=”P c l o s e ”> <DIST min =”0.0” max=”10.0” /> <ATOM type1=”CA” type2=”P” /> </BONDRULE> This rule includes P (phosporus) beads interacting with CA (α-carbon) beads that are in vicinity (distance 0-10 ˚ A). <BONDRULE name=”Pcompl1”> <INTERACTION name=”WCHBond” /> <ATOM type=”P” /> </BONDRULE> This rule includes P (phosporus) beads from bases connected using the Watson-Crick complementary pairs interaction read from 2D/3D RNA structure file. <BONDRULE name=”Pcompl2”> <SHIFT s h i f t 1 =”1” s h i f t 2 =”1”> <INTERACTION name=”WCHBond” /> </SHIFT> <ATOM type=”P” /> </BONDRULE> This rule is similar to the previous one, however connects beads from bases that are next in sequence to beads interacting by the Watson-Crick hydrogen bonding pattern (so called i+1:j+1 interaction). 4.4.3 Potential energy function definition When bond rules are defined, the next step is to assign these rules to a potential energy function term. It is done as in the following example: <BOND HARM name=”compl2”> <BONDRULE name=”compl2 ” /> <VAR name=”r 0”>bond . d i s t </VARIABLE> <VAR name=”k”>5.0</VARIABLE> </BOND HARM> 4 INPUT FILES 16 In this example a harmonic potential pseudo–bond is created between beads described by compl2 bond rule (potential and bond rule names can overlap). Potential parameters are defined with an equilibrium distance equal to distance in a reference structure and a force constant equal to 5 kcal˚2 . molA In the parameter definition one can use values connected to the bond (bond.dist, bond.angle, bond.dihedral), atoms forming the bond (atom1.charge, atom2.mass) or variables defined in the protocol/optimization file (should be prefixed by global, e.g. global.PI). Potential terms are assigned to bonds in order they appear in the file. If two potentials would describe interaction between the same beads only the first in order will be used and second ignored. RedMDStream supports the following potentials: • Harmonic pseudo–bond potential 1 V (r) = k(r − r0 )2 2 (1) tag: BOND HARM, parameters to declare: r0= r0 and k= k • Harmonic pseudo–angle potential 1 V (θ) = kθ (θ − θ0 )2 2 (2) tag: ANG HARM, parameters to declare: theta0= θ0 and k= kθ • Harmonic pseudo–dihedral potential 2 V (α) = kα (α − α0 )2 2 (3) tag: DIH HARM, parameters to declare: alpha0= α0 and k= kα • Morse pseudo–bond potential V (r) = E0 ∗ (1 − exp(−α ∗ (r − r0 )))2 (4) tag: BOND MORSE, parameters to declare: E0= E0 , r0= r0 and alpha= α • Lennard-Jones pseudo–bond potential V (r) = 4 ∗ ε ∗ ((σ/r)12 − (σ/r)6 ) (5) tag: BOND LJ, parameters to declare: eps= ε and sigma= σ • Custom pseudo–bond potential This is a user–defined potential. It is defined as: <BOND CUST name=”custom1 ” f o r m u l a =”2∗ g l o b a l . PI / l o g ( r )∗ exp(−bond . d i s t ∗ r ∗k)”> <BONDRULE name=”compl2 ” /> <VAR name=”k”>5.0</VARIABLE> </BOND HARM> 4 INPUT FILES 4.4.4 17 Nonbonded potential Nobonded potential are defined inside a NONBONDED section tags. They can be defined for a particular pair of beads (by their type) or for all beads (”default” potential). An example NONBONDED section looks like this: <NONBONDED> <BOND LJ name=”nonbLJ”> <VAR name=”eps ”>3.0</VAR> <VAR name=”sigma ”>15.0</VAR> </BOND LJ> <BOND MORSE name=”morseNB” type1=”P” type2=”CA”> <VAR name=”E0”>3.0</VAR> <VAR name=”r 0 ”>15.0</VAR> <VAR name=”alpha ”>0.1</VAR> </BOND MORSE> </NONBONDED> 4.5 Optimization protocol RedMDStream can be launched in optimization mode. In this mode a coarse–grained model can be optimized for a particular needs. See Ref. [2] for details. Optimization and simulation protocol communicate through variables. Used defines certain variables as parameters, theirs values is assigned by an optimization routine and other variables, produced by simulation, are considered a score (for example RMSD, RMSF, distance distribution comparison value). This corresponds to a definition of a function: M X f (x1 , ..., xn ) = w i si (6) i where xi are parameters assigned by the optimization routine, wi are weights for each score and si are values of certain scoring functions. In the RedMDStream optimization procedure is a general one, it does use any knowledge of underlying physics of the system. In this way it can avoid traps of analytic methods, like for example parameter correlation in case of Boltzmann Inversion. The following methods are available: • Evolutionary algorithm • Particle swarm optimization • Grid calculation For description of available methods at the moment please refer to Refs. [2] and [4]. Example files are available with RedMDStream. 4.5.1 Evolutionary algorithm example Example optimization file for evolutionary algorithm execution: 4 INPUT FILES 18 <OPTIMIZATION method=”evo”> <PROTOCOL r e r u n =”10” name=”run . xml” /> <DATA> <PARAMETER name=”p1” min =”0.0” max=”1.0” /> <PARAMETER name=”p2” min =”5.0” max=”7.0” /> <SCORE name=”s1 ” v a l u e=”rmsd” max=”50” weight =”0.7” /> <SCORE name=”s2 ” v a l u e=”rmsf ” max=”100” weight =”0.3” /> </DATA> <OPTIONS> <ITERATION s i z e =”16” number=”128”/> <ELITE v a l =”16” /> <MATING type=”tournament ” /> <MUTATION type=”g a u s s ” t o l e r a n c e =”0.1” /> <CROSSOVER b l e n d i n g =”1.0” /> <FILE s a v e=”o p t i m i z e . r e s u l t ” t x t=”o p t i m i z e . r e s u l t . c s v ” /> </OPTIONS> </OPTIMIZATION> In this file scoring function will take two arguments (p1 from 0–1 and p2 from 5–7). Value of the scoring function will be calculated be running run.xml protocol file (average of 10 executions), taking rmsd variable with 70% weight and rmsf variable with 30% weight. Evolutionary algorithm will run for 16 generations, each of 128 members, 16 best members will copied unchanged to a next generation (elitism). Mating will be done using tournament selection, mutation are applied with normal probability from the parameter value, crossing–over will be done using blending method. Results will be saved to optimize.result (binary format for restarts) and to optimize.result.csv (Excel readable comma separated values file). 4.5.2 Grid spacing example <OPTIMIZATION method=” g r i d ”> <PROTOCOL r e r u n =”10” name=”run . xml” /> <DATA> <PARAMETER name=”p1” min =”0.0” max=”1.0” /> <PARAMETER name=”p2” min =”5.0” max=”7.0” /> <SCORE name=”s1 ” v a l u e=”rmsd” max=”50” weight =”0.7” /> <SCORE name=”s2 ” v a l u e=”rmsf ” max=”100” weight =”0.3” /> </DATA> <OPTIONS> <STEPS v a l =”8” /> <FILE t x t=”o p t i m i z e . r e s u l t . c s v ” /> </OPTIONS> </OPTIMIZATION> In this file scoring function will take two arguments (p1 from 0–1 and p2 from 5–7). Value of the scoring function will be calculated be running run.xml protocol file (average of 10 executions), 4 INPUT FILES 19 taking rmsd variable with 70% weight and rmsf variable with 30% weight. Calculation will be performed for values of p1 for 0,1,2,3,...,10 and values of p2 also for 0,1,2,3,...,10, resulting in 100 calculations of the scoring function, providing some insight into scoring function behavior for the parameters of interest. REFERENCES 20 References [1] Adam G´orecki, Marcin Szypowski, Maciej Dlugosz, and Joanna Trylska. Redmd – reduced molecular dynamics package. J. Comput. Chem., 30:2364–2373, 2009. [2] F. Leonarski, F. Trovato, V. Tozzini, A Le´s, and J. Trylska. Evolutionary algorithm in the optimization of a coarse–grained force field. J. Chem. Theory Comput., 9:4874–4889, 2013. [3] Wolfgang Kabsch. A solution for the best rotation to relate two sets of vectors. Acta Crystallographica Section A, 32(5):922–923, Sep 1976. [4] Randy L. Haupt and Sue E. Haupt. Practical Genetic Algorithms. Wiley-Interscience, 2 edition, May 2004.
© Copyright 2025