Structure, Volume 23 Supplemental Information Collision Cross Sections for Structural Proteomics Erik G. Marklund, Matteo T. Baldwin, and Justin L.P. Benesch Degiacomi, Carol V. Robinson, Andrew J. Supplemental*Information*for* Collision'cross*sections'for'structural'proteomics' " Erik"G."Marklund,"Matteo"T."Degiacomi," Carol"V."Robinson,"Andrew"J."Baldwin*,"Justin"L.P."Benesch*" Department of Chemistry, Physical & Theoretical Chemistry Laboratory, University of Oxford, South Parks Road, Oxford, Oxfordshire, OX1 3QZ, U.K. " " *Correspondence"to"[email protected],"[email protected]" " Contains: Supplemental Data (Tables S1-2, Supplemental Figures S1-3), Supplemental Experimental Procedures, Supplemental References " ! 1" " SUPPLEMENTAL'DATA' ' (!"#) a" b" !! %" !!,!"#$%& %" !!,!"# %" !! %" 0.843" 1.051" 1.08" 0.10" 0.49" 0.95" " Table' S1,' related' to' Figure' 1:' Calibration' of' IMPACT' allows' for' accurate' calculations.'a"and"b"are"parameters"for"calibration"of"IMPACT"against"TJM"obtained"by" fitting" the" power" law"Ω !"# = !Ω!"#$%& ! "to" the" calculated" data" (Fig" 1B," Supplementary" Figure" 1A)." !!" is" the" standard" deviation" of" the" observed" distribution" of" relative" error" between" the" calibrated" IMPACT" and" TJM," which" includes" both" systematic" differences" between" the" methods" and" statistical" uncertainty" from" the" MC" integration." !IMPACT" and" !TJM"are"the"statistical"uncertainties"for"IMPACT"and"TJM"expressed"as"relative"standard" deviations." From" these" quantities" we" can" estimate" the" discrepancy" between" TJM" and" IMPACT"to"be"!! = 0.95%"in"the"limit"of"infinite"precision"(Supplementary"Experimental" Procedures),"meaning"that"IMPACT"typically"can"predict"TJM"results"to"within"less"than" 1%"for"folded"proteins." " 2" " B Collision cross-section (TJM) / Å 2 Normalised wall time Normalised standard error Collision cross-section (IMPACT) / Å 2 A Number of interleaved calculations ' Figure'S1,'related'to'Figure'1:'Accuracy,'robustness,'and'wall'time.'A:"Similar"to"Fig." 1B"but"including"a"linear"fit"for"comparison."Both"the"linear"and"power"law"fits"contain" the"same"number"of"degrees"of"freedom"(two),"The"fitted"power"law"yields"lower"errors" than"the"linear"fit,"and"is"therefore"a"better"choice"for"calibrating"PA"against"TJM."B:"For" 10,000" probes" fired" in" total" per" run," distributed" over" N" interleaved" calculations," the" relative" standard" error" of" mean" sr" normalised" by" the" target" precision" !" (blue)" and" the" wall"time"required"for"convergence"normalised"by"the"rightmost"data"point"(red)."For"N" <" 16," and" particularly" for" low" !," sr" is" underestimated" by" IMPACT" and" the" calculations" stop"before"reaching"the"target"precision,"which"is"also"reflected"by"a"shorter"wall"time" for" low" N." For" N" ≳" 32" however" IMPACT" correctly" estimates" the" standard" error" of" the" calculations." " " ' ' 3" " Assembly' PDB'code' m'/'MDa' Ω'/'Å2' #atoms' topt'(t0)'/'s' 70S"ribosome" 2WDK–N" 2.4" 40,975" 148,020" 1.3"(15.3)" STNV" 2BUK*" 1.8" 33,627" 182,258" 1.2"(6.3)" The"vault" 2ZUO,"2ZV4–5" 3.5" 285,372" 483,912" 2.9"(50.4)" Adenovirus" 1VSZ" 89" 1,076,957" 5,543,280" 10.7"(47.4)" " Table'S2,'related'to'Figure'2:'Large'protein'complexes'used'for'assessing'the'effect' of'octrees'on'IMPACT’s'performance."Four"large"macromolecular"assemblies"that,"due" to"their"size,"are"challenging"targets"for"CCS"calculations"were"used"to"assess"the"effect"of" octrees." topt" and" t0" are" the" wall" times" for" calculating" the" CCS" to" 1%" precision" using" octrees" with" optimum" depth" and" without" octrees," respectively." *The" model" used" contains"the"RNA"genome"modelled"into"the"crystal"structure"of"the"capsid"together"with" water"and"salt"(Larsson,"2012).' ' ' 4" " ' A B C D 2 1 0 ' Particles Figure'S2,'related'to'Figure'2:'Flowchart'of'the'algorithm'and'heuristics'for'setting' optimal' octree' depth.' A:"Overall"process"flow"of"IMPACT."B:"The"process"of"assessing" convergence"from"several"independent"calculations."C:"The"MC"integration"with"octrees." Incoming" probe" particles" are" first" tested" for" collision" with" the" bounding" box" that" surrounds" the" structure," i.e." octree" level" d" =" 0," before" it" is" tested" for" collision" with" the" box’" contents." At" any" level" of" the" octree" a" box" can" be" empty," contain" smaller" boxes" representing"the"next"level"of"the"octree,"or"contain"atoms."IMPACT"recursively"traverses" the" octree" to" find" atomhcontaining" boxes" that" intersect" the" probe" trajectory," and" only" those" atoms" are" ever" tested" for" collision" with" the" probe." D:" The" optimal" octree" depth" level,"based"on"wall"time,"increases"with"the"number"of"atoms"in"the"target."IMPACT"uses" this"empirical"relation"to"choose"the"optimal"depth"limit"D"at"runtime."The"boundaries" between"regions"in"which"IMPACT"choses"D"="0,"1,"and"2"are"indicated"with"dashed"lines." (See"also"Fig"2)." ' 5" " " 10 2 10 Fisher et al. 3 0.48 Da/Å 3 0.37 Da/Å PiQSi 0 Intensity 1 1.0 3 2 10 3 10 4 10 5 10 6 Mass / Da 10 7 10 8 0 0.5 Best fit Sphere 10 3 10 4 10 5 Mass / Da 10 6 0 10 -3 4 1.0 10 Fisher et al. 3 0.48 Da/Å 3 0.37 Da/Å PDBe 0 Intensity 1 0.5 5 2 -3 10 1 Counts 10 Density / Da Å 10 C 6 Density / Da Å B Collision cross-section / Å 2 A 10 3 10 4 10 5 Mass / Da 10 6 ' Figure' S3,' related' to' Figure' 3:' Normalisation' of' cross' sections' links' them' to' effective'gas*phase'density.'A:"CCS"as"a"function"of"mass"for"all"structures"in"the"PiQSi" database."The"fitted"line"(black)"is"also"the"normalisation"used"for"the"shape"factor"in"Fig." 3C."The"red"line"represents"the"CCS"for"perfect"spheres"as"a"function"of"mass,"assuming" the"same"density"as"for"proteins"(Fischer"et"al.,"2004)"and"a"1.4"Å"radius"for"the"buffer" gas"particles."B' and' C:"Effective"gashphase"densities"!eff"for"all"biological"assemblies"in" PDBe"and"PiQSi,"respectively,"determined"by"assuming"a"spherical"shape"for"the"proteins" (Bush"et"al.,"2010;"Kaddis"et"al.,"2007;"Ruotolo"et"al.,"2008)."For"both"datasets"!eff"closely" matches" those" derived" from" IMhMS" (Bush" et" al.," 2010;" Kaddis" et" al.," 2007)," but" are" considerably" smaller" than" the" density" inferred" from" crystal" structures" (Fischer) et) al.,) 2004))even)though)both)sets)of)structures)used)here)were)largely)determined)through) X?ray)crystallography.)This)implies)that)the)effective)gas?phase)density)is)artificially)low) because)of)the)assumption)of)spherical)proteins.) ) ) 6" " ) SUPPLEMENTAL'EXPERIMENTAL'PROCEDURES' Projection'approximation'and'the'formulation'of'a'stopping'criterion' The" PA" equates" the" CCS," Ω," to" the" rotationally" averaged" projected" area" of" the" target," taking"into"account"the"radius"of"the"bufferhgas"particles."This"area"is"typically"calculated" through"MC"integration"where"probes"representing"the"buffer"gas"are"fired"towards"the" randomly"rotated"target."If"we"let"n"probes"be"fired"at"random"within"a"region"of"area"A" that" encloses" the" projection" of" the" target," and" with" each" probe" hitting" the" target" with" probability" ph=Ω/A," then" we" can" make" an" estimate" Ω’" of" the" CCS" from" the" fraction" of" probes"that"hit:" " ℎ Ω = !!! = lim ! " !→! ! Eqn"S1" The"required"size"of"n"for"Ω’"to"converge"to" Ω"within"a"relative"error" 𝜏"is"not"known"a" priori." Firing" additional" probes" beyond" what" is" needed" for" results" to" converge" is" however" a" waste" of" computing" time." Here," we" measure" consensus" among" several" independent"replica"CCS"calculations"run"concurrently"(Williams"et"al.,"2009)."We"define" the"consensus"as"the"relative"standard"error"of"the"mean,"sr,"which"is"evaluated"until"it"is" below" a" prehdefined" threshold" 𝜏." This" has" the" advantage" of" connecting" the" stopping" criterion"to"a"measure"of"precision."" Relying"on"consensus"runs"the"risk"of"stopping"the"calculations"prematurely,"as"for"small" n" there" is" a" chance" that" the" N" independent" estimates" are" close" to" their" common" mean" ⟨Ω’⟩,"yet"still"far"from"the"true"Ω."Empirically,"with"N"≳"32"IMPACT"was"found"to"avoid" 7" " stopping" prematurely" and" robustly" estimated" the" standard" error" of" mean" (Supplementary" Figure" S2B)." To" further" mitigate" this" effect," IMPACT" fires" a" tuneable" minimum"number"of"probes"before"assessing"convergence." " Proof'that'convergence'depends'on'number'of'probes'fired' Any"sequence"of"probes"fired"at"the"target"from"a"certain"direction"is"a"Bernoulli"process." As"such"the"number"of"hits"h"is"binomially"distributed,"and"after"n"probes"the"variance"of" h"is"!! ! = !!! 1 − !! ."By"combining"this"with"Eqn"S1"we"get"the"variance"for"Ω,"which" is"! ! = !"! ! " ! = AΩ − Ω! !."Hence"the"relative"standard"deviation"is"given"by:" !! = ! Ω = ! − Ω !Ω" Eqn"S2" For"large"n"the"binomial"distribution"can"be"approximated"by"a"normal"distribution,"with" mean"Ω"and"standard"deviation"𝜎."We"scale"and"shift"the"distribution"to"have"mean"0"and" standard"deviation"!r"in"order"to"analyse"the"likelihood"of"having"the"Ω’"correct"within"a" relative"precision"𝜏."The"total"probability"p"that"the"estimate"is"within"Ω𝜏"of"Ω"is"obtained" by" integrating" the" scaled" and" shifted" distribution" from" –𝜏" to" 𝜏," which" yields" the" probability"for"a"single"estimate"to"be"correct"within"the"specified"precision:" "" ! = erf ! Eqn"S3" !! 2 " To"consider"the"average"of"N"separate"estimates,"we"define"ℵ ≡ !","i.e."the"total"number" of" probes" fired" in" all" N" independent" calculations" combined," and" note," with" the" help" of" Eqn" S2," that" the" relative" standard" error" of" mean" for" Ω’" is"!! = !! != A − Ω ℵΩ." Analogous"to"Eqn"S3,"the"probability"ps"of"meeting"the"stopping"criterion"then"becomes" 8" " " !! = !"# ! !! ℵΩ = !"# ! 2 !−Ω 2 Eqn"S4" " and" depends" on" the" total" number" of" probes" that" have" been" fired" and" is," for"! ≳ 32" (Supplemental"Figure"S1B),"independent"of"the"number"of"separate"estimates"for"Ω." " Benchmarking'the'robustness'of'IMPACT' To"assess"how"IMPACT"performance"and"precision"are"affected"by"various"parameters," we" calculated" the" CCS" of" the" asymmetric" unit" from" a" crystal" structure" of" the" Norwalk" virus" capsid" (NV," PDB" code" 1IHM)" for" all" combinations" of" N," 𝜏," and" maximum" octree" depth" D." The" parameter" ranges" were" N" ∈" {2," 4," 8," 16," 32," 64," 128}," 𝜏 ∈" {0.01," 0.005," 0.001}," and" D" ∈" [0,4]" (Figure" 2B," Supplementary" Figure" S2AhC)." We" repeated" the" calculations" 200" times" for" each" parameter" combination." To" address" the" fraction" of" the" wall"time"spent"in"different"parts"of"IMPACT,"we"calculated"the"CCS"of"NV"another"1,000" times"with"timing"functions"turned"on."For"NV,"the"wall"time"to"calculate"the"CCS"without" octrees"was"0.18"s;"with"6%"of"the"time"spend"on"I/O"operations,"11%"on"rotations,"and" 83%"on"the"MC"bombardment."With"octrees"(D"="2)"the"wall"time"more"than"halved"to" 0.071" s," with" 17%," 6%," and" 77%" spent" on" I/O," octree" management," and" MC" bombardment"(including"rotations)." " Benchmarking'the'accuracy'of'IMPACT' 9" " We" ran" IMPACT" on" a" data" set" containing" 428" protein" structures" used" previously" for" benchmarking"CCS"calculations"(Bleiholder"et"al.,"2011)."This"was"extended"with"a"set"of" proteins" that" is" commonly" used" for" calibration" of" travellinghwave" IM" instrumentation" (Bush"et"al.,"2010)"and"a"complete"satellite"tobacco"necrosis"virus"(STNV)"virus"structure" (Supplementary"Table"S1)"in"order"to"encompass"a"larger"mass"range."Some"structures" in" the" data" set" consisted" of" nonhcontiguous" clusters" of" atoms," which" is" a" highly" unrealistic"scenario"in"an"ion"mobility"(IM)"experiment."Consequently,"to"filter"out"such" structure"models,"we"only"consider"structures"that"consisted"of"a"single"cluster"with"all" atoms"within"4"Å"from"another"atom."This"excluded"six"structures"from"the"initial"set"of" 428" (Bleiholder" et" al.," 2011):" PDB" codes" 1GO6," 1H1N," 1J0P," 1M0M," 1P7W," and" 1W3M." The" target" precision" in" the" IMPACT" calculations" was" set" to" 0.1%" to" allow" for" more" confident"separation"of"accuracy"from"precision"(see"below)." For" each" structure," in" order" to" compare" the" IMPACT" results" with" TJM," we" invoked" MOBCAL"at"least"50"times,"except"for"STNV,"which"needed"only"20"times"to"converge"to" sufficient"precision."For"structures"where"the"standard"error"of"the"mean"CCS"was"larger" than"0.5%,"we"rehran"them"with"MOBCAL"again"in"batches"of"50"runs"until"the"standard" error"for"each"structure"was"below"0.5%."We"initiated"each"MOBCAL"run"with"a"different" random" seed" and" the" results" pooled" to" give" average" TJM" and" PA" CCS" values" with" associated" error" estimates." To" validate" IMPACT" we" also" recalculated" the" CCSs" for" this" dataset"using"the"same"atomic"radii"as"in"the"PA"implementation"in"MOBCAL,"revealing"a" good"match"between"the"MOBCAL"and"IMPACT,"with""a"correlation"coefficient"R2>0.999" and"an"RMSD"of"0.2%." " Deconvoluting'accuracy'and'precision'from'an'observed'error'distribution' 10" " We" define" the" relative" error" that" describes" the" observed" discrepancy" between" a" CCS" calculated"with"IMPACT"(calibrated)"and"TJM"as:" " !≡ Ω!"#$%& − Ω !"# Ω!"#$%& = − 1" Ω !"# Ω !"# Eqn"S5" Our"calibration"of"IMPACT"ensures"that)Ω!"#$%& Ω !"# ≈ 1,)and"hence) ! ≈ 0.)The)error) is) distributed) around) 0) however) (Fig" 1B)," which" can" be" attributed" to" two" distinct" sources" of" error:" underlying" differences" between" TJM" and" IMPACT," and" random" error" from"the"MC"integration."The"former"determines"the"accuracy"of"IMPACT,"whereas"the" latter"is"the"combined"precision"of"the"TJM"and"IMPACT"calculations."Their"variances,"!! " and"!! ," subscripts" denoting" accuracy" and" precision," combine" into" the" variance" of" the" relative"error:" !! ! = !! ! + !! ! " " Eqn"S6" Firing" a" very" large" number" of" probes" would" make"!! "vanishingly" small," but" the" timeh consuming"TJM"calculations"make"this"approach"impractical."We"can"however"use"error" propagation"of"Eqn"S5"to"obtain"!! ,"and"subsequently"get"!! "from"Eqn"S6."The"first"term" is"a"quotient,"for"which"the"relative"variance"is"equal"to"the"sum"of"the"relative"variances" for" the" nominator" and" denominator," i.e.) !! ! ! ! ! = !! ! ! + !! ! ! ." The" last" term"of"Eqn"S5"is"unity"and"has"therefore"no"impact"on)!! ! ."Hence)we)have) " !! ! Ω!"#$%& Ω !"# ! !!"#$%& = Ω!"#$%& ! !!"# + Ω !"# ! ." Eqn"S7" The"powerhlaw"calibration"does"not"preserve"the"precision"that"was"imposed"on"the"raw" PA" calculations" with" IMPACT," but" the" error" propagation" yields)!!"#$%& Ω!"#$%& = 11" " (!"#) (!"#) !!!"#$%& Ω!"#$%& .)Combining"this"with"Eqs"S6"and"S7,"and"noting"that"Ω!"#$%& Ω !"# ≈ 1)to"within"a"few"per"cent,"we"get"an"expression"for"assessing"the"accuracy"of"IMPACT" with"regards"to"TJM"from"an"observed"distribution"of"relative"errors:" " (!"#) !! = !! ! − !"!"#$%& (!"#) Ω!"#$%& ! !!"# − Ω !"# ! ." Eqn"S8" If"CCSs"calculated"by"using"IMPACT"are"to"be"employed"in"structure"modelling,""this" calculation"error"should"therefore"be"combined"with"the"experimental"uncertainty"to" establish"an"acceptable"CCS"difference"between"a"model"and"experiment,"or"restraining" potential." " Decoupling'the'evaluation'of'the'rotation'matrix'from'the'rotation'of'atoms' Affine" transformations," such" as" rotation," of" an" object" consisting" of" many" vertices" are" dramatically"faster"if"the"transformation"matrix"R"is"evaluated"prior"to"its"application"to" object"coordinates."For"3D"rotation"around"the"y"and"z"axes"only," " " cos ! cos ! R = sin ! cos ! − sin ! − sin ! cos ! 0 cos ! sin ! sin ! sin ! ." cos ! Eqn"S9" " When"R"is"applied"to"the"coordinates"r"of"an"atom"(in"the"case"of"molecular"objects)"to" generate"the"rotated"coordinates"r’,"i.e."r’T=Rr,"the"matrixhvector"product"comprises"the" evaluation"of"5"trigonometric"functions,"5"multiplications,"and"2"additions"each"for"x’"and" 12" " y’."The"rotated"z’"coordinate"is"irrelevant"for"the"PA,"and"its"evaluations"can"be"omitted" from"the"calculations."Sine"and"cosine"functions"take">100"clock"cycles"to"complete:"by" contrast,"multiplications"and"additions"take"≈2."If"the"trigonometric"elements"in"R"have" already" been" evaluated," its" application" of" the" rotation" matrix" only" comprises" the" evaluation" of" 3" multiplications" and" 2" additions" for" each" x’" and" y’." When" thousands" of" coordinates" are" rotated," this" order" of" evaluation" makes" subsequent" rotation" of" the" coordinates"considerably"faster"than"if"the"elements"of"R"are"evaluated"again"for"every" coordinate."As"such,"prehevaluation"of"R"yields"rotations"that"take"about"20"clock"cycles" per" coordinate," compared" to" about" 2,000" clock" cycles" per" coordinate." Things" like" compiler"optimizations"make"the"gain"from"prehcalculating"the"rotation"matrix"difficult" to"determine"precisely,"but"the"effect"is"demonstrably"significant"(Williams"et"al.,"2009)." " The'effect'of'octrees'on'the'speed'of'the'calculation' To"assess"the"effect"of"octrees"(Fig"2A,"Supplementary"Figure"S2AhC),"we"processed"the" PiQSi" data" set," comprising" 1,755" manually" curated" structures" from" the" PDB" (Levy," 2007),"with"IMPACT."100,000"probes"were"fired"at"every"structure"for"a"range"of"octree" depths"! ∈ 0,3 ."To"see"the"effect"of"changing"D"with"larger"structures"we"selected"a"set" of" challenging" high" molecular" weight" complexes;" the" Thermus" thermophilus" 70S" ribosome" (Voorhees" et" al.," 2009)," Satellite" Tobacco" Necrosis" Virus" (STNV)" with" the" genome"modelled"into"the"capsid"(Jones"and"Liljas,"1984;"Larsson,"2012),"an"intact"vault" particle" from" rat" liver" (Tanaka" et" al.," 2009)," and" the" Adenovirus" capsid" (Reddy" et" al.," 2010)" (Table" S1," Fig" 2B);" which" we" ran" through" IMPACT" at"! ∈ 0,5 ," 100" times" per" value"of"D."" 13" " Overall,"the"analysis"revealed"a"clear"dependence"on"the"number"of"atoms"with"optimum" octree"depth"(Supplemental"Figure"S2D)."Above"approximately"2×103"atoms"and"1×104" atoms," Dopt=1" and" Dopt=2," respectively." For" higher" D" the" data" is" too" sparse" for" reliable" estimation" of" Dopt." Based" on" this" empirical" relation" Dopt" is" automatically" selected" when" IMPACT"is"run." " Benchmarking'computational'performance'of'IMPACT' In"another"batch"of"calculations,"we"used"IMPACT,"CCSCalc,"and"MOBCAL"to"compute"the" CCS"of"NV"in"order"to"benchmark"IMPACT’s"performance."To"estimate"the"average"wall" time" we" ran" IMPACT" 1000" times." To" circumvent" CCSCalc’s" fixed" random" seed," which" yields" exactly" the" same" CCS" and" approximately" the" same" wall" time" on" every" run" for" a" given" structure," we" generated" a" series" of" NV" structures" by" rotating" NV" 5" degrees" at" a" time"around"the"[1,1,1]"axis,"the"rotation"angle"ranging"from"0"to"180."We"ran"CCSCalc" from" the" command" line" with" the" convergence" threshold" set" to" 1%" for" each" structure," allowing" for" the" determination" of" an" average" wall" time." Note" that" CCSCalc" employs" a" different" convergence" criterion," which" is" not" based" on" the" standard" error" of" mean." However," emulating" CCSCalc’s" convergence" criterion" with" IMPACT" yielded" a" similar" difference"in"performance."Using"MOBCAL,"we"generated"500"separate"TJM"calculations," with" 1000" trajectories" for" each," and" pooled" them" into" block" averages." We" varied" the" block" size" until" the" standard" error" of" the" mean" CCS" within" the" blocks" was" below" 1%." From" the" resulting" block" size" and" the" time" required" for" a" single" TJM" calculation" we" estimated"the"wall"time"for"TJM."Analogously,"we"also"pooled"100"separate"exact"hardh sphere"scattering"(EHSS)"and"PA"calculations"with"25,000"trajectories"per"run"into"block" 14" " averages."As"such,"despite"a"rugged"CCS"distribution"for"TJM"(Bleiholder"et"al.,"2011),"we" could" still" extract" a" converged" standard" error" of" mean" and" therefore" also" a" good" estimate"for"the"time"to"reach"1%"error"for"these"methods."The"resulting"average"wall" times"are"shown"in"Fig."2C." " Calculating'the'CCS'of'all'biological'assemblies'in'the'PDBe'and'PiQSi' We" processed" all" biological" units" in" the" PDBe" using" IMPACT," with" automatic" octree" depth" determination." The" dataset" comprised" 317,424" files" containing" 266,527" models," of" which" 266,516" were" at" least" part" biomolecular." When" a" file" contained" multiple" models," i.e." for" NMR" ensembles," we" calculated" the" CCS" of" individual" molecules" separately."We" processed" the" PiQSi" data" set" in" an" identical" manner," with" the" structure" 2ULL" omitted" from" subsequent" analysis" since" it" was" found" to" contain" multiple" superimposed" structures" not" separated" into" individual" models." We" calculated" the" masses" of" every" biological" assembly" in" the" PDBe" and" PiQSi" as" the" sum" of" all" residue" masses" based" on" the" Chemical" Compound" Directory," corrected" for" the" mass" loss" corresponding" to" one" water" molecule" per" peptide" or" phosphodiester" bond" formed" during"polymerisation." ' Calculating'an'effective'density'from'a'collision'cross*section' The" solution" density" of" proteins," ρs," can" be" inferred" from" crystal" structures" through" Connolly"volumes"(Quillin"and"Matthews,"2000)"or"Voronoi"diagrams"(Tsai"et"al.,"1999)." The"gashphase"density"on"the"other"hand"is"sometimes"estimated"under"the"assumption" 15" " that"proteins"are"approximately"spherical"(Bush"et"al.,"2010;"Kaddis"et"al.,"2007;"Ruotolo" et"al.,"2008),"relating"the"CCS"to"the"radius"as"Ω = !" ! ."Since"! = 4!! ! 3"and"! = ! !," we"can"express"the"CCS"as:" " Ω = ! 3! !4! ! Eqn"S10" !" Thus"an"effective"density,"!eff,)can)be)inferred)from)Ω)and) m)under)this)assumption.)The) CCS)for)a)perfect)sphere)as)formulated)above)does)not)take)into)account)the)finite)size)of) the) buffer) gas) particles) however.) If) the) latter) have) a) radius) rg,) we) get) Ω = ! 3! !4! ! ! ! + !! ,) yielding) CCS) that) are) more) easily) compared) with) those) calculated)for)atomistic)structures,)especially)for)structures)with)low)mass." The"effective"density"inferred"via"Eqn"S10"from"experimental"CCS"has"been"reported"to" be" approximately" a" factor" two" lower" than" the" density" !s" for" crystal" structures." To" see" whether" this" difference" arises" from" the" assumption" of" spherical" proteins" we" inspected" the" quotient"Ω Ω!"!!"! ," where"Ω"is" the" CCS" calculated" from" a" native" structure," and" Ω!"!!"! "is"the"CCS"we"expect"for"a"sphere"of"the"same"mass"and"a"density"!s,"taking"into" account" the" buffer" gas" radius." Using" Eqn" S10," we" get"Ω Ω!"!!"! = !! !!"" ! ! ," which" we"rearrange"into:" " !!"" = !! Ω Ω!"!!"! ! ! ) Eqn"S11" ." Eqn" S11" allows" us" to" emulate" how"!!"" "was" inferred" from" the" CCS" under" a" spherical" assumption"(Bush"et"al.,"2010;"Kaddis"et"al.,"2007;"Ruotolo"et"al.,"2008)."" " 16" " Calculating'the'collision'cross*section'of'EM'density'maps.' As"a"test"case"for"computing"the"CCS"of"nonhatomistic"structural"data"with"IMPACT,"we" chose"the"density"map"of"GroEL"solved"by"single"particle"reconstruction"at"a"resolution" of"5.4"Å"(EMD"code"1457)"(Stagg"et"al.,"2008)"(Figure"4A)."We"constructed"bead"models" from"the"map"by"replacing"all"voxels"with"an"electron"density"above"a"prehset"threshold" by" spheres" with" volumes" corresponding" to" the" EM" grid" spacing." We" performed" this" process"at"different"intensity"thresholds"ranging"from"0"to"10.5"to"generate"500"models," for"each"of"which"the"CCS"was"calculated"with"IMPACT." " Calculating'the'collision'cross*section'of'SAXS'bead'models' We"generated"SAXS"bead"models"using"the"ATSAS"package"(Svergun"et"al.,"1995;"Volkov" and"Svergun,"2003)"as"follows."First,"we"simulated"SAXS"curve"from"a"crystal"structure" (PDB" code" 1OEL)" using" CRYSOL." Then" we" generated" a" distance" distribution" from" the" SAXS" curve" with" GNOM" followed" by" construction" of" 10" separate" ab<initio" models" with" GASBORI." We" superimposed" these" models" and" averaged" them" with" DAMSUP" and" DAMAVER,"and"filtered"the"resulting"average"model"using"100"different"target"volumes" ranging"from"748,200"Å3"to"1,870,500"Å3"using"DAMFILT.""We"used"IMPACT"to"calculate" the"CCS"for"each"of"these"models,"setting"the"bead"radius"to"4.5"Å,"which"corresponds"to" half"the"bead"spacing"of"the"filtered"models." " Calculating'the'collision'cross*sections'of'NMR'ensembles' 17" " We"analysed"the"NMR"ensembles"2K39"(Lange"et"al.,"2008)"and"2KOX"(Bryn"Fenwick"et" al.,"2011)"of"ubiquitin"using"default"parameters,"yielding"one"CCS"distribution"for"each"of" the" two" ensembles." We" used" the" fullhwidthhathhalfhmaximum" of" an" experimental" CCS" distribution" for" the" 7+" charge" state" of" ubiquitin" (Wyttenbach" and" Bowers," 2011)" to" produce"a"Gaussian"distribution"representing"the"experimentally"observed"peak"width." We" generated" another" Gaussian" using" a" resolving" power" of" 130," for" a" single" conformation" (Koeniger" et" al.," 2006)." To" facilitate" comparison" of" peak" shapes," we" normalised"each"distribution"by"their"respective"average"CCS"(Fig"4B)." " Calculating'the'collision'cross*sections'of'molecular*dynamics'trajectories' To"test"whether"IMPACT"would"be"capable"of"calculating"CCS"onhthehfly"throughout"an" MD" simulation" we" analysed" a" 15" ns" trajectory" from" a" simulation" of" lysozyme" in"vacuo" (Marklund" et" al.," 2009)" by" calculating" the" CCS" with" IMPACT" at" 10" ps" intervals" to" a" precision"of"𝜏"="0.5"%"(Figure"4ChD)."To"see"if"the"variations"in"CCS"were"reflected"in"the" radius" of" gyration," RG," for" the" protein," we" calculated" RG" for" each" conformation" in" the" trajectory"using"the"Gromacs"simulation"package"(Pronk"et"al.,"2013)." " " 18" " SUPPLEMENTAL'REFERENCES' Jones," T.A.," and" Liljas," L." (1984)." Structure" of" satellite" tobacco" necrosis" virus" after" crystallographic"refinement"at"2.5"A"resolution."J"Mol"Biol"177,"735h767." Larsson," D." (2012)." Exploring" the" molecular" dynamics" of" proteins" and" viruses" (Acta" Universitatis"Upsaliensis)." Pronk," S.," Pall," S.," Schulz," R.," Larsson," P.," Bjelkmar," P.," Apostolov," R.," Shirts," M.R.," Smith," J.C.,"Kasson,"P.M.,"van"der"Spoel,"D.,"et"al."(2013)."GROMACS"4.5:"a"highhthroughput"and" highly"parallel"open"source"molecular"simulation"toolkit."Bioinformatics"29,"845h854." Quillin,"M.L.,"and"Matthews,"B.W."(2000)."Accurate"calculation"of"the"density"of"proteins." Acta"Crystallogr"Sect"D:"Biol"Crystallogr"56,"791h794." Reddy,"V.S.,"Natchiar,"S.K.,"Stewart,"P.L.,"and"Nemerow,"G.R."(2010)."Crystal"structure"of" human"adenovirus"at"3.5"Å"resolution."Science"329,"1071h1075." Stagg," S.M.," Lander," G.C.," Quispe," J.," Voss," N.R.," Cheng," A.," Bradlow," H.," Bradlow," S.," Carragher," B.," and" Potter," C.S." (2008)." A" testhbed" for" optimizing" highhresolution" single" particle"reconstructions."J"Struct"Biol"163,"29h39." Tanaka,"H.,"Kato,"K.,"Yamashita,"E.,"Sumizawa,"T.,"Zhou,"Y.,"Yao,"M.,"Iwasaki,"K.,"Yoshimura," M.,"and"Tsukihara,"T."(2009)."The"structure"of"rat"liver"vault"at"3.5"Å"resolution."Science" 323,"384h388." Tsai,"J.,"Taylor,"R.,"Chothia,"C.,"and"Gerstein,"M."(1999)."The"packing"density"in"proteins:" standard"radii"and"volumes."J"Mol"Biol"290,"253h266." Voorhees,"R.M.,"Weixlbaumer,"A.,"Loakes,"D.,"Kelley,"A.C.,"and"Ramakrishnan,"V."(2009)." Insights"into"substrate"stabilization"from"snapshots"of"the"peptidyl"transferase"center"of" the"intact"70S"ribosome."Nat"Struct"Mol"Biol"16,"528h533." 19" "
© Copyright 2024