Proteomics of body liquids as a source for potential methods for medical diagnostics Prof. Dr. Evgeny Nikolaev Institute for Biochemical Physics, Rus. Acad. Sci., Moscow, Russia. Institute for Energy Problems of Chemical Physics Rus. Acad. Sci., Moscow, Russia. High throughput proteome analyses by tandem mass spectrometry methods Proteins Peptides digestion Mass Spec HPLC/MS MS/MS Protein DB S14_1 #3422 RT: 52.14 AV: 1 NL: 4.69E2 T: ITMS + c ESI d w Full ms2 [email protected] [155.00-1215.00] 1006.34 100 Relative Abundance Parent and fragment ion intensities Protein & Peptide Identifications 639.26 95 90 85 80 75 70 65 60 843.29 55 50 274.16 45 527.33 40 35 715.26 520.75 30 194.95 470.18 25 358.11 340.02 403.14 20 927.13 726.34 548.03 664.22 790.48 15 10 5 936.21 1026.22 257.04 1096.17 1168.44 0 200 Mascot 300 400 500 600 700 m/z 800 900 1000 1100 1200 MS/MS Spectra Problem of methods based on MS/MS identification - Sensitivity lost –informative are only MS/MS spectra, whose intensity is at least ~10-fold lower than intensity of MS spectra - There is no possibility to detect all peptides in one run - Extra time for fragment spectra measurements causes longer chromatography time (application of UPLC is questionable for some types of MS instruments) Relative amounts of new peptide identifications during several consecutive LC-MS runs with the same sample. 3 2 1 The other possibility in proteomics – usage of high mass measurement accuracy mass spectrometry (From Alan Marshall NHMFL) Ion cyclotron resonance mass spectrometer can measure masses with sub ppm accuracy FTMS Data Linear ion trap Линейная ионная ловушка Magnet Магнит 7T Electron gun Электронная пушка ИК-лазер IR laser Other mass spectrometers with high accuracy of mass measurements are available now Orbitraps Q-TOFs BRUKER micrOTOF-QII ……. Mass accuracy 1-2 ppm (intern. calib.), 5 ppm (extern. calib.) Resolution 20 000-60 000 FWHM Rate of mass spectra measurements >20 Hz At accuracy level of 1 ppm elementary composition of peptide with mass up to 600 Da and amino acid composition of peptide with mass up to 500 Da could be determined almost unambiguously It is not enough for peptide identification! Accurate mass tag retention time Dick Smith group (PNNL) . Besides we have another tag - LC retention time Accurate mass tag together with retention time Can identify peptide practically unambiguously! LC reproducibility-Agilent 1100 RT: 46.10 - 80.40 NL: 1.07E6 Base Peak F: FTMS + p ESI Full ms [ 350.00-2000.00] MS urine_1-5_01ul_150min 120 55.89 100 80 60.73 60 40 20 46.69 57.46 49.23 49.88 64.54 65.97 58.02 54.17 68.81 69.41 62.53 75.42 72.79 76.66 78.29 0 120 NL: 1.44E6 Base Peak F: FTMS + p ESI Full ms [ 350.00-2000.00] MS urine2nd 55.66 100 80 60.45 60 64.35 40 20 46.48 49.02 57.23 49.68 51.57 55.05 65.78 57.91 66.77 68.65 69.41 62.37 75.24 72.61 76.47 78.14 0 120 NL: 1.83E6 Base Peak F: FTMS + p ESI Full ms [ 350.00-2000.00] MS urine3thd 55.65 100 80 60.64 60 64.33 40 57.25 46.56 20 49.09 51.54 55.18 65.93 57.94 75.26 68.73 69.23 69.52 72.61 62.38 76.43 78.17 0 120 NL: 5.89E5 Base Peak MS urine4th 55.90 100 80 60.93 60 64.46 40 20 58.17 46.66 47.48 49.97 51.58 66.16 60.13 54.27 75.42 68.81 64.00 69.35 72.76 74.20 76.56 78.44 0 48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 ...TGLYCESQTPRSLTLGIEPVSPTSLRVGLQRYVQLRSLR ... Vasorin (Homo Sapiens protein) trypsinolyses …TGLYCESQTPR SLTLGIEPVSPTSLR Fragment (463-477) from Vasorin LC- FTICR VGLQR YVQLR SLR LC-MS/MS (e.g. with ion trap) y9 identification y7 y8 y6 522.5 m/z 525.0 validation b9 b8 y5 y4 b6 b7 450 500 550 m/z 600 y10 b11 b10 y11 y12 b12 b13 b14 y13 650 Accurate measured mass: 1568.8768 200 600 1,000 m/z 1,400 1,800 Putative mass tag from Homo Sapiens: SLTLGIEPVSPTSLR Calculated mass (1568.8773) And measured retention time Validated accurate mass tag (SLTLGIEPVSPTSLR) Thus, the general idea is to create using MS/MS a data base for accurate mass tags and retention times as a reference base for quantitative proteomics Analyses of urine proteome Urine is available in large quantities – ideal analyte for noninvasive diagnostic. Possibility of biomarker discovery is attracting a big attention. 1500 proteins!!! (from Mann’s group Adachi et al. Genome Biology 2006, Volume 7, Issue 9, Article R80 ) Accurate mass tag retention time approach Lab Lab FT MS Clinic ESI Q-TOF ESI TOF Statistics of the collected AMT tags in urine proteom 233 LC-MS (liquid chromatography coupled with mass spectrometry) runs totally: (80% of men and 20% of women) and 25 samples from each of 6 long term isolation experiments volunteers (during 19 weeks) have been collected so far. The number of peptides in the database 2758 The number of urine proteins in the database 840 Two kinds of sample donors People “from street” and people in “special conditions”. General blood analysis Examination of internist Blood pressure measurement Current control for urogenital and other pathology including kidney pathology, prostatitis, arterial hypertension, diabetes Decision to include a person to the study group Analysis of archival information from medical records Control for treatment with diuretics and excessive consumption of fluids Data recorded for each sample 1. Number 2. Name, 3. Date of birth 4. Sex 5. Date of urine collection 6. Time of urine collection 7. Current smoking status (+/-) 8. Sample volume 9. Clinical parameters (other diseases) 10.Results of testing for bilirubin, urobilinogen, ketones, glucose, protein, blood, nitrite, pH, specific gravity, leukocytes For “healthy people data base” subset we need urine samples from persons under well controlled diet and having healthy lifestyle? In this case we can test urine temporal variability and polymorphism Those are people participating In long term isolation experiments in the frame of space research programs. April- July 2009. Ground based experimental facility April- July 2009 Urine collection Centrifugation Sample concentration Amicon Ultra Ultracel-15 3 k Desalting and major protein removal Carboxymethylation and trypsinolyses LC MS analyses Search engine: Mascot Database: IPI.Human v.3.52 Parent Tolerance: ± 5.0 PPM (Monoisotopic) Fragment Tolerance: ± 0.50 Da (Monoisotopic) Fixed Modifications: Carbamidomethyl (C) Variable Modifications: Oxidation(M) Digestion Enzyme: Trypsin Max Missed Cleavages: 2 Instrument type: Ion-trap What is in the DB • • • • • • • • • • Run, in which this peptide was identified Peptide sequence What protein does this sequence belong to Mascot score Modifications Measured mass Theoretical mass Measured charge RT, when the peptide began to elute from the column RT, when the peptide finished elution Retention time normalization Normalization – time scale alignment for series of experiments Several types of normalization are possible: - By some added calibrant – external calibration (e.g. Cytochrome C) - By theoretically predicted RTs - By peptides that are always present in your samples (for example, peptides of digestion enzyme, etc.) We have chosen the last one, as it is rather robust and doesn’t require any additional sample treatment. RTs are renormalized every time a new run is added to the database. Normalization for runs without MS/MS HPLC is considered to be linear, so different masses should retain elution order from run to run. We can use pivots and look for the same sequence of masses in the run without MS/MS, our goal is to find the longest common subsequence. 878.1 1024.5 1575.1 2330.9 1150.3 Run 5 without MS/MS 1 758.1 1024.5 1575.1 3 2 1150.3 Average NET for a peptide Elution sequence of known masses is retained 758.1 NETs sorted by RT No MS/MS run peak list sorted by RT 2330.9 1150.3 1150.3 878.1 1575.1 758.1 ● ● 1024.5 1575.1 758.1 1024.5 ● ● Normalization of runs without MS/MS RT correlation between all masses within 5 ppm 70 • 2 runs of different urine samples performed with 1 day interval t2 (min) 60 50 • Plotted are RTs of all the masses matching with 5ppm tolerance 40 30 y = 0.8988x + 3.704 R2 = 0.7774 20 • Correlation coefficient of linear least squares fit is only 0.7774, which is bad 10 20 25 30 35 40 45 50 55 60 65 70 t1 (min) RT correlation of the longest common subsequence 70 • Correlation coefficient of linear least squares fit is 0.9996, which means we have an almost perfectly linear correlation between 2 datasets t2 (min) 60 50 40 30 y = 0.9838x + 0.8356 R2 = 0.9996 20 10 20 25 30 35 40 45 50 55 t1 (min) 60 65 70 The total number of identified proteins in the database during its creation/filling stage. Vertical blue arrows show steps of equal protein count increase, the length of horizontal arrows parallel to the abscissa axis is proportional to the time needed to identify an additional protein. 700 600 500 400 300 200 100 0 0 20 40 of LC-MS 60 runs Number 80 100 120 Smokers vs. non-smokers urine proteome Current statistics of urinary proteome database 233 LC-MS (liquid chromatography coupled with mass spectrometry) runs totally: 102 with samples from smokers, 131 with samples from non-smokers. Using all peptides Peptides Proteins Non-smokers 2527 762 Smokers 1893 627 Total 2758 840 Using all peptides Peptides Non-smokers 2527 Smokers 1893 Total 2758 Proteins 762 627 840 Peptides 865 1662 40% Proteins 231 213 549 35% 78 Using all peptides Odd Even Non-smokers Peptides 2232 2306 2535 Proteins 445 467 506 Peptides 229 Proteins 2003 20% 303 49 406 21% 61 Using all peptides Selection1 Selection2 Smokers Peptides 1723 1588 1894 Proteins 365 337 400 Peptides 171 Proteins 1417 25% 306 63 302 25% 35 Differences in the numbers of observed proteins in urine of smokers and nonsmokers participating in particular biological process Transport, homophilic cell adhesion, lipid metabolic process, inflammatory response, innate immune response, epidermis development, defense response ! ! ! ! ! ! !
© Copyright 2024