MS-DIAL Tutorial Last edited in 2015/5/01 ABSTRACT Novel mass spectrometers perform ultra-fast, accurate data acquisition on the MS and MS/MS levels without selecting specific precursor ions (such as SWATH approaches), or by integrating different collision energy levels in MS/MS spectral acquisition (such as MSE or all-ions approaches). Such data-independent MS/MS approaches provide richer information contents compared to classic data-dependent MS/MS experiments. MS-DIAL aims to provide total solutions to not only data-dependent MS/MS but also data “independent” MS/MS experiments in metabolomics, lipidomics, and proteomics research. It features (1) spectral de-convolution for data-independent MS/MS, (2) streamlined criteria for peak identification, (3) support of all data processing steps from raw data import to statistical analysis, and (4) user-friendly graphic user interface. MS-DIAL has been developed as the collaborative work between Prof. Masanori Arita team (RIKEN, Reifycs Inc.) and Prof. Oliver Fiehn team (UC Davis) supported by the JST/NSF SICORP “Metabolomics for the low carbon society” project. Hiroshi Tsugawa RIKEN Center for Sustainable Resource Science [email protected] MS-DIAL screenshot Table of Contents Required software programs and files ................................................................................................... 4 Downloading the ABF converter from Reifycs Inc. ............................................................................ 5 File conversion ................................................................................................................................... 6 The result of ABF converter: Centroid or Profile? ............................................................................. 7 MSP format MS/MS library ............................................................................................................... 8 Text format retention time and accurate mass library for post identification ................................ 10 Starting MS-DIAL ............................................................................................................................... 11 Starting up your project ................................................................................................................... 12 Importing ABF files .......................................................................................................................... 14 Setting parameters........................................................................................................................... 15 Data collection tab ........................................................................................................................ 15 Peak detection tab ........................................................................................................................ 16 Deconvolution tab ......................................................................................................................... 17 Identification tab .......................................................................................................................... 18 Adduct tab ..................................................................................................................................... 20 Alignment tab ............................................................................................................................... 21 MS-DIAL viewer .................................................................................................................................. 22 Mouse function ................................................................................................................................. 22 Peak viewer ...................................................................................................................................... 23 Display filter..................................................................................................................................... 24 Alignment viewer ............................................................................................................................. 26 MS/MS spectrum viewer .................................................................................................................. 27 Compound search................................................................................................................................. 30 Normalization and Statistical analysis ............................................................................................... 32 Menu .................................................................................................................................................... 33 File .................................................................................................................................................... 33 New project ................................................................................................................................... 33 Open project .................................................................................................................................. 33 Save project ................................................................................................................................... 34 Save parameter setting................................................................................................................. 34 Data processing ................................................................................................................................ 36 Identification .................................................................................................................................... 37 View .................................................................................................................................................. 38 Option ............................................................................................................................................... 40 Export ............................................................................................................................................... 41 Software Environments Microsoft Windows XP, Vista, 7 or 8 .NET Framework 4.0 or later Required software programs and files Reifycs Analysis Base File Converter (ABF file converter) * Download link: http://www.reifycs.com/english/AbfConverter/index.html MS-DIAL Download link: http://prime.psc.riken.jp/Metabolomics_Software/MS-DIAL/index.html Reference library for compound identification (msp format file) Example library link: http://prime.psc.riken.jp/Metabolomics_Software/MS-DIAL/index.html Demonstration file Download link: http://prime.psc.riken.jp/?action=drop_index *MS-DIAL imports “analysis base file (abf)” format. The file converter is freely available from the above link. ABF file converter and MS-DIAL have been tested on the MS platforms from Agilent Technologies, AB Sciex, Bruker Daltonics, Waters, and Thermo Fisher Scientific. *MS-DIAL have been validated as the below conditions: Data dependent MS/MS acquisition Agilent Technologies, AB Sciex, Bruker Daltonics, Waters, and Thermo Fisher Scientific Data independent MS/MS acquisition Agilent Technologies (All-ions), AB Sciex (SWATH), Waters (MSE) and Thermo Fisher Scientific (All Ion Fragmentation) (Positive/Negative switching mode is not tested yet.) For LECO citius Please convert raw files to mzML via the vendor’s software. Then, convert the mzML files to ABF files with ABF file converter. *2015/5/1: now we are fixing the converter program for Waters-MS. The problem will be fixed as soon as possible. Downloading the ABF converter from Reifycs Inc. 1. Go to http://www.reifycs.com/english/AbfConverter/. 2. Check the requirements and license terms, and download the converter. File conversion 1. Start “AnalysisBaseFileConverter.exe”. 2. Drag & drop MS vendor files into this program. 3. Click “Convert”. 4. The ABF files are generated in the same directory as the raw data files. The result of ABF converter: Centroid or Profile? As long as we use the default settings in each MS instrument, the ABF converter will export the vendor's file as: AB Sciex Q-TOF: Profile Thermo Q-Exactive: Profile Agilent LC-QTOF: Centroid Bruker LC-QTOF and FT-ICR: Centroid Waters LC-Xevo QTOF or Synapt: Centroid mzML: depends on the export method by the ProteoWizard program etc. If the centroid datum is stored in the vendor's raw file (for example, in .D folder), the ABF converter tries to export the centroid datum instead of profile datum as the ABF file. However, it's much better to check the result in MS-DIAL by the following way: 1. Try to start MS-DIAL project as 'Centroid' mode by only one file. 2. See the MS1 or MS2 spectrum in MSDIAL peakviewer and check the 'Shape' of the spectrum. 3. If the shape would be like-Profile mode, please re-starts your project as 'Profile' mode. MSP format MS/MS library MS-DIAL supports the MSP (http://www.nist.gov/srd/upload/NIST1a11Ver2-0Man.pdf) format in ASCII text. In addition, the software can utilize “RETENTIONTIME:”, “PRECURSORTYPE: ”, and “FORMULA:” information for metabolite identification (cases are ignored). Please add retention time information by minute [min] scale if available. The adduct ion information, i.e. here ‘Precursor type’, will be used for the adduct ion search algorithm (also see the adduct format*.). * Adduct ion format: [M+Na]+, [M+2H]2+, [M-2H2O+H]+, [2M+FA-H]-, etc. 1. The parentheses ‘[’ and ‘]’ must be used to bracket the ion information. 2. The char + and - must be required after ']' and the number must be written before + or -. 3. When you want to define the organic formula like C6H12O5, you have to write it without any replicate elements or parentheses like [M+C2H5COOH-H] or [M+H+(CH3)3SiOH]. 4. The beginning figure of organic formula like '2'H2O is recognized as the H2O × 2. Again, never use 2(H2O) for that. 5. Sequential equations are acceptable: [2M+H-C6H12O5+Na]2+ (very apt.) 6. MS-DIAL accepts some abbreviations or common organic formulas for adduct types as follows. For Acetonitrile: ACN, CH3CN For Methanol: CH3OH For Isopropanol: IsoProp, C3H7OH For Dimethyl sulfoxide: DMSO For Formic acid: FA, HCOOH For Acetic acid: Hac, CH3COOH For Trifluoroacetic acid: TFA, CFCOOH Text format retention time and accurate mass library for post identification MS-DIAL also supports the tab delimited text format library for peak identification by means of retention time and MS1 accurate mass information. The identification process is performed after the peak identification based on MSP format library is finished. That’s why we call this identification processing “post identification”. First row should include a header information. First, second, and third columns should be metabolite name, accurate mass [Da], and retention time [min], respectively. This library can be made by Microsoft excel easily. Please add compounds information and save as “Tab delimited text format” in Microsoft excel. This option is useful for internal standard identifications etc. (Even if you don’t have MS/MS libraries, the peak identification based on retention time and accurate mass is available from this option.) Starting MS-DIAL 1. Starting up your project 2. Importing Abf files 3. Setting parameters 4. Running the software (1-2 min / sample) *The tutorial uses 23 demonstration files and the lipid reference library which are downloadable from the above link. The common measurement conditions of the demonstration files were as follows. Liquid chromatography: total 15 min run per sample with Waters Acquity UPLC CSH C18 column (100×2.1 mm; 1.7 μm). Mass spectrometer: SWATH method with negative ion mode. MS1 accumulation time, 100 ms MS2 accumulation time, 10 ms Collision energy, 45 V Collision energy spread, 15 V Cycle time, 731 ms Q1 window, 21 Da Mass range, m/z 100-1250 Starting up your project 1. File -> new project 2. Set your project file path to the directory of your ABF files 3-1. Select your method type from either “data-dependent” or “independent.” In the case of SWATH data-independent analysis, the experiment file can be made at PeakView (Show->sample information). Never change its format, please. (“SCAN” and “SWATH” should be capital.) Even if you want to use a data-independent analysis different from SWATH, please keep using the word “SWATH” and change the m/z range information only. 3-2. (data-independent mode only) Make an experiment file* and select it. To follow this tutorial, please select ABSciex_Experiment_Information_CSH21Da.txt. 4. Choose data type either from “profile” or “centroid.” 5. Choose ion mode either from “positive” or “negative.” 6. Choose target omics either from “metabolomics” or “lipidomics.” If you select ‘lipidomics’ project, you do not have to prepare NIST MSP format library. What you have to do is to select what you want to find in your data sets. On the other hand, when you select ‘metabolomics’ project, your own MSP file will be required for compound identifications. Importing ABF files 1. Select ABF files 2. If the file is a “quality control” sample for peak alignment, then set the type as such. Note: Please finalize your file name here, because you cannot change it later. Setting parameters Data collection tab Data collection parameters: You can set analysis ranges (RT and MS1 axis). For example, if your expected data range is 0.5-10 min for 100-1250 Da, so set the parameters. Centroid parameters: After the peak detection algorithm is applied along the MS axis with a very low threshold, MS-DIAL performs spectral centroiding. By default, mass spectrum of ±0.01 and ±0.1 Da range from each peak top is integrated in MS1 and MS2, respectively. MS-DIAL provides another option to skip the peak detection before centroiding. To choose this option, tick the checkbox of “peak detection-based”. This option integrates all spectral signal. If the accumulation time is not enough to do the centroiding, this option is useful in capturing low-intensity spectra. Peak detection tab Peak detection parameters: Linear-weighted moving average is used for the peak detection by default to accurately determine the peak left- and right edges. The recommended smoothing level is 1 or 2. MS-DIAL provides two simple thresholds: minimum values for peak width and height. Peaks below these thresholds are ignored. be more than 20,000. For FT-ICR or Orbirap data, the minimum peak height should Peak spotting parameters: The width of mass slice is set here. From our experience, 0.1 or 0.05 is suitable for Agilent Q-TOF, AB Sciex TripleTOF, and Thermo Q-Exactive. If you already know un-wanted m/z peaks from columns or solvent, you can specify them in the “Exclusion mass list.” Deconvolution tab Baseline correction and de-convolution parameters: Please do not manipulate default values unless you fully understand the deconvolution process. The details are described in Supplementary Note 1. If you want to remove the product ions after the focused precursor ion (recommended for metabolomics and lipidomics), check “Exclude after precursor.” Identification tab Database: Set your MSP file here. (Tutorial data: LipidBlast_Nega_Algae_vs5.msp. If you select ‘metabolomics project’.) In the case that you selected ‘lipidomics’ project, please select what you want to find in your data sets for lipid profiling. Parameters: If you put retention time (RT) information in your MSP file, set the RT tolerance value (default is 0.5). For example, the tutorial data (LipidBlast_Nega_Algae_vs5.msp) include the RT information optimized for our 15 min LC method. If suitable RT information is unavailable, set the tolerance 100 or larger (larger than your LC time). The two mass tolerances for MS1 and MS2 are required for the compound search and they are dependent on your instrument performance. The cutoff of the identification score should be greater than 0.7 or 0.8. Text file: If you want to perform “post identification” processing, set your text file here. (Tutorial data: Lipid_Nega_IS_PostIdentification_vs1.txt) Parameters: The meanings of parameters are the same as MSP based identification. Advanced library search option: The options for your library search are defined here. In the current program (2014/11/30), there are two options for the library search. MS/MS tab: 1. Relative abundance cut off: the mass spectrum peak less than the user-defined value will not be used for the MS/MS similarity calculation. Post ident. Tab: 2. Only report the top hit: Since some chromatogram peaks will be annotated as the same compound from the identification algorithm, this option allows us to determine only one candidate from such multiple results by means of the identification score. Adduct tab Adduct ion setting: You can tick the adduct ions and charge values to be considered. Alignment tab Parameters: If you already have a suitable quality control (QC) data, typically a mixed sample data, then specify the QC file here. All sample data will be aligned to this QC file. The RT and MS1 tolerances for peak alignment depend on your chromatographic conditions. Do not change these parameters unless you know procedure details. If you want to remove specific peaks that are not fully detected in the alignment, specify the peak count filter. For example, the tutorial data include at least 4 biological replicates with the same peak information and the total number of data is 23. Then, you may set the peak count filter as (4/23)*100 = 17.4 %. This means peaks will be removed when they include missing values for more than 17.4%. If you can prepare many QC sample data, tick the “QC at least filter” box. Then a peak will be removed if it is missing in any of the QC samples. The “Gap filling option” must be always checked. Note: When you execute the compound identification, the representative spectra with identification results are automatically determined from samples as spectra of the highest identification scores. MS-DIAL viewer Mouse function A) Mouse right click (or hold) and move: zoom in and out B) Mouse left click (or hold) and move: select and scroll C) Mouse left double click: reset range and select files in the file navigator D) Mouse wheel: zoom in and out E) Right click: popup context menu Peak viewer In the main viewer of MS-DIAL, the detected peak information is shown in the center window by mouse left double click of the file name in the File navigator. In the center window, each spot denotes the detected peak information: blue spots describe peaks of lower abundance in the sample, red spots describe peaks of higher abundance, and green spots describe peaks of middle abundance. The left window displays the MS1 spectrum of the focused peak and the upper window displays the extracted ion chromatogram of the focused peak. The right window displays the MS/MS spectrum (blue or green) and the reference MS/MS spectrum (red). Other peak information is displayed in the top-right of this window. Display filter Label: You can check the peak information such as retention time, accurate mass, metabolite name, adduct ion name and isotope ion in the center window of MS-DIAL. Shown below are examples. Height filter: This filter is used to check the peak abundance. Each peak is assigned a rank with respect to its peak abundance in the focused sample. Display filter 1. “Identified” shows only identified peaks with the MS/MS spectrum. 2. “Annotated” shows only identified peaks without the MS/MS spectrum. 3. “Molecular ion” shows de-isotoped molecular ions only. 4. “MS/MS” shows only peaks having the MS/MS spectrum. Alignment viewer Alignment viewer: Each spot shows an aligned spot including all retention time, accurate mass, intensity, and MS/MS spectrum of all samples. As in the Peak viewer, red, blue, and green “alignment” spot denotes higher, lower, and middle abundance (on average) in the alignment, respectively. By clicking each spot, you can check all retention times and accurate masses of aligned samples. The green spot is associated with the “detected” flag, showing whether all samples contain the spot. The red spot is associated with the “interpolated” flag, showing whether the software program augmented originally missing values. More details are shown in Supplementary Note 1. MS/MS spectrum viewer This viewer is prepared for data independent MS/MS analysis except for the Act. vs. Ref. window. Act. vs. Ref.: The upper spectrum (blue) displays the centroided information of the MS/MS spectrum. The lower spectrum (red) displays the reference MS/MS spectrum. In case of data independent MS/MS analysis, de-convoluted MS/MS spectrum can be displayed by clicking the de-convolution icon . MS2 Chrom.: The MS/MS chromatograms inside the sky-blue rectangle in the center window are displayed. This icon displays the raw MS/MS chromatograms. This icon displays the de-convoluted MS/MS chromatograms. This icon displays both the raw and de-convoluted MS/MS chromatograms. Raw vs. Decon.: The upper and bottom windows display the raw and de-convoluted MS/MS spectrum, respectively. Rep. vs. Ref.: In combination with the alignment viewer, the window compares a representative MS/MS spectrum and a reference MS/MS spectrum. The representative MS/MS is automatically selected as the spectrum of the highest identification score for all samples aligned to the focused alignment spot. Compound search The automatic identification process cannot escape from mis-identification. MS-DIAL provides the user-interface so that users can manually correct the identification result. In this option, you can customize the identification criteria into three levels: “confident”, “unsettled”, and “unknown.” For example, in the phospholipid identification, we often determine only the cumulative composition such as PC 36:1 without positions of acyl chains, e.g. PC(18:0/18:1). You can add “unsettled” tag to such peaks as the signpost to comment that “we only checked the cumulative composition”. Information of identification is available not only in the “peak viewer” but also in the “alignment viewer”. Although you only see representative spectra from all samples in the alignment viewer, it is very helpful to make a data matrix and to check your peak identification result. A) Mouse double click in each row of the library information to show identification details. B) Add a tolerance value for identification and click the “Search” button. C) You can select either “A: Confident”, “B: Unsettled” or “C: Unknown.” Normalization and Statistical analysis A) Data normalization by internal standards or LOESS algorithm B) Principal component analysis A) If you want to use internal standards to normalize your peak list, you have to set the IS information in Option menu. MS-DIAL also supports LOESS and cubic spline algorithm to normalize batch or amplitude drifts. In order to use the LOESS algorithm, you have to set “quality control” and “analytical order” information correctly in the Option menu. B) If you want to use the other statistics, please go to PRIMe web site: http://prime.psc.riken.jp/Metabolomics_Software/StatisticalAnalysisOnMicrosoftExcel/index.ht ml Menu File New project When you start a project, use this option and see the document of MS-DIAL start-up as described above. Open project The project file is saved as MTD file format automatically whenever you perform data processing method. The manual save is described below. You can re-start your project from the MTB file. The manual curation of peak identification result is highly recommended. In addition, the internal standard information can be set from Identification menu. Save project Although your project is saved automatically whenever you do the data processing method, this program is not saved after your manual modification such as the curation of identification result, internal standard setting, and file or class information setting. Therefore, you have to save your project from this option after your modification. Save parameter setting Your data processing parameter can be saved as MED format file. When you want to use your method file for your data processing method, select your MED format file in the data processing setting. Data processing All processing: reruns all the processing steps with current parameters. Adduct ion picking, De-convolution, Identification, Alignment: runs each process independently. Identification Identification setting: You can manually correct identification result. This option may be useful to check internal standards which are not included in the reference library. View Display total ion chromatogram: You can see the total ion chromatogram of the focused sample. Display extracted ion chromatogram: You can see the extracted ion chromatograms which you want to display for the focused sample. Option You can set properties of aligned peaks and files. In the file properties, you can reset file type, class ID, or analytical order (but not the file name). If you clear the check box of the “Included” column, the corresponding data are no longer used in the statistical analysis. In the alignment properties, you can set internal standard information for each aligned peak. Please make sure to assign “alignment ID” in the “internal standard” column. Export A) Peak list export B) Alignment result export C) Context menu strip A) Peak list export: You can get the peak list information of each sample including retention time, m/z, MS/MS spectra information, and so on. Available formats are MSP, MGF or Text. Step1. Step2. Step3. Step4. Choose an export folder path. Choose files which you want to export. Select export format. Push the export button. B) Alignment result export: You can get data matrix or spectral information. Step1. Step2. Step3. Choose an export folder path. Choose an alignment file which you want to export. Select export format if you want to export the representative spectra. Step4. Push the export button. C) Context menu strip: You can pop-up the context menu strip by a mouse right-click to export spectra, chromatograms, or PCA results as the ASCII, Bitmap image, or Vector image.
© Copyright 2024