Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
Applies to MineXpert3 11.8.0

5 Mass Spectral Deconvolutions

When analysing a mass spectrum, there are two available deconvolution modes:

Deconvolutions are performed in order to get back to the Mr mass of the analyte while reading <emphasis>m/z</emphasis> values. The deconvolution processes below first determine the charge of the ion beneath the mass spectral data (either charge envelope states or isotopic cluster). Then, if the user has pointed a <emphasis>m/z</emphasis> value precisely, the process goes through to the molecular mass of the analyte (the ionization agent is always considered to be the proton entity). In the following sections, all the avaiable deconvolution processes are described in detail.

Before delving into the deconvolutions, it is necessary to present two menu options that are found in the plot widgets contained in the Mass spectra window: the menu items under the Centroidation and deconvolutions menu (Figure 5.1, “Mass spectrum plot widget-specific deconvolution menu” ).

Mass spectrum plot widget-specific deconvolution menu

The two menu items are needed to configure the mouse-based deconvolution of mass spectra.

Figure 5.1: Mass spectrum plot widget-specific deconvolution menu

These two menus allow one to set parameters for the manual deconvolutions (see text for details).

5.1 Deconvolution based on charge state

In this kind of deconvolution, at the present time, the software assumes that the ionization agent is the proton and that the ionization is positive.

The deconvolution is based on the determination of the distance between two peaks —consecutive or not— of a given charge state envelope. When the user left-mouse-button-click-drags the cursor from one peak to another, the program tries to calculate if the distance between two peaks matches one or more charge difference(s). If so, it computes the molecular (Mr [12C]-relative molecular mass) mass of the analyte.

Note
Note: The mouse drag movement is significant

The Mr value that is computed is for the analyte below the mass peak at which the mouse drag moment operation started.

Top panel of Figure 5.2, “Charge state-based mass deconvolution” shows the charge state envelope-based deconvolution process for a protein of Mr≍8566 Da. In the top panel, the deconvolution has involved two consecutive peaks ( span> is 1). The mouse drag movement occurred from left to right. Thus, the <emphasis>m/z</emphasis> value chosen for the computation is that under the left peak (start point of the movement). The status line at the bottom of the plot indicates the selection range, the delta movement on the x-axis, the computed charge, the Mr value chosen for the calculation (the Mr value at the start of drag movement) and finally the calculated Mr value.

This two consecutive mass peaks method is the default method. However, it might happen that no clearly visible mass peak is available around one nice peak that might be chosen as the start of the mouse drag operation. In this case it is possible to define a different span> between two peaks elected for the deconvolution (see Figure 5.1, “Mass spectrum plot widget-specific deconvolution menu” ). In the figure, than span has been set to 2, which means that the mouse drag movement encompasses three mass peaks: the movement start peak, one peak in the middle and finally the movement stop peak (the span is thus of two intervals between the extreme peaks).

The bottom panel of the figure now displays the same Mr value for the protein even if the span is now of two intervals.

Charge state-based mass deconvolution

Deconvolution approach using two peaks belonging to the same charge state envelope. The top deconvolution involves two consecutive mass peaks (peak span value is 1). The bottom deconvolution involves two non-consecutive peaks (peak span value is 2). The Mr value, expectedly, did not change whatever the configured span.

Figure 5.2: Charge state-based mass deconvolution
Note
Note

The charge calculation, which is at the heart of the deconvolution, almost never produces an integer value with no fractional part (say, charge z=15.0) because it is almost impossible to drag the mouse cursor the exact number of pixels that would match a <emphasis>m/z</emphasis> range leading to such an integral charge value. Almost always, the charge that is calculated looks like 14.995 or 15.001, for example. This is due to the fact that the mouse moves at discrete positions on the screen and these positions might be more or less far apart, depending on the mouse capabilities and on the current zoom factor over the mass spectrum region of interest.

It is advised to zoom-in as much as possible over the peaks at hand so as to minimize the difficulties above. It may happen, however, that even zoomed-in peaks are not sufficiently distant to allow a charge calculation. In this case, reduce the stringency over the fractional part that is allowed in the charge (see menu item Set charge minimal fractional part at Figure 5.1, “Mass spectrum plot widget-specific deconvolution menu” ). By default, the stringency is set at 0.99, that is, any calculated value that has a fractional part either superior or equal to 0.99 or inferior or equal to 0.01 would lead to a successful round-up/round-down to the nearest integer value. Outside of the [0.01-0.99] interval, no charge calculation is performed and thus no deconvolution is performed. When the stringency is too high, reducing it will allow the deconvolution to be carried-over. General experience is that setting that value to 0.997 is fine for most situations and provides very reliable results.

5.2 Deconvolutions based on isotopic cluster peaks

There are two different (albeit similar) deconvolution modes that are based on the isotopic cluster representing the isotopologues of a given analyte:

  • The two-point mouse drag way: in this kind of deconvolution, the user left-mouse-button-click-drags the cursor between the first two peaks (when possible) of the isotopic cluster. The charge state of the ion is the inverse of the distance between the two consecutive peaks (that is, the <emphasis>m/z</emphasis> delta value; see Section 5.2.1, “ Two-point charge determination with Mr calculation ” ).

  • The multi-point mouse drag way: in this kind of deconvolution, the user left-mouse-button-click-drags the cursor between any one of the isotopic cluster's peaks up to any location in this isotopic cluster. The charge state of the ion is determined on the basis of the statistical analysis of the extracted centroids (which happens behind the curtains; see Section 5.2.2, “ Multi-point statistical analysis-based charge determination ” ).

5.2.1 Two-point charge determination with Mr calculation

This method is the most immediate one and is reliable when the user can visually pin-point the correct centroid position for two profiled isotopic cluster peaks. There is no centroid extraction in this calculation mode. The user has to drag the mouse from the first perceived centroid position to the next perceived centroid position. When the program encounters that the distance between the mouse start and current drag points is a multiple of the 1/<emphasis>z</emphasis> ratio (1 being a rounded value of the mass delta expected for a switch from a [12C] to a [13C] isotope), it prints the <emphasis>z</emphasis> value (see Figure 5.3, “Isotopic cluster-based mass deconvolution” for a description of the process.

Note
Note

If the user has started the mouse drag movement right at the proper centroid position of the first isotopologue (non-labelled analytes) or of the last isotopologue (fully labelled analytes), then the molecular mass (Mr) can be computed based on the determined <emphasis>z</emphasis> charge. These two values are the set in memory and might be used when recording discoveries (see Chapter 9, Recording data exploration discoveries ).

Isotopic cluster-based mass deconvolution

Top panel: deconvolution involving a left-to-right mouse drag movement. Bottom panel: deconvolution involving a right-to-left mouse drag movement. The calculated Mr values differ.

Figure 5.3: Isotopic cluster-based mass deconvolution
Note
Note: The mouse drag movement start position is significant

The left-mouse-button-click-dragging direction (left→right or right→left) determines the final Mr that is computed because that value is calculated for the peak under the mouse when the mouse drag movement is initiated. This is visible in the two panels of Figure 5.3, “Isotopic cluster-based mass deconvolution” , where the top panel shows the Mr computed for the left peak and the bottom panel shows the Mr computed for the right peak. Since the ion is monocharged, the difference is 1 Da.

This is a significant departure from the previous versions, where the postulate was that the single real peak of interest in an isotopic cluster was the left-most monoisotopic peak. Since this software has been used by scientists in research projects using almost 100 % labelled bacteria (with [13C] and [15N]), that concept has become moot. Indeed, analytes from these bacteria have their monoisotopic peak at the far right end of the isotoopic cluster.

The new behaviour allows scientists to compute the Mr value of the peak of interest in an isotopic cluster, be that for a non-labelled or for a labelled analyte. See the following articles as examples of heavy isotope almost full labelling of bacteria.

Heavy isotope labeling and mass spectrometry reveal unexpected remodeling of bacterial cell wall expansion in response to drugs. Atze H, Liang Y, Hugonnet JE, Gutierrez A, Rusconi F, Arthur M. Elife, 2022, doi: 10.7554/eLife.72863, PMID: 35678393 .

Peptidoglycan-tethered and free forms of the Braun lipoprotein are in dynamic equilibrium in Escherichia coli. Liang Y, Hugonnet JE, Rusconi F, Arthur M. Elife, 2024, doi: 10.7554/eLife.91598, PMID:39360705 .

(p)ppGpp modifies RNAP function to confer β-lactam resistance in a peptidoglycan-independent manner. Voedts H, Anoyatis-Pelé C, Langella O, Rusconi F, Hugonnet JE, Arthur M. Nat Microbiol , 2024, PMID:38443580 .

5.2.2 Multi-point statistical analysis-based charge determination

In this way of doing the determination of the charge of the analyte beneath an isotopic cluster, the first step is to extract the centroids under the region that spans the left-mouse-button-click-drag movement. The first centroid extracted is then used as an anchor for the statistical analysis of the spacing between that centroid and the remaining ones.

To trigger the automatic extraction of the centroids and the statistical analysis of the <emphasis>m/z</emphasis> spacings between these centroids, the user left-mouse-button-click-drags the mouse over the region of interest while pressing the Z keyboard key. The process is shown in Figure 5.4, “Isotopic cluster-based charge determination” .

Isotopic cluster-based charge determination

When the statistics computed on the extracted centroids are not reliable, the verdict shows it. Here, the statistical validatio could not be performed right because only two (2) centroids are available since the user left-mouse-button-click-dragged the mouse over too small a region.

Figure 5.4: Isotopic cluster-based charge determination

Whenever the mouse drag-spanned region encompasses enough peaks to extract enough reliable centroids, statistics are reliable and the result is shown in Figure 5.5, “Isotopic cluster-based charge determination” .

Isotopic cluster-based charge determination

When the statistics computed on the extracted centroids are reliable, the verdict shows it. In this situation, the user left-mouse-button-click-dragged the mouse over a sufficiently large isotopic cluster region to extract nine (9) centroids on the spacings of which to compute that statistics.

Figure 5.5: Isotopic cluster-based charge determination

The configuration of the level of stringency for the statistics to be considered reliable for the charge determination is performed in the application preferences window, on the specific section entitled Charge determination , as shown in Figure 5.6, “Isotopic cluster-based charge determination” .

Isotopic cluster-based charge determination

Depending on the instrument used to acquire mass spectrometric data, the statistical analysis outcome stringency may be relaxed or tightened.

Figure 5.6: Isotopic cluster-based charge determination

The statistical criteria involved in the assessment of the reliability of the determination of the charge are described below.

Note
Note

The tool tips that display when the user hovers over the controls of the interface contain very detailed explanations of the statistical meaning of the variables set in this parameter section.

  • Max ppm RMSE: This is a physical validation test on the charge value determined by the analysis of the isotopic cluster. With the centroid peaks' <emphasis>m/z</emphasis> value, determine the median of the <emphasis>m/z</emphasis> spacings between the various centroids' m/z value. The inverse of that median value is the putative charge. Next, compute the set of theoretically expected isotopologue centroids by anchoring the computation on the very first centroid extracted from the isotopic cluster (for <emphasis>z</emphasis> = 1, the delta <emphasis>m/z</emphasis> expected between two contiguous centroids is 1.003355 Th). Then compute the actual error between the observed centroids' <emphasis>m/z</emphasis> values and the corresponding calculated values. Turn the error into a ppm error (ppmError). In a variable, store the sum of the ppmError*ppmError values (that is, ppmError to the square) for all the deltas. The Root Mean Square Error value is the square root of that sum divided by the number of centroids in the cluster. The RMSE value is thus an indication of the consistency of the inter-centroid spacings. Depending on the analyzer used, the RMSE tolerance should be set to different values. Typically, for an Orbitrap analyzer, that RMSE tolerance would be 3 ppm, while for a TOF analyzer, it might be as high as 10 ppm. This test is useful because it helps rule out a charge value determined by inter centroid spacings that seem conducive to that charge but that in reality are not accurate enough for the analyzer being used.

    In the configuration shown, for the Orbitrap high resolution analyzer, a typical value that is accepted for this RMSE value is 3 ppm. For a low resolving power instrument (an old TOF analyzer, for example), that value might be set to roughly 10 ppm.

  • Min. determination coeff (R2) : This is a linear regression analysis correlation coefficient tolerance. The entire isotopic cluster is fit into a linear regression model, with, for the abscissa values the isotopologue indices and, for the ordinate values, the corresponding intensity. The model is the line equation, with the slope being the mean <emphasis>m/z</emphasis> spacing between the centroids. Then the coefficient of determination is computed to reflect the reality (or not) that the centroids are perfectly evenly spaced, as expected for an isotopic cluster. A value of R2 less than 0.98 implies that the centroid <emphasis>m/z</emphasis> values are not rightly on the regression line, they draw a jagged line;

  • Max. pairwise coeff. of variation: This pairwise statistics is performed when the number of centroids is less than 4. In this case it is not possible to perform a robust regression: the variance of the <emphasis>m/z</emphasis> spacings between the centroids is thus computed to extract the standard deviation. The coefficient of variation (CV) is the standard deviation divided by the mean of the spacings. A reliable isotopic cluster has a coefficient of variation of the <emphasis>m/z</emphasis> spacings < 0.02 (2%). If the CV is high, then that means that the <emphasis>m/z</emphasis> spacings are fluctuating, which might indicate poor centroidation.

Warning
Warning

The <emphasis>z</emphasis> value (the charge of the analyte as determined using this method), is never set to the analysis context and does not provide the value for the recording of the mining discoveries (see Chapter 9, Recording data exploration discoveries ). Likewise, the determined <emphasis>z</emphasis> value is not used in combination with the start point of the mouse drag movement to compute the Mr value. This is because the start point of the mouse drag movement is not necessarily at the very centroid position of the first peak of the isotopic cluster (see Section 5.2.1, “ Two-point charge determination with Mr calculation ” for the proper way of doing this deconvolution). Also, the determination of a compelling <emphasis>z</emphasis> value that is required for the calculation of a reliable Mr molecular mass cannot be achieved using the method described in this section because there are different outcomes of different reliabilities.

5.3 Low Mass Automated Deconvolution Based on Isotopic Cluster Peaks

Note
Note

A general overview of the centroid extraction→deconvolution→analyte identification workflow is illustrated at Figure 6.2, “Worflow leading from a mass spectrum to analyte identity suggestions ” .

The low mass automated deconvolution that uses the isotopic cluster peaks and is designed to be reliable for low-mass analytes (typically below 5—7 kDa). Deconvolution of a mass spectrum is triggered by selecting the menu show in Figure 5.7, “Low mass deconvolution of mass spectral data” .

Low mass deconvolution of mass spectral data

The low mass deconvolution is reliable for analytes having masses contained in the 5—7 kDa range.

Figure 5.7: Low mass deconvolution of mass spectral data

The process involves two separate steps: first, the centroids are extracted from the mass spectrum (see Section 4.1.7, “ Centroid Extraction from Mass Spectra ” ); second, the extracted centroids are used for the deconvolution proper. The deconvolution is a highly complex process that would require extensive knowldedge to understand. The main concepts are the following:

  • Iterate in each observed centroid of the input centroided mass spectrum (from smallest (<emphasis>m/z</emphasis>,i) values to greatest) and, for each centroid, run a nested iteration loop where each charge in the range specified by the user is used to compute a neutral analyte mass (see below).

  • The analyte's neutral mass is then used to construct a theoretically expected (that is, calculated) neutral averagine-based isotopic cluster. Each centroid of the isotopic is then taken in sequence and subjected to a matching process.

  • The matching operation takes place by first converting back the theoretical neutral analyte cluster centroids to a charged analyte <emphasis>m/z</emphasis> value (same charge as that initially used to craft the neutral mass); second, the <emphasis>m/z</emphasis> value is searched for in the input mass spectrum centroids. When an input spectrum <emphasis>m/z</emphasis> value matches that of the theoretical isotopic cluster, a match is stored.

  • When all the centroids of the theoretical isotopic cluster have been iterated into, the set of found matches is checked for reliability. The checks are performed according to the parameters set by the user and described below. If the checks are all positive, then the match is stored for the next step.

  • At the end of the processing of all the input centroids, all the found matches are processed in order to consolidate them into a neutral mass. Each neutral mass is listed along with all the <emphasis>m/z</emphasis> values that were actually matched (these <emphasis>m/z</emphasis> values correspond to the different charge states of the same analyte and are called supporting ions ).

The low mass deconvolution algorithm can be tightly configured using a large set of parameter settings that are detailed below and that are accessible via the application preferences window shown in Figure 5.8, “Parameter set governing the low mass deconvolver” and detailed below.

The low mass deconvolver settings interface enables the user to store groups of settings for easy recalling along different user sessions. Each relevant widget has an associated explanatory tool tip.

Parameter set governing the low mass deconvolver

Interface to the configuration of the low mass deconvolution process.

Figure 5.8: Parameter set governing the low mass deconvolver
  • Min. charge (Max. charge): the charge range that is explored.

  • Mass tolerance (ppm): the tolerance with which all the mass matches are performed. A typical value is 20. That values spans half of it on the left side of the mass value and half of it on the right side.

  • Min. mass (Max. mass): the neutral mass range that is explored.

  • Min. intensity: the intensity value below which input mass spectrum centroids are discarded.

  • Min. cluster centroid matches: The minimum number of matching centroids between the actual input mass spectral centroids and the calculated theoretically expected centroid cluster.

  • Max. cluster centroid matches: The maximum number of matching centroids between the actual input mass spectral centroids and the calculated theoretically expected centroid cluster.

  • Cluster shape centroid count: Over how many isotopic cluster centroids the cluster shape quality test (see below) should be performed.

  • Total cluster intensity tolerance: Once the matches between the centroids in the theoretical isotopic cluster and those in the input centroided mass spectrum have been performed, the reliability of these matches is checked like so: select the monoisotopic centroid and its contiguous centroid in both the theoretical isotopic cluster and in the matched centroided mass spectrum. Compute the ratio between the intensities of these centroids. That ratio is used to normalize the theoretical centroids intensities so they can be compared to the matched input mass spectrum centroid intensities. Compute the sum of the matched centroids intensities for all the matches (input mass spectrum vs theoretical). Compute the [input / theoretical] summed intensities ratio, which reflects the 'explained intensity'. The greater the ratio, the better the intensity correspondence between the theoretical and experimental isotopic clusters. That ratio value is required to be greater than the value set for this parameter for the matches to be considered reliable. Typical values: 0.5-0.7.

  • Relative cluster shape tolerance: Once the matches between the centroids in the theoretical isotopic cluster and those in the input centroided mass spectrum have been performed, the reliability of these matches is checked like so: get the monoisotopic (highest probability centroid, isotopologue 0) match. Get the isotopologue 1 match. For each one of these two matches compute the ratio of the intensities for the theoretical isotopic cluster data and for the input centroided mass spectrum (observed data) This yields the following intensity ratios: [ theor_ratio = theor. intensity iso 1 / theor. intensity iso 0 ] and [ obser_ratio -> obser. intensity iso 1 / obser. intensity iso 0 ]. Compute the ratio [ abs(obser_ratio - theor_ratio) / theor_ratio ] which amounts to a 'normalized difference'. If the ratio above is smaller than this parameter setting, then the matches are considered to be reliable. Typical values: 0.6-0.7.

  • Min Pearson corr. score: Once the matches between the centroids in the theoretical isotopic cluster and those in the input centroided mass spectrum have been performed, the reliability of these matches is checked like so: A Pearson correlation factor is computed to check that the observed and theoretical isotopic cluster centroids are correlated by their intensity. If the obtained correlation factor is greater or equal to this setting, then the matches are considered to be reliable. Typical values: 0.7-0.90.

  • Neutral averagine formula: The averagine formula to be used to model the isotopic clusters starting from an analyte's neutral mass.

  • Isotopic data file: The path name of the file holding the isotopic data. This is a file having the following format (removed data for clarity):

              1,hydrogen,H,1,1.007825,1,0,0.999884,-0.000115,0
              1,hydrogen,H,1,2.014101,2,1,0.0001157,-9.0644,0
              2,helium,He,2,3.0160293,3,0,0.00000,-13.5206046 0 [
              ... ] 6,carbon,C,6,12.00000,12,0,0.9892119,-0.0108466,0
              6,carbon,C,6,13.003354,13,1,0.0107,-4.529315,0
              7,nitrogen,N,7,14.0030,-0.003648,0
              7,nitrogen,N,7,15.00010,15,1,0.0036419,-5.615226,0
    Note
    Note

    It is possible to use an isotopic data file that lists heavy isotopes as the most probable isotopes for any chemical element (remember to check the Heavy isotope labelling ). In this manual author's research, bacteria are fully labelled with the heavy stable isotopes of carbon and nitrogen, which produces peptidoglycan structural elements that display as inverted isotopic clusters (with respect to natural isotopic abundance analytes). The deconvolver works also in this case by only detecting the input mass spectrum isotopic cluster having the proper inverted shape, and reports the analytes with the diagnostic ions at monoisotopic <emphasis>m/z</emphasis> values corresponding to the highest value in the isotopic cluster that was correctly recapitulated by the averagine-based theoretical isotopic cluster. This means that whatever the isotopic abundance configuration, the reported ions are correct: their monoisotopic <emphasis>m/z</emphasis> value is reported.

  • The last field may be used to indicate the path name to a file where all the processing details are stored. Perusing these data might be interesting when the deconvolution results do not correspond to expected results.

When the deconvolution process has ended, the results are shown in the console window.

If the user wants that a synthetic mass spectrum be created using the mass spectrometric data that could be used to obtain deconvolved features, clicking the Check to generate a synthetic mass spectrum and configure the mass peak shaping process will switch the application preferences page to that entitled Peak shaper. The configuration of the peak shaper is described in Section 7.2, “Shaping mass peak centroids into well-profiled peaks” . The title of the synthetic mass spectrum can be specified in the corresponding line edit widget.

When the synthetic mass spectrum is ready, the user is provided with two options:

  • Create a new MS run data set : This option creates a brand new mass spectrum.

  • Select an existing data set: This option creates a brand new mass spectrum that is parented to the existing dataset.

The generated mass spectrum is plotted in a new plot widget, in a new color if it was not parented to the original mass spectrum, in the same color otherwise.

The synthetic mass spectrum (blue) mimicks the measured input mass spectrum

The synthetic mass spectrum is written to disk and loaded into the program. The user is informed to perform an integration to a mass spectrum.

Figure 5.9: The synthetic mass spectrum (blue) mimicks the measured input mass spectrum
Note
Note

When performing an integration of the synthetic mass spectrum from the pseudo-TIC chromatogram, it is advised to check the m/z integration parameters. Most often, setting these to no bins is the best option.

5.4 Reading the resolving power based on mass spectral data

When left-mouse-button-click-dragging the mouse cursor between two mass spectrum locations of interest, the program computes the apparent resolving power. This process is shown on Figure 5.10, “Calculation of the resolving power” , where the resolving power is calculated by dragging the mouse cursor from one edge of a peak to the other at half maximum height (this is called full width at half maximum [FWHM] resolution).

Calculation of the resolving power

Click-dragging the mouse cursor will trigger the calculation of the resolving power of the instrument. That value is printed in the status bar.

Figure 5.10: Calculation of the resolving power
Print this page