Software innovations in four-dimensional mass spectrometric data analysis

 
CONTINUE READING
Software innovations in four-dimensional mass spectrometric data analysis
VOL. 32 NO. 6 (2020)

  ARTICLE

Software innovations in
four-dimensional mass
spectrometric data analysis
Jürgen Coxa and Gary Kruppab
a
 Research Group Lead of Computational Systems Biochemistry, Max Planck Institute of Biochemistry, Am
Klopferspitz 18, 82152 Martinsried, Germany. cox@biochem.mpg.de; https://orcid.org/0000-0001-8597-205X
b
  Vice-President, Proteomics, Bruker Daltonics

Over the last two decades, significant            MaxQuant possesses a large ecosys-         widespread adoption of Bruker’s timsTOF
advances in technology and new meth-            tem of algorithms for comprehensive          Pro instrument, which was designed
odologies have made proteomics an               data analysis. It incorporates the peptide   to deliver greater sensitivity, selectiv-
extremely powerful tool for protein scien-      search engine Andromeda and, coupled         ity and MS/MS acquisition speeds for
tists, biologists, and clinical researchers.1   with Perseus, was developed to offer a       proteomics research. The novel design
As analytical instrumentation continues         complete solution for downstream bioin-      allows for ions to be accumulated in
to evolve, more data is produced with           formatics analysis.2 MaxQuant performs       the front section, while ions in the
each technical advancement in proteom-          quantification with labels and via the       rear section are sequentially released
ics research. That, of course, also creates     MaxLFQ algorithm on label-free data, and     depending on their ion mobility, and in
new challenges for bioinformatics soft-         achieves high peptide mass accuracies        subsequent scans selected precursors
ware development.                               thanks to its advanced non-linear recali-    can be targeted for MS/MS. This process
    The modern high-throughput mass             bration algorithms.                          is called Parallel Accumulation Serial
spectrometry (MS) -based proteom-                                                            Fragmentation, or PASEF®.3
ics methods that are required to gain           MaxQuant for four-                              The unique trapped ion mobil-
deeper insights into biological processes       dimensional proteomics                       ity spectrometry (TIMS) design allows
produce enormous amounts of data. This          MaxQuant is often used for liquid chro-      researchers to reproducibly measure the
raw data necessitates powerful, auto-           matography coupled with tandem mass          collisional cross-section (CCS) values for
mated computer-based methods that               spectrometry (LC-MS/MS) shotgun              all detected ions, and those can be used
provide reliable identification and quan-       proteomics—a method of identifying           to further increase the system’s selec-
tification of proteins.                         proteins in complex mixtures to provide      tivity, enabling more and more reliable
                                                a wider dynamic range and coverage of        relative quantitation information from
Quantitative proteomics                         proteins.                                    complex samples and short gradient
software                                           Shotgun (or bottom-up) proteomics         analyses.
Freely available for academic and non-          is the most commonly used MS-based              The addition of TIMS to LC-MS based
academic researchers, many labora-              approach to study proteins by digesting      shotgun proteomics, using PASEF,
tories across the globe benefit from            them into peptides prior to MS analysis.     has the potential to boost proteome
the MaxQuant quantitative proteom-                 Ion mobility can add a further dimen-     coverage, quantification accuracy and
ics software package’s precise protein          sion to LC-MS based shotgun proteomics       dynamic range, resulting in fast ultra-
and peptide quantification algorithms.          that has the potential to boost proteome     sensitive analysis. The increase in
The software was developed by the               coverage, quantification accuracy and        speed resulting from PASEF technology
Computational Systems Biochemistry              dynamic range. However, this additional      allows more samples to be analysed in
group at the Max Planck Institute of            information requires suitable software       a shorter time frame, but also generates
Biochemistry (MPIB) in Martinsried,             that extracts the information contained      vast amounts of spectral data, creating
Munich, Germany. The group also devel-          in the four-dimensional (4D) data space      challenges when dealing with large
oped Perseus, a software platform that          spanned by m/z, retention time, ion          sample cohorts.
supports researchers in the interpretation      mobility and signal intensity.                  MPIB software developers adapted
of protein quantification and interaction          The impact of the addition of the ion     the MaxQuant shotgun proteomics
data as well as data on post-translational      mobility dimension on the proteomics         workflow to extract this abundance
modifications (PTMs).                           field can be seen, for example, in the       of information from the timsTOF Pro

14 SPECTROSCOPYEUROPE                                                                            www.spectroscopyeurope.com
Software innovations in four-dimensional mass spectrometric data analysis
VOL. 32 NO. 6 (2020)

                                                                                                        ARTICLE

Figure 1. Screenshot of MaxQuant quantitative proteomics software package designed for analysing large mass spectrometric data sets. Source:
Maxquant.org

data, making it possible to manage 4D             Emerging applications for                          research proteomics will be one of the
features in the space spanned by reten-           4D-Proteomics                                      main applications of 4D-Proteomics
tion time, ion mobility, mass and signal          Powerful developments in MS tech-                  in the future, and they are working
intensity that benefit the identification         nology have led to the expansion of                with several clinical groups to bring
and quantification of peptides, proteins          MaxQuant and the software’s ability to             MS-based proteomics into clinical prac-
and PTMs.                                         meet future needs in the proteomics                tice. However, the analysis of proteom-
   TIMS is challenging for software devel-        field. These improvements are helping              ics data from samples derived from
opers because it is not just one piece of         researchers develop new capabilities and           patients requires special computational
new information—it adds an additional             applications by delivering more sensitivity        strategies. The problems that need to be
dimension. The updated MaxQuant                   and selectivity for the identification and         addressed include: how to extract mean-
4D-Proteomics workflows can process               quantification of peptides, proteins and           ingful protein expression signatures from
data produced via PASEF, data-indepen-            PTMs. Instrumentation advances have                data with high individual variability, how
dent acquisition (dia)-PASEF and Mobility         also bolstered the ongoing development             to integrate the genomic background of
Offset Mass Aligned (MOMA).                       of MaxQuant. The MPIB software devel-              the patients into the analysis of proteom-
   Adding another dimension can                   opment team’s goal is always to expand             ics data, and how to determine biomark-
lengthen algorithm processing times,              and improve MaxQuant to meet the                   ers and properly estimate their predictive
creating a significant challenge for soft-        complexity of biological processes and             power.
ware development with 4D-Proteomics.              novel MS instruments.                                 To answer these questions, the MPIB
As a result, the team optimised computa-                                                             software developers are working to make
tion time in MaxQuant to overcome this,           Clinical research proteomics                       use of machine learning algorithms to
so users can achieve good results in a            The MPIB Computational Systems                     classify patients and employ feature
reasonable time frame.                            Biochemistry team believes clinical                selection algorithms to extract predictive

www.spectroscopyeurope.com                                                                                       SPECTROSCOPYEUROPE 15
Software innovations in four-dimensional mass spectrometric data analysis
VOL. 32 NO. 6 (2020)

  ARTICLE

Figure 2. Design of the timsTOF Pro instrument. Source: www.Bruker.com

protein signatures. The extra clinical test
engine provides a challenge for the soft-
ware. It is not clear yet if proteomics will
be a guide to which molecules to look at,
or if it will be a major component of clini-
cal diagnostics.

Single-cell proteomics
While today’s laboratories generate large
data sets from single-cell genomic and
single-cell transcriptomic research, single
cell proteomics (sc-proteomics) is a
nascent field. It holds the potential to
enable researchers to detect and quanti-
tate the proteins in single cells, avoiding
the need to infer proteins from cellular
messenger-RNA levels.4 That also creates
new challenges for computational analy-
sis.
   MPIB researchers are looking ahead
to future needs as sc-proteomics devel-
ops, closely examining emerging tech-
nologies to establish quantification
standards. The software developers
believe sc-proteomics will be achiev-
able in the next few years. The two main
advances enabling this will be improved
sensitivity in the instrumentation and the
software to work with it.

                                                 Figure 3. Example showing how dia-PASEF can efficiently fragment nearly all peptide ions
Data-independent                                 eluting at a given retention time. Figures courtesy of: F. Meier, A.-D. Brumer, M. Frank, A. Ha, I.
acquisition                                      Bludau, E. Voytik, S. Kaspar-Schoenefeld, M. Lubeck, O. Raether, R. Aebersold, B. Collins, H.-L. Rost
Recent advances in data-independent              and M. Mann, “Parallel accumulation – serial fragmentation combined with data-independent
acquisition (DIA) sensitivity have encour-       acquisition (diaPASEF): Bottom-up proteomics with near optimal usage”, bioRxiv 656207 (2020).
aged the MPIB Computational Systems              https://doi.org/10.1101/656207

16 SPECTROSCOPYEUROPE                                                                                      www.spectroscopyeurope.com
Software innovations in four-dimensional mass spectrometric data analysis
VOL. 32 NO. 6 (2020)

                                                                                           ARTICLE
Biochemistry team to integrate DIA work-    of proteomics, which could extend the        References
flows into MaxQuant using machine           feasibility of applying 4D-Proteomics        1. J. Cox and M. Mann, “Quantitative,
learning algorithms. The success of DIA     for clinical research applications, as          high-resolution proteomics for data-
relies on key instrumental capabilities—    mentioned previously.                           driven systems biology”, Ann. Rev.
namely resolution, sensitivity, accuracy                                                    Biochem. 80, 273–299 (2011).
and dynamic range uncompromised by          Conclusion                                      https://doi.org/10.1146/annurev-
a fast-spectral acquisition rate.           Bioinformatics software development             biochem-061308-093216
   DIA has been implemented in the          is a constantly changing field that will     2. J. Cox, N. Neuhauser, A. Michalski,
timsTOF Pro in a way that takes advan-      see more technological advancement              R.A. Scheltema, J.V. Olsen and M.
tage of the speed and sensitivity of        in the future. As analytical instrumenta-       Mann, “Andromeda: a peptide search
TIMS and PASEF, in a method called          tion advances, the MPIB Computational           engine integrated into the MaxQuant
dia-PASEF®. The 4D nature of the dia-       Systems Biochemistry team must deal             environment”, J. Proteome Res.
PASEF data is an advantage for DIA soft-    with even larger amounts of informa-            10(4), 1794–1805 (2011). https://
ware developers, who can make use of        tion on the software side because these         doi.org/10.1021/pr101065j
the additional ion mobility dimension       instruments continue to expand their         3. F. Meier, A.D. Brunner, S. Koch, H.
for alignment and extraction of features.   dynamic range and capabilities.                 Koch, M. Lubeck, M. Krause, N.
Such recent technological advances,            These improvements to the MaxQuant           Goedecke, J. Decker, T. Kosinski,
as well as developments in DIA meth-        software platform are possible due              M.A. Park, N. Bache, O. Hoerning,
ods, have provided new opportunities        to important collaborations between             J. Cox, O. Räther and M. Mann,
for the MPIB Computational Systems          the MPIB Computational Systems                  “Online Parallel Accumulation-Serial
Biochemistry research group.                Biochemistry research team and other            Fragmentation (PASEF) with a novel
   The MaxQuant developers are enthu-       institutions and industry leaders. These        trapped ion mobility mass spectrom-
siastic about the platform’s upcom-         collaborations include projects designed        eter”, Mol. Cell. Proteomics 17(12),
ing DIA capability. DDA and DIA are         to improve proteomics technologies, with        2534–2545 (2018). https://doi.
becoming comparable because of              MaxQuant providing the computational            org/10.1074/mcp.TIR118.000900
improved sensitivity in instrumentation.    tools necessary to analyse data acquired     4. V. Marx, “A dream of single-cell prot-
The hardware is becoming simpler for        on new and emerging hardware plat-              eomics”, Nat. Methods 16, 809–812
users, but data has been much more          forms. The results of these collaborations      (2019). https://doi.org/10.1038/
challenging because the sof t ware          will be used to develop future versions of      s41592-019-0540-6
must find which fragments belong to         the software to optimise 4D-Proteomics
which molecule. Addressing these chal-      workflows.
lenges will provide a deeper coverage

IMPORTANT READER SURVEY
As Spectroscopy Europe evolves to digital-only publication, we really need to know how you want
to read the magazine in future:

„ PDF for screen or to print?

„ App for mobile devices?

„ Page-turning “flipbook”?

„ Or something else?
We would also like to learn what sort of content you
like to read in Spectroscopy Europe.
Please take a couple of minutes, or less, to help shape
the future.

spectroscopyeurope.com/2021-survey                                                                 SPECTROSCOPYEUROPE 17
You can also read