Processing Metabolomics and Proteomics Data with Open Software
eBook - ePub

Processing Metabolomics and Proteomics Data with Open Software

A Practical Guide

Robert Winkler, Marek Domin, Robert Winkler

Buch teilen
  1. 430 Seiten
  2. English
  3. ePUB (handyfreundlich)
  4. Über iOS und Android verfügbar
eBook - ePub

Processing Metabolomics and Proteomics Data with Open Software

A Practical Guide

Robert Winkler, Marek Domin, Robert Winkler

Angaben zum Buch
Buchvorschau
Inhaltsverzeichnis
Quellenangaben

Über dieses Buch

Metabolomics and proteomics allow deep insights into the chemistry and physiology of biological systems. This book expounds open-source programs, platforms and programming tools for analysing metabolomics and proteomics mass spectrometry data. In contrast to commercial software, open-source software is created by the academic community, which facilitates the direct interaction between users and developers and accelerates the implementation of new concepts and ideas. The first section of the book covers the basics of mass spectrometry, experimental strategies, data operations, the open-source philosophy, metabolomics, proteomics and statistics/ data mining. In the second section, active programmers and users describe available software packages. Included tutorials, datasets and code examples can be used for training and for building custom workflows. Finally, every reader is invited to participate in the open science movement.

Häufig gestellte Fragen

Wie kann ich mein Abo kündigen?
Gehe einfach zum Kontobereich in den Einstellungen und klicke auf „Abo kündigen“ – ganz einfach. Nachdem du gekündigt hast, bleibt deine Mitgliedschaft für den verbleibenden Abozeitraum, den du bereits bezahlt hast, aktiv. Mehr Informationen hier.
(Wie) Kann ich Bücher herunterladen?
Derzeit stehen all unsere auf Mobilgeräte reagierenden ePub-Bücher zum Download über die App zur Verfügung. Die meisten unserer PDFs stehen ebenfalls zum Download bereit; wir arbeiten daran, auch die übrigen PDFs zum Download anzubieten, bei denen dies aktuell noch nicht möglich ist. Weitere Informationen hier.
Welcher Unterschied besteht bei den Preisen zwischen den Aboplänen?
Mit beiden Aboplänen erhältst du vollen Zugang zur Bibliothek und allen Funktionen von Perlego. Die einzigen Unterschiede bestehen im Preis und dem Abozeitraum: Mit dem Jahresabo sparst du auf 12 Monate gerechnet im Vergleich zum Monatsabo rund 30 %.
Was ist Perlego?
Wir sind ein Online-Abodienst für Lehrbücher, bei dem du für weniger als den Preis eines einzelnen Buches pro Monat Zugang zu einer ganzen Online-Bibliothek erhältst. Mit über 1 Million Büchern zu über 1.000 verschiedenen Themen haben wir bestimmt alles, was du brauchst! Weitere Informationen hier.
Unterstützt Perlego Text-zu-Sprache?
Achte auf das Symbol zum Vorlesen in deinem nächsten Buch, um zu sehen, ob du es dir auch anhören kannst. Bei diesem Tool wird dir Text laut vorgelesen, wobei der Text beim Vorlesen auch grafisch hervorgehoben wird. Du kannst das Vorlesen jederzeit anhalten, beschleunigen und verlangsamen. Weitere Informationen hier.
Ist Processing Metabolomics and Proteomics Data with Open Software als Online-PDF/ePub verfügbar?
Ja, du hast Zugang zu Processing Metabolomics and Proteomics Data with Open Software von Robert Winkler, Marek Domin, Robert Winkler im PDF- und/oder ePub-Format sowie zu anderen beliebten Büchern aus Scienze fisiche & Chimica analitica. Aus unserem Katalog stehen dir über 1 Million Bücher zur Verfügung.

Information

Jahr
2020
ISBN
9781788019903
Part B
Open MS Programs, Toolkits and Workflow Platforms
CHAPTER 6
OpenMS and KNIME for Mass Spectrometry Data Processing
Oliver Alka d , Timo Sachsenberg d , Leon Bichmann d , Julianus Pfeuffer d , j , Hendrik Weisser k , Samuel Wein i , Eugen Netz e , Marc Rurik d , Oliver Kohlbacher d , e , f , g , h and Hannes Röst*a , *b , *c
a Donnelly Centre, University of Toronto, Toronto, Canada,
b Department of Molecular Genetics, University of Toronto, Toronto, Canada
c Department of Computer Science, University of Toronto, Toronto, Canada
d Department for Computer Science, Applied Bioinformatics, University of Tübingen, Sand 14, 72076 Tübingen, Germany
e Biomolecular Interactions, Max Planck Institute for Developmental Biology, Max-Planck-Ring 5, 72076 Tübingen, Germany
f Institute for Translational Bioinformatics, University Hospital Tübingen, Hoppe-Seyler-Str. 9, 72076 Tübingen, Germany
g Institute for Biomedical Informatics, University of Tübingen, Sand 14, 72076 Tübingen, Germany
h Quantitative Biology Center, University of Tübingen, Auf der Morgenstelle 10, 72076 Tübingen, Germany
i Epigenetics Institute, Department of Cell and Developmental Biology, University of Pennsylvania, 9th Floor, Smilow Center for Translational Research 3400 Civic Center Blvd, Philadelphia, PA 19104, USA
j Department for Computer Science, Algorithmic Bioinformatics, Freie Universität Berlin, Takustr. 9, 14195 Berlin, Germany
k STORM Therapeutics Limited, Moneta Building, Babraham Research Campus, Cambridge CB22 3AT, UK
*E-mail: [email protected]

Computational mass spectrometry is plagued by a multitude of issues, including a heterogeneous software environment, complex workflows and proprietary tools. OpenMS addresses these challenges by providing robust open-source software for users and an open, well-designed software environment for developers. OpenMS is an open-source software C++ library for LC-MS data management and analyses using modern C++11. It offers an infrastructure for rapid development of mass spectrometry related software. OpenMS is free software available under the three clause BSD license. It comes with a variety of pre-built and ready-to-use tools for high-throughput Proteomics and Metabolomics data analysis (TOPPTools), covering most MS and LC-MS data processing and mining tasks, as well as visualization (TOPPView). OpenMS offers automated analyses for various quantitation protocols, including label-free quantitation, SILAC, iTRAQ, TMT, SRM, SWATH. It provides built-in algorithms for de novo identification and database search, as well as adapters to other tools like X!Tandem, Mascot, OMSSA, SIRIUS. It supports easy integration of OpenMS built tools into workflow engines like KNIME, Galaxy, WS-Pgrade, and TOPPAS. OpenMS supports the Proteomics Standard Initiative (PSI) formats for MS data including mzML, mzXML, mzIdentXML, pepXML. With pyOpenMS, OpenMS offers Python bindings to a large part of the OpenMS API to enable rapid algorithm development.

6.1 Introduction

Computational mass spectrometry has seen exponential growth in recent years in data size and complexity, straining the existing infrastructure of many labs as they moved towards high-performance computing (HPC) and embraced big data paradigms. Transparent and reproducible data analysis has traditionally been challenging in the field due to a highly heterogeneous software environment, while algorithms and analysis workflows have grown increasingly complex. A multitude of competing and often incompatible file formats prevented objective algorithmic comparisons and, in some cases, access to specific software or file formats relied on a vendor license. Due to the fast technology progress in the field, many novel algorithms are proposed in the literature every year, but few are implemented with reusability, robustness, cross-platform compatibility and user-friendliness in mind, creating a highly challenging software and data storage environment that in some aspects is even opaque to experts.
The OpenMS software framework addresses these issues through a set of around 175 highly robust and transparent cross-platform tools with a focus on maximal flexibility. 1,3 Modern software engineering techniques ensure reproducibility between versions and minimize code duplication and putative errors in software. OpenMS is completely open-source, uses standardized data formats extensively and is available on all three major computing platforms (macOS, Windows, Linux). Multiple layers of access to the OpenMS algorithms exist for specialist, intermediate and novice users, providing low entrance barriers through sophisticated data visualization and graphical workflow managers.
The flexibility of OpenMS allows it to support a multitude of customizable and easily transmissible workflows in multi-omics data analysis, including metabolomics, lipidomics, and proteomics setups, supporting different quantitative approaches spanning label-free, isotopic, isobaric labeling techniques, as well as targeted proteomics. Its highly flexible structure and layered design allow different scientific groups to take full advantage of the software. Developers can fully exploit the sophisticated C++ library for tool and data structure development, while advanced Python bindings (pyOpenMS, see Chapter 16) wrap most of the classes, 4 providing an excellent solution for fast scripting and prototyping. Users, can either work on command line tool level or take advantage of industry-grade workflows systems, such as the KoNstanz Information MinEr (KNIME), 5,6 Galaxy, 7 Nextflow, 8 or Snakemake. 9 The framework is highly adaptable, allowing even novice users to generate complex workflows using the easy-to-learn graphical user interfaces of KNIME. Built-in support for most common workflow steps (such as popular proteomics search engines) ensures low entrance barriers while advanced users have high flexibility within the same framework. A modular and comprehensive codebase allows rapid development of novel methods as exemplified by the recent additions for metabolomics, SWATH-MS and cross-linking workflows.
In addition, a versatile visualization software (TOPPView) allows exploration of raw data as well as identification and quantification results. 10 The permissive BSD license encourages usage in commercial and academic projects, making the project especially suited for reference implementations of file formats and algorithms.

6.2 OpenMS for Developers

The OpenMS framework consists of different abstraction layers. The first layer consists of external libraries (Qt, Boost, Xerces, Seqan, Eigen, Wildmagic, Coin-Or, libSVM), which add additional data structures and functionality, simplifying complex tasks such as GUI-programming or XML parsing. The next layer consists of the OpenMS core library containing algorithms, data structures and input/output processing. The third layer encloses TOPP tools and utilities (∼175), which allow various analysis tasks, such as signal processing, filtering, identification, quantification and visualization. The core library and the TOPP tools have Python bindings, which can be used for fast scripting and prototyping (pyOpenMS). 4 In the top layer, the tools are accessible from different workflow systems, for the construction of flexible, tool based workflows.

6.2.1 C++ Library

OpenMS has a multi-level architecture with an open source C++ library at its core which implement...

Inhaltsverzeichnis