
- English
- ePUB (mobile friendly)
- Available on iOS & Android
Machine Learning for Microbiome Statistics
About this book
Machine learning fundamentally learns from the past experiences (seen data) to make predictions about future (unseen data). Predictions in nature are often uncertain. Microbiome data have unique characteristics, including high-dimensionality, over-dispersion, sparsity and zero-inflation, and heterogeneity. Thus, machine learning involving microbiome data for predicting the outcome of phenotypes is even more uncertain than learning those data from other fields. Machine Learning for Microbiome Statistics poses many challenges for evaluating the prediction performance using appropriate metrics and independent data validation.
This unique book aims to address the challenges of machine learning statistics, emphasize the importance of performance valuation by appropriate metrics and independent data, and describe several important concepts of machine learning statistics, such as feature engineering and overfitting. It comprehensively reviews commonly used and newly developed machine learning models for microbiome research. Specifically, this book provides the step-by-step procedures to perform machine learning of microbiome data, including feature engineering, algorithm selection and optimization, performance evaluation and model testing. It comments the benefits and limitations of using machine learning for microbiome statistics and remarks on the advantages and disadvantages of each machine learning algorithm.
It will be an excellent reference book for students and academics in the field.
- Presents a thorough overview of machine learning algorithms for microbiome statistics.
- Performs step-by-step procedures to perform machine learning of microbiome data, using important supervised learning algorithms, including classical, ensemble learning and tree-based models.
- Describes important concepts of machine learning, including bias and variance tradeoff, accuracy and precision, overfitting and underfitting, model complexity and interpretability, and feature engineering.
- Investigates and applies various cross-validation techniques step-by-step.
- Introduces confusion matrix and its derived measures. Comprehensively describes the properties of F1, Matthews' correlation coefficient (MCC), area under the receiver operating characteristic curve (AUC-ROC), and area under the precision-recall curve (AUC-PR), as well as discusses their advantages and disadvantages when using them for microbiome data.
- Offers all related R codes and the datasets from the authors' first-hand microbiome research and publicly available data.
Trusted by 375,005 students
Access to over 1 million titles for a fair monthly price.
Study more efficiently using our study tools.
Information
Table of contents
- Cover Page
- Half-Title Page
- Series Page
- Title Page
- Copyright Page
- Dedication Page
- Contents
- Preface
- Acknowledgments
- 1. Introduction to Machine Learning
- 2. Overview of Machine Learning in Microbiome Research
- 3. Accessing Model Accuracy and Goodness-of-Fit Tests for Normality
- 4. Overfitting and Underfitting
- 5. Assessing Model Accuracy Using Cross-Validation
- 6. Feature Engineering and Model Selection
- 7. Logistic Regression
- 8. Support Vector Machines
- 9. Classification Trees
- 10. Random Forest
- 11. The Evolution of Tree-Based Algorithms
- 12. Extreme Gradient Boosting (XGBoost)
- 13. Artificial Neural Networks and Deep Learning
- 14. Machine Learning Microbiome with SIAMCAT
- 15. Basic Performance Metrics for Machine Learning Models
- 16. Matthews Correlation Coefficient
- 17. Area under the Receiver Operating Characteristic Curve (AUC-ROC)
- 18. Area under the Precision-Recall Curve (AUC-PR)
- 19. Comparisons of Machine Learning Classification Models with Tidymodels
- References
- Index
Frequently asked questions
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app