Artificial Intelligence in Healthcare and Medicine
eBook - ePub

Artificial Intelligence in Healthcare and Medicine

  1. 286 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Artificial Intelligence in Healthcare and Medicine

About this book

This book provides a comprehensive overview of the recent developments in clinical decision support systems, precision health, and data science in medicine. The book targets clinical researchers and computational scientists seeking to understand the recent advances of artificial intelligence (AI) in health and medicine. Since AI and its applications are believed to have the potential to revolutionize healthcare and medicine, there is a clear need to explore and investigate the state-of-the-art advancements in the field. This book provides a detailed description of the advancements, challenges, and opportunities of using AI in medical and health applications. Over 10 case studies are included in the book that cover topics related to biomedical image processing, machine learning for healthcare, clinical decision support systems, visualization of high dimensional data, data security and privacy, bioinformatics, and biometrics. The book is intended for clinical researchers and computational scientists seeking to understand the recent advances of AI in health and medicine. Many universities may use the book as a secondary training text. Companies in the healthcare sector can greatly benefit from the case studies covered in the book. Moreover, this book also:



  • Provides an overview of the recent developments in clinical decision support systems, precision health, and data science in medicine


  • Examines the advancements, challenges, and opportunities of using AI in medical and health applications


  • Includes 10 cases for practical application and reference

Kayvan Najarian is a Professor in the Department of Computational Medicine and Bioinformatics, Department of Electrical Engineering and Computer Science, and Department of Emergency Medicine at the University of Michigan, Ann Arbor.

Delaram Kahrobaei is the University Dean for Research at City University of New York (CUNY), a Professor of Computer Science and Mathematics, Queens College CUNY, and the former Chair of Cyber Security, University of York.

Enrique DomĂ­nguez is a professor in the Department of Computer Science at the University of Malaga and a member of the Biomedical Research Institute of Malaga.

Reza Soroushmehr is a Research Assistant Professor in the Department of Computational Medicine and Bioinformatics and a member of the Michigan Center for Integrative Research in Critical Care, University of Michigan, Ann Arbor.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Artificial Intelligence in Healthcare and Medicine by Kayvan Najarian,Delaram Kahrobaei,Enrique Dominguez,Reza Soroushmehr in PDF and/or ePUB format, as well as other popular books in Computer Science & Biostatistics. We have over one million books available in our catalogue for you to explore.

Information

Publisher
CRC Press
Year
2022
Print ISBN
9780367638405
eBook ISBN
9781000565843

1 Machine Learning for Disease Classification: A Perspective

Jonathan Parkinson, Jonalyn H. DeCastro, Brett Goldsmith, and Kiana Aran
DOI: 10.1201/9781003120902-1
CONTENTS
  1. 1.1 The Groundwork of Machine Learning for Disease Modeling
  2. 1.2 The “Big Brother” of Predictions: Supervised Learning
  3. 1.3 A Different Language is a Different Vision of Life: Biomedical Data Feature Selection
  4. 1.4 Oh Me, Oh My, Omics: Reduction Techniques for Tackling Large Omics Datasets
  5. 1.5 Let's Get Ready to Rumble! Training and Testing of Disease Model Predictions
  6. 1.6 The Model Rhythm for Your Algorithm
  7. 1.7 Seeing Through the Brush: Decisions Trees
  8. 1.8 Buttered Up Approach: Kernel Method
  9. 1.9 Deep Blue Sea of Predictions: Deep Learning and Neural Networks
  10. 1.10 Disease Model to Disease Specialist: Model Interpretability for Healthcare Stakeholders
  11. 1.11 The Future Is not Something to Predict
  12. References

1.1 The Groundwork of Machine Learning for Disease Modeling

Recent years have witnessed dramatic growth in the volume and complexity of data available to biomedical research scientists (Bui & Van Horn, 2017). Deriving actionable insights that can improve diagnosis and treatment from these large heterogeneous datasets presents a formidable challenge for the community. Machine learning (ML) techniques have become a popular tool for the analysis of biomedical datasets, albeit not one yet widely used in medicine (Deo, 2015). Indeed, while the application of machine learning to disease classification holds considerable promise, it also faces unique obstacles arising from the nature of the data and from stakeholder expectations. A comprehensive review of machine learning algorithms and their applications in disease classification would be an ambitious task and one we will not attempt here. Rather, this perspective will provide an accessible high-level introduction to machine learning for disease classification: the mechanics of some popular algorithms, the challenges, and pitfalls this field confronts, and some examples and insights from recent literature.
It is important to realize that machine learning is not some kind of mathematical alchemy that can transform irreproducible data into golden insights. ML is subject to the same “garbage in = garbage out” limitation that applies elsewhere in modeling; hence, good data curation is key for success (Beam & Kohane, 2018; Rajkomar et al., 2019). Nor is machine learning a new field of study; modern deep learning algorithms, for example, are an extension of perceptron models first proposed in the 1950s and 1960s (Schmidhuber, 2015). Instead, it is probably best to view machine learning as an extension of statistical modeling techniques, with the main difference that while statistics seeks to make inferences about populations, machine learning tries to find patterns that provide predictive power and that can be generalized (Bzdok et al., 2018).
Machine learning problems can be broadly classified into three paradigms. Unsupervised learning techniques like clustering and dimension reduction seek structure in unlabeled data and hence are often useful in data exploration or hypothesis generation. Unsupervised learning techniques would for example likely be useful for seeking subsets of a patient population that share many common features. In a study of this type, subgroups identified via clustering could then be further studied to determine whether they respond differently to a treatment. Supervised learning techniques by contrast learn to predict an output variable from an input vector, matrix or a multidimensional array or tensor. In disease classification, a common task for a supervised learning algorithm might be to determine whether an image from an MRI or a histology image indicates the presence or absence of disease; in this example, the output variable to be predicted would be the category, while the input would be the pixel values of the image. Finally, in the reinforcement learning paradigm, the algorithm is provided with a set of choices and is offered “rewards” when its choice leads to a better outcome. Although unsupervised learning and clustering is often a powerful tool for analysis of multi-omics data, in this review, we will focus on supervised learning as the most directly relevant task for disease classification.

1.2 The “Big Brother” of Predictions: Supervised Learning

Supervised learning problems are those where an output or label y, sometimes called the “ground truth”, must be correctly predicted based on an input. The output y for disease classification is typically a category but may in some instances be a real-valued or complex number (regression), or a ranked category (ordinal regression). In some cases, the output may also be a real-valued vector (e.g., prediction of dihedral angles in a protein, based on an amino acid sequence). Successfully applying supervised learning to disease classification requires selecting and defining the right prediction problem. This may involve careful consideration of the labels to be predicted, the structure of existing workflows and pipelines, and the availability of relevant data.
It is of course crucial that the data is labeled correctly, and this may represent a challenge for biomedical applications in general. International Classification of Disease (ICD) codes, for example, are often used to indicate diagnoses in electronic health records (EHR). Errors in ICD code assignment are however not infrequent – estimates of the accuracy of ICD coding vary widely – and may arise from multiple possible sources of error (O’Malley et al., 2005). Physicians sometimes for example use abbreviations in their notes whose meaning may be ambiguous to the medical coder responsible for selecting and entering an appropriate diagnosis code (Sheppard et al., 2008).
Another more subtle problem can arise when there is a mismatch between the categories used to label the data and the ultimate objective of the study or of stakeholders. This problem is perhaps best illustrated with an example. Cancers that share the same tissue of origin exhibit a striking level of genetic diversity both within a single patient and across patients (Mroz & Rocco, 2017). It is well-established that different cancer subtypes exhibit different prognoses and may respond differently to the same treatment – indeed, many drug discovery efforts have focused on the development of drugs that target cancers with specific genetic features (Haque et al., 2012; Yersal, 2014). Breast cancers, for example, have been divided into five “intrinsic subtypes” (Howlader et al., 2018). More complex classification schemes and a variety of other risk markers have been proposed, since substantial diversity in genetic and transcriptomic profiles and in outcomes are observed within subtypes (Bayani et al., 2017; Dawson et al., 2013; Curtis et al., 2012; Russnes et al., 2017). If a model is trained to classify breast cancers based on genetic, transcriptomic, and/or other information is desired, clearly, the categorization chosen should be appropriate for the ultimate goal of the study: in other words, the labels that are generated should be clinically useful.
When defining metrics for predictive models, is it equally important to take into consideration existing workflows currently in practice. Ideally, a model should be chosen to solve a problem that takes advantage of the strengths and minimizes the weaknesses of existing pipelines. Steiner et al. (2018), for example, found that their deep learning algorithm exhibited a reduced false-negative rate for identification of breast cancer metastases in lymph nodes when compared with human pathologists. The algorithm, however, also exhibited an increased false-positive rate, especially if acquired images were out of focus. To overcome this problem, they designed a machine learning-assisted pipeline whereby the deep learning algorithm color-highlighted regions of interest for review by the pathologist, where different colors indicated different levels of confidence. This pipeline significantly improved both accuracy and speed compared to identification performed by unaided pathologists, thereby improving rather than reinventing the existing pipeline.
In addition to these considerations, the model should require only data that will be readily available at the time when the prediction will be made (Chen et al., 2019). In some cases, accurate diagnosis may require time-consuming lab tests whose results will seldom be available at the time of admission. A model that relies on late-arriving information may be severely limited in scope, while a model that can provide an accurate diagnosis without it may in such instances offer a key advantage. In 2019, for example, Yelin et al. used patient data from over 700,000 urinary tract infections (UTIs) to build a gradient boosted trees model and a logistic regression model to predict antibiotic resistance category solely based on patient history (Yelin et al., 2019). Their models were able to significantly outperform physicians and dramatically reduce the rate of incorrect prescriptions (i.e., situations where a patient has prescribed an antibiotic to which their infection is resistant). Since only patient history data was required, their approach can choose an antibiotic at the time of admission without waiting for antibiotic susceptibility testing results, which may require several days or more (Van Camp et al., 2020).
Availability of relevant data is a key challenge for developing machine learning models for disease classification. Healthcare datasets are in general both highly heterogenous and highly fragmented. A wide variety of EHR systems are marketed; there is little standardization across systems and software packages, so that pooling data acquired on different systems is inherently challenging (DeMartino & Larsen, 2013; Miller, 2011). EHR systems are often designed to prioritize the needs of medical billers and the insurance payors with whom they will communicate, so that the data is seldom formatted in a manner conducive to the needs of researchers or even physicians, many of whom report dissatisfaction with their healthcare system’s EHR software (Agrawal & Prabakaran, 2020; Gawande, 2018). Physicians frequently record their observations in the form of notes that cannot easily be translated into encoded input suitable for modeling purposes (DeMartino & Larsen, 2013). Pooling. Furthermore, the pooling and sharing of data between healthcare providers and different sources are substantially hindered by patient privacy and regulatory concerns (Agrawal et al., 2020).
Ultimately, these issues combine to ensure that assembling and pre-processing healthcare datasets are necessary for predictive modeling which may incur substantial effort and expense. Even once such datasets have been assembled, they may appear to be large and yet contain data for a wide array of conditions, so that only a handful of datapoints relevant to a particular disorder or outcome of interest appear in the dataset. Adibuzzaman et al., for example, report their experience with the Medical Information Mart for Intensive Care (MIMIC III) from Beth Israel Deaconess Hospital. This superficially large dataset contains data for some 50,000 patient encounters; yet if a researcher interested in drug-drug interactions were to query it for patients on antidepressants also taking an antihistamine, for example, they would retrieve a mere 44 datapoints (Adibuzzaman et al., 2017). Finally, most healthcare datasets contain missing values such that key information available for some patients is unavailable for others (Allen et al., 2014).
For all these reasons, organizing healthcare data to improve access for biomedical...

Table of contents

  1. Cover
  2. Half Title
  3. Title Page
  4. Copyright Page
  5. Contents
  6. Editor Biographies
  7. List of Contributors
  8. Introduction
  9. Chapter 1 Machine Learning for Disease Classification: A Perspective
  10. Chapter 2 A Review of Automatic Cardiac Segmentation using Deep Learning and Deformable Models
  11. Chapter 3 Advances in Artificial Intelligence Applied to Heart Failure
  12. Chapter 4 A Combination of Dilated Adversarial Convolutional Neural Network and Guided Active Contour Model for Left Ventricle Segmentation
  13. Chapter 5 Automated Methods for Vessel Segmentation in X-ray Coronary Angiography and Geometric Modeling of Coronary Angiographic Image Sequences: A Survey
  14. Chapter 6 Super-Resolution of 3D Magnetic Resonance Images of the Brain
  15. Chapter 7 Head CT Analysis for Intracranial Hemorrhage Segmentation
  16. Chapter 8 Wound Tissue Classification with Convolutional Neural Networks
  17. Chapter 9 Artificial Intelligence Methodologies in Dentistry
  18. Chapter 10 Literature Review of Computer Tools for the Visually Impaired: A Focus on Search Engines
  19. Chapter 11 Tensor Methods for Clinical Informatics
  20. Index