Abstract
Feature selection is an important step in everyday data mining. Its aim is to reduce the number of potentially irrelevant expert features describing a dataset to a number of important ones. Unlike feature reduction and transformation techniques, feature selection keeps a subset of the original features, thus maintaining the interpretability of the final models, which is especially important for researchers and medical professionals in the field of biomedicine. The aim of this chapter is to provide an in-depth overview of the various feature selection approaches that are applicable to biomedical signal classification, including: filters, wrappers, embedded methods, and various hybrid approaches. In addition, the recently developed methods based on sequential feature selection and data filtering from streams are considered. Feature selection implementations in current software solutions are described. A comparison of feature selection with deep learning approach is provided. The feature selection approach used in our own web-based biomedical signal analysis platform called MULTISAB (multiple time series analysis in biomedicine) is presented.
1.1 Introduction
The abundance of data available today constitutes an increasing problem for currently developed data modeling methods. Large datasets, some of which not even fitting in the computerās main memory of typical or advanced hardware configurations, present an obstacle for discovering relevant new information. To improve the analysis process and ameliorate the problem of efficient analysis of large datasets, feature selection and dimensionality reduction methods were developed in parallel with the improvement in data modeling techniques (i.e., classification, clustering, association rules, etc.). The aim of feature selection is to reduce the number of potentially irrelevant or redundant expert features describing a dataset to a smaller number of important ones that would lead to the feasibility of use and optimal effectiveness of modeling algorithms [1]. Thus, feature selection reduces the dataset size, which accelerates the model construction and allows its processing by machine-learning modeling methods. Feature selection also optimizes the accuracy of the modeling methods, because they need not concern themselves with many irrelevant features. The main difference between feature selection and dimensionality reduction (also called feature transformation or feature extraction [2]) is that feature selection keeps a subset of the original features, thus maintaining the interpretability of the final models. Although some researchers confuse the terms feature selection and dimensionality reduction [3], there is a clear distinction concerning interpretability. Maintaining interpretability is especially important in the field of biomedicine, where medical professionals usually require an explanation about the machine-reasoning process to be able to understand how a decision support software reached a proposed decision [4]. Therefore, in this work, we do not consider feature transformation/dimensionality reduction type of methods, although they are commonly used to improve model accuracy.
Feature selection may be applied in a variety of scenarios, depending on the research goal:
-
in classification problems, where the task is to differentiate between two or more categories;
-
in clustering problems, where the exact categories are unknown, but where features may nevertheless be irrelevant to describe the coherent sample cluster;
-
in descriptive problems, where one only wants to discover which features (out of many) are relevant for modeling of their problem, but where classification and/or clustering are not immediately needed; and
-
in streaming information problems, where one needs to decide quickly whether a given feature is important enough to warrant its further storage and processing.
The most common application in biomedicine is, by far, the use of feature selection in biomedical signal classification. Here, the goal is to effectively remove all or the majority of the informationally weak (unimportant, irrelevant) features and thus enable more efficient and accurate decisions about the classes of biomedical signals under consideration. This usually involves discerning between several disorders or organismās states measured by the signal (e.g., classification of arrhythmic heartbeats, detection or prediction of epilepsy, etc.). A subtype of classification where learning is achieved on a small set of labeled examples and is progressed in iterative steps on the (usually large) unlabeled set is called semi-supervised learning [5]. Although they are occasionally used in signal reconstruction, its applications in biomedical signal classification have thus far been limited and are thus not discussed here. In addition, although clustering and descriptive problems are useful in general, there are not many prominent applications of feature selection in such context in biomedical signal processing. Thus, feature selection in semi-supervised learning, clustering, and in descriptive models are considered to be out of scope for this chapter. Streaming information problems are usually used within the scope of classification and may be considered relevant for current biomedical engineering research. Hence, we include a description of online streaming feature selection in this work.
Feature selection is an integral part of signal analysis process. In this chapter, a detailed survey of various feature selection methods and their applications in biomedical signal analysis is provided, which includes their connection with data mining and machine learning, mostly in classification, but also in other topics in the field of artificial intelligence applications in biomedicine. The primary aim of this chapter is to provide an in-depth overview of the various feature selection approaches that are applicable to biomedical signal classification, including filters, w...