
- 395 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
eBook - ePub
Computational Methods for Data Analysis
About this book
This graduate text covers a variety of mathematical and statistical tools for the analysis of big data coming from biology, medicine and economics. Neural networks, Markov chains, tools from statistical physics and wavelet analysis are used to develop efficient computational algorithms, which are then used for the processing of real-life data using Matlab.
Frequently asked questions
Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Computational Methods for Data Analysis by Yeliz Karaca,Carlo Cattani in PDF and/or ePUB format, as well as other popular books in Matemáticas & Econometría. We have over one million books available in our catalogue for you to explore.
Information
1Introduction
In mathematics and computer science, an algorithm is defined as an unambiguous specification of how to solve a problem, based on a sequence of logical-numerical deductions. Algorithm, in fact, is a general name given to the systematic method of any kind of numerical calculation or a route drawn for solving a problem or achieving a goal utilized in the mentioned fields. Tasks such as calculation, data processing and automated reasoning can be realized through algorithms. Algorithm, as a concept, has been around for centuries. The formation of the modern algorithm started with endeavors exerted for the solution of what David Hilbert (1928) called Entscheidungsproblem (decision problem). The following formalizations of the concept were identified as efforts for the definition of “effective calculability” or “effective method” with Gödel–Herbrand–Kleene recursive functions (1930, 1934 and 1935), lambda calculus of Alonzo Church (1936), “Formulation 1” by Emil Post (1936) and the Turing machines of Alan Turing (1936–37 and 1939). Since then, particularly in the twentieth century, there has been a growing interest in data analysis algorithms as well as their applications to interdisciplinary various datasets.
Data analysis can be defined as a process of collecting raw data and converting it into information that would prove to be useful for the users in their decision-making processes. Data collection is performed and data analysis is done for the purpose of answering questions, testing hypotheses or refuting theories. According to the statistician John Tukey (1961), data analysis is defined as the set of (1) procedures for analyzing data, (2) techniques for interpreting the results of these procedures and (3) methods for planning the gathering of data so that one can render its analysis more accurate and also much easier. It also comprises the entire mechanism and outcomes of (mathematical) statistics, which are applicable to the analyzing of data. Numerous ways exist for the classification of algorithms and each of them has its own merits.
Accordingly, knowledge itself turns into power when it is processed, analyzed and interpreted in a proper and accurate way. With this key motive in mind, our general aim in this book is to ensure the integration of relevant findings in an interdisciplinary approach, discussing various relevant methods, thus putting forth a common approach for both problems and solutions. The main aim of this book is to provide the readers with core skills regarding data analysis in interdisciplinary studies. Data analysis is characterized by three typical features: (1) algorithms for classification, clustering, association analysis, modeling, data visualization as well as singling out the singularities; (2) computer algorithms’ source codes for conducting data analysis; and (3) specific fields (economics, physics, medicine, psychology, etc.) where the data are collected.
This book will help the readers establish a bridge from equations to algorithms’ source codes and from the interpretation of results to draw meaningful information about data and the process they represent. As the algorithms are developed further, it will be possible to grasp the significance of having a variety of variables. Moreover, it will be showing how to use the obtained results of data analysis for the forecasting of future developments, diagnosis and prediction in the field of medicine and related fields. In this way, we will present how knowledge merges with applications.
With this concern in mind, the book will be guiding for interdisciplinary studies to be carried out by those who are engaged in the fields of mathematics, statistics, economics, medicine, engineering, neuroengineering, computer science, neurology, cognitive sciences and psychiatry and so on.
In this book, we will analyze in detail important algorithms of data analysis and classification. We will discuss the contribution gained through linear model and multilinear model, decision trees, naive Bayesian classifier, support vector machines, k-nearest neighbor and artificial neural network (ANN) algorithms. Besides these, the book will also include fractal and multifractal methods with ANN algorithm.
The main goal of this book is to provide the readers with core skills regarding data analysis in interdisciplinary datasets. The second goal is to analyze each of the main components of data analysis:
–Application of algorithms to real dataset and synthetic dataset
–Specific application of data analysis algorithm in interdisciplinary datasets
–Detailed description of general concepts for extracting knowledge from data, which undergird the wide-ranging array of datasets and application algorithms
Accordingly, each component has adequate resources so that data analysis can be developed through algorithms. This comprehensive collection is organized into three parts:
–Classification of real dataset and synthetic dataset by algorithms
–Singling out singularities features by fractals and multifractals for real dataset and synthetic datasets
–Achieving high accuracy rate for classification of singled out singularities features by ANN algorithm (learning vector quantization algorithm is one of the ANN algorithms).
Moreover, we aim to coalesce three scientific endeavors and pave a way for providing direction for future applications to
–real dataset and synthetic datasets,
–fractals and multifractals for singled out singularities data as obtained from real datasets and synthetic datasets and
–data analysis algorithms for the classification of datasets.
Main objectives are as follows:
1.1Objectives
Our book intends to enhance knowledge and facilitate learning, by using linear model and multilinear model, decision trees, naive Bayesian classifier, support vector machines, k-nearest neighbor, ANN algorithms as well as fractal and multifractal methods with ANN with the following goals:
–Understand what data analysis means and how data analysis can be employed to solve real problems through the use of computational mathematics
–Recognize whether data analysis solution with algorithm is a feasible alternative for a specific problem
–Draw inferences on the results of a given algorithm through discovery process
–Apply relevant mathematical rules and statistical techniques to evaluate the results of a given algorithm
–Recognize several different computational mathematic techniques for data analysis strategies and optimize the results by selecting the most appropriate strategy
–Develop a comprehensive understanding of how different data analysis techniques build models to solve problems related to decision-making, classification and selection of the more significant critical attributes from datasets and so on
–Understand the types of problems that can be solved by combining an expert systems problem solving algorithm approach and a data analysis strategy
–Develop a general awareness about the structure of a dataset and how a dataset can be used to enhance opportunities related to different fields which include but are not limited to psychiatry, neurology (radiology) as well as economy
–Understand how data analysis through computational mathematics can be applied to algorithms via concrete examples whose procedures are explained in depth
–Handle independent variables that have direct correlation with dependent variable
–Learn how to use a decision tree to be able to design a rule-based system
–Calculate the probability of which class the samples with certain attributes in dataset belong to
–Calculate which training samples the smallest k unit belongs to among the distance vector obtained
–Specify significant singled out singularities in data
–Know how to implement codes and use them in accordance with computational mathematical principles
1.2Intended audience
Our intended audience are undergraduate, graduate, postgraduate students as well as academics and scholars; however, it also encompasses a wider range of readers who specialize or are interested in the applications of data analysis to real-world problems concerning various fields, such as engineering, medical studies, mathematics, physics, social sciences and economics. The purpose of the book is to provide the readers with the mathematical foundations for some of the main computational approaches to data analysis, decision-making, classification and selecting the significant critical attributes. These include techniques and methods for numerical solution of systems of linear and nonlinear algorithms. This requires making connections between techniques of numerical analysis and algorithms. The content of the book focuses on presenting the main algorithmic approaches and the underlying mathematical concepts, with particular attention given to the implementation aspects. Hence, use of typical mathematical environments, Matlab and available solvers/ libraries, is experimented throughout the chapters.
In writing this text, we directed our attention toward three groups of individuals:
–Academics who wish to teach a unit and conduct a workshop or an entire course on essential computational mathematical approac...
Table of contents
- Cover
- Title Page
- Copyright
- Preface
- Acknowledgment
- Contents
- 1 Introduction
- 2 Dataset
- 3 Data preprocessing and model evaluation
- 4 Algorithms
- 5 Linear model and multilinear model
- 6 Decision Tree
- 7 Naive Bayesian classifier
- 8 Support vector machines algorithms
- 9 k-Nearest neighbor algorithm
- 10 Artificial neural networks algorithm
- 11 Fractal and multifractal methods with ANN
- Index