eBook - ePub

Data Mining and Machine Learning Applications

Name: Data Mining and Machine Learning Applications
Author: Rohit Raja,Kapil Kumar Nagwanshi,Sandeep Kumar,K. Ramya Laxmi

Rohit Raja,Kapil Kumar Nagwanshi,Sandeep Kumar,K. Ramya Laxmi

English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Data Mining and Machine Learning Applications

Rohit Raja,Kapil Kumar Nagwanshi,Sandeep Kumar,K. Ramya Laxmi

Book details

Book preview

Table of contents

Citations

About This Book

DATA MINING AND MACHINE LEARNING APPLICATIONS

The book elaborates in detail on the current needs of data mining and machine learning and promotes mutual understanding among research in different disciplines, thus facilitating research development and collaboration.

Data, the latest currency of today's world, is the new gold. In this new form of gold, the most beautiful jewels are data analytics and machine learning. Data mining and machine learning are considered interdisciplinary fields. Data mining is a subset of data analytics and machine learning involves the use of algorithms that automatically improve through experience based on data.

Massive datasets can be classified and clustered to obtain accurate results. The most common technologies used include classification and clustering methods. Accuracy and error rates are calculated for regression and classification and clustering to find actual results through algorithms like support vector machines and neural networks with forward and backward propagation. Applications include fraud detection, image processing, medical diagnosis, weather prediction, e-commerce and so forth.

The book features:

A review of the state-of-the-art in data mining and machine learning,
A review and description of the learning methods in human-computer interaction,
Implementation strategies and future research directions used to meet the design and application requirements of several modern and real-time applications for a long time,
The scope and implementation of a majority of data mining and machine learning strategies.
A discussion of real-time problems.

Audience

Industry and academic researchers, scientists, and engineers in information technology, data science and machine and deep learning, as well as artificial intelligence more broadly.

Frequently asked questions

How do I cancel my subscription?

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

Can/how do I download books?

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

What is the difference between the pricing plans?

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

What is Perlego?

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Do you support text-to-speech?

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Is Data Mining and Machine Learning Applications an online PDF/ePUB?

Yes, you can access Data Mining and Machine Learning Applications by Rohit Raja,Kapil Kumar Nagwanshi,Sandeep Kumar,K. Ramya Laxmi in PDF and/or ePUB format, as well as other popular books in Informatique & Stockage de données. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Wiley-Scrivener

Year

2022

ISBN

9781119792505

Edition

Topic

Informatique

Subtopic

Stockage de données

1
Introduction to Data Mining

Santosh R. Durugkar¹, Rohit Raja², Kapil Kumar Nagwanshi³* and Sandeep Kumar⁴

¹Amity University Rajasthan, Jaipur, India

²IT Department, GGV Bilaspur Central University, Bilaspur, India

³ASET, Amity University Rajasthan, Jaipur, India

⁴Computer Science and Engineering Department, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andra Pradesh, India

Abstract

Data mining, as its name suggests “mining”, is nothing but extracting the desired, meaningful exact information from the datasets. Its methods and algorithms help researchers and students develop the numerous applications to be used by the end-users. Its presence in the healthcare industry, marketing, scientific applications, etc., enables the end-users to extract the meaningful required information from the collection. In the initial section, we discuss KDD—knowledge discovery in the database with its different phases like data cleaning, data integration, data selection and transformation, representation. In this chapter, we give a brief introduction to data mining. Comparative discussion about classification and clustering helps the end-user to distinguish these techniques. We also discuss its applications, algorithms, etc. An introduction to a basic clustering algorithm, K-means clustering, hierarchical clustering, fuzzy clustering, and density-based clustering, will help the end-user to select a specific algorithm as per the application. In the last section of this chapter, we introduce various data mining tools like Python, Rapid Miner, and KNIME, etc., to the user to extract the required information.

Keywords: Data mining, KDD, clustering, classification, Python, KNIME

1.1 Introduction

1.1.1. Data Mining

‘Mining’—extracts the meaningful information from the databases. This method helps the researchers, students, and other IT professionals remove the exact significant details and develop the desired applications [1, 2]. It is also known as Knowledge Discovery from databases—KDD. The applications of KDD may include medical/hospitals, Marketing, Educational systems, Scientific applications, E-commerce, Retail industries, Biological analysis, Counterterrorism, use in data-warehouse, in the energy sector for decision making, Spatial data mining, and Logistics [4–6].

1.2 Knowledge Discovery in Database (KDD)

It helps detect the new patterns of previously unknown data, i.e., extracting the hidden patterns, data from the massive volume of datasets [3, 6]. Figure 1.1 gives an idea about Knowledge discovery in Database—KDD, which consists of the following phases:

Data cleaning: This step can be defined as removing irrelevant data. Removing irrelevant data is nothing but unwanted data; records can be removed. Data collection may consist of missing values which must be either needs to be removed or should impute the missing information [7].

Figure 1.1 Knowledge discovery in Database—KDD.
Data integration: Data is collected from heterogeneous sources and integrated into a common source like data-warehouse (DW). A very common technique, Extract-Transform-Load (ETL), is beneficial in this regard. Integrating the data from multiple sources requires proper synchronization between the systems [2].
Data selection & transformation: Once the required data is selected, the next task is data transformation. As its name suggests transformation, it is nothing but transforming it into the desired mining procedure [8, 9].
Pattern evaluation: Evaluation is based on some measures; once these measures are applied, retrieved results are strictly compared/evaluated based on the stored patterns [9–11].
Knowledge representation: It is nothing but representing the processed data into the required formats such as tables and reports. One can say knowledge representation generates the rules, and using the exact visualization is possible [10].

1.2.1 Importance of Data Mining

◦ Useful in predictive analysis.
◦ They are storing and managing data in multidimensional systems.
◦ They are identifying the hidden patterns.
◦ Knowledge representation in desired formats, etc. [11].

1.2.2 Applications of Data Mining

Fraud Detection
- ◦ Data mining identifies patterns, i.e...