eBook - ePub
Data Driven Decision Making using Analytics
Parul Gandhi, Surbhi Bhatia, Kapal Dev, Parul Gandhi, Surbhi Bhatia, Kapal Dev
This is a test
Share book
- 138 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
eBook - ePub
Data Driven Decision Making using Analytics
Parul Gandhi, Surbhi Bhatia, Kapal Dev, Parul Gandhi, Surbhi Bhatia, Kapal Dev
Book details
Book preview
Table of contents
Citations
About This Book
This book aims to explain Data Analytics towards decision making in terms of models and algorithms, theoretical concepts, applications, experiments in relevant domains or focused on specific issues. It explores the concepts of database technology, machine learning, knowledge-based system, high performance computing, information retrieval, finding patterns hidden in large datasets and data visualization. Also, it presents various paradigms including pattern mining, clustering, classification, and data analysis. Overall aim is to provide technical solutions in the field of data analytics and data mining.
Features:
-
- Covers descriptive statistics with respect to predictive analytics and business analytics.
-
- Discusses different data analytics platforms for real-time applications.
-
- Explain SMART business models.
-
- Includes algorithms in data sciences alongwith automated methods and models.
-
- Explores varied challenges encountered by researchers and businesses in the realm of real-time analytics.
This book aims at researchers and graduate students in data analytics, data sciences, data mining, and signal processing.
Frequently asked questions
How do I cancel my subscription?
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Data Driven Decision Making using Analytics an online PDF/ePUB?
Yes, you can access Data Driven Decision Making using Analytics by Parul Gandhi, Surbhi Bhatia, Kapal Dev, Parul Gandhi, Surbhi Bhatia, Kapal Dev in PDF and/or ePUB format, as well as other popular books in Informatique & Extraction de données. We have over one million books available in our catalogue for you to explore.
Information
1Securing Big Data Using Big Data Mining
Preety1, Jagjit Singh Dhatterwal2, and Kuldeep Singh Kaswan3
1Assistant Professor, PDM University, Bahadurgarh, Jhajjar, Haryana, India
2Assistant Professor, PDM University, Bahadurgarh, Jhajjar, Haryana, India
3Associate Professor, Galgotias University, Greater Noida, Gautam Buddha Nagar, UP, India
DOI: 10.1201/9781003199403-1
Contents
- 1.1Big Data
- 1.1.1Big Data V’s
- 1.1.1.1Volume
- 1.1.1.2Variety
- 1.1.1.3Velocity
- 1.1.1.4Veracity
- 1.1.1.5Validity
- 1.1.1.6Visualization of Big Data
- 1.1.1.7Value
- 1.1.1.8Big Data Hiding
- 1.1.2Challenges with Big Data
- 1.1.3Analytics of Big Data
- 1.1.3.1Use Cases Used in Big Data Analytics
- 1.1.3.1.1Amazon’s “360-Degree View”
- 1.1.3.1.2Amazon – Improving User Experience
- 1.1.4Social Media Analysis and Response
- 1.1.4.1IoT – Preventive Maintenance and Support
- 1.1.4.2Healthcare
- 1.1.4.3Insurance Fraud
- 1.1.5Big Data Analytics Tools
- 1.1.5.1Hadoop
- 1.1.5.2MapReduce Optimize
- 1.1.5.3HBase Hadoop Structure
- 1.1.5.4Hive Warehousing Tool
- 1.1.5.5Pig Programming
- 1.1.5.6Mahout Sub-Project Apache
- 1.1.5.7Non-Structured Query Language
- 1.1.5.8Bigtable
- 1.1.6Security Threats for Big Data
- 1.1.7Big Data Mining Algorithms
- 1.1.8Big Data Mining for Big Data Security
- 1.1.8.1Securing Big Data
- 1.1.8.2Real-Time Predictive and Active Intrusion Detection Systems
- 1.1.8.3Securing Valuable Information Using Data Science
- 1.1.8.4Pattern Discovery
- 1.1.8.5Automated Detection and Response Using Data Science
- 1.1.9Conclusions
1.1 Big Data
The advent of IoT (internet of things) devices, business intelligence systems and AI (artificial intelligence) has led to their widespread implementation and to continuously increase the amount of data in existence. The development of self-driving cars, smart cities, home and factory automation, intelligent avionics systems, weaponry automation, medical process automation, Ericsson Company has estimated that nearly 29 billion connected devices are expected by 2022, of which 18 billion would apply to IoT. The number of IoT units, led by the new use scenarios, is projected to grow by 21% between 2016 and 2022. IDC reports that by 2025, real-time data will be more than a quarter of all data. Over the years, control systems kept evolving at different levels of Big Data information security. These control measures although serving as the underlying strategies for securing big data, have limited capability in combating recent attacks as malicious hackers have found new ways of launching destructive operations on big data infrastructures [1].
Digital data will increase as like zettabytes. This forecast gives insight into the higher rate of vulnerabilities and the large scale data security loopholes that may arise. Big data companies are facing greater challenges on how to highly secure and manage the constantly growing data.
Some of the challenges include the following:
- Interception or corruption of data in transit.
- Data in storage which can be held internee by malicious parties or hackers.
- Output data can also be a point of malicious attack.
- Low or no encryption mechanism over the variety of data sources.
- Incompatibility resulting from the various forms of data implementation from different sources.
1.1.1 Big Data V’s
The above-outlined challenges greatly impact the Vs of big data building blocks that are illustrated in Figure 1.1 [2].
1.1.1.1 Volume
The cumulative number of data is referred to in the volume. Today, Facebook contributes to 500 terabytes of new data every day. A single flight through the United States can produce 240 terabytes of flight data. In the near future, mobile phones and the data that they generate and ingest will result in thousands of new, continuously changing data streams that include information on the world, location, and other matters.
1.1.1.2 Variety
Data are of various types such as text, sensor data, audio, graphics, and video. Various data forms exist.
Structured data: data that can be saved in the row and column table in the database. These data are linked and can be mapped into pre-designed fields quickly, for example relational database.
Semi-structured data: partially ordered data such as XML and CSV archives.
Unstructured data: data which cannot be pre-defined, for example text, audio, and video files. It accounts for approximately 80% of data. It is fast growing and its use could assist in company’s decision making.
1.1.1.3 Velocity
Measuring how easily the data is entering as data streams constantly and receiving usable data in real time from the webcam.
1.1.1.4 Veracity
Consistency or trust of data is veracity.
It investigates whether data obtained from Twitter posts is trustworthy and correct, with hash tags, abbreviations, styles, etc.
- Do you have faith in the data you gathered?
- Is the data enough reliable to gather insight?
1.1.1.5 Validity
It is important to verify the authenticity of the data prior to processing large data sets.
1.1.1.6 Visualization of Big Data
A big data processing task is how the findings are visualized since the data is too broad and user-friendly visualizations are difficult to locate.
1.1.1.7 Value
It refers to the worth of the data being extracted. The bulk of data having no value is not at all useful for the company. Data needs to be converted into something valuable to achieve business gains. Through the estimation of the full costs for the production and processing of big data, businesses can determine whether big data analytics really add some value to their business relative to the ROI that business insights are supposed to produce.
1.1.1.8 Big Data Hiding
Huge volumes of usable data are lost when fresh information is mainly unstructured and dependent on files.
1.1.2 Challenges with Big Data
- Storing exponentially growing huge data sets.
- Integrating disparate data sources.
- Generating insights in a timely manner.
- Data governance.
- Security issues.
There are so many challenges in handling big data.
1.1.3 Analytics of Big Data
It analyzes the broad and diverse forms of data in order to detect secret trends, associations, and other perspectives.
1.1.3.1 Use Cases Used in Big Data Analytics
1.1.3.1.1 Amazon’s “360-Degree View”
In order to develop its recommendation engine, Amazon uses broad data obtained from consumers. It makes recommendations on what you buy, your reviews/feedback, any personal details, your shipping address (to guess your income level based on where you live), and browsing behavior. The company also makes recommendations based on what other customers with similar profile bought. This also helps in retaining their existing customers [3].
1.1.3.1.2 Amazon – Improving User Experience
Amazon is analyzing any visitor clicking on its web pages which will allow the company to understand user’s web navigation behavior, their empirical paths to purchase the app, and the paths that led them to leave the site. All this knowledge helps consumers enhance their marketing and advertising experiences.
1.1.4 Social Media Analysis and Response
Companies monitor what people are saying about their products and services in social media, and collect and analyze the posts on Facebook, Twitter, Instagram, etc. This further helps improve their products and enhance customer satisfaction as well as retain existing customers.
1.1.4.1 IoT – Preventive Maintenance and Support
Sensors are used for tracking the system and transmitting the related data over the internet in factories and other installations that use costly instruments. Big data technology programs process to identify whether a crisis is going to occur, often in real time. Prevention of incidents or expensive shutdowns may help its sustain.
1.1.4.2 Healthcare
Big data in healthcare refers to vast volumes of data obtained from a number of sources such as electronic gadgets such as exercise tracking systems, smart clocks, and sensors. Biometri...