Computer Science

Big Data Analytics

Big Data Analytics involves the process of examining large and complex data sets to uncover patterns, correlations, and insights. It utilizes advanced algorithms and tools to extract valuable information from massive volumes of data, enabling organizations to make data-driven decisions and predictions. This field encompasses various techniques such as data mining, machine learning, and predictive analytics to derive meaningful insights from big data.

Written by Perlego with AI-assistance

12 Key excerpts on "Big Data Analytics"

  • Book cover image for: Data Analytics and Big Data
    • Soraya Sedkaoui(Author)
    • 2018(Publication Date)
    • Wiley-ISTE
      (Publisher)
    Big data poses new challenges to statisticians both in terms of theory and application. Some of the challenges include size, scalability of statistical computation methods, non-random data, assessing uncertainty, sampling, modeling relationships, mixture data, real-time analysis of streaming data, statistical analysis with multiple kinds of data, data quality and complexity, protecting, privacy and confidentiality, high dimensional data, etc.
    As the volume of data grows, so do the requirements for more advanced data warehouses and dispersed cloud-based databases [KIM 11]. In cases of data analytics, we analyzed requirements regarding (1) data: types, structure, format and sources and (2) data processing: operations, performance and conditions.
    The systematic application of data as a key driver for improving the robustness of decision making is widely considered a valuable, even necessary, practice for businesses. McAfee and Brynjolfsson [MCA 12] suggest that firms that consider themselves “data-driven” achieve consistently higher performance on several financial and operational measures, compared to those that do not.
    It is focused on the development of methodologies and techniques that “make sense” out of data. It would require tailored analytical methods and data quality control to superimpose on large data streams to make sense of the data and use them for statistical inference and decisions [SED 17]. More often than not, good theoretical insights and models of the subject discipline would be useful in identifying the “payoff relevance” of data for predictive purposes [HAR 14].
    The notion of making sense of big data has been expressed in many different ways, including data mining, knowledge extraction, information discovery, information harvesting, data archaeology and data pattern processing.
    In this book, Big Data Analytics, or advanced analytics, is considered as an umbrella concept for the analysis of data with the explicit aim of generating value, in the form of efficient information, that aids the decision-makers in their process. This idea can be formalized by Van Barneveld et al
  • Book cover image for: The Human Element of Big Data
    eBook - PDF

    The Human Element of Big Data

    Issues, Analytics, and Performance

    • Geetam S. Tomar, Narendra S. Chaudhari, Robin Singh Bhadoria, Ganesh Chandra Deka(Authors)
    • 2016(Publication Date)
    Referring to our abil-ity to harness, store, and extract valuable meaning from vast amounts of data, the term Big Data holds the implicit promise of answering fundamental questions, which disciplines such as the sciences, technology, healthcare, and business have yet to answer. In fact, as the vol-ume of data available to professionals and researchers steadily grows opportunities for new discoveries as well as potential to answer research challenges at stake are fast increasing (Manovich, 2011) it is expected that Big Data will transform various fields such as medicine, businesses, and scientific research overall (Chen and Zhang, 2014), and generate profound shifts in numerous disciplines (Kitchin, 2014). Yet, the adoption of advanced technologies in the field of Big Data remains a challenge for organizations, which still need to strategi-cally engage in the change toward rapidly shifting environments (Bughin et al., 2010). What is more, organizations adopting Big Data at an early stage still face difficulties in under-standing its guiding principles and the value it adds to the business (Wamba et al., 2015). Moreover, data sets are often of different types, which urges organizations to develop or apply “new forms of processing to enable enhanced decision making, insights discovery and process optimization” (Chen and Zhang, 2014, p. 315) as well as “a knowledge of analytics approaches” to different unstructured data types such as text, pictures, and video format, proving to be highly beneficial (Davenport et al., 2014) so that data scientists can quickly test and provide solutions to business challenges, emphasizing the application of Big Data Analytics in their business context over a specific analytical approach. Indeed, a data scientist student can be taught “how to write a Python program in half an hour” but can’t be taught “very easily what is the domain knowledge” (Dumbill et al., 2013).
  • Book cover image for: Handbook of Big Data
    • Peter Bühlmann, Petros Drineas, Michael Kane, Mark van der Laan, Peter Bühlmann, Petros Drineas, Michael Kane, Mark van der Laan(Authors)
    • 2016(Publication Date)
    All of these factors suggest a kind of ubiquity of data, but also contain a functionally vague understanding, which is situationally determined, and because of that it can be deployed in many contexts, has many advocates, and can be claimed by many as well. Partly because of this context-sensitive definition of the concept of big data, it is by no means a time phenomenon or novelty, but has a long genealogy that goes back to the earliest civilizations. Some aspects of this phenomenon will be discussed in the following sections. In addition, we will show in this chapter how big data embody a conception of data science at least at two levels. First of all, data science is the technical-scientific discipline, specialized in managing the multitude of data: collect, store, access, analyze, visualize, interpret, and protect. It is rooted in computer science and statistics; computer science is traditionally oriented toward data structures, algorithms, and scalability, and statistics is focused on analyzing and interpreting the data. In particular, we may identify here the triptych database technology/information retrieval, computational intelligence/machine learning, and finally inferential statistics. The first pillar concerns database/information retrieval technology. Both are core disciplines of computer science since many decades. Emerging from this tradition in recent years, notably researchers of Google and Yahoo have been working on techniques to cluster many computers in a data center, making data accessible and allowing for data-intensive calculations: think, for example, of BigTable, Google File Systems, a programming paradigm as Map Reduce, and the open source variant Hadoop. The paper of Halevy precedes this development as well. The second pillar relates to intelligent algorithms from the field of computational intelligence (machine learning and data mining).
  • Book cover image for: Big Data and Social Science
    • Sudha Menon, University of Kerala, India(Authors)
    • 2019(Publication Date)
    The organizations give attractive packages and incentives for the qualified professionals. The IT professionals such as data administrators and engineers can learn the tools of analytics for a better and bright career. In different sectors of industry, the nature of the job varies greatly and hence, the need of the industry varies too. As analytics is evolving in every field, the need of workforce is very huge. The job designations may comprise Big Data Engineer, Big Data Analyst, Solution Architect, Business Intelligence Consultants, etc. In addition to that, certain certifications can help people to display their skills and talent. The experience and knowledge of Big Data Analytics can effectively provide people an edge over others. 1.6. FOUR DIFFERENT WAYS DATA ANALYSIS CAN IMPROVE SOCIETY Big data is a term which people can hear everywhere, but what does it exactly mean? It means what it exactly sounds like. It is a term, which is used to describe any enormous amount of information, which can further be used as beneficial and useful information. Generally, it is characterized by three ‘V’s that are: • Volume: It refers to the amount of data. • Variety: It refers to the number of different types of data. • Velocity: It refers to the speed at which the data is processed. However, there is no precise quantity that actually describes the word ‘big’ in term ‘big data.’ Usually, it is in terabytes, petabytes, or exabytes—or one quintillion (that is 1,000,000,000,000,000,000!) bytes of information. It actually comes from a wide range of sources such as business records, social media, sensors in the IoT, etc. it is also described as a cluster of distinct database stores mainly in online shopping and web browsing. As big data develops into the field of ML and artificial intelligence (AI) , velocity is particularly beneficial in context to the speed that different algorithms can ingest, compare, and analyze the sources of data.
  • Book cover image for: Marketing Research Essentials
    • Carl McDaniel, Jr., Roger Gates(Authors)
    • 2016(Publication Date)
    • Wiley
      (Publisher)
    Big Data Analytics Big data is the accumulation and analysis of massive quantities of information. One research and consulting firm says that an organization with five total terabytes of active business data is one with big data. 7 A terabyte is a billion bytes, so a firm would be considered as working with big data if it had active business data of 5 billion bytes or more. Big data offers a firm: ■ Deeper insights—Rather than looking at market segments, classifications, groups, or other summary‐level information, big data researchers have insights into all the individual, all the products, all the parts, all the events, and all the transactions. behavioral targeting The use of online and offline data to understand a consumer’s habits, demographics, and social networks in order to increase the effectiveness of online advertising. 46 CHAPTER 3 Secondary Data and Big Data Analytics ■ Broader insights—Big Data Analytics takes into account all the data, structured and unstruc- tured, to understand the complex, evolving, and interrelated conditions to produce more accurate insights. 8 An example of deeper and broader insights would be where a cable TV supplier showed that 95 percent of all appointments made were met on time. This sounds impressive until you know that there were 3,000 appointments in a day, so 150 customers waited at home in vain each day. If you can then tie the missed appointments to call data, survey data, and repurchase data along with tweets and Facebook comments, a manager could begin calculating how much revenue and word‐ of‐mouth damage occurred, in addition to the extra cost of rescheduling and expediting visits. 9 Defining Relationships For scientists and marketing researchers, Big Data Analytics represents a paradigm shift. The tradi- tional scientific method involves getting information about a problem, creating a hypothesis, and then testing data to accept or reject the null hypothesis.
  • Book cover image for: Signal Processing and Networking for Big Data Applications
    Part I Overview of Big Data Applications 1 Introduction 1.1 Background Today, scientists, engineers, educators, citizens, and decision-makers have unprece- dented amounts and types of data available to them. Data come from many disparate sources, including scientific instruments, medical devices, telescopes, microscopes, satellites; digital media including text, video, audio, e-mail, weblogs, twitter feeds, image collections, click streams, and financial transactions; dynamic sensor, social, and other types of networks; scientific simulations, models, and surveys; or computational analysis of observational data. Data can be temporal, spatial, or dynamic; structured or unstructured. Information and knowledge derived from data can differ in repre- sentation, complexity, granularity, context, provenance, reliability, trustworthiness, and scope. Data can also differ in the rate at which they are generated and accessed. The phrase “big data” refers to the kinds of data that challenge existing analytical methods due to size, complexity, or rate of availability. The challenges in managing and analyzing “big data” can require fundamentally new techniques and technologies in order to handle the size, complexity, or rate of avail- ability of these data. At the same time, the advent of big data offers unprecedented opportunities for data-driven discovery and decision-making in virtually every area of human endeavor. A key example of this is the scientific discovery process, which is a cycle involving data analysis, hypothesis generation, the design and execution of new experiments, hypothesis testing, and theory refinement. Realizing the transformative potential of big data requires addressing many challenges in the management of data and knowledge, computational methods for data analysis, and automating many aspects of data-enabled discovery processes.
  • Book cover image for: Exploring Data and Business Management: How Information Helps in Supervision
    It is important to keep in mind a business problem and not data or technology, while building a business case for Big Data Analytics project. Either collecting or purchasing technology without having a clear business target is a losing strategy. A business case is meant to solve real business problems that an organization faces when it comes to analytics. 3.3. HISTORY OF BIG DATA The concept of big data came into the picture before the advances in databases technologies and also, from the need for solutions in order to Big Data and Business Analytics 69 handle the huge deluge of datasets as well as the lack of sufficient storage capacity. According to Figure 3.4, the evolution of this concept of big data through the past decades where each and every decade is explained in terms of computer disc space, from megabyte (MB) in the period of 1970s to exabyte (EB) which was introduced in the year 2011. Figure 3.4. Brief history of big data. Although there have been a number of attempts to provide a consensus definition of the big data, the term confronts an entire uncertainty among the researchers in not a similar discipline. This term has been implemented as a new as well as sophisticated apparatus for research. Despite the fact that the term has been extensively used in the recent years, the term itself in its true meaning has been into existence for years, and there have been arguments that the term was coined back in period of 1990s. However, the term has been deliberated since the year 2000 and also, this term was related with statistics domain. Therefore, it was used for describing “the explosion in the quantity (and sometimes, quality) of available and potentially relevant data, largely the result of recent and unprecedented advancements in data recording and storage technology.” In the further years, the term has been used, and the usage has notably been extended to a number of domains.
  • Book cover image for: Big Data, Big Analytics
    eBook - PDF

    Big Data, Big Analytics

    Emerging Business Intelligence and Analytic Trends for Today's Businesses

    • Michael Minelli, Michele Chambers, Ambiga Dhiraj(Authors)
    • 2012(Publication Date)
    • Wiley
      (Publisher)
    Rather than looking at segments, classifications, regions, groups, or other summary levels you’ll have insights into all the individuals, all the products, all the parts, all the events, all the transactions, etc. ■ Broader insights. The world is complex. Operating a business in a global, connected economy is very complex given constantly evolv- ing and changing conditions. As humans, we simplify conditions so we can process events and understand what is happening. But our best- laid plans often go astray because of the estimating or approximating. Big Data Analytics takes into account all the data, including new data sources, to understand the complex, evolving, and interrelated condi- tions to produce more accurate insights. ■ Frictionless actions. Increased reliability and accuracy that will allow the deeper and broader insights to be automated into systematic actions. SQL Analytics Descriptive Analytics Data Mining Predictive Analytics Simulation Optimization • Count • Mean • OLAP • Univariate distribution • Central tendency • Dispersion • Association rules • Clustering • Feature extraction • Classification • Regression • Forecasting • Spatial • Machine learning • Text analytics • Monte Carlo • Agent-based modeling • Discrete event modeling • Linear optimization • Non-linear optimization Business Intelligence Advanced Analytics Figure 1.2 Analytics Spectrum 14 WHAT IS BIG DATA AND WHY IS IT IMPORTANT? 15 GigaOm, a leading technology industry research firm, uses a simple framework (see Table 1.1) to describe potential Big Data Business Models for enterprises seeking to exploit Big Data Analytics. The competitive strategies outlined in the GigaOm framework are enabled today via packaged or custom analytic applications (see Table 1.2) depending on the maturity of the competitive strategy in the marketplace.
  • Book cover image for: Big Data Computing
    Interestingly, the real test is to see whether the recommendations produced by new solution are better than those of the legacy system. Conclusion Integrating BI and Big Data Analytics is no easy task. The goal for any data or analytical system is to make the data useful and available to as many users as possible. In order to do so, we need too powerful platforms providing both highly scalable and low cost data storage tightly integrated with scalable processing. So that businesses will be able to tackle increasingly complex problems by unlocking the power of their data. The capability to understand and act upon their data will open the door to a richer and more robust Big Data ecosystem. This requires time, patience, and innovation. References Akerkar, R. and Lingras, P. 2007. Building an Intelligent Web: Theory and Practice , Jones & Bartlett Publishers, Sudbury. Akerkar, R. and Sajja, P. 2012. Intelligent Technologies for Web Applications , Taylor & Francis, USA. * https://foursquare.com/ 397 Advanced Data Analytics for Business Chaudhuri, S., Dayal, U., and Narasayya, V. 2011. An overview of business intelli-gence technology, Communications of the ACM (54:8), 88–98. Davenport, T. H. and Harris, J. G. 2007, Competing on Analytics: The New Science of Winning , Harvard Business School, Boston, MA. Gelfand, A. 2011/2012. Privacy and biomedical research: Building a trust infrastructure—an exploration of data-driven and process-driven approaches to data privacy, Biomedical Computation Review , Winter, 23–28. Henschen, D. 2011. “Why All the Hadoopla?” Information Week, November 14, pp. 19–26. Mitchell, T. M. 1995. Machine Learning . McGraw-Hill, New York. Patterson, D. A. 2008. Technical perspective: The data center is the computer, Communications of the ACM (51:1), 105. Settles, B. 2011. Algorithms for active learning. In B. Krishnapuram, S. Yu, and R.B. Rao (Eds.), Cost-Sensitive Machine Learning .
  • Book cover image for: Communication, Management and Information Technology
    eBook - PDF

    Communication, Management and Information Technology

    International Conference on Communciation, Management and Information Technology (ICCMIT 2016, Cosenza, Italy, 26-29 April 2016)

    • Marcelo Sampaio de Alencar(Author)
    • 2016(Publication Date)
    • CRC Press
      (Publisher)
    687 Communication, Management and Information Technology – Sampaio de Alencar (Ed.) © 2017 Taylor & Francis Group, London, ISBN 978-1-138-02972-9 Big data mining: A classification perspective Nojod M. Alotaibi & Manal A. Abdullah Faculty of Computing and Information Technology, King Abdulaziz University (KAU), Saudi Arabia ABSTRACT: An unprecedented amount of data is being generated and recorded every day. Big data is the term used to describe such data which is difficult to process, manage and analyze patterns using tradi-tional databases or data mining algorithms. Mining big data is currently one of the most critical emerging research areas. Big data Mining refers to the process of extracting useful knowledge from large datasets or streams of data. Due to enormity, high dimensionality, heterogeneous, and distributed nature of data, traditional techniques of data mining may be unsuitable to work with big data. As a result, there is a criti-cal need to develop effective and efficient big data mining techniques. This paper explores the current use of supervised classification algorithms for the big data. It also compares between the protocols based on their advantages and limitations. Keywords : Big data, knowledge discovery, Data mining, Big Data mining, Supervised classification [5] defines big data technologies as “a new genera-tion of technologies and architectures designed to extract value economically from very large volumes of a wide variety of data by enabling high velocity capture, discovery and analysis”. Mining and discovering meaningful knowledge from big data for decision-making, prediction, and for other purposes is extremely challenging due to its characteristics. Knowledge Discovery (KD) is the process of discovering useful knowledge from a collection of data. Major KD application areas include marketing, manufacturing, fraud detec-tion, telecommunication, education, medical, Internet agent and many other areas [6, 7].
  • Book cover image for: Cost Accounting
    eBook - PDF

    Cost Accounting

    With Integrated Data Analytics

    • Karen Congo Farmer, Amy Fredin(Authors)
    • 2022(Publication Date)
    • Wiley
      (Publisher)
    Judging by the hours we spend on Netflix, their recommendations are working. • And good news from the Internal Revenue Service (IRS). The IRS used data analytics to prevent the issuance of $6.51 billion in invalid tax refunds from 2015 to 2017. Such interesting examples show promising insights for organizations based on the everyday data they collect. So how do we distinguish big data from data analytics? What are the implications for the cost accounting profession? What are the possible data sets and techniques to be used? This chapter provides an overview of the subject and answers those important questions. Overview of Data Did you know that consumers’ internet data usage plans are now offered in terabytes? One terabyte is 1,024 gigabytes, and one gigabyte is 1,024 megabytes. If consumers are using that much data—mostly for streaming videos—then imagine what commercial customers use! And think of all the rich insights available in all that commercial data. Advances in information technology have made it easier for organizations to collect data from operations and for operations. The sheer amount of data collected by companies is at an all-time high. For example, organizations may collect text-based customer feedback (sugges- tions or complaints) and social media posts with both audio and video. These types of data are generated at an amazing rate, including by the billions of emails sent and received daily. Big data is the structured and unstructured data generated from a variety of sources in volumes too large for traditional technologies to capture, manage, and process in a timely manner. In its raw state, this growing mountain of data just fills up storage systems. It’s important to differentiate between the two types of big data. • Structured data uses a known, predefined format, often organized in rows and columns with fixed fields, ready to be added to a database, and is easy to manipulate and use.
  • Book cover image for: Big Data Computing
    eBook - PDF

    Big Data Computing

    A Guide for Business and Technology Managers

    Business analytics is a process of transforming data into actions through analysis and insights in the context of organizational decision-making and problem-solving . Business analytics has traditionally been supported by various tools such as Microsoft Excel and various Excel add-ins, commercial statistical software packages such as SAS or Minitab, and more-complex business intelligence suites that integrate data with analytical software. Tools and techniques of business analytics are used across many areas in a wide variety of organizations to improve the management of customer relationships, financial and mar- keting activities, human capital, supply chains, and many other areas. Leading banks use analytics to predict and prevent credit fraud. Manufacturers use analytics for production planning, purchasing, and inventory management. Retailers use analytics to recommend products to customers and optimize marketing promotions. Pharmaceutical firms use it to get life-saving drugs to market more quickly. The leisure and vacation industries use analytics to analyze historical sales data, understand customer behavior, improve web- site design, and optimize schedules and bookings. Airlines and hotels use analytics to dynamically set prices over time to maximize revenue. Even sports teams are using busi- ness analytics to determine both game strategy and optimal ticket prices. Top-performing organizations (those that outperform their competitors) are three times more likely to be sophisticated in their use of analytics than lower performers and are more likely to state that their use of analytics differentiates them from competitors. One of the emerging applications of analytics is helping businesses learn from social media and exploit social media data for strategic advantage (see Chapter 15). Using analytics, firms can integrate social media data with traditional data sources such as customer surveys,
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.