Fundamentals of Big Data Network Analysis for Research and Industry
eBook - ePub

Fundamentals of Big Data Network Analysis for Research and Industry

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Fundamentals of Big Data Network Analysis for Research and Industry

About this book

Presents the methodology of big data analysis using examples from research and industry

There are large amounts of data everywhere, and the ability to pick out crucial information is increasingly important. Contrary to popular belief, not all information is useful; big data network analysis assumes that data is not only large, but also meaningful, and this book focuses on the fundamental techniques required to extract essential information from vast datasets.

Featuring case studies drawn largely from the iron and steel industries, this book offers practical guidance which will enable readers to easily understand big data network analysis. Particular attention is paid to the methodology of network analysis, offering information on the method of data collection, on research design and analysis, and on the interpretation of results.  A variety of programs including UCINET, NetMiner, R, NodeXL, and Gephi for network analysis are covered in detail.

Fundamentals of Big Data Network Analysis for Research and Industry looks at big data from a fresh perspective, and provides a new approach to data analysis.

This book:

  • Explains the basic concepts in understanding big data and filtering meaningful data
  • Presents big data analysis within the networking perspective
  • Features methodology applicable to research and industry
  • Describes in detail the social relationship between big data and its implications
  • Provides insight into identifying patterns and relationships between seemingly unrelated big data

Fundamentals of Big Data Network Analysis for Research and Industry will prove a valuable resource for analysts, research engineers, industrial engineers, marketing professionals, and any individuals dealing with accumulated large data whose interest is to analyze and identify potential relationships among data sets.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Fundamentals of Big Data Network Analysis for Research and Industry by Hyunjoung Lee,Il Sohn in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Warehousing. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Wiley
Year
2015
Print ISBN
9781119015581
eBook ISBN
9781119015499

1
Why Big Data?

There is an enormous amount of data. The increase in unfiltered data that has accumulated so rapidly includes an increase in needless data, which musdt be removed to allow more efficient and unbiased analyses. This requires an ability to extract correct and useful information from the data. Thus, by correctly distinguishing the “gems” from the “pebbles,” Big Data analysis would assist an enterprise in obtaining a wider view when starting with a comparably narrow view. Because Big Data bases its significance in the expansion of thought, it is not about volume, velocity, or variety of data but rather about an alternative perspective and viewpoint with respect to the data. If you want to see a forest, you should not leave the forest you should climb to the top of a mountain. Likewise, to obtain meaningful insight from Big Data, we should attempt to broaden our perspective from a bird’s eye view. The higher the altitude, the wider is the vision that can be obtained. To see the outside that was never observed from the inside, a different perspective is required to see the forest, and that is where Big Data steps in.

1.1 Big Data

There has been a significant influx of interest in Big Data. Gartner, one of the top marketing analysis institutions in the world, has selected Big Data as one of the top 10 strategic technologies [1] in both 2012 and 2013; in 2014, it selected Big Data and Actionable Analytics as the core strategy technology for smart governance [2]. Further, every January at Davos, global political and economic leaders gather at the World Economic Forum to discuss world issues, At the so-called Davos Forum 2012 [3], Big Data was again selected as one of the 10 technologies that have emerged as crucial for future developments. Although we are currently confronted by a financial crisis and partial recovery, along with issues related to climate change, energy, poverty, and security, the selection of Big Data seems to indicate that solutions to global issues require a broad range and amount of data, and the technology to effectively manage and extract useful data is expected to provide much-needed insight into resolving some of these potentially catastrophic global issues.
Of course, when we first encounter Big Data, we focus most of our attention on the word “Big” and become engrossed with the image of a giant being. In reality, however, Big Data is more closely associated with enormity and numberlessness. The term Big Data was defined and widely disseminated by Meta Group (now Gartner) analyst Doug Laney in 2001 to address issues and opportunities in the three dimensions of the rapid data expansion, including data volume, velocity of input/output data, and variety of data type [4]. The concept of Big Data attracting widespread interest in the 2000s can be correlated with the global proliferation of the Internet and the need to analyze the enormous amount data that it generates. The importance of analyzing massive data and converting them into useful information cannot be overstated. Next, a dimension dealing with “value” should be added to the existing three dimensions of data. If Big Data is large, expressed in real time similar to streaming, and includes unstructured data such as text, images, and videos, combining these different types of data and creating value are important. Thus, the amount of reserves is important, whereas the size of the mine is unimportant. The researcher does not need data; he or she needs information. Big Data addresses the size of the data; fundamentally, however, it is more important to analyze and produce meaningful data.
To be considered as Big Data, the data volume must be large in the data set. Although there is no specific size limit that defines Big Data, typically the data set would be a few terabytes for small data sets to as much as a few petabytes for large data sets. Table 1.1 indicates the current data sizes, with the prefixes of peta-, exa-, zetta-, yotta-, bronto-, and geop- used to express the amount of data [5]. If we were to express the amount of data in the books contained in the Library of Congress (in Washington, DC), the total would be about ~15 TB. Through 2012, the human race has accumulated a wealth of data totaling 1.27 ZB. Thus, 1 GpB would suggest an amount of data that is difficult to fathom and would describe an enormous amount of data that are created and distributed.
Table 1.1 Data size.
Data Size Means
Bit (b) 1 b 1 Binary digit (1 or 0)
Byte (B) 8 b 23 English letter (1 character) Basic data units
Kilobyte (kB) 1024 B 210 1 page A sheet of paper with 1200 characters
Megabyte (MB) 1024 kB 220 873 Pages 4 Books Single digital photo: 3 MB
Single MP3 song: 4 MB
Gigabyte (GB) 1024 MB 230 894,784 Pages 4,473 Books 1–2 hours movie: 1–2 GB
341 Digital pictures 256 MP3 audio files
Terabyte (TB) 1024 GB 240 916,259,689 Pages 4,581,298 Books Entire volume of books in the library of Congress: 15TB
349,525 Digital pictures 262,144 MP3 audio files
1,613 CDs 233 DVDs
40 Blu-ray discs
Petabyte (PB) 1024 TB 250 938,249,922,368 Pages 4,691,249,611 Books Amount of data Google processes
357,913,941 Digital pictures 268,435,456 MP3 audio files Every hour: 1 PB
1,651,910 CDs 239,400 DVDs
41,943 Blu-ray discs
Exabyte (EB) 1024 PB 260 960,767,920,505,705 Pages 4,803,839,602,528 Books Amount of data contained in 100 million copies of a weekly magazine in the US
366,503,875,925 Digital pictures 274,877,906,944 MP3 audio files
1,691,556,350 CDs 245,146,535 DVDs
42,949,672 Blu-ray discs
Zettabyte (ZB) 1024 EB 270 983,826,350,597,842,752 Pages 4,919,131,752,989,213 Books The amount of data existing until 2012: 1.27 ZB
375,299,968,947,541 Digital pictures 281,474,976,710,656 MP3 audio files
1,732,153,702,834 CDs 251,030,052,003 DVDs
43,980,465,111 Blu-ray discs
Yottabyte (YB) 1024 ZB 280 1,007,438,153,012,190,978,921 Pages 5,037,190,915,060,954,894 Books It would take 11 trillion years to download 1 YB from a high-power broadband
3843,307,168,202,282,325 Digital pictures 288,230,376,151,711,744 MP3 audio files
1,773,725,391,702,841 CDs 257,054,773,251,740 DVDs
45,035,996,273,704 Blu-ray discs
Brontobyte (BB) 1024 YB 290 1,031,616,699,404,483,562,415,936 Pages 5,158,083,497,022,417,812,079 Books Considering the size of the data that can be collected in real time sensor data of the IoT (internet of things)
393,530,540,239,137,101,141 Digital pictures 295,147,905,179,352,825,856 MP3 audio files
1,816,294,801,103,709,697 CDs 263,224,087,809,782,414 DVDs
46,116,860,184,273,879 Blu-ray discs
Geopbyte (GpB) 1024 BB 2100 1,056,375,500,190,191,167,913,919,337 Pages 5,281,877,500,950,955,839,569,596 Books Largest data amount that can be fathomed
402,975,273,204,876,391,568,725 Digital pictures 302,231,454,903,657,293,6...

Table of contents

  1. Cover
  2. Title Page
  3. Table of Contents
  4. Preface
  5. About the Authors
  6. List of Figures
  7. List of Tables
  8. 1 Why Big Data?
  9. 2 Basic Programs for Analyzing Networks
  10. 3 Understanding Network Analysis
  11. 4 Research Methods Using SNA
  12. 5 Position and Structure
  13. 6 Connectivity and Role
  14. 7 Data Structure in NetMiner
  15. 8 Network Analysis Using NetMiner
  16. Appendix A: Visualization
  17. Appendix B: Case Study
  18. Index
  19. End User License Agreement