
Discovering Knowledge in Data
An Introduction to Data Mining
- English
- ePUB (mobile friendly)
- Available on iOS & Android
About this book
The field of data mining lies at the confluence of predictive analytics, statistical analysis, and business intelligence. Due to the ever-increasing complexity and size of data sets and the wide range of applications in computer science, business, and health care, the process of discovering knowledge in data is more relevant than ever before.
This book provides the tools needed to thrive in today's big data world. The author demonstrates how to leverage a company's existing databases to increase profits and market share, and carefully explains the most current data science methods and techniques. The reader will "learn data mining by doing data mining". By adding chapters on data modelling preparation, imputation of missing data, and multivariate statistical analysis, Discovering Knowledge in Data, Second Edition remains the eminent reference on data mining.
- The second edition of a highly praised, successful reference on data mining, with thorough coverage of big data applications, predictive analytics, and statistical analysis.
- Includes new chapters on Multivariate Statistics, Preparing to Model the Data, and Imputation of Missing Data, and an Appendix on Data Summarization and Visualization
- Offers extensive coverage of the R statistical programming language
- Contains 280 end-of-chapter exercises
- Includes a companion website for university instructorswho adopt the book
Frequently asked questions
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Information
Chapter 1
An Introduction to Data Mining
- 1.1 What is Data Mining?
- 1.2 Wanted: Data Miners
- 1.3 The Need for Human Direction of Data Mining
- 1.4 The Cross-Industry Standard Practice for Data Mining
- 1.5 Fallacies of Data Mining
- 1.6 What Tasks Can Data Mining Accomplish?
- References
- Exercises
1.1 What is Data Mining?
1.2 Wanted: Data Miners
- The explosive growth in data collection, as exemplified by the supermarket scanners above,
- The storing of the data in data warehouses, so that the entire enterprise has access to a reliable, current database,
- The availability of increased access to data from web navigation and intranets,
- The competitive pressure to increase market share in a globalized economy,
- The development of “off-the-shelf” commercial data mining software suites,
- The tremendous growth in computing power and storage capacity.
There will be a shortage of talent necessary for organizations to take advantage of big data. A significant constraint on realizing value from big data will be a shortage of talent, particularly of people with deep expertise in statistics and machine learning, and the managers and analysts who know how to operate companies by using insights from big data . . . . We project that demand for deep analytical positions in a big data world could exceed the supply being produced on current trends by 140,000 to 190,000 positions. . . . In addition, we project a need for 1...
Table of contents
- Cover
- Series
- Title Page
- Copyright
- Preface
- Chapter 1: An Introduction to Data Mining
- Chapter 2: Data Preprocessing
- Chapter 3: Exploratory Data Analysis
- Chapter 4: Univariate Statistical Analysis
- Chapter 5: Multivariate Statistics
- Chapter 6: Preparing to Model the Data
- Chapter 7: k-Nearest Neighbor Algorithm
- Chapter 8: Decision Trees
- Chapter 9: Neural Networks
- Chapter 10: Hierarchical and k-Means Clustering
- Chapter 11: Kohonen Networks
- Chapter 12: Association Rules
- Chapter 13: Imputation of Missing Data
- Chapter 14: Model Evaluation Techniques
- Appendix: Data Summarization and Visualization
- Index
- End User License Agreement