Hands-On Data Science with SQL Server 2017
Perform end-to-end data analysis to gain efficient data insight
Marek Chmel, Vladimír Mužný
- 506 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
Hands-On Data Science with SQL Server 2017
Perform end-to-end data analysis to gain efficient data insight
Marek Chmel, Vladimír Mužný
About This Book
Find, explore, and extract big data to transform into actionable insights
Key Features
- Perform end-to-end data analysis—from exploration to visualization
- Real-world examples, tasks, and interview queries to be a proficient data scientist
- Understand how SQL is used for big data processing using HiveQL and SparkSQL
Book Description
SQL Server is a relational database management system that enables you to cover end-to-end data science processes using various inbuilt services and features.
Hands-On Data Science with SQL Server 2017 starts with an overview of data science with SQL to understand the core tasks in data science. You will learn intermediate-to-advanced level concepts to perform analytical tasks on data using SQL Server. The book has a unique approach, covering best practices, tasks, and challenges to test your abilities at the end of each chapter. You will explore the ins and outs of performing various key tasks such as data collection, cleaning, manipulation, aggregations, and filtering techniques. As you make your way through the chapters, you will turn raw data into actionable insights by wrangling and extracting data from databases using T-SQL. You will get to grips with preparing and presenting data in a meaningful way, using Power BI to reveal hidden patterns. In the concluding chapters, you will work with SQL Server integration services to transform data into a useful format and delve into advanced examples covering machine learning concepts such as predictive analytics using real-world examples.
By the end of this book, you will be in a position to handle the growing amounts of data and perform everyday activities that a data science professional performs.
What you will learn
- Understand what data science is and how SQL Server is used for big data processing
- Analyze incoming data with SQL queries and visualizations
- Create, train, and evaluate predictive models
- Make predictions using trained models and establish regular retraining courses
- Incorporate data source querying into SQL Server
- Enhance built-in T-SQL capabilities using SQLCLR
- Visualize data with Reporting Services, Power View, and Power BI
- Transform data with R, Python, and Azure
Who this book is for
Hands-On Data Science with SQL Server 2017 is intended for data scientists, data analysts, and big data professionals who want to master their skills learning SQL and its applications. This book will be helpful even for beginners who want to build their career as data science professionals using the power of SQL Server 2017. Basic familiarity with SQL language will aid with understanding the concepts covered in this book.
Frequently asked questions
Information
Data Exploration and Statistics with T-SQL
- T-SQL aggregate queries: This section explains what the aggregate query is and which statistical measures it can show.
- Ranking, framing, and windowing with T-SQL: Using framing and windowing helps to obtain results enriched by sorting or ranking. In this section, we will play with framing and windowing from the perspective of data exploration.
- Running aggregates with T-SQL: This section will join knowledge from the aggregation queries section and the framing and windowing section to help us to create running aggregates or comparisons of values between rows.
Technical requirements
T-SQL aggregate queries
Common properties of aggregate functions
- Every aggregate function has one parameter of a typically numeric column that computes one value over that column. Two exceptions are made through the following functions:
- COUNT(*): This function computes an amount of records.
- STRING_AGG: This function concatenates strings from a column into one string so that it accepts the varchar or nvarchar columns as a parameter.
- Almost every aggregate function except COUNT(*) ignores records in which the aggregated column contains NULL. This is because the COUNT(*) function does not work with certain columns, but with whole records.
- Every aggregate function can be used in SELECT and HAVING clauses. In a SELECT clause, it provides scalar results. The HAVING clause serves as a conditional clause similar to the WHERE clause, but for the result of aggregations that are not yet known in a WHERE clause. We will cover more about the HAVING clause in the dedicated section later in this chapter.
Aggregate functions
COUNT, COUNT(*), and COUNT_BIG
SELECT COUNT(*) FROM Sales.SalesOrderDetail
SELECT COUNT(*) FROM Sales.SalesOrderDetail WHERE ProductID = 710