Data Science
eBook - ePub

Data Science

The Executive Summary - A Technical Book for Non-Technical Professionals

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Data Science

The Executive Summary - A Technical Book for Non-Technical Professionals

About this book

Tap into the power of data science with this comprehensive resource for non-technical professionals

Data Science: The Executive Summary – A Technical Book for Non-Technical Professionals is a comprehensive resource for people in non-engineer roles who want to fully understand data science and analytics concepts. Accomplished data scientist and author Field Cady describes both the "business side" of data science, including what problems it solves and how it fits into an organization, and the technical side, including analytical techniques and key technologies.

Data Science: The Executive Summary covers topics like:

  • Assessing whether your organization needs data scientists, and what to look for when hiring them
  • When Big Data is the best approach to use for a project, and when it actually ties analysts' hands
  • Cutting edge Artificial Intelligence, as well as classical approaches that work better for many problems
  • How many techniques rely on dubious mathematical idealizations, and when you can work around them

Perfect for executives who make critical decisions based on data science and analytics, as well as mangers who hire and assess the work of data scientists, Data Science: The Executive Summary also belongs on the bookshelves of salespeople and marketers who need to explain what a data analytics product does. Finally, data scientists themselves will improve their technical work with insights into the goals and constraints of the business situation.

Trusted by 375,005 students

Access to over 1 million titles for a fair monthly price.

Study more efficiently using our study tools.

Information

1
Introduction

1.1 Why Managers Need to Know About Data Science

There are many “data science for managers” books on the market today. They are filled with business success stories, pretty visualizations, and pointers about what some of the hot trends are. That material will get you rightfully excited about data science's potential, and maybe even get you started off on the right foot with some promising problems, but it isn't enough to see projects over the finish line or bring the full benefits of data to your organization. Depending on your role you may also need to decide how much to trust a piece of analytical work, make final calls about what tools your company will invest in, and hire/manage a team of data scientists. These tasks don't require writing your own code or performing mathematical derivations, but they do require a solid grounding in data science concepts and the ability to think critically about them.
In the past, mathematical disciplines like statistics and accounting solved precisely defined problems with a clear business meaning. You don't need a STEM degree to understand the idea of testing whether a drug works or balancing a checkbook! But as businesses tackle more open‐ended questions, and do so with datasets that are increasingly complex, the analytics problems become more ambiguous. A data science problem almost never lines up perfectly with something in a textbook; there is always a business consideration or data issue that requires some improvisation. Flexibility like this can become recklessness without fluency in the underlying technical concepts. Combine this with the fact that data science is fast becoming ubiquitous in the business world, and managers and executives face a higher technical bar than they ever did in the past.
Business education has not caught up to this new reality. Most careers follow a “business track” that teaches few technical concepts, or a “technical track” that focuses on hands‐on skills that are useless for businesspeople. This book charts a middle path, teaching non‐technical professionals the core concepts of modern data science. I won't teach you the brass tacks of how to do the work yourself (that truly is for specialists), but I will give you the conceptual background you need to recognize good analytics, frame business needs as solvable problems, manage data science projects, and understand the ways data science is likely to transform your industry.
In my work as a consultant I have seen PMs struggle to mediate technical disagreements, ultimately making decisions based on peoples' personalities rather than the merits of their ideas. I've seen large‐scale proof‐of‐concept projects that proved the wrong concept, because organizers set out inappropriate success metrics. And I've seen executives scratching their heads after contractors deliver a result, unable to see for themselves whether they got what they paid for.
Conversely, I have seen managers who can talk shop with their analysts, asking solid questions that move the needle on the business. I've seen executives who understand what is and isn't feasible, instinctively moving resources toward projects that are likely to succeed. And I've seen non‐technical employees who can identify key oversights on the part of analysts and communicate results throughout an organization.
Most books on data science come in one of two types. Some are written for aspiring data scientists, with a focus on example code and the gory details of how to tune different models. Others assume that their readers are unable or unwilling to think critically, and dumb the technical material down to the point of uselessness. This book rejects both those approaches. I am convinced that it is not just possible for people throughout the modern business workforce to learn the language of data: it is essential.

1.2 The New Age of Data Literacy

Analytics used to play a minor role in business. For the most part it was used to solve a few well‐known problems that were industry‐specific. When more general analytics was needed, it was for well‐defined problems, like conducting an experiment to see what product customers preferred.
Two trends have changed that situation. The first is the intrusion of computers into every aspect of life and business. Every phone app, every new feature in a computer program, every device that monitors a factory is a place where computers are making decisions based on algorithmic rules, rather than human judgment. Determining those rules, measuring their effectiveness, and monitoring them over time are inherently analytical. The second trend is the profusion of data and machines that can process it. In the past data was rare, gathered with a specific purpose in mind, and carefully structured so as to support the intended analysis. These days every device is generating a constant stream of data, which is passively gathered and stored in whatever format is most convenient. Eventually it gets used by high‐powered computer clusters to answer a staggering range of questions, many of which it wasn't really designed for.
I don't mean to make it sound like computers are able to take care of everything themselves – quite the opposite. They have no real‐world insights, no creativity, and no common sense. It is the job of humans to make sure that computers' brute computational muscle is channeled toward the right questions, and to know their limitations when interpreting the answers. Humans are not being replaced – they are taking on the job of shepherding machines.
I am constantly concerned when I see smart, ethical business people failing to keep up with these changes. Good managers are at risk of botching major decisions for dumb reasons, or even falling prey to unscrupulous snake oil vendors. Some of these people are my friends and colleagues. It's not a question of intelligence or earnestness – many simply don't have the required conceptual background, which is understandable. I wrote this book for my friends and people like them, so that they can be empowered by the age of data rather than left behind.

1.3 Data‐Driven Development

So where is all of this leading? Cutting out hyperbole and speculation, what does it look like for an organization to make full use of modern data technologies and what are the benefits? The goal that we are pushing toward is what I call “data‐driven development” (DDD). In an organization that uses DDD, all stages in a business process have their data gathered, modeled, and deployed to enable better decision making. Overall business goals and workflows are crafted by human experts, but after that every part of the system can be monitored and optimized, hypotheses can be tested rigorously and retroactively, and large‐scale trends can be identified and capitalized on. Data greases the wheels of all parts of the operation and provides a constant pulse on what's happening on the ground.
I break the benefits of DDD into three major categories:
  1. 1. Human decisions are better‐informed: Business is filled with decisions about what to prioritize, how to allocate resources, and which direction to take a project. Often the people making these calls have no true confidence in one direction or the other, and the numbers that could help them out are either unavailable or dubious. In DDD the data they need will be available at a moment's notice. More than that though, there will be an understanding of how to access it, pre‐existing models that give interpretations and predictions, and a tribal understanding of how reliable these analyses are.
  2. 2. Some decisions are made autonomously: If there is a single class of “killer apps” for data science, it is machine learning algorithms that can make decisions without human intervention. In a DDD system large portions of a workflow can be automated, with assurances about performance based on historical data.
  3. 3. Everything can be measured and monitored: Understanding a large, complex, real‐time operation requires the ability to monitor all aspects of it over time. This ranges from concrete stats – like visitors to a website or yield at a stage of a manufacturing pipeline – to fuzzier concepts like user satisfaction. This makes it possible to constantly optimize a system, diagnose problems quickly, and react more quickly to a changing environment.
It might seem at first blush like these benefit categories apply to unrelated aspects of a business. But in fact they have much in common: they rely on the same datasets and data processing systems, they leverage the same models to make predictions, and they inform each other. If an autonomous decision algorithm suddenly starts performing poorly, it will prompt an investigation and possibly lead to high‐level business choices. Monitoring systems use autonomous decision algorithms to prioritize incidents for human investigation. And any major business decision will be accompanied by a plan to keep track of how well it turns out, so that adjustments can be made as needed.
Data science today is treated as a collection of stand‐alone projects, each with its own models, team, and datasets. But in DDD all of these projects are really just applications of a single unified system. DDD goes so far beyond just giving people access to a common database; it keeps a pulse on all parts of a business operation, it automates large parts of it, and where automation isn't possible it puts all the best analyses at people's fingertips.
It's a waste of effort to sit around and guess things that can be measured, or to cross our fingers about hypotheses that we can go out and test. Ideally we should spend our time coming up with creative new ideas, understanding customer needs, deep troubleshooting, or anticipating “black swan” events that have no historical precedent. DDD pushes as much work as possible onto machines and pre‐existing models, so that humans can focus on the work that only a human can do.

1.4 How to Use this Book

This book was written to bring people who don't necessarily have a technical background up to speed on data science. The goals are twofold: first I want to give a working knowledge of the current state of data science, the tools being used, and where it's going in the foreseeable future. Secondly, I want to give a solid grounding in the core concepts of analytics that will never go out of date. This book may also be of interest to data scientists who have nitty‐gritty technical chops but want to take their career to the next level by focusing on work that moves the business needle.
The first part of this book, The Business Side of Data Science, stands on its own. It explains in non‐technical terms what data science is, how to manage, hire, and work with data scientists, and how you can leverage DDD without getting into the technical weeds.
To really achieve data literacy though requires a certain amount of technical background, at least at a conceptual level. That's where the rest of the book comes in. It gives you the foundation required to formulate clear analytics questions, know what is and isn't possible, understand the tradeoffs between different approaches, and think critically about the usefulness of analytics results. Key jargon is explained in basic terms, the real‐world impact of technical details is shown, unnecessary formalism is avoided, and there is no code. Theory is kept to a minimum, but when it is necessary I illustrate it by example and explain why it is important. I have tried to adhere to Einstein's maxim: “everything should be made as simple as possible… but not simpler.”
Some sections of the book are flagged as “advanced material” in the title. These sections are (by comparison) highly technical in their content. They are necessary for understanding the strengths and weaknesses of specific data science techniques, but are less important for framing analytics problems and managing data science teams.
I have tried to make the chapters as independent as possible, so that the book can be consumed in bite‐sized chunks. In some places the concepts necessarily build off of each other; I have tried to call this out explicitly when it occurs, and to summarize the key background ideas so that the book can be used as a reference.

Table of contents

  1. Cover
  2. Table of Contents
  3. Data Science: The Executive Summary
  4. Copyright
  5. 1 Introduction
  6. 2 The Business Side of Data Science
  7. 3 Working with Modern Data
  8. 4 Telling the Story, Summarizing Data
  9. 5 Machine Learning
  10. 6 Knowing the Tools
  11. 7 Deep Learning and Artificial Intelligence
  12. Postscript
  13. Index
  14. End User License Agreement

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access Data Science by Field Cady in PDF and/or ePUB format, as well as other popular books in Economics & Statistics for Business & Economics. We have over one million books available in our catalogue for you to explore.