Confident Data Skills
eBook - ePub

Confident Data Skills

How to Work with Data and Futureproof Your Career

Kirill Eremenko

Share book
  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Confident Data Skills

How to Work with Data and Futureproof Your Career

Kirill Eremenko

Book details
Book preview
Table of contents
Citations

About This Book

Data has dramatically changed how our world works. Understanding and using data is now one of the most transferable and desirable skills. Whether you're an entrepreneur wanting to boost your business, a jobseeker looking for that employable edge, or simply hoping to make the most of your current career, Confident Data Skills is here to help.This updated second edition takes you through the basics of data: from data mining and preparing and analysing your data, to visualizing and communicating your insights. It now contains exciting new content on neural networks and deep learning. Featuring in-depth international case studies from companies including Amazon, LinkedIn and Mike's Hard Lemonade Co, as well as easy-to understand language and inspiring advice and guidance, Confident Data Skills will help you use your new-found data skills to give your career that cutting-edge boost.About the Confident series...
From coding and web design to data, digital content and cyber security, the Confident books are the perfect beginner's resource for enhancing your professional life, whatever your career path.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Confident Data Skills an online PDF/ePUB?
Yes, you can access Confident Data Skills by Kirill Eremenko in PDF and/or ePUB format, as well as other popular books in Business & Business Skills. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Kogan Page
Year
2020
ISBN
9781789664393
Edition
2
Part One

‘What is it?’ Key principles

With all the attention given to the apparently limitless potential of technology and the extensive opportunities for keen entrepreneurs, some may ask why they should bother with data science at all – why not simply learn the principles of technology? After all, technology powers the world, and it shows no signs of slowing down. Any reader with an eye to their career might think that learning how to develop new technologies would surely be the better way forward.
It is easy to regard technology as the force that changes the world – it has given us the personal computer, the internet, artificial organs, driverless cars, the Global Positioning System (GPS) – but few people think of data science as the propeller behind many of these inventions. That is why you should be reading this book over a book about technology; you need to understand the mechanics behind a system in order to make a change.
We should not consider data only as the boring-but-helpful parent, and technology as the stylish teenager. The importance of data science does not begin and end with the explanation that technology needs data as just one of many other functional elements. That would be denying the beauty of data, and the many exciting applications that it offers for work and play. In short, it is not possible to have one without the other. What this means is that if you have a grounding in data science, the door will be open to a wide range of other fields that need a data scientist, making it an unusual and propitious area of research and practice.
Part One introduces you to the ubiquity of data, and the developments and key principles of data science that are useful for entering the subject. The concepts in the three chapters will outline a clear picture of how data applies to you, and will get you thinking not only about how data can directly benefit you and your company but also how you can leverage data for the long term in your career and beyond.

Striding out

Chapter 1 will mark the beginning of our journey into data science. It will make clear the vast proliferation of data and how in this Computer Age we all contribute to its production, before moving on to show how people have collected it, worked with it and, crucially, how data can be used to bolster a great number of projects and methods within and outside the discipline.
We have established that part of the problem with data science is not its relative difficulty but rather that the discipline is still something of a grey area for so many. Only when we understand precisely how much data there is and how it is collected can we start to consider the various ways in which we can work with it. We have reached a point in our technological development where information can be efficiently collected and stored for making improvements across all manner of industries and disciplines – as evidenced in the quantity of publicly available databases and government- funded projects to aggregate data across cultural and political institutions – but there are comparatively few people who know how to access and analyse it. Without workers knowing why data is useful, these beautiful datasets will only gather dust. This chapter makes the case for why data science matters right now, why it is not just a trend that will soon go out of style, and why you should consider implementing its practices as a key component of your work tasks.
Lastly, this chapter details how the soaring trajectory of technology gives us no room for pause in the field of data science. Whatever fears we may have about the world towards which we are headed, we cannot put a stop to data being collected, prepared and used. Nevertheless, it is impossible to ignore the fact that data itself is not concerned with questions of morality, and this has left it open to exploitation and abuse. Those of you who are concerned can take charge of these developments and enter into discussion with global institutions that are dealing with issues surrounding data ethics, an area that I find so gripping I gave it its own section in Chapter 3, AI and our future.

The future is data

Everything, every process, every sensor, will soon be driven by data. This will dramatically change the way in which business is carried out. In 10 years from now, I predict that every employee of every organization in the world will be expected to have a level of data literacy and be able to work with data and derive some insights to add value to the business. Not such a wild thought if we consider how, at the time of this book’s publication, many people are expected to know how to use the digital wallet service Apple Pay, which was only brought onto the market in 2014.
Chapter 2, How data fulfils our needs, makes clear that data is endemic to every aspect of our lives. It governs us, and it gathers power in numbers. While technology has only been important in recent human history, data has always played a seminal role in our existence. Our DNA provides the most elementary forms of data about us. We are governed by it: it is responsible for the way we look, for the shape of our limbs, for the way our brains are structured and their processing capabilities, and for the range of emotions we experience. We are vessels of this data, walking flash drives of biochemical information, passing it on to our children and ‘coding’ them with a mix of data from us and our partner. To be uninterested in data is to be uninterested in the most fundamental principles of existence.
This chapter explains how data is used across so many fields, and to illustrate this I use examples that directly respond to Abraham Maslow’s hierarchy of needs, a theory that will be familiar to many students and practitioners in the field of business and management. If this hierarchy is news to you, don’t worry – I will explain its structure and how it applies to us in Chapter 2.

Arresting developments

The final chapter in Part One will explore the current state of AI, its potential applications and its dangers. Many of the developments made within the field have had knock-on effects on other subjects. They have raised questions about the future for data scientists as well as for scholars and practitioners beyond its disciplinary boundaries. If you want to develop your career in data science, this chapter could even fire up ideas for subject niches that sorely need qualified personnel.
To add further weight to the examples offered in Chapter 2 which show compelling arguments for data’s supportive role across many walks of life, in Chapter 3 I highlight the five most promising AI developments in the world of business. AI’s broad application can make it difficult to penetrate. This chapter gives you a grounding in the subject’s key applications, and where change is happening.
The positive effects of AI are clear, but it is also important not to be blinded by it. Thus, Chapter 3 also addresses the security threats that data, and AI’s use of it, can pose, and how data practitioners can address current and future issues. Ethics is a compelling area, as it has the power to alter and direct future developments in data science. From what we understand of information collection, to the extent to which it can be used within machines and online, data ethics is setting the stage for how humans and technology communicate.
01

Defining data

Think about the last film you saw at the cinema. How did you first hear about it? You might have clicked on the trailer when YouTube recommended it to you, or it may have appeared as an advertisement before YouTube showed you the video you actually wanted to see. You may have seen a friend sing its praises on your social network, or had an engaging clip from the film interrupt your newsfeed. If you’re a keen moviegoer, it could have been picked out for you on an aggregate movie website as a film you might enjoy. Even outside the comfort of the internet, you may have found an advertisement for the film in your favourite magazine, or you could have taken an idle interest in the poster on your way to that coffeehouse with the best Wi-Fi.
None of these touchpoints was coincidental. The stars didn’t just happen to align for you and the film at the right moment. Let’s leave the idealistic serendipity to the onscreen encounters. What got you into the cinema was less a desire to see the film and more of a potent concoction of data-driven evidence that had marked you out as a likely audience member before you even realized you wanted to see the film.
When you interacted with each of these touchpoints, you left a little bit of data about yourself behind. We call this ‘data exhaust’. It isn’t confined to your online presence, nor is it only for the social media generation. Whether or not you use social media platforms, whether you like it or not, you’re contributing data.
It has always been this way; we’ve just become better at recording and collecting it. Any number of your day-to-day interactions stand to contribute to this exhaust. On your way to the London Underground, CCTV cameras are recording you. Hop onto the Tube, and you’re adding to Transport for London’s statistical data about peak times and usage. When you bookmark or highlight the pages of a novel on your Kindle, you are helping distributors to understand what readers particularly enjoyed about it, what they could put in future marketing material and how far their readers tend to get into the novel before they stop.
When you finally decide to forgo the trials and punishments of public transport and instead drive your car to the supermarket, the speed you’re going is helping GPS services to show their users in real time how much traffic there is in an area, and it also helps your car gauge how much more time you have left before you’ll need to find a petrol station.
And today, when you emerge from these touchpoints, the data you leave behind is swept up and added to a blueprint about you that details your interests, actions and desires.
But this is only the beginning of the data story. This book will teach you about how absolutely pervasive data really is. You will learn the essential concepts you need to be on your way to mastering data science, as well as the key definitions, tools and techniques that will enable you to apply data skills to your own work. This book will broaden your horizons by showing you how data science can be applied to areas in ways that you may previously have never thought possible. I’ll describe how data skills can give a boost to your career and transform the way you do business – whether that’s through impressing top executives with your ideas or even starting up on your own.

Data is everywhere

Before we move any further, we should clarify what we mean by data. When people think of data, they think of it being actively collected, stashed away in databases on inscrutable corporate servers and funnelled into research. But this is an outdated view. Today, data is much more ubiquitous.
Quite simply, data is any unit of information. It is the by-product of any and every action, pervading every part of our lives, not just within the sphere of the internet, but also in history, place and culture. A cave painting is data. A chord of music is data. The speed of a car is data. A ticket to a football match is data. A response to a survey question is data. A book is data, as is a chapter within that book, as is a word within that chapter, as is a letter within that word. It doesn’t have to be collected for it to be considered data. It doesn’t have to be stored in a vault of an organization for it to be considered data. Much of the world’s data probably doesn’t (yet) belong to any database at all.
Let’s say that in this definition of data being a unit of information, data is the tangible past. This is quite profound when you think about it. Data is the past, and the past is data. The record of things to which data contributes is called a database. And data scientists can use it to better understand our present and future operations. They’re applying the very same principle that historians have been telling us about for ages: we can learn from history. We can learn from our successes – and our mistakes – in order to improve the present and future.
The only aspect of data that has dramatically changed in recent years is our ability to collect, organize, analyse and visualize it in contexts that are only limited by our imagination. Wherever we go, whatever we buy, whatever interests we have, this data is all being collected and remodelled into trends that help advertisers and marketers push their products to the right peopl...

Table of contents