Advanced Data Science and Analytics with Python
eBook - ePub

Advanced Data Science and Analytics with Python

  1. 384 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Advanced Data Science and Analytics with Python

About this book

Advanced Data Science and Analytics with Python enables data scientists to continue developing their skills and apply them in business as well as academic settings. The subjects discussed in this book are complementary and a follow-up to the topics discussed in Data Science and Analytics with Python. The aim is to cover important advanced areas in data science using tools developed in Python such as SciKit-learn, Pandas, Numpy, Beautiful Soup, NLTK, NetworkX and others. The model development is supported by the use of frameworks such as Keras, TensorFlow and Core ML, as well as Swift for the development of iOS and MacOS applications.

Features:

  • Targets readers with a background in programming, who are interested in the tools used in data analytics and data science
  • Uses Python throughout
  • Presents tools, alongside solved examples, with steps that the reader can easily reproduce and adapt to their needs
  • Focuses on the practical use of the tools rather than on lengthy explanations
  • Provides the reader with the opportunity to use the book whenever needed rather than following a sequential path

The book can be read independently from the previous volume and each of the chapters in this volume is sufficiently independent from the others, providing flexibility for the reader. Each of the topics addressed in the book tackles the data science workflow from a practical perspective, concentrating on the process and results obtained. The implementation and deployment of trained models are central to the book.

Time series analysis, natural language processing, topic modelling, social network analysis, neural networks and deep learning are comprehensively covered. The book discusses the need to develop data products and addresses the subject of bringing models to their intended audiences – in this case, literally to the users' fingertips in the form of an iPhone app.

About the Author

Dr. Jesús Rogel-Salazar is a lead data scientist in the field, working for companies such as Tympa Health Technologies, Barclays, AKQA, IBM Data Science Studio and Dow Jones. He is a visiting researcher at the Department of Physics at Imperial College London, UK and a member of the School of Physics, Astronomy and Mathematics at the University of Hertfordshire, UK.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Advanced Data Science and Analytics with Python by Jesus Rogel-Salazar,Jesús Rogel-Salazar in PDF and/or ePUB format, as well as other popular books in Economia & Statistiche per il settore aziendale ed economico. We have over one million books available in our catalogue for you to explore.

1

No Time to Lose: Time Series Analysis

HAVE YOU EVER WONDERED WHAT the weather, financial prices, home energy usage, and your weight all have in common? Well, appart from the obvious, the data to analyse these phenomena can be collected at regular intervals over time. Common sense, right? Well, there is no time to lose; let us take a deeper look into this exciting kind of data. Are you ready?
Not obvious? Oh… well, read on!
Or is it Toulouse, like “Toulouse” in France?
A time series is defined as a sequence of data reading in successive order and can be taken on any variable that changes over time. So, if a time series is a set of data collected over time, then a lot of things, not just our weight or the weather, would be classed as time series, and perhaps that is true. There are, obviously and quite literally, millions of data points that can be collected over time. However, time series analysis is not necessarily immediately employed.
A lot of data is collected over time, but that does not make the data set a time series.
Time series analysis encapsulates the methods used to understand the sequence of data points mentioned above and extract useful information from it. A main goal is that of forecasting successive future values of the series. In this chapter we will cover some of these methods. Let us take a look.

1.1 Time Series

KNOWING HOW TO MODEL TIME series is surely an important tool in our Jackalope data scientist toolbox. Jackalopes? Yes! Long story… You can get further information in Chapter 1 of Data Science and Analytics with Python.1. But I digress, the key point about time series data is that the ordering of the data points in time matters. For many datasets it is not important in which order the data are obtained or listed. One order is as good as another, and although the ordering may tell us something about the dataset, it is not an inherent attribute of the set.
See for instance the datasets analysed in the book mentioned above.
However, for time series data the ordering is absolutely crucial. The order imposes a certain structure on the data, which in turn is of relevance to the underlying phenomenon studied. So, what is different about time series? Well, Time! Furthermore, we will see later on in this chapter that in some cases there are situations where future observations are influenced by past data points. All in all, this is not a surprising statement; we are well acquainted with causality relationships.
What is different about time series? —Time!
Let us have a look at an example of a time series. In Figure 1.1 we can see a financial time series corresponding to the log returns of Apple for a year starting in April 2017. The log returns are used to determine the proportional amount you might get on a given day compared to the previous one. With that description in mind, we can see how we are relating the value on day n to the one on day n − 1.
The log return is given by log(FVPV), where FV is the future value and PV is the past value.
Image
Figure 1.1: A time series of the log returns for Apple Inc. for a year since April 2017.
In that way, a...

Table of contents

  1. Cover
  2. Half Title
  3. Series Page
  4. Title Page
  5. Copyright Page
  6. Dedication
  7. Table of Contents
  8. List of Figures
  9. List of Tables
  10. Preface
  11. Reader’s Guide
  12. About the Author
  13. Other Books by the Same Author
  14. 1 No Time to Lose: Time Series Analysis
  15. 2 Speaking Naturally: Text and Natural Language Processing
  16. 3 Getting Social: Graph Theory and Social Network Analysis
  17. 4 Thinking Deeply: Neural Networks and Deep Learning
  18. 5 Here Is One I Made Earlier: Machine Learning Deployment
  19. A Information Criteria
  20. B Power Iteration
  21. C The Softmax Function and Its Derivative
  22. D The Derivative of the Cross-Entropy Loss Function
  23. Bibliography
  24. Index