Advances in Financial Machine Learning
eBook - ePub

Advances in Financial Machine Learning

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Advances in Financial Machine Learning

About this book

Learn to understand and implement the latest machine learning innovations to improve your investment performance

Machine learning (ML) is changing virtually every aspect of our lives. Today, ML algorithms accomplish tasks that – until recently – only expert humans could perform. And finance is ripe for disruptive innovations that will transform how the following generations understand money and invest.

In the book, readers will learn how to:

  • Structure big data in a way that is amenable to ML algorithms
  • Conduct research with ML algorithms on big data
  • Use supercomputing methods and back test their discoveries while avoiding false positives

Advances in Financial Machine Learning addresses real life problems faced by practitioners every day, and explains scientifically sound solutions using math, supported by code and examples. Readers become active users who can test the proposed solutions in their individual setting.

Written by a recognized expert and portfolio manager, this book will equip investment professionals with the groundbreaking tools needed to succeed in modern finance.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Advances in Financial Machine Learning by Marcos Lopez de Prado in PDF and/or ePUB format, as well as other popular books in Business & Investments & Securities. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Wiley
Year
2018
Print ISBN
9781119482086
eBook ISBN
9781119482109

PART 1
Data Analysis

  1. Chapter 2 Financial Data Structures
  2. Chapter 3 Labeling
  3. Chapter 4 Sample Weights
  4. Chapter 5 Fractionally Differentiated Features

CHAPTER 2
Financial Data Structures

2.1 MOTIVATION

In this chapter we will learn how to work with unstructured financial data, and from that to derive a structured dataset amenable to ML algorithms. In general, you do not want to consume someone else’s processed dataset, as the likely outcome will be that you discover what someone else already knows or will figure out soon. Ideally your starting point is a collection of unstructured, raw data that you are going to process in a way that will lead to informative features.

2.2 ESSENTIAL TYPES OF FINANCIAL DATA

Financial data comes in many shapes and forms. Table 2.1 shows the four essential types of financial data, ordered from left to right in terms of increasing diversity. Next, we will discuss their different natures and applications.
TABLE 2.1 The Four Essential Types of Financial Data
Fundamental Data Market Data Analytics Alternative Data
  • Assets
  • Liabilities
  • Sales
  • Costs/earnings
  • Macro variables
  • . . .
  • Price/yield/implied volatility
  • Volume
  • Dividend/coupons
  • Open interest
  • Quotes/cancellations
  • Aggressor side
  • . . .
  • Analyst recommendations
  • Credit ratings
  • Earnings expectations
  • News sentiment
  • . . .
  • Satellite/CCTV images
  • Google searches
  • Twitter/chats
  • Metadata
  • . . .

2.2.1 Fundamental Data

Fundamental data encompasses information that can be found in regulatory filings and business analytics. It is mostly accounting data, reported quarterly. A particular aspect of this data is that it is reported with a lapse. You must confirm exactly when each data point was released, so that your analysis uses that information only after it was publicly available. A common beginner’s error is to assume that this data was published at the end of the reporting period. That is never the case.
For example, fundamental data published by Bloomberg is indexed by the last date included in the report, which precedes the date of the release (often by 1.5 months). In other words, Bloomberg is assigning those values to a date when they were not known. You could not believe how many papers are published every year using misaligned fundamental data, especially in the factor-investing literature. Once you align the data correctly, a substantial number of findings in those papers cannot be reproduced.
A second aspect of fundamental data is that it is often backfilled or reinstated. ā€œBackfillingā€ means that missing data is assigned a value, even if those values were unknown at that time. A ā€œreinstated valueā€ is a corrected value that amends an incorrect initial release. A company may issue multiple corrections for a past quarter’s results long after the first publication, and data vendors may overwrite the initial values with their corrections. The problem is, the corrected values were not known on that first release date. Some data vendors circumvent this problem by storing multiple release dates and values for each variable. For example, we typically have three values for a single quarterly GDP release: the original released value and two monthly revisions. Still, it is very common to find studies that use the final released value and assign it to the time of the first release, or even to the last day in the reporting period. We will revisit this mistake, and its implications, when we discuss backtesting errors in Chapter 11.
Fundamental data is extremely regularized and low freque...

Table of contents

  1. Cover
  2. Praise
  3. Title page
  4. Copyright
  5. Dedication
  6. About the Author
  7. PREAMBLE
  8. PART 1 DATA ANALYSIS
  9. PART 2 MODELLING
  10. PART 3 BACKTESTING
  11. PART 4 USEFUL FINANCIAL FEATURES
  12. PART 5 HIGH-PERFORMANCE COMPUTING RECIPES
  13. Index
  14. EULA