Survival Analysis with Python
eBook - ePub

Survival Analysis with Python

Avishek Nag

Buch teilen
  1. 84 Seiten
  2. English
  3. ePUB (handyfreundlich)
  4. Über iOS und Android verfügbar
eBook - ePub

Survival Analysis with Python

Avishek Nag

Angaben zum Buch
Buchvorschau
Inhaltsverzeichnis
Quellenangaben

Über dieses Buch

Survival analysis uses statistics to calculate time to failure. Survival Analysis with Python takes a fresh look at this complex subject by explaining how to use the Python programming language to perform this type of analysis. As the subject itself is very mathematical and full of expressions and formulations, the book provides detailed explanations and examines practical implications. The book begins with an overview of the concepts underpinning statistical survival analysis. It then delves into



  • Parametric models with coverage of


    • Concept of maximum likelihood estimate (MLE) of a probability distribution parameter


    • MLE of the survival function


    • Common probability distributions and their analysis


    • Analysis of exponential distribution as a survival function


    • Analysis of Weibull distribution as a survival function


    • Derivation of Gumbel distribution as a survival function from Weibull



  • Non-parametric models including


    • Kaplan–Meier (KM) estimator, a derivation of expression using MLE


    • Fitting KM estimator with an example dataset, Python code and plotting curves


    • Greenwood's formula and its derivation



  • Models with covariates explaining


    • The concept of time shift and the accelerated failure time (AFT) model


    • Weibull-AFT model and derivation of parameters by MLE


    • Proportional Hazard (PH) model


    • Cox-PH model and Breslow's method


    • Significance of covariates


    • Selection of covariates

The Python lifelines library is used for coding examples. By mapping theory to practical examples featuring datasets, this book is a hands-on tutorial as well as a handy reference.

Häufig gestellte Fragen

Wie kann ich mein Abo kündigen?
Gehe einfach zum Kontobereich in den Einstellungen und klicke auf „Abo kündigen“ – ganz einfach. Nachdem du gekündigt hast, bleibt deine Mitgliedschaft für den verbleibenden Abozeitraum, den du bereits bezahlt hast, aktiv. Mehr Informationen hier.
(Wie) Kann ich Bücher herunterladen?
Derzeit stehen all unsere auf Mobilgeräte reagierenden ePub-Bücher zum Download über die App zur Verfügung. Die meisten unserer PDFs stehen ebenfalls zum Download bereit; wir arbeiten daran, auch die übrigen PDFs zum Download anzubieten, bei denen dies aktuell noch nicht möglich ist. Weitere Informationen hier.
Welcher Unterschied besteht bei den Preisen zwischen den Aboplänen?
Mit beiden Aboplänen erhältst du vollen Zugang zur Bibliothek und allen Funktionen von Perlego. Die einzigen Unterschiede bestehen im Preis und dem Abozeitraum: Mit dem Jahresabo sparst du auf 12 Monate gerechnet im Vergleich zum Monatsabo rund 30 %.
Was ist Perlego?
Wir sind ein Online-Abodienst für Lehrbücher, bei dem du für weniger als den Preis eines einzelnen Buches pro Monat Zugang zu einer ganzen Online-Bibliothek erhältst. Mit über 1 Million Büchern zu über 1.000 verschiedenen Themen haben wir bestimmt alles, was du brauchst! Weitere Informationen hier.
Unterstützt Perlego Text-zu-Sprache?
Achte auf das Symbol zum Vorlesen in deinem nächsten Buch, um zu sehen, ob du es dir auch anhören kannst. Bei diesem Tool wird dir Text laut vorgelesen, wobei der Text beim Vorlesen auch grafisch hervorgehoben wird. Du kannst das Vorlesen jederzeit anhalten, beschleunigen und verlangsamen. Weitere Informationen hier.
Ist Survival Analysis with Python als Online-PDF/ePub verfügbar?
Ja, du hast Zugang zu Survival Analysis with Python von Avishek Nag im PDF- und/oder ePub-Format sowie zu anderen beliebten Büchern aus Ciencia de la computación & Programación en Python. Aus unserem Katalog stehen dir über 1 Million Bücher zur Verfügung.

Information

Chapter 1Introduction

DOI: 10.1201/9781003255499-1
We will start our discussion with a few events that can be observed: death of a person due to a disease, attrition of an employee from an organization and incident of a natural calamity (earthquake or flood). All these examples are from completely different domains, but they have a common thing: time or, better to say, time until an event occurs. Time is crucial in all these situations. If we know beforehand that a certain event may occur at any specific time, then a lot of lives and resources can be saved. Survival analysis is defined as a collection of statistical longitudinal data analysis techniques where time is a major factor. It is utilized in biology, medicine, engineering, marketing, social sciences or behavioral sciences. Survival analysis is also sometimes named as reliability theory under operations research or engineering. It is a complex subject and the reader would need expertise in probability, statistics, calculus and optimization to grasp it fully.
In this chapter, we will explore some basic concepts of survival analysis, nomenclatures and sample datasets.

Concept of Failure Time

We have already talked about event. In general, survival analysis deals with the events related to failure. And failure off course can occur one or more time for any subject. For the topics discussed in this book it is assumed that failure occurs only once for a subject. We will be using the term subject throughout this book to represent the entity which is going through some phases and the failure (or the event) is attached to it. A subject may be a person, a machine, a river, and even an entire geographic region. There are numerous use cases where survival analysis can be applied to find out chances of event occurrence. Some of them are:
  • Death of a person by any disease
  • Suicide
  • Failure of machine tools
  • Attrition of employees from organization
  • Divorce
  • Occurrence any natural catastrophe (flood, earthquake, volcanic eruption, etc.)
In this book, we will be discussing mostly about the death by disease use cases, as survival analysis finds its usage in these cases mostly. Death by disease use case is mostly analyzed in case of drug development, where survival analysis plays a crucial role to identify the right drug by comparative study of several options.
We are talking about time a lot. But what does it signify? By time, we mean years, months, weeks or days from the beginning of analysis of the data until an event (like death, exit of an employee, earthquake, etc.) occurs. As said earlier, event is also termed as failure. So, time taken till failure is referred to as the failure time or survival time. Time may not be a physical unit always; there are cases where it can be used as a logical indicator. Below points are needed to be taken care of before defining a time scale:
  • Origin of the time must be unambiguously defined.
  • The scale for measuring the time difference must be defined.
  • Definition of failure must be clear.

Concept of Survival

When we speak about survival, we mean probabilities. Probability of not occurring an event till some time can be taken as survival probability. In other words, probability of an event occurrence after a certain time is survival probability. For example, when we say survival probability of a heart patient at age 71 is 0.23, it means that the patient will survive at least till age 71 and there is a probability 0.23 that he/she will keep surviving after 71. Age is a time scale here. Similarly, there could be a probability 0.40 that he/she will survive after 50. Reason is clear. At younger age, chances of collapsing by a heart attack is less and thus survival probability will be higher. So, we can have a survival probability distribution over random variable time (here age) like below:
Table 1.1 A Sample Survival Probability Distribution
Time (Age)
40
45
50
60
65
70
Survival Probability
0.51
0.42
0.38
0.36
0.28
0.24
One of the purposes of survival analysis is to find out this probability distribution. A lot of other domain-specific statistical inferences can also be drawn from this. It can be observed that survival probability decreases over time. It is a very important feature of distribution. We will discuss it in greater detail in Chapter 2. Like heart patient use case, the same analysis can be done for employee attrition of an organization. The purpose is to find out survival probability distribution of employee exit at various times after he/she joins there. Interesting part is that the term survival is very generic here. It should not necessarily always mean saving yourself from something. It is not also always related to disease, patients or healthcare. Survival means non-occurrence of an event till some time. Events could either be any one from the list as discussed in the section ‘Concept of Failure Time’ or something else.

Censoring

Most survival analyses must consider a very important analytical problem called censoring. It is caused by not observing some subjects fo...

Inhaltsverzeichnis