Survival Analysis with Python
eBook - ePub

Survival Analysis with Python

Avishek Nag

Partager le livre
  1. 84 pages
  2. English
  3. ePUB (adapté aux mobiles)
  4. Disponible sur iOS et Android
eBook - ePub

Survival Analysis with Python

Avishek Nag

DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations

À propos de ce livre

Survival analysis uses statistics to calculate time to failure. Survival Analysis with Python takes a fresh look at this complex subject by explaining how to use the Python programming language to perform this type of analysis. As the subject itself is very mathematical and full of expressions and formulations, the book provides detailed explanations and examines practical implications. The book begins with an overview of the concepts underpinning statistical survival analysis. It then delves into



  • Parametric models with coverage of


    • Concept of maximum likelihood estimate (MLE) of a probability distribution parameter


    • MLE of the survival function


    • Common probability distributions and their analysis


    • Analysis of exponential distribution as a survival function


    • Analysis of Weibull distribution as a survival function


    • Derivation of Gumbel distribution as a survival function from Weibull



  • Non-parametric models including


    • Kaplan–Meier (KM) estimator, a derivation of expression using MLE


    • Fitting KM estimator with an example dataset, Python code and plotting curves


    • Greenwood's formula and its derivation



  • Models with covariates explaining


    • The concept of time shift and the accelerated failure time (AFT) model


    • Weibull-AFT model and derivation of parameters by MLE


    • Proportional Hazard (PH) model


    • Cox-PH model and Breslow's method


    • Significance of covariates


    • Selection of covariates

The Python lifelines library is used for coding examples. By mapping theory to practical examples featuring datasets, this book is a hands-on tutorial as well as a handy reference.

Foire aux questions

Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via l’application. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă  la bibliothĂšque et Ă  toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode d’abonnement : avec l’abonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă  12 mois d’abonnement mensuel.
Qu’est-ce que Perlego ?
Nous sommes un service d’abonnement Ă  des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă  toute une bibliothĂšque pour un prix infĂ©rieur Ă  celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte Ă  haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que Survival Analysis with Python est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă  Survival Analysis with Python par Avishek Nag en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Ciencia de la computaciĂłn et ProgramaciĂłn en Python. Nous disposons de plus d’un million d’ouvrages Ă  dĂ©couvrir dans notre catalogue.

Informations

Année
2021
ISBN
9781000520699

Chapter 1Introduction

DOI: 10.1201/9781003255499-1
We will start our discussion with a few events that can be observed: death of a person due to a disease, attrition of an employee from an organization and incident of a natural calamity (earthquake or flood). All these examples are from completely different domains, but they have a common thing: time or, better to say, time until an event occurs. Time is crucial in all these situations. If we know beforehand that a certain event may occur at any specific time, then a lot of lives and resources can be saved. Survival analysis is defined as a collection of statistical longitudinal data analysis techniques where time is a major factor. It is utilized in biology, medicine, engineering, marketing, social sciences or behavioral sciences. Survival analysis is also sometimes named as reliability theory under operations research or engineering. It is a complex subject and the reader would need expertise in probability, statistics, calculus and optimization to grasp it fully.
In this chapter, we will explore some basic concepts of survival analysis, nomenclatures and sample datasets.

Concept of Failure Time

We have already talked about event. In general, survival analysis deals with the events related to failure. And failure off course can occur one or more time for any subject. For the topics discussed in this book it is assumed that failure occurs only once for a subject. We will be using the term subject throughout this book to represent the entity which is going through some phases and the failure (or the event) is attached to it. A subject may be a person, a machine, a river, and even an entire geographic region. There are numerous use cases where survival analysis can be applied to find out chances of event occurrence. Some of them are:
  • Death of a person by any disease
  • Suicide
  • Failure of machine tools
  • Attrition of employees from organization
  • Divorce
  • Occurrence any natural catastrophe (flood, earthquake, volcanic eruption, etc.)
In this book, we will be discussing mostly about the death by disease use cases, as survival analysis finds its usage in these cases mostly. Death by disease use case is mostly analyzed in case of drug development, where survival analysis plays a crucial role to identify the right drug by comparative study of several options.
We are talking about time a lot. But what does it signify? By time, we mean years, months, weeks or days from the beginning of analysis of the data until an event (like death, exit of an employee, earthquake, etc.) occurs. As said earlier, event is also termed as failure. So, time taken till failure is referred to as the failure time or survival time. Time may not be a physical unit always; there are cases where it can be used as a logical indicator. Below points are needed to be taken care of before defining a time scale:
  • Origin of the time must be unambiguously defined.
  • The scale for measuring the time difference must be defined.
  • Definition of failure must be clear.

Concept of Survival

When we speak about survival, we mean probabilities. Probability of not occurring an event till some time can be taken as survival probability. In other words, probability of an event occurrence after a certain time is survival probability. For example, when we say survival probability of a heart patient at age 71 is 0.23, it means that the patient will survive at least till age 71 and there is a probability 0.23 that he/she will keep surviving after 71. Age is a time scale here. Similarly, there could be a probability 0.40 that he/she will survive after 50. Reason is clear. At younger age, chances of collapsing by a heart attack is less and thus survival probability will be higher. So, we can have a survival probability distribution over random variable time (here age) like below:
Table 1.1 A Sample Survival Probability Distribution
Time (Age)
40
45
50
60
65
70
Survival Probability
0.51
0.42
0.38
0.36
0.28
0.24
One of the purposes of survival analysis is to find out this probability distribution. A lot of other domain-specific statistical inferences can also be drawn from this. It can be observed that survival probability decreases over time. It is a very important feature of distribution. We will discuss it in greater detail in Chapter 2. Like heart patient use case, the same analysis can be done for employee attrition of an organization. The purpose is to find out survival probability distribution of employee exit at various times after he/she joins there. Interesting part is that the term survival is very generic here. It should not necessarily always mean saving yourself from something. It is not also always related to disease, patients or healthcare. Survival means non-occurrence of an event till some time. Events could either be any one from the list as discussed in the section ‘Concept of Failure Time’ or something else.

Censoring

Most survival analyses must consider a very important analytical problem called censoring. It is caused by not observing some subjects fo...

Table des matiĂšres