# 22 Jul Survival analysis

• When the response variable is binary and it is possible to switch permanently from one state to another, we can carry out survival analyses
• This type of analysis can take into account the lost to follow-up
• The most commonly used statistical model for survival analysis in medical studies is the Cox model.
• When there is only one explanatory variable that is categorical, a Cox model produces a similar result to a log-rank test.
In an attempt to make it simpler, we will name Y the response variable that we want to explain by X factors. (Use your distant memories: Y = aX +b)
For example, if we want to explain the probability of dying according to the treatment, Y is the state (dead/live), X is the treatment.

### When to perform a survival analysis?

Survival analyses in general allow events and non-events to be taken into account. Non-events are called censoring. There are two types of censoring: the lost to follow-up, which are patients who leave the study before the end of the study, and patients who do not experience the event for the duration of the study.
The simplest survival analyses are conducted using the log-rank test. This test relates the probability of an event to a single risk factor. This is an univariate test.
Unless patients have been randomly assigned to two groups to be exposed or not to a factor (such as a drug) in a randomized trial, there are most likely confounding factors. A log-rank test is then not suitable.
The Cox model is a multivariable model and can therefore take into account confounding factors. It is simple to implement, and is very often used in clinical studies; however, its conditions are quite strict.

### How to perform survival analyses with pvalue.io?

1. Choose to perform an explanatory analysis
2. Select your variable to explain (Y), which is the event, and your explanatory variables (X)
3. Define the direction of the event (from Y=0 to Y=1 or from Y=1 to Y=0)
4. Define the variable representing the follow-up time or the patient’s entry date and exit date
5. Check that there are no errors according to the descriptive analysis (using figures and tables)
6. Check the conditions and assumptions and cut your variables if necessary

It is possible that at the end of this verification, pvalue.io will refer you to a statistician. This is the case if the conditions are not met. A solution cannot be proposed by pvalue.io because it requires more or less complex mathematical transformations that are not standardized and/or difficult to interpret.

### How to interpret the results of a survival analysis?

By default, statistical softwares provide for each explanatory variable X included in the Cox model, a coefficient, the confidence interval of this coefficient and a p-value. It is however common practice to present the exponential of this coefficient which is the hazards ratio (HR).
If the hazards ratio is greater than 1: the factor increases the risk of the event occurring, and vice versa.
If the confidence interval of this hazards ratio does not include 1, it is said to be statistically significant and the p-value is lower than 0.05.

### For a numerical variable

An increase of one unit in a qualitative variable, with all other variables unchanged, multiplies the event risk by the hazard ratio.

### For a categorical variable

The risk associated with the corresponding class is multiplied by the value of the Hazards Ratio in comparison with the reference class.

In the following table, we wanted to know the risk factors of death in patients with colon cancer.

Hazards Ratio [CI]pp global
Age1.00 [0.996, 1.01]0.5
Nodes1.04 [1.02, 1.06]<0.001
RxLev vs Obs0.974 [0.84, 1.1]0.7<0.001
Lev+5FU vs Obs0.637 [0.54, 0.75]<0.001
Sex1 vs 00.970 [0.85, 1.1]0.7

We conclude as follows:

• Age has no influence on the risk of death (p >0.05)
• Rx = Lev+5FU is a protective factor (p <0.001): compared to Rx = Obs, the risk of the event is multiplied on average by 0.637 [0.54, 0.75]
• The event risk is significantly different according to Rx (all classes combined) (p global <0.001)
• Nodes is a risk factor for death (p <0.001): when Nodes is increased from 1 to 2, the risk of the event is multiplied on average by 1.04 [1.02, 1.06]
• Sex has no influence on the risk of death (p >0.05)