# 27 Nov How to perform a multivariable analysis

Several steps are necessary to perform a multivariable (multivariate) analysis:

1. Formalizing the research hypothesis
• Select the outcome variable Y
• Select the explanatory variables X
3. Check that the conditions are met
4. Check the robustness of the model

## Formalizing the research hypothesis

All research requires hypotheses to be made. For example, it may be: “Does our new service protocol reduce the number of readmissions?”. Sometimes this hypothesis is straightforward to explain, sometimes it is more complex.

pvalue.io helps you to formalize it, by assuming the following: “X has an influence on Y”. It is up to you to determine the relevant X and Y. In the example on readmissions, X would be: the service protocol (coded for example in 2 classes: “former” and “new”), and Y the presence of a 30-day readmission (e. g. yes/no or 0/1). The hypothesis would therefore be: “The service protocol has an influence on 30-day readmission”.

However, there are many possible biases, and it is therefore often necessary to adjust for confounding factors. These confounding factors are the variables related to Y. Thus, pvalue.io prompts you to select the other variables known or assumed related to Y.

We will call here the adjustment variables the variables that are statistically related to Y (i.e. with a p-value below a threshold), but which are not known or assumed to be related to Y. Typically, we do not need to obtain an estimate of the influence of these variables. The statistics software pvalue.io selects these variables automatically, and suggests that you deselect those that are not relevant.

## Check that the conditions are met

This verification step is essential and uses automatic detection mechanisms(for example, the residuals normality check), or manual (for example, checking the linearity of X as a function of Y, or proportional hazards).

## Check the robustness of the model

After a multivariate analysis, it is necessary to check the robustness of the model by removing the most influential variables from the statistical model. This procedure has not yet been implemented.