# 03 Jul How to perform a multivariable analysis when you have too subjects

It is sometimes surprising not to be able to carry out a multivariablee analysis because the number of subjects is too small while the file contains several hundred observations (patients, subjects).

## Linear regressions

For linear regressions, i. e. multivariable analyses for which the outcome variable is numerical, it is necessary to have at least 10 observations per explanatory variable.

A small subtlety, when the explanatory variable is categorical with N classes, it counts as N-1 variables. For example, let us take the categorical variable “satisfaction” with the following 5 classes:

- Not at all satisfied
- Somewhat dissatisfied
- Moderately satisfied
- Somewhat satisfied
- Very satisfied

When this variable is used in a statistical model, it is automatically recoded into 4 binary variables, each of which is 0 or 1.

Satisfaction | Very satisfied | Somewhat satisfied | Moderately satisfied | Somewhat dissatisfied |

Very satisfied | 1 | 0 | 0 | 0 |

Somewhat satisfied | 0 | 1 | 0 | 0 |

Moderately satisfied | 0 | 0 | 1 | 0 |

Somewhat dissatisfied | 0 | 0 | 0 | 1 |

Not at all satisfied | 0 | 0 | 0 | 0 |

## Logistical regressions and survival analyses

For logistic regressions and survival analyses, i.e. when the outcome variable is binary, it is slightly more complex. There must be at least 10 observations per variable, but be careful, not calculated on the total number of subjects, but on the number of subjects for whom the outcome variable is 0 and for whom the outcome variable is 1.

Thus, if the number of subjects is 179 distributed as follows: 29 patients with Y = 0 and 150 with Y = 1, the maximum number of explanatory variables is 2.

## No Comments