Coefficient of Determination (r²)
The coefficient of determination (r²) is a measure of how well a linear model fits a set of data. It is a squared value between 0 and 1, representing the proportion of variance in the dependent variable that is explained by the independent variables.
Formula:
r² = 1 - [(n - 1)SSres/(n - 1)SSTot]
where:
- r²: Coefficient of determination
- n: Number of observations
- SSres: Sum of squares of residuals
- SSTot: Sum of squares of total variation
Interpretation:
- r² = 1: The model perfectly fits the data, explaining all variability in the dependent variable.
- r² = 0: The model does not explain any variability in the dependent variable.
Significance:
- A high coefficient of determination indicates a good fit between the model and the data.
- A low coefficient of determination indicates a poor fit.
- The coefficient of determination is an important metric for model evaluation, but should not be the only factor considered.
Example:
A model with r² = 0.85 indicates that the model explains 85% of the variance in the dependent variable.
Additional Notes:
- The coefficient of determination is a squared value, so it can be negative.
- The coefficient of determination can be biased for models with a large number of independent variables.
- It is important to consider other model evaluation metrics, such as mean squared error (MSE) and root mean squared error (RMSE), in addition to r².