Skip to contents

Detect overall deviations from the expected distribution.

Usage

check_residuals(object, n_simulations = 30)

Arguments

object

a model object

n_simulations

number of simulations (defaults to 30)

Value

Invisibly returns the p-value of the test statistics. A p-value < 0.05 indicates a significant deviation from expected distribution.

Details

Misspecifications in GLMs cannot reliably be diagnosed with standard residual plots, and GLMs are thus often not as thoroughly checked as LMs. One reason why GLMs residuals are harder to interpret is that the expected distribution of the data changes with the fitted values. As a result, standard residual plots, when interpreted in the same way as for linear models, seem to show all kind of problems, such as non-normality, heteroscedasticity, even if the model is correctly specified. check_residuals() aims at solving these problems by creating readily interpretable residuals for GLMs that are standardized to values between 0 and 1, and that can be interpreted as intuitively as residuals for the linear model. This is achieved by a simulation-based approach, similar to the Bayesian p-value or the parametric bootstrap, that transforms the residuals to a standardized scale. This explanation is adopted from DHARMa::simulateResiduals().

It might happen that in the fitted model for a data point all simulations have the same value (e.g. zero), this returns the error message Error in approxfun: need at least two non-NA values to interpolate*. If that is the case, it could help to increase the number of simulations.

References

Dunn, K. P., and Smyth, G. K. (1996). Randomized quantile residuals. Journal of Computational and Graphical Statistics 5, 1-10.

Gelman, A. & Hill, J. Data analysis using regression and multilevel/hierarchical models Cambridge University Press, 2006

Hartig, F. (2020). DHARMa: Residual Diagnostics for Hierarchical (Multi-Level / Mixed) Regression Models. R package version 0.3.0. https://CRAN.R-project.org/package=DHARMa

Author

Martin Haringa

Examples

if (FALSE) { # \dontrun{
m1 <- glm(nclaims ~ area, offset = log(exposure), family = poisson(),
data = MTPL2)
check_residuals(m1, n_simulations = 50) |> autoplot()
} # }