Extract residuals, fitted values, or the spatial trend — residuals.geostan

Extract model residuals, fitted values, or spatial trend from a fitted geostan_fit model.

# S3 method for geostan_fit
residuals(object, summary = TRUE, rates = TRUE, detrend = TRUE, ...)

# S3 method for geostan_fit
fitted(object, summary = TRUE, rates = TRUE, trend = TRUE, ...)

spatial(object, summary = TRUE, ...)

# S3 method for geostan_fit
spatial(object, summary = TRUE, ...)

Arguments

object: A fitted model object of class geostan_fit.
summary: Logical; should the values be summarized by their mean, standard deviation, and quantiles (probs = c(.025, .2, .5, .8, .975)) for each observation? Otherwise, a matrix containing samples from the posterior distributions is returned.
rates: For Poisson and Binomial models, should the fitted values be returned as rates, as opposed to raw counts? Defaults to TRUE; see the Details section for more information.
detrend: For auto-normal models (CAR and SAR models with Gaussian likelihood only); if detrend = FALSE, the residuals will equal y - Mu, where Mu does not contain the implicit spatial trend. If detrend = TRUE (the default), the residuals will equal Resid = y - Mu - Trend. For CAR models and for SAR models of type SEM, the implicit spatial trend is Trend = rho * C %*% (Y - Mu) (see stan_car or stan_sar). For SAR models of type SLM, the spatial trend is rho * C * y.
...: Not used
trend: For auto-normal models (CAR and SAR models with Gaussian likelihood only); if trend = FALSE, the fitted values will not include the implicit spatial trend term. Rather, the fitted values will equal Mu: the intercept plus any other terms explicitly included in the model formula and the re argument ('random effects'). If trend = TRUE (the default), fitted = Mu + Trend. For CAR models and SAR models of type SEM, the spatial trend is rho * C (y - Mu). For SAR models of type SLM, the spatial trend is rho * C * y.

Value

By default, these methods return a data.frame. The column named mean is what most users will be looking for. These contain the fitted values (for the fitted method), the residuals (fitted values minus observed values, for the resid method), or the spatial trend (for the spatial method). The mean column is the posterior mean of each value, and the column sd contains the posterior standard deviation for each value. The posterior distributions are also summarized by select quantiles.

If summary = FALSE then the method returns an S-by-N matrix of MCMC samples, where S is the number of MCMC samples and N is the number of observations in the data.

For spatial GLMs, the spatial method returns values on the scale of the linear predictor. Hence, for a Poisson with log link function, the spatial trend is on the log scale (in the same way that the intercept and coefficients are on the log scale).

Details

When rates = FALSE and the model is Poisson or Binomial, the fitted values returned by the fitted method are the expected value of the response variable. The rates argument is used to translate count outcomes to rates by dividing by the appropriate denominator. The behavior of the rates argument depends on the model specification. Consider a Poisson model of disease incidence, such as the following intercept-only case:

fit <- stan_glm(y ~ offset(log(E)),
               data = data,
               family = poisson())

If the fitted values are extracted using rates = FALSE, then fitted(fit) will return the expectation of \(y\). If rates = TRUE (the default), then fitted(fit) will return the expected value of the rate \(\frac{y}{E}\).

If a binomial model is used instead of the Poisson, then using rates = TRUE with the fitted method will return the expectation of \(\frac{y}{N}\) where \(y\) is the number of 'successes' and \(N\) is the sum of the number of 'successes' and 'failures', as in:

fit <- stan_glm(cbind(successes, failures) ~ 1,
               data = data,
               family = binomial())

Examples

sar_list <- prep_sar_data2(row = 10, col = 10, quiet = TRUE)
W <- sar_list$W
x <- sim_sar(w = W, rho = 0.8)
y <- sim_sar(w = W, rho = 0.7, mu = 1 + 0.5 * x)
dat <- data.frame(x = x, y = y)

fit <- stan_sar(y ~ x, data = dat, sar = sar_list,
                    chains = 1, iter = 800)

# Residuals
r = resid(fit)
head(r)
moran_plot(r$mean, W)

# when residuals are not detrended, they exhibit spatial autocorrelation
r2 = resid(fit, detrend = FALSE)
mc(r$mean, W)
mc(r2$mean, W)

r_mat <- resid(fit, summary = FALSE)
r_mean <- apply(r_mat, 2, mean)
head(r_mean)
mean(r_mean)

# Fitted values
f <- fitted(fit)
head(f)

# Spatial trend
sp_trend  <- spatial(fit)
head(sp_trend)

# another way to obtain detrended residuals 
sp <- spatial(fit, summary = FALSE)
r <- resid(fit, detrend = FALSE, summary = FALSE)
# detrended residuals (matrix of samples):
r_det <- r - sp
# summarize (take posterior means):
r_det <- apply(r_det, 2, mean)
r <- apply(r, 2, mean)
# Moran coefficients:
mc(r_det, W)
mc(r, W)