R/geostan_fit-methods.R
resid_geostan_fit.Rd
Extract model residuals, fitted values, or spatial trend from a fitted geostan_fit
model.
A fitted model object of class geostan_fit
.
Logical; should the values be summarized by their mean, standard deviation, and quantiles (probs = c(.025, .2, .5, .8, .975)
) for each observation? Otherwise, a matrix containing samples from the posterior distributions is returned.
For Poisson and Binomial models, should the fitted values be returned as rates, as opposed to raw counts? Defaults to TRUE
; see the Details
section for more information.
For auto-normal models (CAR and SAR models with Gaussian likelihood only); if detrend = TRUE
, the implicit spatial trend will be removed from the residuals. The implicit spatial trend is Trend = rho * C %*% (Y - Mu)
(see stan_car or stan_sar). I.e., resid = Y - (Mu + Trend)
.
Not used
For auto-normal models (CAR and SAR models with Gaussian likelihood only); if trend = TRUE
, the fitted values will include the implicit spatial trend term. The implicit spatial trend is Trend = rho * C %*% (Y - Mu)
(see stan_car or stan_sar). I.e., if trend = TRUE
, fitted = Mu + Trend
.
By default, these methods return a data.frame
. The column named mean
is what most users will be looking for. These contain the fitted values (for the fitted
method), the residuals (fitted values minus observed values, for the resid
method), or the spatial trend (for the spatial
method). The mean
column is the posterior mean of each value, and the column sd
contains the posterior standard deviation for each value. The posterior distributions are also summarized by select quantiles (including 2.5\
If summary = FALSE
then the method returns an S-by-N matrix of MCMC samples, where S is the number of MCMC samples and N is the number of observations in the data.
When rates = FALSE
and the model is Poisson or Binomial, the fitted values returned by the fitted
method are the expected value of the response variable. The rates
argument is used to translate count outcomes to rates by dividing by the appropriate denominator. The behavior of the rates
argument depends on the model specification. Consider a Poisson model of disease incidence, such as the following intercept-only case:
stan_glm(y ~ offset(log(E)),
fit <-data = data,
family = poisson())
If the fitted values are extracted using rates = FALSE
, then fitted(fit)
will return the expectation of \(y\). If rates = TRUE
(the default), then fitted(fit)
will return the expected value of the rate \(\frac{y}{E}\).
If a binomial model is used instead of the Poisson, then using rates = TRUE
will return the expectation of \(\frac{y}{N}\) where \(N\) is the sum of the number of 'successes' and 'failures', as in:
stan_glm(cbind(successes, failures) ~ 1,
fit <-data = data,
family = binomial())
# \donttest{
data(georgia)
C <- shape2mat(georgia, "B")
fit <- stan_esf(deaths.male ~ offset(log(pop.at.risk.male)),
C = C,
re = ~ GEOID,
data = georgia,
family = poisson(),
chains = 1, iter = 600) # for speed only
# Residuals
r <- resid(fit)
head(r)
moran_plot(r$mean, C)
# Fitted values
f <- fitted(fit)
head(f)
f2 <- fitted(fit, rates = FALSE)
head(f2)
# Spatial trend
esf <- spatial(fit)
head(esf)
# }