Visual diagnostics for areal data and model residuals

sp_diag(y, shape, ...)

# S3 method for geostan_fit
sp_diag(
  y,
  shape,
  name = "Residual",
  plot = TRUE,
  mc_style = c("scatter", "hist"),
  style = c("W", "B"),
  w,
  rates = TRUE,
  binwidth = function(x) 0.5 * stats::sd(x, na.rm = TRUE),
  size = 0.1,
  ...
)

# S3 method for numeric
sp_diag(
  y,
  shape,
  name = "y",
  plot = TRUE,
  mc_style = c("scatter", "hist"),
  style = c("W", "B"),
  w = shape2mat(shape, match.arg(style)),
  binwidth = function(x) 0.5 * stats::sd(x, na.rm = TRUE),
  ...
)

Arguments

y

A numeric vector, or a fitted geostan model (class geostan_fit).

shape

An object of class sf or another spatial object coercible to sf with sf::st_as_sf such as SpatialPolygonsDataFrame.

...

Additional arguments passed to residuals.geostan_fit. For binomial and Poisson models, this includes the option to view the outcome variable as a rate (the default) rather than a count; for stan_car models with auto-Gaussian likelihood (fit$family$family = "auto_gaussian"), the residuals will be detrended by default, but this can be changed using detrend = FALSE`.

name

The name to use on the plot labels; default to "y" or, if y is a geostan_fit object, to "Residuals".

plot

If FALSE, return a list of gg plots.

mc_style

Character string indicating how to plot the residual Moran coefficient (only used if y is a fitted model): if mc = "scatter", then moran_plot will be used with the marginal residuals; if mc = "hist", then a histogram of Moran coefficient values will be returned, where each plotted value represents the degree of residual autocorrelation in a draw from the join posterior distribution of model parameters.

style

Style of connectivity matrix; if w is not provided, style is passed to shape2mat and defaults to "W" for row-standardized.

w

An optional spatial connectivity matrix; if not provided and y is a numeric vector, one will be created using shape2mat. If w is not provided and y is a fitted geostan model, then the spatial connectivity matrix that is stored with the fitted model (y$C) will be used.

rates

For Poisson and binomial models, convert the outcome variable to a rate before plotting fitted values and residuals. Defaults to rates = TRUE.

binwidth

A function with a single argument that will be passed to the binwidth argument in geom_histogram. The default is to set the width of bins to 0.5 * sd(x).

size

Point size and linewidth for point-interval plot of observed vs. fitted values (passed to geom_pointrange).

Value

A grid of spatial diagnostic plots. If plot = TRUE, the ggplots are drawn using grid.arrange; otherwise, they are returned in a list. For the geostan_fit method, the underlying data for the Moran coefficient (as required for mc_style = "hist") will also be returned if plot = FALSE.

Details

When provided with a numeric vector, this function plots a histogram, Moran scatter plot, and map.

When provided with a fitted geostan model, the function returns a point-interval plot of observed values against fitted values (mean and 95 percent credible interval), either a Moran scatter plot of residuals or a histogram of Moran coefficient values calculated from the joint posterior distribution of the residuals, and a map of the mean posterior residuals (means of the marginal distributions).

When y is a fitted CAR or SAR model with family = auto_gaussian(), the fitted values will include implicit spatial trend term, i.e. the call to fitted.geostan_fit will use the default trend = TRUE and the call to residuals.geostan_fit will use the default detrend = TRUE. (See stan_car or stan_sar for additional details on their implicit spatial trend components.)

See also

Examples

# \donttest{
data(georgia)
sp_diag(georgia$college, georgia)

bin_fn <- function(y) mad(y, na.rm = TRUE)
sp_diag(georgia$college, georgia, binwidth = bin_fn)

fit <- stan_glm(log(rate.male) ~ log(income),
                data = georgia,
                chains = 2, iter = 800) # for speed only
sp_diag(fit, georgia)
# }