Obtain predicted values from a fitted model by providing new covariate values.
A fitted model object of class geostan_fit
.
A data frame in which to look for variables with which to predict, presumably for the purpose of viewing marginal effects. Note that if the model formula includes an offset term, newdata
must contain the offset. Note also that any spatially-lagged covariate terms will be ignored if they were provided using the slx
argument. If covariates in the model were centered using the centerx
argument, the predict.geostan_fit
method will automatically center the predictors in newdata
using the values stored in object$x_center
. If newdata
is missing, the fitted values of the model will be returned.
An N-by-1 matrix of MCMC samples for the intercept; this is provided by default. However, this argument might be used if there is a need to incorporate the spatial trend term, in which case it may be thought of as a spatially-varying intercept. If used, note that the intercept needs to be provided on the scale of the linear predictor.
Optional vector of numeric values or a logical scalar to pass to scale
. Defaults to using object$x_center
. If the model was fit using centerx = TRUE
, then covariates were centered and their mean values are stored in object$x_center
and the predict
method will use them to automatically center newdata
; if the model was fit with centerx = FALSE
, then object$x_center = FALSE
and newdata
will not be centered.
If FALSE
, a matrix containing samples from the posterior distribution at each observation is returned. The default, TRUE
, will summarize results by providing an estimate (mean) and credible interval (formed by taking quantiles of the MCMC samples).
By default, results from predict
are on the scale of the linear predictor (type = "link")
). The alternative (type = "response"
) is on the scale of the response variable. For example, the default return values for a Poisson model on the log scale, and using type = "response"
will return the original scale of the outcome variable (by exponentiating the log values).
Not used
If summary = FALSE
, a matrix of samples is returned. If summary = TRUE
(the default), a data frame is returned.
The purpose of the predict method is to explore marginal effects of (combinations of) covariates.
The model formula will be taken from object$formula
, and then a model matrix will be created by passing newdata
to the model.frame function (as in: model.frame(newdata, object$formula
). Parameters are taken from as.matrix(object, pars = c("intercept", "beta"))
.
Be aware that in generalized linear models (such as Poisson and Binomial models) marginal effects plots on the response scale may be sensitive to the level of other covariates in the model. If the model includes any spatially-lagged covariates (introduced using the slx
argument) or a spatial autocorrelation term (for example, you used a spatial CAR, SAR, or ESF model), these terms will essentially be fixed at zero for the purposes of calculating marginal effects. If you want to change this, you can introduce spatial trend values by specifying a varying intercept using the alpha
argument.
data(georgia)
georgia$income <- georgia$income / 1e3
fit <- stan_glm(deaths.male ~ offset(log(pop.at.risk.male)) + log(income),
data = georgia,
centerx = TRUE,
family = poisson(),
chains = 2, iter = 600) # for speed only
# note: pop.at.risk.male=1 leads to log(pop.at.risk.male)=0
# so that the predicted values are rates
newdata <- data.frame(
income = seq(min(georgia$income),
max(georgia$income),
length.out = 100),
pop.at.risk.male = 1)
preds <- predict(fit, newdata, type = "response")
head(preds)
plot(preds$income,
preds$mean * 10e3,
type = "l",
ylab = "Deaths per 10,000",
xlab = "Income ($1,000s)")
# here the predictions are rates per 10,000
newdata$pop.at.risk.male <- 10e3
preds <- predict(fit, newdata, type = "response")
head(preds)
plot(preds$income,
preds$mean,
type = "l",
ylab = "Deaths per 10,000",
xlab = "Income ($1,000s)")