Theil's entropy-based index of inequality
theil(x)
theil2(Count, Population, rates, total = TRUE)
# S3 method for surveil
theil(x)
# S3 method for list
theil(x)
Conceicao, P. and P. Ferreira (2000). The young person's guide to the Theil Index: Suggesting intuitive interpretations and exploring analytical applications. University of Texas Inequality Project. UTIP Working Paper Number 14. Accessed May 1, 2021 from https://utip.gov.utexas.edu/papers.html
Conceicao, P, Galbraith, JK, Bradford, P. (2001). The Theil Index in sequences of nested and hierarchic grouping structures: implications for the measurement of inequality through time, with data aggregated at different levels of industrial classification. Eastern Economic Journal. 27(4): 491-514.
Theil, Henri (1972). Statistical Decomposition Analysis. Amsterdam, The Netherlands and London, UK: North-Holland Publishing Company.
Shannon, Claude E. and Weaver, Warren (1963). The Mathematical Theory of Communication. Urbana and Chicago, USA: University if Illinois Press.
A fitted surveil
model, from stan_rw
; or, a list of fitted surveil
models, where each model represents a different geographic area (e.g., states).
Case counts, integers
Population at risk, integers
If Count
is not provided, then rates
must be provided (Count = rates * Population
).
If total = TRUE
, Theil's index will be returned. Each unit contributes to Theil's index; if total = FALSE
, all of the elements that sum to Theil's index will be returned.
If total = TRUE
(the default), theil2
returns Theil's index as a numeric value. Else, theil2
returns a vector of values that sum to Theil's index.
A named list with the following elements:
A data.frame
summarizing the posterior probability distribution for Theil's T, including the mean and 95 percent credible interval for each time period
A data.frame
with MCMC samples for Theil's T
A list (also of class theil_list
) containing a summary data frame and a tbl_df
containing MCMC samples for Theil's index at each time period.
The summary data frame includes the following columns:
time period
Posterior mean for Theil's index; equal to the sum of Theil_between
and Theil_within
.
The between-areas component to Theil's inequality index
The within-areas component to Theil's inequality index
Additional columns contain the upper and lower limits of the 95 percent credible intervals for each component of Theil's index.
The data frame of samples contains the following columns:
Time period indicator
An id for each MCMC sample; note that samples are from the joint distribution
The between-geographies component of Theil's index
The within-geographies component of Theil's index
Theil's inequality index (T = Between + Within)
Theil's index is a good index of inequality in disease and mortality burdens when multiple groups are being considered. It provides a summary measure of inequality across a set of demographic groups that may be tracked over time (and/or space). Also, it is interesting because it is additive, and thus admits of simple decompositions.
The index measures discrepancies between a population's share of the disease burden, omega
, and their share of the population, eta
. A situation of zero inequality would imply that each population's share of cases is equal to its population share, or, omega=eta
. Each population's contribution to total inequality is calculated as:
omega_i * [log(omega_i/eta_i)], T_i =
the log-ratio of case-share to population-share, weighted by their share of cases. Theil's index for all areas is the sum of each area's T_i:
sum_(i=1)^n T_i. T =
Theil's T is thus a weighted mean of log-ratios of case shares to population shares, where each log-ratio (which we may describe as a raw inequality score) is weighted by its share of total cases. The index has a minimum of zero and a maximum of log(N)
, where N
is the number of units (e.g., number of states).
Theil's index, which is based on Shannon's information theory, can be extended to measure inequality across multiple groups nested within non-overlapping geographies (e.g., states).