balance {polymatching}R Documentation

Evaluating the Balance of Covariates After Matching

Description

The function balance computes the standardized mean differences and the ratio of the variances among treatment groups, before and after matching. The function computes the two measures of balance for each pair of treatment groups.

Usage

balance(
  formulaBalance,
  match_id,
  data,
  weights_before = NULL,
  weights_after = NULL
)

Arguments

formulaBalance

Formula with form group ~ x_1 + ... + x_p. group is the variable identifying the treatment groups/exposures. The balance is evaluated for the covariates x_1,...,x_p. Numeric and integer variables are treated as continuous. Factor variables are treated as categorical. Factor variables with two levels are treated as binary.

match_id

Vector identifying the matched sets—matched units must have the same identifier. It is generated by polymatch.

data

The data.frame object with the data.

weights_before

Optional vector of weights of the observations to be considered in the unmatched dataset. To compute the unweighted standardized mean differences, set weights_before to NULL (default).

weights_after

Vector of weights for the matched dataset. Set it to NULL (default) to compute the unweighted standardized mean differences.

Value

A data.frame containing the standardized differences and ratios of the variances (only for continuous variables) for each pair of treatment groups. A graphical representation of the results can be generated with plotBalance.

See Also

polymatch to generate matched samples and plotBalance to graphically represent the indicators of balance.

Examples

#Generate a datasets with group indicator and four variables:
#- var1, continuous, sampled from normal distributions;
#- var2, continuous, sampled from beta distributions;
#- var3, categorical with 4 levels;
#- var4, binary.
set.seed(1234567)
dat <- data.frame(group = c(rep("A",10),rep("B",20),rep("C",30)),
               var1 = c(rnorm(10,mean=0,sd=1),
                        rnorm(20,mean=1,sd=2),
                        rnorm(30,mean=-1,sd=2)),
               var2 = c(rbeta(10,shape1=1,shape2=1),
                        rbeta(20,shape1=2,shape2=1),
                        rbeta(30,shape1=1,shape2=2)),
               var3 = factor(c(rbinom(10,size=3,prob=.4),
                               rbinom(20,size=3,prob=.5),
                               rbinom(30,size=3,prob=.3))),
               var4 = factor(c(rbinom(10,size=1,prob=.5),
                               rbinom(20,size=1,prob=.3),
                               rbinom(30,size=1,prob=.7))))

#Match on propensity score
#-------------------------

#With multiple groups, need a multinomial model for the PS
library(VGAM)
psModel <- vglm(group ~ var1 + var2 + var3 + var4,
                family=multinomial, data=dat)
#Estimated logits - 2 for each unit: log(P(group=A)/P(group=C)), log(P(group=B)/P(group=C))
logitPS <- predict(psModel, type = "link")
dat$logit_AvsC <- logitPS[,1]
dat$logit_BvsC <- logitPS[,2]

#Match on logits of PS
resultPs <- polymatch(group ~ logit_AvsC + logit_BvsC, data = dat,
                    distance = "euclidean")
dat$match_id_ps <- resultPs$match_id

#Evaluate balance in covariates
tabBalancePs <- balance(group ~ var1 + var2 + var3 + var4,
                        match_id = dat$match_id_ps, data = dat)
tabBalancePs

#You can also represent the standardized mean differences with 'plotBalance'
#plotBalance(tabBalancePs, ratioVariances = TRUE)


[Package polymatching version 1.0.1 Index]