est_irt {irtQ}R Documentation

Item parameter estimation using MMLE-EM algorithm

Description

This function fits unidimensional item response theory (IRT) models to mixed-format data comprising both dichotomous and polytomous items, using marginal maximum likelihood estimation via the expectation–maximization (MMLE-EM) algorithm (Bock & Aitkin, 1981). It also supports fixed item parameter calibration (FIPC; Kim, 2006), a practical method for pretest (or newly developed) item calibration in computerized adaptive testing (CAT). FIPC enables the parameter estimates of pretest items to be placed on the same scale as those of operational items (Ban et al., 2001). For dichotomous items, the function supports the one-, two-, and three-parameter logistic models. For polytomous items, it supports the graded response model (GRM) and the (generalized) partial credit model (GPCM).

Usage

est_irt(
  x = NULL,
  data,
  D = 1,
  model = NULL,
  cats = NULL,
  item.id = NULL,
  fix.a.1pl = FALSE,
  fix.a.gpcm = FALSE,
  fix.g = FALSE,
  a.val.1pl = 1,
  a.val.gpcm = 1,
  g.val = 0.2,
  use.aprior = FALSE,
  use.bprior = FALSE,
  use.gprior = TRUE,
  aprior = list(dist = "lnorm", params = c(0, 0.5)),
  bprior = list(dist = "norm", params = c(0, 1)),
  gprior = list(dist = "beta", params = c(5, 16)),
  missing = NA,
  Quadrature = c(49, 6),
  weights = NULL,
  group.mean = 0,
  group.var = 1,
  EmpHist = FALSE,
  use.startval = FALSE,
  Etol = 1e-04,
  MaxE = 500,
  control = list(iter.max = 200),
  fipc = FALSE,
  fipc.method = "MEM",
  fix.loc = NULL,
  fix.id = NULL,
  se = TRUE,
  verbose = TRUE
)

Arguments

x

A data frame containing item metadata. This metadata is required to retrieve essential information for each item (e.g., number of score categories, IRT model type, etc.) necessary for calibration. You can create an empty item metadata frame using the function shape_df().

When use.startval = TRUE, the item parameters specified in the metadata will be used as starting values for parameter estimation. If x = NULL, both model and cats arguments must be specified. Note that when fipc = TRUE to implement FIPC, item metadata for the test form must be supplied via the x argument. See below for more details. Default is NULL.

data

A matrix of examinees' item responses corresponding to the items specified in the x argument. Rows represent examinees and columns represent items.

D

A scaling constant used in IRT models to make the logistic function closely approximate the normal ogive function. A value of 1.7 is commonly used for this purpose. Default is 1.

model

A character vector specifying the IRT model to fit each item. Available values are:

  • "1PLM", "2PLM", "3PLM", "DRM" for dichotomous items

  • "GRM", "GPCM" for polytomous items

Here, "GRM" denotes the graded response model and "GPCM" the (generalized) partial credit model. Note that "DRM" serves as a general label covering all three dichotomous IRT models. If a single model name is provided, it is recycled for all items. This argument is only used when x = NULL and fipc = FALSE. Default is NULL.

cats

Numeric vector specifying the number of score categories per item. For dichotomous items, this should be 2. If a single value is supplied, it will be recycled across all items. When cats = NULL and all models specified in the model argument are dichotomous ("1PLM", "2PLM", "3PLM", or "DRM"), the function defaults to 2 categories per item. This argument is used only when x = NULL and fipc = FALSE. Default is NULL.

item.id

Character vector of item identifiers. If NULL, IDs are generated automatically. When fipc = TRUE, a provided item.id will override any IDs present in x. Default is NULL.

fix.a.1pl

Logical. If TRUE, the slope parameters of all 1PLM items are fixed to a.val.1pl; otherwise, they are constrained to be equal and estimated. Default is FALSE.

fix.a.gpcm

Logical. If TRUE, GPCM items are calibrated as PCM with slopes fixed to a.val.gpcm; otherwise, each item's slope is estimated. Default is FALSE.

fix.g

Logical. If TRUE, all 3PLM guessing parameters are fixed to g.val; otherwise, each guessing parameter is estimated. Default is FALSE.

a.val.1pl

Numeric. Value to which the slope parameters of 1PLM items are fixed when fix.a.1pl = TRUE. Default is 1.

a.val.gpcm

Numeric. Value to which the slope parameters of GPCM items are fixed when fix.a.gpcm = TRUE. Default is 1.

g.val

Numeric. Value to which the guessing parameters of 3PLM items are fixed when fix.g = TRUE. Default is 0.2.

use.aprior

Logical. If TRUE, applies a prior distribution to all item discrimination (slope) parameters during calibration. Default is FALSE.

use.bprior

Logical. If TRUE, applies a prior distribution to all item difficulty (or threshold) parameters during calibration. Default is FALSE.

use.gprior

Logical. If TRUE, applies a prior distribution to all 3PLM guessing parameters during calibration. Default is TRUE.

aprior, bprior, gprior

A list specifying the prior distribution for all item discrimination (slope), difficulty (or threshold), guessing parameters. Three distributions are supported: Beta, Log-normal, and Normal. The list must have two elements:

  • dist: A character string, one of "beta", "lnorm", or "norm".

  • params: A numeric vector of length two giving the distribution’s parameters. For details on each parameterization, see stats::dbeta(), stats::dlnorm(), and stats::dnorm().

Defaults are:

  • aprior = list(dist = "lnorm", params = c(0.0, 0.5))

  • bprior = list(dist = "norm", params = c(0.0, 1.0))

  • gprior = list(dist = "beta", params = c(5, 16))

for discrimination, difficulty, and guessing parameters, respectively.

missing

A value indicating missing responses in the data set. Default is NA.

Quadrature

A numeric vector of length two:

  • first element: number of quadrature points

  • second element: symmetric bound (absolute value) for those points For example, c(49, 6) specifies 49 evenly spaced points from –6 to 6. These points are used in the E-step of the EM algorithm. Default is c(49, 6).

weights

A two-column matrix or data frame containing the quadrature points (in the first column) and their corresponding weights (in the second column) for the latent variable prior distribution. If not NULL, the scale of the latent ability distribution is fixed to match the scale of the provided quadrature points and weights. The weights and points can be conveniently generated using the function gen.weight().

If NULL, a normal prior density is used instead, based on the information provided in the Quadrature, group.mean, and group.var arguments. Default is NULL.

group.mean

A numeric value specifying the mean of the latent variable prior distribution when weights = NULL. Default is 0. This value is fixed to resolve the indeterminacy of the item parameter scale during calibration. However, the scale of the prior distribution is updated when FIPC is implemented.

group.var

A positive numeric value specifying the variance of the latent variable prior distribution when weights = NULL. Default is 1. This value is fixed to resolve the indeterminacy of the item parameter scale during calibration. However, the scale of the prior distribution is updated when FIPC is implemented.

EmpHist

Logical. If TRUE, the empirical histogram of the latent variable prior distribution is estimated simultaneously with the item parameters using the approach proposed by Woods (2007). Item calibration is conducted relative to the estimated empirical prior. See below for details.

use.startval

Logical. If TRUE, the item parameters provided in the item metadata (i.e., the x argument) are used as starting values for item parameter estimation. Otherwise, internally generated starting values are used. Default is FALSE.

Etol

A positive numeric value specifying the convergence criterion for the E-step of the EM algorithm. Default is 1e-4.

MaxE

A positive integer specifying the maximum number of iterations for the E-step in the EM algorithm. Default is 500.

control

A list of control parameters to be passed to the optimization function stats::nlminb(). These parameters control the M-step of the EM algorithm. For example, the maximum number of iterations in each M-step can be specified using control = list(iter.max = 200). The default maximum number of iterations per M-step is 200. See stats::nlminb() for additional control options.

fipc

Logical. If TRUE, fixed item parameter calibration (FIPC) is applied during item parameter estimation. When fipc = TRUE, the information on which items are fixed must be provided via either fix.loc or fix.id. See below for details.

fipc.method

A character string specifying the FIPC method. Available options are:

  • "OEM": No Prior Weights Updating and One EM Cycle (NWU-OEM; Wainer & Mislevy, 1990)

  • "MEM": Multiple Prior Weights Updating and Multiple EM Cycles (MWU-MEM; Kim, 2006) When fipc.method = "OEM", the maximum number of E-steps is automatically set to 1, regardless of the value specified in MaxE.

fix.loc

A vector of positive integers specifying the row positions of the items to be fixed in the item metadata (i.e., x) when FIPC is implemented (i.e., fipc = TRUE). For example, suppose that five items located in the 1st, 2nd, 4th, 7th, and 9th rows of x should be fixed. Then use fix.loc = c(1, 2, 4, 7, 9). Note that if fix.id is not NULL, the information provided in fix.loc is ignored. See below for details.

fix.id

A character vector specifying the IDs of the items to be fixed when FIPC is implemented (i.e., fipc = TRUE). For example, suppose five items with IDs "CMC1", "CMC2", "CMC3", "CMC4", and "CMC5" are to be fixed, and that all item IDs are supplied via item.id column in the x argument. Then use fix.id = c("CMC1", "CMC2", "CMC3", "CMC4", "CMC5"). Note that if fix.id is not NULL, the information in fix.loc is ignored. See below for details.

se

Logical. If FALSE, standard errors of the item parameter estimates are not computed. Default is TRUE.

verbose

Logical. If FALSE, all progress messages, including information about the EM algorithm process, are suppressed. Default is TRUE.

Details

A specific format of data frame should be used for the argument x. The first column should contain item IDs, the second column should contain the number of unique score categories for each item, and the third column should specify the IRT model to be fitted to each item. Available IRT models are:

Note that "DRM" serves as a general label covering all dichotomous IRT models (i.e., "1PLM", "2PLM", and "3PLM"), while "GRM" and "GPCM" represent the graded response model and (generalized) partial credit model, respectively.

The subsequent columns should contain the item parameters for the specified models. For dichotomous items, the fourth, fifth, and sixth columns represent item discrimination (slope), item difficulty, and item guessing parameters, respectively. When "1PLM" or "2PLM" is specified in the third column, NAs must be entered in the sixth column for the guessing parameters.

For polytomous items, the item discrimination (slope) parameter should appear in the fourth column, and the item difficulty (or threshold) parameters for category boundaries should occupy the fifth through the last columns. When the number of unique score categories differs across items, unused parameter cells should be filled with NAs.

In the irtQ package, the threshold parameters for GPCM items are expressed as the item location (or overall difficulty) minus the threshold values for each score category. Note that when a GPCM item has K unique score categories, K - 1 threshold parameters are required, since the threshold for the first category boundary is always fixed at 0. For example, if a GPCM item has five score categories, four threshold parameters must be provided.

An example of a data frame for a single-format test is shown below:

ITEM1 2 1PLM 1.000 1.461 NA
ITEM2 2 2PLM 1.921 -1.049 NA
ITEM3 2 3PLM 1.736 1.501 0.203
ITEM4 2 3PLM 0.835 -1.049 0.182
ITEM5 2 DRM 0.926 0.394 0.099

An example of a data frame for a mixed-format test is shown below:

ITEM1 2 1PLM 1.000 1.461 NA NA NA
ITEM2 2 2PLM 1.921 -1.049 NA NA NA
ITEM3 2 3PLM 0.926 0.394 0.099 NA NA
ITEM4 2 DRM 1.052 -0.407 0.201 NA NA
ITEM5 4 GRM 1.913 -1.869 -1.238 -0.714 NA
ITEM6 5 GRM 1.278 -0.724 -0.068 0.568 1.072
ITEM7 4 GPCM 1.137 -0.374 0.215 0.848 NA
ITEM8 5 GPCM 1.233 -2.078 -1.347 -0.705 -0.116

See the IRT Models section in the irtQ-package documentation for more details about the IRT models used in the irtQ package. A convenient way to create a data frame for the argument x is by using the function shape_df().

To fit IRT models to data, the item response data must be accompanied by information on the IRT model and the number of score categories for each item. There are two ways to provide this information:

  1. Supply item metadata to the argument x. As explained above, such metadata can be easily created using shape_df().

  2. Specify the IRT models and score category information directly through the arguments model and cats.

If x = NULL, the function uses the information specified in model and cats.

To implement FIPC, the item metadata must be provided via the x argument. This is because the item parameters of the fixed items in the metadata are used to estimate the characteristics of the underlying latent variable prior distribution when calibrating the remaining (freely estimated) items. More specifically, the latent prior distribution is estimated based on the fixed items, and then used to calibrate the new (pretest) items so that their parameters are placed on the same scale as those of the fixed items (Kim, 2006).The full item metadata, including both fixed and non-fixed items, can be conveniently created using the shape_df_fipc() function.

In terms of approaches for FIPC, Kim (2006) described five different methods. Among them, two methods are available in the est_irt() function. The first method is "NWU-OEM", which uses a single E-step in the EM algorithm (involving only the fixed items) followed by a single M-step (involving only the non-fixed items). This method was proposed by Wainer and Mislevy (1990) in the context of online calibration and can be implemented by setting fipc.method = "OEM".

The second method is "MWU-MEM", which iteratively updates the latent variable prior distribution and estimates the parameters of the non-fixed items. In this method, the same procedure as the NWU-OEM approach is applied during the first EM cycle. From the second cycle onward, both the parameters of the non-fixed items and the weights of the prior distribution are concurrently updated. This method can be implemented by setting fipc.method = "MEM". See Kim (2006) for more details.

When fipc = TRUE, information about which items are to be fixed must be provided via either the fix.loc or fix.id argument. For example, suppose that five items with IDs "CMC1", "CMC2", "CMC3", "CMC4", and "CMC5" should be fixed, and all item IDs are provided via the x or item.id argument. Also, assume these five items are located in the 1st through 5th rows of the item metadata (i.e., x). In this case, the fixed items can be specified using either fix.loc = c(1, 2, 3, 4, 5) or fix.id = c("CMC1", "CMC2", "CMC3", "CMC4", "CMC5"). Note that if both fix.loc and fix.id are not NULL, the information in fix.loc is ignored.

When EmpHist = TRUE, the empirical histogram of the latent variable prior distribution (i.e., the densities at the quadrature points) is estimated simultaneously with the item parameters. If EmpHist = TRUE and fipc = TRUE, the scale parameters of the empirical prior distribution (e.g., mean and variance) are also estimated. If EmpHist = TRUE and fipc = FALSE, the scale parameters are fixed to the values specified in group.mean and group.var. When EmpHist = FALSE, a normal prior distribution is used instead. If fipc = TRUE, the scale parameters of this normal prior are estimated along with the item parameters. If fipc = FALSE, they are fixed to the values specified in group.mean and group.var.

Value

This function returns an object of class est_irt. The returned object contains the following components:

estimates

A data frame containing both the item parameter estimates and their corresponding standard errors.

par.est

A data frame of item parameter estimates, structured according to the item metadata format.

se.est

A data frame of standard errors for the item parameter estimates, computed using the cross-product approximation method (Meilijson, 1989).

pos.par

A data frame indicating the position index of each estimated item parameter. The position information is useful for interpreting the variance-covariance matrix of item parameter estimates

covariance

A variance-covariance matrix of the item parameter estimates.

loglikelihood

The marginal log-likelihood, calculated as the sum of the log-likelihoods across all items.

aic

Akaike Information Criterion (AIC) based on the log-likelihood.

bic

Bayesian Information Criterion (BIC) based on the log-likelihood.

group.par

A data frame containing the mean, variance, and standard deviation of the latent variable prior distribution.

weights

A two-column data frame of quadrature points (column 1) and corresponding weights (column 2) of the (updated) latent prior distribution.

posterior.dist

A matrix of normalized posterior densities for all response patterns at each quadrature point. Rows and columns represent response patterns and quadrature points, respectively.

data

A data frame of examinees' response data.

scale.D

The scaling factor used in the IRT model.

ncase

The total number of response patterns.

nitem

The total number of items in the response data.

Etol

The convergence criterion for the E-step of the EM algorithm.

MaxE

The maximum number of E-steps allowed in the EM algorithm.

aprior

A list describing the prior distribution used for discrimination parameters.

bprior

A list describing the prior distribution used for difficulty parameters.

gprior

A list describing the prior distribution used for guessing parameters.

npar.est

The total number of parameters estimated.

niter

The number of completed EM cycles.

maxpar.diff

The maximum absolute change in parameter estimates at convergence.

EMtime

Time (in seconds) spent on EM cycles.

SEtime

Time (in seconds) spent computing standard errors.

TotalTime

Total computation time (in seconds).

test.1

First-order test result indicating whether the gradient sufficiently vanished for solution stability.

test.2

Second-order test result indicating whether the information matrix is positive definite, a necessary condition for identifying a local maximum.

var.note

A note indicating whether the variance-covariance matrix was successfully obtained from the information matrix.

fipc

Logical. Indicates whether FIPC was used.

fipc.method

The method used for FIPC.

fix.loc

A vector of integers specifying the row locations of fixed items when FIPC was applied.

Note that you can easily extract components from the output using the getirt() function.

Author(s)

Hwanggyu Lim hglim83@gmail.com

References

Ban, J. C., Hanson, B. A., Wang, T., Yi, Q., & Harris, D., J. (2001) A comparative study of on-line pretest item calibration/scaling methods in computerized adaptive testing. Journal of Educational Measurement, 38(3), 191-212.

Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443-459.

Kim, S. (2006). A comparative study of IRT fixed parameter calibration methods. Journal of Educational Measurement, 43(4), 355-381.

Meilijson, I. (1989). A fast improvement to the EM algorithm on its own terms. Journal of the Royal Statistical Society: Series B (Methodological), 51, 127-138.

Stocking, M. L. (1988). Scale drift in on-line calibration (Research Rep. 88-28). Princeton, NJ: ETS.

Wainer, H., & Mislevy, R. J. (1990). Item response theory, item calibration, and proficiency estimation. In H. Wainer (Ed.), Computer adaptive testing: A primer (Chap. 4, pp.65-102). Hillsdale, NJ: Lawrence Erlbaum.

Woods, C. M. (2007). Empirical histograms in item response theory with ordinal data. Educational and Psychological Measurement, 67(1), 73-87.

See Also

shape_df(), shape_df_fipc(), getirt()

Examples



## --------------------------------------------------------------
## 1. Item parameter estimation for dichotomous item data (LSAT6)
## --------------------------------------------------------------
# Fit the 1PL model to LSAT6 data and estimate a common slope parameter
# (i.e., constrain slope parameters to be equal)
(mod.1pl.c <- est_irt(data = LSAT6, D = 1, model = "1PLM", cats = 2,
                      fix.a.1pl = FALSE))

# Display a summary of the estimation results
summary(mod.1pl.c)

# Extract the item parameter estimates
getirt(mod.1pl.c, what = "par.est")

# Extract the standard error estimates
getirt(mod.1pl.c, what = "se.est")

# Fit the 1PL model to LSAT6 data and fix slope parameters to 1.0
(mod.1pl.f <- est_irt(data = LSAT6, D = 1, model = "1PLM", cats = 2,
                      fix.a.1pl = TRUE, a.val.1pl = 1))

# Display a summary of the estimation results
summary(mod.1pl.f)

# Fit the 2PL model to LSAT6 data
(mod.2pl <- est_irt(data = LSAT6, D = 1, model = "2PLM", cats = 2))

# Display a summary of the estimation results
summary(mod.2pl)

# Assess the model fit for the 2PL model using the S-X2 fit statistic
(sx2fit.2pl <- sx2_fit(x = mod.2pl))

# Compute item and test information functions at a range of theta values
theta <- seq(-4, 4, 0.1)
(info.2pl <- info(x = mod.2pl, theta = theta))

# Plot the test characteristic curve (TCC)
(trace.2pl <- traceline(x = mod.2pl, theta = theta))
plot(trace.2pl)

# Plot the item characteristic curve (ICC) for the first item
plot(trace.2pl, item.loc = 1)

# Fit the 2PL model and simultaneously estimate an empirical histogram
# of the latent variable prior distribution
# Also apply a looser convergence threshold for the E-step
(mod.2pl.hist <- est_irt(data = LSAT6, D = 1, model = "2PLM", cats = 2,
                         EmpHist = TRUE, Etol = 0.001))
(emphist <- getirt(mod.2pl.hist, what = "weights"))
plot(emphist$weight ~ emphist$theta, type = "h")

# Fit the 3PL model and apply a Beta prior to the guessing parameters
(mod.3pl <- est_irt(
  data = LSAT6, D = 1, model = "3PLM", cats = 2, use.gprior = TRUE,
  gprior = list(dist = "beta", params = c(5, 16))
))

# Display a summary of the estimation results
summary(mod.3pl)

# Fit the 3PL model and fix the guessing parameters at 0.2
(mod.3pl.f <- est_irt(data = LSAT6, D = 1, model = "3PLM", cats = 2,
                      fix.g = TRUE, g.val = 0.2))

# Display a summary of the estimation results
summary(mod.3pl.f)

# Fit different dichotomous models to each item in the LSAT6 data:
# Fit the constrained 1PL model to items 1–3, the 2PL model to item 4,
# and the 3PL model with a Beta prior on guessing to item 5
(mod.drm.mix <- est_irt(
  data = LSAT6, D = 1, model = c("1PLM", "1PLM", "1PLM", "2PLM", "3PLM"),
  cats = 2, fix.a.1pl = FALSE, use.gprior = TRUE,
  gprior = list(dist = "beta", params = c(5, 16))
))

# Display a summary of the estimation results
summary(mod.drm.mix)

## -------------------------------------------------------------------
## 2. Item parameter estimation for mixed-format data (simulated data)
## -------------------------------------------------------------------
## Import the "-prm.txt" output file from flexMIRT
flex_sam <- system.file("extdata", "flexmirt_sample-prm.txt", package = "irtQ")

# Extract item metadata
x <- bring.flexmirt(file = flex_sam, "par")$Group1$full_df

# Modify the item metadata so that the 39th and 40th items use the GPCM
x[39:40, 3] <- "GPCM"

# Generate 1,000 examinees' latent abilities from N(0, 1)
set.seed(37)
score1 <- rnorm(1000, mean = 0, sd = 1)

# Simulate item response data
sim.dat1 <- simdat(x = x, theta = score1, D = 1)

# Fit the 3PL model to all dichotomous items, the GPCM to items 39 and 40,
# and the GRM to items 53, 54, and 55.
# Use a Beta prior for guessing parameters, a log-normal prior for slope
# parameters, and a normal prior for difficulty (threshold) parameters.
# Also, specify the argument `x` to provide IRT model and score category information.
item.meta <- shape_df(item.id = x$id, cats = x$cats, model = x$model,
  default.par = TRUE)
(mod.mix1 <- est_irt(
  x = item.meta, data = sim.dat1, D = 1, use.aprior = TRUE, use.bprior = TRUE,
  use.gprior = TRUE,
  aprior = list(dist = "lnorm", params = c(0.0, 0.5)),
  bprior = list(dist = "norm", params = c(0.0, 2.0)),
  gprior = list(dist = "beta", params = c(5, 16))
))

# Display a summary of the estimation results
summary(mod.mix1)

# Estimate examinees' latent scores using MLE and the estimated item parameters
(score.mle <- est_score(x = mod.mix1, method = "ML", range = c(-4, 4), ncore = 2))

# Compute traditional model-fit statistics
(fit.mix1 <- irtfit(
  x = mod.mix1, score = score.mle$est.theta, group.method = "equal.width",
  n.width = 10, loc.theta = "middle"
))

# Residual plot for the first item (dichotomous)
plot(
  x = fit.mix1, item.loc = 1, type = "both", ci.method = "wald",
  show.table = TRUE, ylim.sr.adjust = TRUE
)

# Residual plot for the last item (polytomous)
plot(
  x = fit.mix1, item.loc = 55, type = "both", ci.method = "wald",
  show.table = FALSE, ylim.sr.adjust = TRUE
)

# Fit the 2PL model to all dichotomous items, the GPCM to items 39 and 40,
# and the GRM to items 53, 54, and 55.
# Provide IRT model and score category information via `model` and `cats`
# arguments.
(mod.mix2 <- est_irt(
  data = sim.dat1, D = 1,
  model = c(rep("2PLM", 38), rep("GPCM", 2), rep("2PLM", 12), rep("GRM", 3)),
  cats = c(rep(2, 38), rep(5, 2), rep(2, 12), rep(5, 3))
))

# Display a summary of the estimation results
summary(mod.mix2)

# Fit the 2PL model to all dichotomous items, the GPCM to items 39 and 40,
# and the GRM to items 53, 54, and 55.
# Also estimate the empirical histogram of the latent prior distribution.
# Provide IRT model and score category information via `model` and `cats` arguments.
(mod.mix3 <- est_irt(
  data = sim.dat1, D = 1,
  model = c(rep("2PLM", 38), rep("GPCM", 2), rep("2PLM", 12), rep("GRM", 3)),
  cats = c(rep(2, 38), rep(5, 2), rep(2, 12), rep(5, 3)), EmpHist = TRUE
))
(emphist <- getirt(mod.mix3, what = "weights"))
plot(emphist$weight ~ emphist$theta, type = "h")

# Fit the 2PL model to all dichotomous items, the PCM to items 39 and 40 by
# fixing slope parameters to 1, and the GRM to items 53, 54, and 55.
# Provide IRT model and score category information via `model` and `cats` arguments.
(mod.mix4 <- est_irt(
  data = sim.dat1, D = 1,
  model = c(rep("2PLM", 38), rep("GPCM", 2), rep("2PLM", 12), rep("GRM", 3)),
  cats = c(rep(2, 38), rep(5, 2), rep(2, 12), rep(5, 3)),
  fix.a.gpcm = TRUE, a.val.gpcm = 1
))

# Display a summary of the estimation results
summary(mod.mix4)

## ----------------------------------------------------------------
## 3. Fixed item parameter calibration (FIPC) for mixed-format data
##    (simulated)
## ----------------------------------------------------------------
## Import the "-prm.txt" output file from flexMIRT
flex_sam <- system.file("extdata", "flexmirt_sample-prm.txt", package = "irtQ")

# Select item metadata
x <- bring.flexmirt(file = flex_sam, "par")$Group1$full_df

# Generate 1,000 examinees' latent abilities from N(0.4, 1.3)
set.seed(20)
score2 <- rnorm(1000, mean = 0.4, sd = 1.3)

# Simulate response data
sim.dat2 <- simdat(x = x, theta = score2, D = 1)

# Fit the 3PL model to all dichotomous items and the GRM to all polytomous items
# Fix five 3PL items (1st–5th) and three GRM items (53rd–55th)
# Also estimate the empirical histogram of the latent variable distribution
# Use the MEM method
fix.loc <- c(1:5, 53:55)
(mod.fix1 <- est_irt(
  x = x, data = sim.dat2, D = 1, use.gprior = TRUE,
  gprior = list(dist = "beta", params = c(5, 16)), EmpHist = TRUE,
  Etol = 1e-3, fipc = TRUE, fipc.method = "MEM", fix.loc = fix.loc
))

# Extract group-level parameter estimates
(prior.par <- mod.fix1$group.par)

# Visualize the empirical prior distribution
(emphist <- getirt(mod.fix1, what = "weights"))
plot(emphist$weight ~ emphist$theta, type = "h")

# Display a summary of the estimation results
summary(mod.fix1)

# Alternatively, fix the same items by providing their item IDs
# using the `fix.id` argument. In this case, set `fix.loc = NULL`
fix.id <- c(x$id[1:5], x$id[53:55])
(mod.fix1 <- est_irt(
  x = x, data = sim.dat2, D = 1, use.gprior = TRUE,
  gprior = list(dist = "beta", params = c(5, 16)), EmpHist = TRUE,
  Etol = 1e-3, fipc = TRUE, fipc.method = "MEM", fix.loc = NULL,
  fix.id = fix.id
))

# Display a summary of the estimation results
summary(mod.fix1)

# Fit the 3PL model to all dichotomous items and the GRM to all polytomous items
# Fix the same items as before (1st–5th and 53rd–55th)
# This time, do not estimate the empirical histogram of the latent prior
# Instead, estimate the scale of the normal prior distribution
# Use the MEM method
fix.loc <- c(1:5, 53:55)
(mod.fix2 <- est_irt(
  x = x, data = sim.dat2, D = 1, use.gprior = TRUE,
  gprior = list(dist = "beta", params = c(5, 16)), EmpHist = FALSE,
  Etol = 1e-3, fipc = TRUE, fipc.method = "MEM", fix.loc = fix.loc
))

# Extract group-level parameter estimates
(prior.par <- mod.fix2$group.par)

# Visualize the prior distribution
(emphist <- getirt(mod.fix2, what = "weights"))
plot(emphist$weight ~ emphist$theta, type = "h")

# Fit the 3PL model to all dichotomous items and the GRM to all polytomous items
# Fix only the five 3PL items (1st–5th) and estimate the empirical histogram
# Use the OEM method (i.e., only one EM cycle is used)
fix.loc <- c(1:5)
(mod.fix3 <- est_irt(
  x = x, data = sim.dat2, D = 1, use.gprior = TRUE,
  gprior = list(dist = "beta", params = c(5, 16)), EmpHist = TRUE,
  Etol = 1e-3, fipc = TRUE, fipc.method = "OEM", fix.loc = fix.loc
))

# Extract group-level parameter estimates
(prior.par <- mod.fix3$group.par)

# Visualize the prior distribution
(emphist <- getirt(mod.fix3, what = "weights"))
plot(emphist$weight ~ emphist$theta, type = "h")

# Display a summary of the estimation results
summary(mod.fix3)

# Fit the 3PL model to all dichotomous items and the GRM to all polytomous items
# Fix all 55 items and estimate only the latent ability distribution
# Use the MEM method
fix.loc <- c(1:55)
(mod.fix4 <- est_irt(
  x = x, data = sim.dat2, D = 1, EmpHist = TRUE,
  Etol = 1e-3, fipc = TRUE, fipc.method = "MEM", fix.loc = fix.loc
))

# Extract group-level parameter estimates
(prior.par <- mod.fix4$group.par)

# Visualize the prior distribution
(emphist <- getirt(mod.fix4, what = "weights"))
plot(emphist$weight ~ emphist$theta, type = "h")

# Display a summary of the estimation results
summary(mod.fix4)

# Alternatively, fix all 55 items by providing their item IDs
# using the `fix.id` argument. In this case, set `fix.loc = NULL`
fix.id <- x$id
(mod.fix4 <- est_irt(
  x = x, data = sim.dat2, D = 1, EmpHist = TRUE,
  Etol = 1e-3, fipc = TRUE, fipc.method = "MEM", fix.loc = NULL,
  fix.id = fix.id
))

# Display a summary of the estimation results
summary(mod.fix4)




[Package irtQ version 1.0.0 Index]