sim.data.Rd
randomly generate response data matrix according to certen conditions, including attributes distribution, item quality, sample size, Q-matrix and cognitive diagnosis models (CDMs).
sim.data(
Q = NULL,
N = NULL,
IQ = list(P0 = NULL, P1 = NULL),
model = "GDINA",
distribute = "uniform",
control = NULL,
verbose = TRUE
)
The Q-matrix. A random 30 × 5 Q-matrix (sim.Q
) will be used if NULL.
Sample size. Default = 500.
A List contains tow I-length vectors: P0
and P1
.
Type of model to be fitted; can be "GDINA"
, "LCDM"
, "DINA"
, "DINO"
,
"ACDM"
, "LLM"
, or "rRUM"
.
Attribute distributions; can be "uniform"
for the uniform distribution,
"mvnorm"
for the multivariate normal distribution (Chiu, Douglas, & Li,
2009) and "horder"
for the higher-order distribution (Tu et al., 2022).
A list of control parameters with elements:
sigma
A positive-definite symmetric matrix specifying the variance-covariance
matrix when distribute = "mvnorm"
. Default = 0.5 (Chiu, Douglas, & Li, 2009).
cutoffs
A vector giving the cutoff for each attribute when distribute = "mvnorm"
.
Default = \(k/(1+K)\) (Chiu, Douglas, & Li, 2009).
theta
A vector of length N representing the higher-order ability for each examinee.
By default, generate randomly from the normal distribution (Tu et al, 2022).
a
The slopes for the higher-order model when distribute = "horder"
.
Default = 1.5 (Tu et al, 2022).
b
The intercepts when distribute = "horder"
. By default, select equally spaced
values between -1.5 and 1.5 according to the number of attributes (Tu et al, 2022).
Logical indicating to print information or not. Default is TRUE
Object of class simGDINA
.
An simGDINA
object gained by simGDINA
function form GDINA
package.
Elements that can be extracted using method extract include:
An N
× I
simulated item response matrix.
The Q-matrix.
An N
× K
matrix for inviduals' attribute patterns.
A list of non-zero category success probabilities for each latent group.
A list of delta parameters.
Higher-order parameters.
Multivariate normal distribution parameters.
A matrix of item/category success probabilities for each latent class.
Chiu, C.-Y., Douglas, J. A., & Li, X. (2009). Cluster Analysis for Cognitive Diagnosis: Theory and Applications. Psychometrika, 74(4), 633-665. DOI: 10.1007/s11336-009-9125-0.
Tu, D., Chiu, J., Ma, W., Wang, D., Cai, Y., & Ouyang, X. (2022). A multiple logistic regression-based (MLR-B) Q-matrix validation method for cognitive diagnosis models:A confirmatory approach. Behavior Research Methods. DOI: 10.3758/s13428-022-01880-x.
################################################################
# Example 1 #
# generate data follow the uniform distrbution #
################################################################
library(Qval)
set.seed(123)
K <- 5
I <- 10
Q <- sim.Q(K, I)
IQ <- list(
P0 = runif(I, 0.0, 0.2),
P1 = runif(I, 0.8, 1.0)
)
data <- sim.data(Q = Q, N = 10, IQ=IQ, model = "GDINA", distribute = "uniform")
#> distribute = uniform
#> model = GDINA
#> number of attributes: 5
#> number of items: 10
#> num of examinees: 10
#> average of P0 = 0.116
#> average of P1 = 0.926
print(data$dat)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 1 0 0 1 0 1 0 0 0 1
#> [2,] 1 0 1 1 1 1 1 1 0 0
#> [3,] 1 1 1 1 0 0 0 1 0 0
#> [4,] 0 0 1 0 0 1 1 1 1 0
#> [5,] 1 1 0 1 0 0 1 0 1 0
#> [6,] 0 0 1 0 0 1 1 1 0 1
#> [7,] 1 1 0 1 0 0 0 0 1 0
#> [8,] 1 1 1 1 1 1 1 1 1 1
#> [9,] 0 1 0 0 1 0 1 0 1 1
#> [10,] 0 1 1 0 0 0 0 1 1 1
################################################################
# Example 2 #
# generate data follow the mvnorm distrbution #
################################################################
set.seed(123)
K <- 5
I <- 10
Q <- sim.Q(K, I)
IQ <- list(
P0 = runif(I, 0.0, 0.2),
P1 = runif(I, 0.8, 1.0)
)
example_cutoffs <- sample(qnorm(c(1:K)/(K+1)), ncol(Q))
data <- sim.data(Q = Q, N = 10, IQ=IQ, model = "GDINA", distribute = "mvnorm",
control = list(sigma = 0.5, cutoffs = example_cutoffs))
#> distribute = mvnorm
#> model = GDINA
#> number of attributes: 5
#> number of items: 10
#> num of examinees: 10
#> average of P0 = 0.116
#> average of P1 = 0.926
#> sigma = 0.5
#> cutoffs = -0.967 0 0.431 0.967 -0.431
print(data$dat)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 1 1 1 1 1 0 1 1 1 0
#> [2,] 0 1 0 0 1 0 1 0 0 0
#> [3,] 1 1 1 1 1 0 1 1 1 1
#> [4,] 1 1 1 1 1 0 1 1 1 1
#> [5,] 1 1 1 1 1 0 1 1 1 0
#> [6,] 1 1 1 1 1 0 1 1 1 0
#> [7,] 0 0 0 0 0 0 0 0 0 1
#> [8,] 0 1 0 0 1 0 1 0 1 0
#> [9,] 1 1 1 1 0 1 1 1 1 1
#> [10,] 0 0 1 0 0 0 1 1 0 0
#################################################################
# Example 3 #
# generate data follow the horder distrbution #
#################################################################
set.seed(123)
K <- 5
I <- 10
Q <- sim.Q(K, I)
IQ <- list(
P0 = runif(I, 0.0, 0.2),
P1 = runif(I, 0.8, 1.0)
)
example_theta <- rnorm(10, 0, 1)
example_b <- seq(-1.5,1.5,length.out=K)
data <- sim.data(Q = Q, N = 10, IQ=IQ, model = "GDINA", distribute = "horder",
control = list(theta = example_theta, a = 1.5, b = example_b))
#> distribute = horder
#> model = GDINA
#> number of attributes: 5
#> number of items: 10
#> num of examinees: 10
#> average of P0 = 0.116
#> average of P1 = 0.926
#> theta_mean = -0.625 , theta_sd = 0.906
#> a = 1.5
#> b = -1.5 -0.75 0 0.75 1.5
print(data$dat)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 0 0 0 0 0 1 0 0 0 0
#> [2,] 1 1 0 1 1 1 1 1 1 1
#> [3,] 0 0 0 0 0 0 0 0 0 0
#> [4,] 0 0 0 0 0 0 0 0 1 0
#> [5,] 1 1 0 0 0 0 0 0 1 1
#> [6,] 1 0 0 1 0 1 0 0 0 1
#> [7,] 0 1 0 1 0 1 0 0 1 1
#> [8,] 0 1 0 0 0 0 0 0 1 0
#> [9,] 0 0 0 0 0 0 0 0 0 0
#> [10,] 0 0 1 0 1 1 1 1 0 1