This function is used to extract the features required by the pre-trained Neural Networks (DNN or LSTM) for Determining the Number of Factors. @seealso NN

extractor.feature.NN(
  response,
  model = "DNN",
  cor.type = "pearson",
  use = "pairwise.complete.obs"
)

Arguments

response

A required N × I matrix or data.frame consisting of the responses of N individuals to I items.

model

A character string indicating the model type. Possible values are "DNN" (default) or "LSTM".

cor.type

A character string indicating which correlation coefficient (or covariance) is to be computed. One of "pearson" (default), "kendall", or "spearman". @seealso cor.

use

an optional character string giving a method for computing covariances in the presence of missing values. This must be one of the strings "everything", "all.obs", "complete.obs", "na.or.complete", or "pairwise.complete.obs" (default). @seealso cor.

Value

A matrix (1×54 or 1×20) containing all the features for the DNN or LSTM.

Details

For "DNN", a total of two types of features (6 kinds, making up 54 features in total) will be extracted, and they are as follows: 1. Clustering-Based Features

(1)

Hierarchical clustering is performed with correlation coefficients as dissimilarity. The top 9 tree node heights are calculated, and all heights are divided by the maximum height. The heights from the 2nd to 9th nodes are used as features. @seealso EFAhclust

(2)

Hierarchical clustering with Euclidean distance as dissimilarity is performed. The top 9 tree node heights are calculated, and all heights are divided by the maximum height. The heights from the 2nd to 9th nodes are used as features. @seealso EFAhclust

(3)

K-means clustering is applied with the number of clusters ranging from 1 to 9. The within-cluster sum of squares (WSS) for clusters 2 to 9 are divided by the WSS for a single cluster. @seealso EFAkmeans

These three features are based on clustering algorithms. The purpose of division is to normalize the data. These clustering metrics often contain information unrelated to the number of factors, such as the number of items and the number of respondents, which can be avoided by normalization. The reason for using the 2nd to 9th data is that only the top F-1 data are needed to determine the number of factors F. The first data point is fixed at 1 after the division operations, so it is excluded. This approach helps in model simplification.

2. Traditional Exploratory Factor Analysis Features (Eigenvalues)

(4)

The top 10 largest eigenvalues.

(5)

The ratio of the top 10 largest eigenvalues to the corresponding reference eigenvalues from Empirical Kaiser Criterion (EKC; Braeken & van Assen, 2017). @seealso EKC

(6)

The cumulative variance proportion of the top 10 largest eigenvalues.

Only the top 10 elements are used to simplify the model.

For "LSTM", a total of 2 types of features. These features are as follows:

(1)

The top 10 largest eigenvalues.

(2)

The difference of the top 10 largest eigenvalues to the corresponding reference eigenvalues from arallel Analysis (PA). @seealso PA

See also

Author

Haijiang Qin <Haijiang133@outlook.com>

Examples

library(EFAfactors)
set.seed(123)

##Take the data.bfi dataset as an example.
data(data.bfi)

response <- as.matrix(data.bfi[, 1:25]) ## loading data
response <- na.omit(response) ## Remove samples with NA/missing values

## Transform the scores of reverse-scored items to normal scoring
response[, c(1, 9, 10, 11, 12, 22, 25)] <- 6 - response[, c(1, 9, 10, 11, 12, 22, 25)] + 1

# \donttest{
## Run extractor.feature.NN function.
features <- extractor.feature.NN(response, model="DNN")

print(features)
#>      eigen.value1 eigen.value2 eigen.value3 eigen.value4 eigen.value5
#> [1,]     5.134311     2.751887     2.142702     1.852328     1.548163
#>      eigen.value6 eigen.value7 eigen.value8 eigen.value9 eigen.value10
#> [1,]     1.073582    0.8395389    0.7992062    0.7189892     0.6880888
#>      eigen.ref1 eigen.ref2 eigen.ref3 eigen.ref4 eigen.ref5 eigen.ref6
#> [1,]   4.233181   2.741087   2.142702   1.852328   1.548163   1.073582
#>      eigen.ref7 eigen.ref8 eigen.ref9 eigen.ref10 var.account1 var.account2
#> [1,]  0.8395389  0.7992062  0.7189892   0.6880888    0.2053724    0.3154479
#>      var.account3 var.account4 var.account5 var.account6 var.account7
#> [1,]     0.401156    0.4752491    0.5371756    0.5801189    0.6137005
#>      var.account8 var.account9 var.account10  rheight2  rheight3  rheight4
#> [1,]    0.6456687    0.6744283     0.7019518 0.8153791 0.6602762 0.5712842
#>       rheight5  rheight6  rheight7  rheight8  rheight9  eheight2 eheight3
#> [1,] 0.4645855 0.4052522 0.3707678 0.3372263 0.3159913 0.4994727  0.39638
#>       eheight4  eheight5  eheight6  eheight7  eheight8 eheight9      wss2
#> [1,] 0.3596445 0.2752157 0.2400672 0.2196391 0.1997694  0.18719 0.8184622
#>           wss3      wss4     wss5      wss6      wss7      wss8      wss9
#> [1,] 0.7254026 0.6644626 0.584799 0.5419842 0.5027078 0.4682304 0.4274335
#> attr(,"class")
#> [1] "features.DNN"

features <- extractor.feature.NN(response, model="LSTM")

print(features)
#>          [,1]     [,2]     [,3]     [,4]     [,5]     [,6]      [,7]      [,8]
#> [1,] 5.134311 2.751887 2.142702 1.852328 1.548163 1.073582 0.8395389 0.7992062
#>           [,9]     [,10]    [,11]    [,12]    [,13]     [,14]     [,15]
#> [1,] 0.7189892 0.6880888 3.922225 1.570314 0.987676 0.7187979 0.4313707
#>            [,16]      [,17]      [,18]      [,19]      [,20]
#> [1,] -0.02644165 -0.2478548 -0.2725414 -0.3390408 -0.3547972
#> attr(,"class")
#> [1] "features.LSTM"
# }