|
- 2018
谱聚类算法在不同属性层级结构诊断评估中的应用Abstract: 摘要: 聚类分析已成功用于认知诊断评估(CDA)中,使用广泛的聚类分析方法为K-means算法,有研究已证明K-means在CDA中具有较好的聚类效果。而谱聚类算法通常比K-means分类效果更佳,本研究将谱聚类算法引进CDA,探讨了属性层级结构、属性个数、样本量和失误率对该方法的影响。研究发现:(1)谱聚类算法要比K-means提供更好的聚类结果,尤其在实验条件较苛刻时,谱聚类算法更加稳健;(2)线型结构聚类效果最好,收敛型和发散型相近,独立型结构表现较差;(3)属性个数和失误率增加后,聚类效果会下降;(4)样本量增加后,聚类效果有所提升,但K-means方法有时会有反向结果出现。Abstract: Clustering analysis for cognitive diagnostic assessment is a significant approach to classify examinees into several categories matching their attribute profiles which can reflect the status of mastering or nonmastering each attribute. These methods belong to the nonparametric technique that dose not require the estimation of parameters, and are less restrictive and often computationally more efficient than parametric technique, such as cognitive diagnostic models. Better yet, many nonparametric classification algorithms can be easily implemented in most statistical software packages, R or matlab. The K-means is the most classical algorithm among the clustering analysis methods, and has widely application in real world. The K-means clustering analysis for cognitive diagnostic assessment requires the Q-matrix only, which describes the relationship between attributes and items. The previous study has proved that the K-means algorithm has fairly favorable classified ability for cognitive diagnostic assessment comparing the cognitive diagnostic models. However, the spectral clustering algorithm (SCA) which is the powerful algorithm for clustering has been broadly applied to many fields, including image segmentation, neural information processing, biology, and large-scale assessment in psychology. The SCA is easy to operate, and often outperforms traditional clustering algorithms such as the K-means algorithm. In this article, we introduce the SCA for classifying examinees into attribute-homogeneous groups based on their responses. However, the starting values have a large effect on the classified performance for both SCA and the K-means algorithm. So, we adopted Ward’s and random starting values when using SCA, and best, Ward’s and random starting values when using the K-means algorithm. Totally, five methods were considered in this article. They are SCA-Ward’s, SCA-R, K-means-best, K-means-Ward’s, and K-means-R, respectively. The simulation studies were implemented to compare the classified performance between the SCA and the K-means algorithm using two indices, agreement between partitions and the within-cluster homogeneity, under four factors: the attribute hierarchical structures (Linear, Convergent, Divergent, or Independent), the number of
|