|
基于条件随机场的多标签学习
|
Abstract:
多标签学习的目标是为每个样本分配一个或者多个标签的集合。在实际应用中,多标签之间通常存在复杂的依赖关系,这为模型的构建带来了挑战。通过将多标签学习问题转化为序列标注问题,能够充分利用标签之间的顺序依赖性,为多标签学习提供新思路。在此框架下,条件随机场(Conditional Random Fields, CRF)因其优异的序列建模能力和概率推断框架,被证明是一种有效的方法。CRF能够通过条件概率建模捕捉输入特征与标签之间的关系,并通过标签间的转移特征建模多标签间的依赖性。相比独立处理各标签的方法,CRF可以建模标签之间的相互影响,从而提高预测的准确性和一致性。通过进一步的理论探索和实践验证,CRF在多标签学习中的应用将变得更加广泛,为学习任务提供强有力的支持。
The goal of multi-label learning is to assign a set of one or more labels to each sample. In practical applications, there are often complex dependencies between multiple tags, which brings challenges to the construction of models. By transforming the multi-label learning problem into a sequential labeling problem, we can make full use of the order dependence between labels and provide a new idea for multi-label learning. Under this framework, Conditional Random Fields (CRF) proved to be an effective method due to its excellent sequence modeling ability and probabilistic inference framework. CRF can capture the relationship between input features and labels through conditional probability modeling, and model the dependency between multiple labels through transition features between labels. CRF can model the interactions between labels to improve the accuracy and consistency of predictions compared to methods that treat each label independently. Through further theoretical exploration and practical verification, the application of CRF in multi-label learning will become more extensive and provide strong support for learning tasks.
[1] | Read, J., Pfahringer, B., Holmes, G. and Frank, E. (2011) Classifier Chains for Multi-Label Classification. Machine Learning, 85, 333-359. https://doi.org/10.1007/s10994-011-5256-5 |
[2] | Read, J., Pfahringer, B., Holmes, G. and Frank, E. (2009) Classifier Chains for Multi-Label Classification. In: Buntine, W., Grobelnik, M., Mladenić, D. and Shawe-Taylor, J., Eds., Lecture Notes in Computer Science, Springer Berlin Heidelberg, 254-269. https://doi.org/10.1007/978-3-642-04174-7_17 |
[3] | Narassiguin, A., Elghazel, H. and Aussem, A. (2017) Dynamic Ensemble Selection with Probabilistic Classifier Chains. In: Ceci, M., Hollmén, J., Todorovski, L., Vens, C. and Džeroski, S., Eds., Lecture Notes in Computer Science, Springer International Publishing, 169-186. https://doi.org/10.1007/978-3-319-71249-9_11 |
[4] | Madjarov, G., Gjorgjevikj, D., Dimitrovski, I. and Džeroski, S. (2016) The Use of Data-Derived Label Hierarchies in Multi-Label Classification. Journal of Intelligent Information Systems, 47, 57-90. https://doi.org/10.1007/s10844-016-0405-8 |
[5] | Spolaôr, N., Monard, M.C., Tsoumakas, G. and Lee, H.D. (2016) A Systematic Review of Multi-Label Feature Selection and a New Method Based on Label Construction. Neurocomputing, 180, 3-15. https://doi.org/10.1016/j.neucom.2015.07.118 |
[6] | Huang, S., Gao, W. and Zhou, Z. (2019) Fast Multi-Instance Multi-Label Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41, 2614-2627. https://doi.org/10.1109/tpami.2018.2861732 |
[7] | Boutell, M.R., Luo, J., Shen, X. and Brown, C.M. (2004) Learning Multi-Label Scene Classification. Pattern Recognition, 37, 1757-1771. https://doi.org/10.1016/j.patcog.2004.03.009 |
[8] | Zhang, M. and Zhou, Z. (2007) ML-KNN: A Lazy Learning Approach to Multi-Label Learning. Pattern Recognition, 40, 2038-2048. https://doi.org/10.1016/j.patcog.2006.12.019 |
[9] | 洪铭材, 张阔, 唐杰, 等. 基于条件随机场(CRFs)的中文词性标注方法[J]. 计算机科学, 2006, 33(10): 148-151+155. |
[10] | 崔丽平, 古丽拉·阿东别克, 王智悦. 基于有向图模型的旅游领域命名实体识别[J]. 计算机工程, 2022, 48(2): 306-313. |
[11] | 喻鑫, 张矩, 邱武松, 等. 基于序列标注算法比较的医学文献风险事件抽取研究[J]. 计算机应用与软件, 2017, 34(12): 58-63. |
[12] | 方胜群. 基于机器学习的院感智能诊断技术研究[D]: [硕士学位论文]. 长沙: 国防科技大学, 2018. |
[13] | 张博晟. 基于深度学习和主题模型的多标签文本分类方法研究[D]: [硕士学位论文]. 杭州: 杭州电子科技大学, 2023. |
[14] | 刘凯, 龚辉, 曹晶晶, 等. 基于多类型无人机数据的红树林遥感分类对比[J]. 热带地理, 2019, 39(4): 492-501. |
[15] | 付彬. 基于标记依赖关系的多标记学习算法研究[D]: [博士学位论文]. 北京: 北京交通大学, 2016. |