oalib

Publish in OALib Journal

ISSN: 2333-9721

APC: Only $99

Submit

Any time

2019 ( 147 )

2018 ( 787 )

2017 ( 800 )

2016 ( 892 )

Custom range...

Search Results: 1 - 10 of 21018 matches for " multi-label classification problem "
All listed articles are free for downloading (OA Articles)
Page 1 /21018
Display every page Item
Extracting Hierarchies from Data Clusters for Better Classification
German Sapozhnikov,Alexander Ulanov
Algorithms , 2012, DOI: 10.3390/a5040506
Abstract: In this paper we present the PHOCS-2 algorithm, which extracts a “Predicted Hierarchy Of ClassifierS”. The extracted hierarchy helps us to enhance performance of flat classification. Nodes in the hierarchy contain classifiers. Each intermediate node corresponds to a set of classes and each leaf node corresponds to a single class. In the PHOCS-2 we make estimation for each node and achieve more precise computation of false positives, true positives and false negatives. Stopping criteria are based on the results of the flat classification. The proposed algorithm is validated against nine datasets.
Semantic Similarity over Gene Ontology for Multi-Label Protein Subcellular Localization  [PDF]
Shibiao Wan, Man-Wai Mak, Sun-Yuan Kung
Engineering (ENG) , 2013, DOI: 10.4236/eng.2013.510B014
Abstract:

As one of the essential topics in proteomics and molecular biology, protein subcellular localization has been extensively studied in previous decades. However, most of the methods are limited to the prediction of single-location proteins. In many studies, multi-location proteins are either not considered or assumed not existing. This paper proposes a novel multi-label subcellular-localization predictor based on the semantic similarity between Gene Ontology (GO) terms. Given a protein, the accession numbers of its homologs are obtained via BLAST search. Then, the homologous accession numbers of the protein are used as keys to search against the gene ontology annotation database to obtain a set of GO terms. The semantic similarity between GO terms is used to formulate semantic similarity vectors for classification. A support vector machine (SVM) classifier with a new decision scheme is proposed to classify the multi-label GO semantic similarity vectors. Experimental results show that the proposed multi-label predictor significantly outperforms the state-of-the-art predictors such as iLoc-Plant and Plant-mPLoc.

Cost-sensitive AdaBoost Algorithm for Multi-class Classification Problems
多分类问题代价敏感AdaBoost算法

FU Zhong-Liang,
付忠良

自动化学报 , 2011,
Abstract: To solve the cost merging problem when multi-class cost-sensitive classification is transferred to two-class cost-sensitive classification, a cost-sensitive AdaBoost algorithm which can be applied directly to multi-class classification is constructed. The proposed algorithm is similar to real AdaBoost algorithm in algorithm flow and error estimation formula. When the costs are equal, this algorithm becomes a new real AdaBoost algorithm for multi-class classification, guaranteeing that the training error of the combination classifier could be reduced while the number of trained classifiers increased. The new real AdaBoost algorithm does not need to meet the condition that every classifier must be independent, that is to say, the independent condition of classifiers can be derived from the new algorithm, instead of being the must for current real AdaBoost algorithm for multi-class classification. The experimental results show that this new algorithm always ensures the classification result trends to the class with the smallest cost, while the existing multi-class cost-sensitive learning algorithm may fail if the costs of being erroneously classified to other classes are imbalanced and the average cost of every class is equal. The research method above provides a new idea to construct new ensemble learning algorithms, and an AdaBoost algorithm for multi-label classification is given, which is easy to operate and approximately meets the smallest error classification rate.
多标签AdaBoost算法的改进算法
付忠良,张丹普,王莉莉
- , 2015, DOI: 10.15961/j.jsuese.2015.05.015
Abstract: 中文摘要: 针对多标签AdaBoost系列算法,以尽量减小算法的学习错误率为目的,提出了对其进行改进的2种思路。基于改进思路构造出了改进的多标签AdaBoost算法。一种思路是修改算法的样本分布调整策略,破坏现有AdaBoost算法中样本分布的均匀性,以确保增加每一个弱分类器都能降低学习错误的上界估计,从而实现对多标签AdaBoost算法的改进;另一种思路是训练弱分类器时兼顾后续待学习的弱分类器对学习错误的影响,克服现有算法在训练弱分类器时只考虑当前弱分类器对学习错误的影响,而完全忽略后续待学习的弱分类器对学习错误的影响这一现象,从而改进多标签AdaBoost算法。理论上,对于改进多标签AdaBoost算法,增加每一个弱分类器都能进一步降低学习错误。理论分析和实验结果均表明了提出的改进算法有改进效果。
Abstract:Aiming to decrease the learning error of the series of AdaBoost algorithm for multi-label classification,the AdaBoost algorithm was improved for multi-label classification by two strategies.One idea is to modify the adjustment strategy of sample distribution,and destroy the sample uniform distribution in the existing AdaBoost algorithm,in order to ensure that the increase of every weak classifier can reduce the learning error bound estimation.Another idea is to consider the effect of subsequent weak classifiers to decrease the learning error when training current weak classifier,which is different from the existing AdaBoost algorithm.Theoretically,the improved AdaBoost algorithms for multi-label classification increase every weak classifier to reduce more learning error.Theoretical analysis and experimental results showed that all the improved algorithms are effective.
Novel Apriori-Based Multi-Label Learning Algorithm by Exploiting Coupled Label Relationship
Novel Apriori-Based Multi-Label Learning Algorithm by Exploiting Coupled Label Relationship

Zhenwu Wang,Longbing Cao
- , 2017, DOI: 10.15918/j.jbit1004-0579.201726.0209
Abstract: It is a key challenge to exploit the label coupling relationship in multi-label classification (MLC) problems. Most previous work focused on label pairwise relations, in which generally only global statistical information is used to analyze the coupled label relationship. In this work, firstly Bayesian and hypothesis testing methods are applied to predict the label set size of testing samples within their k nearest neighbor samples, which combines global and local statistical information, and then apriori algorithm is used to mine the label coupling relationship among multiple labels rather than pairwise labels, which can exploit the label coupling relations more accurately and comprehensively. The experimental results on text, biology and audio datasets shown that, compared with the state-of-the-art algorithm, the proposed algorithm can obtain better performance on 5 common criteria.
It is a key challenge to exploit the label coupling relationship in multi-label classification (MLC) problems. Most previous work focused on label pairwise relations, in which generally only global statistical information is used to analyze the coupled label relationship. In this work, firstly Bayesian and hypothesis testing methods are applied to predict the label set size of testing samples within their k nearest neighbor samples, which combines global and local statistical information, and then apriori algorithm is used to mine the label coupling relationship among multiple labels rather than pairwise labels, which can exploit the label coupling relations more accurately and comprehensively. The experimental results on text, biology and audio datasets shown that, compared with the state-of-the-art algorithm, the proposed algorithm can obtain better performance on 5 common criteria.
Towards Multi Label Text Classification through Label Propagation
Shweta C. Dharmadhikari,Maya Ingle,Parag Kulkarni
International Journal of Advanced Computer Sciences and Applications , 2012,
Abstract: Classifying text data has been an active area of research for a long time. Text document is multifaceted object and often inherently ambiguous by nature. Multi-label learning deals with such ambiguous object. Classification of such ambiguous text objects often makes task of classifier difficult while assigning relevant classes to input document. Traditional single label and multi class text classification paradigms cannot efficiently classify such multifaceted text corpus. Through our paper we are proposing a novel label propagation approach based on semi supervised learning for Multi Label Text Classification. Our proposed approach models the relationship between class labels and also effectively represents input text documents. We are using semi supervised learning technique for effective utilization of labeled and unlabeled data for classification. Our proposed approach promises better classification accuracy and handling of complexity and elaborated on the basis of standard datasets such as Enron, Slashdot and Bibtex.
Multi-label Data Mining:A Survey
多标签数据挖掘技术:研究综述

LI Si-nan,LI Ning,LI Zhan-huai,
李思男
,李宁,李战怀

计算机科学 , 2013,
Abstract: 传统的单标签数据挖掘技术研究对象中,每个样本仅属于一个类别标签,但在实际应用中一个样本更倾向于同时具备多个属性,即属于多标签数据类型。多标签数据挖掘技术现已成为数据挖掘技术中的一个研究热点。其研究成果广泛地应用于各种不同的领域,如图像视频的语义标注、功能基因组、音乐情感分类以及营销指导等。从多标签数据挖掘的方法和度量方式两个方面对多标签数据挖掘进行了系统详细的阐述,最后归纳了目前研究中存在的问题和挑战并展望了本领域的发展趋势。
Modified KNN algorithm for multi-label learning
用于多标记学习的K近邻改进算法*

ZHANG Shun,ZHANG Hua-xiang,
张顺
,张化祥

计算机应用研究 , 2011,
Abstract: ML-KNN is an approach that employs KNN to solve multi-label problems,but it suffers from the problems of high time complexity and low classification accuracy.This paper proposed a modified algorithm WML-KNN to solve these problems.It combined data sampling and weighting into one approach,and resulted in the time complexity reduction and the classification accuracy improvement of the minority class data.Experimental results show that WML-KNN works better than other commonly used multi-label algorithms.
A Novel Multi label Text Classification Model using Semi supervised learning
Shweta C. Dharmadhikari,Maya Ingle,Parag Kulkarni
International Journal of Data Mining & Knowledge Management Process , 2012,
Abstract: Automatic text categorization (ATC) is a prominent research area within Information retrieval. Throughthis paper a classification model for ATC in multi-label domain is discussed. We are proposing a new multi label text classification model for assigning more relevant set of categories to every input text document. Our model is greatly influenced by graph based framework and Semi supervised learning. We demonstrate the effectiveness of our model using Enron , Slashdot , Bibtex and RCV1 datasets. Our experimental results indicate that the use of Semi Supervised Learning in MLTC greatly improves the decision making capability of classifier.
Fast Classification of Meat Spoilage Markers Using Nanostructured ZnO Thin Films and Unsupervised Feature Learning
Martin L?ngkvist,Silvia Coradeschi,Amy Loutfi,John Bosco Balaguru Rayappan
Sensors , 2013, DOI: 10.3390/s130201578
Abstract: This paper investigates a rapid and accurate detection system for spoilage in meat. We use unsupervised feature learning techniques (stacked restricted Boltzmann machines and auto-encoders) that consider only the transient response from undoped zinc oxide, manganese-doped zinc oxide, and fluorine-doped zinc oxide in order to classify three categories: the type of thin film that is used, the type of gas, and the approximate ppm-level of the gas. These models mainly offer the advantage that features are learned from data instead of being hand-designed. We compare our results to a feature-based approach using samples with various ppm level of ethanol and trimethylamine (TMA) that are good markers for meat spoilage. The result is that deep networks give a better and faster classification than the feature-based approach, and we thus conclude that the fine-tuning of our deep models are more efficient for this kind of multi-label classification task.
Page 1 /21018
Display every page Item


Home
Copyright © 2008-2017 Open Access Library. All rights reserved.