%0 Journal Article
%T Constructing Binary Classification Trees with High Intelligibility
具有高可理解性的二分决策树生成算法研究
%A JIANG Yan-Huang
%A YANG Xue-Jun
%A ZHAO Qiang-Li
%A
蒋艳凰
%A 杨学军
%A 赵强利
%J 软件学报
%D 2003
%I
%X Binarization is the most popular discretization method in decision tree generation, while for the domain with many continuous attributes, it always gets a big incomprehensible tree which can't be described as knowledge. In order to get a more intelligible decision tree, this paper presents a new discretization algorithm, RCAT, for continuous attributes in the generation of binary classification tree. It uses simple binarization to solve the multisplitting problem through mapping a continuous attribute into another probability attribute based on statistic information. Two pruning methods are introduced to simplify the constructed tree. Empirical results of several domains show that, for the two-class problem with a preponderance of continuous attributes, RCAT algorithm can generate a much smaller decision tree efficiently with higher intelligibility than binarization while retaining predictive accuracy.
%K machine learning
%K binary classification tree
%K information gain
%K pruning
%K range-splitting based on continuous attributes transform (RCAT) algorithm
机器学习
%K 二分决策树
%K 信息熵增益
%K 剪枝
%K RCAT算法
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=7735F413D429542E610B3D6AC0D5EC59&aid=9BBD1EA328285D73&yid=D43C4A19B2EE3C0A&vid=F3583C8E78166B9E&iid=59906B3B2830C2C5&sid=8A15F8B0AA0E5323&eid=2DD7160C83D0ACED&journal_id=1000-9825&journal_name=软件学报&referenced_num=3&reference_num=16