%0 Journal Article %T A Statistical Analysis of Textual E-Commerce Reviews Using Tree-Based Methods %A Jessica Kubrusly %A Ana Luiza Neves %A Thamires Louzada Marques %J Open Journal of Statistics %P 357-372 %@ 2161-7198 %D 2022 %I Scientific Research Publishing %R 10.4236/ojs.2022.123023 %X With the increasing interest in e-commerce shopping, customer reviews have become one of the most important elements that determine customer satisfaction regarding products. This demonstrates the importance of working with Text Mining. This study is based on The Womenกฏs Clothing E-Commerce Reviews database, which consists of reviews written by real customers. The aim of this paper is to conduct a Text Mining approach on a set of customer reviews. Each review was classified as either a positive or negative review by employing a classification method. Four tree-based methods were applied to solve the classification problem, namely Classification Tree, Random Forest, Gradient Boosting and XGBoost. The dataset was categorized into training and test sets. The results indicate that the Random Forest method displays an overfitting, XGBoost displays an overfitting if the number of trees is too high, Classification Tree is good at detecting negative reviews and bad at detecting positive reviews and the Gradient Boosting shows stable values and quality measures above 77% for the test dataset. A consensus between the applied methods is noted for important classification terms. %K Text Mining %K Supervised Classification %K Tree-Based Methods %K Classification Trees %K Random Forest %K Gradient Boosting %K XGBoost %U http://www.scirp.org/journal/PaperInformation.aspx?PaperID=117776