The Web development has drastically changed the human interaction and communication, leading to an exponential growth of data generated by users in various digital media. This mass of data provides opportunities for understanding people’s opinions about products, services, processes, events, political movements, and organizational strategies. In this context, it becomes important for companies to be able to assess customer satisfaction about their products or services. One of the ways to evaluate customer sentiment is the use of Sentiment Analysis, also known as Opinion
Mining. This research aims to compare the efficiency of an automatic classifier based on dictionary with the classification by human jurors in a set of comments made by customers in Portuguese language. The data consist of opinions of service users of one of the largest Brazilian online employment agencies. The performance evaluation of the classification models was done using Kappa index and a Confusion Matrix. As the main finding, it is noteworthy that the
agreement between the classifier and the human jurors came to moderate, with better performance for the dictionary-based classifier. This result was considered satisfactory, considering that the Sentiment Analysis in Portuguese language is a complex task and demands more research and development.
References
[1]
Newman, R., Chang, V., Walters, R.J. and Wills, G.B. (2016) Web 2.0—The Past and the Future. International Journal of Information Management, 36, 591-598.
https://doi.org/10.1016/j.ijinfomgt.2016.03.010
[2]
Cambria, E. (2013) New Avenues in Opinion Mining and Sentiment Analysis. 7.
[3]
Chen, H. and Zimbra, D. (2010) AI and Opinion Mining. IEEE Intelligent Systems, 25, 74-80. https://doi.org/10.1109/MIS.2010.75
[4]
Pang, B. and Lee, L. (2008) Opinion Mining and Sentiment Analysis. Foundations and Trends® in Information Retrieval, 2, 1-135. https://doi.org/10.1561/1500000011
[5]
Pang, B., Lee, L. and Vaithyanathan, S. (2002) Thumbs up?: Sentiment Classification Using Machine Learning Techniques. Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, 10, 79-86.
https://doi.org/10.3115/1118693.1118704
[6]
Medhat, W., Hassan, A. and Korashy, H. (2014) Sentiment Analysis Algorithms and Applications: A Survey. Ain Shams Engineering Journal, 5, 1093-1113.
https://doi.org/10.1016/j.asej.2014.04.011
[7]
Ravi, K. and Ravi, V. (2015) A Survey on Opinion Mining and Sentiment Analysis: Tasks, Approaches and Applications. Knowledge-Based Systems, 89, 14-46.
https://doi.org/10.1016/j.knosys.2015.06.015
[8]
Liu, B. (2012) Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies, 5, 1-167.
https://doi.org/10.2200/S00416ED1V01Y201204HLT016
[9]
Canhoto, A.I. and Padmanabhan, Y. (2015) We (don’t) Know How You Feel—A Comparative Study of Automated vs. Manual Analysis of Social Media Conversations. ournal of Marketing Management, 31, 1141-1157.
https://doi.org/10.1080/0267257X.2015.1047466
[10]
Taboada, M., Brooke, J., Tofiloski, M., Voll, K. and Stede, M. (2011) Lexicon-Based Methods for Sentiment Analysis. Computational Linguistics, 37, 267-307.
https://doi.org/10.1162/COLI_a_00049
[11]
Hegde, D.S. (2015) Essays on Research Methodology. Springer Berlin Heidelberg, New York, NY. https://doi.org/10.1007/978-81-322-2214-9
[12]
de A. Martins, G. and Theóphilo, C.R. (2017) Metodologia da Investigação Científica Para Ciências Sociais Aplicadas, 3a Edição. Atlas, Brasil.
[13]
Tavakoli, H. (2013) A Dictionary of Research Methodology and Statistics in Applied Linguistics. Rahnamā, Tehran.
[14]
Avanco, L.V. and das G. V. Nunes, M. (2014) Lexicon-Based Sentiment Analysis for Reviews of Products in Brazilian Portuguese. 2014 Brazilian Conference on Intelligent Systems, Sao Paulo, 18-22 October 2014, 277-281.
https://doi.org/10.1109/BRACIS.2014.57
[15]
Musto, C., Semeraro, G. and Polignano, M. (2014) A Comparison of Lexicon-Based Approaches for Sentiment Analysis of Microblog Posts. Vol. 59.
[16]
Chiavetta, F., Lo Bosco, G. and Pilato, G. (2016) A Lexicon-Based Approach for Sentiment Classification of Amazon Books Reviews in Italian Language. 12th International Conference on Web Information Systems and Technologies, Roma, 23-25 April 2016, 159-170.
[17]
Silva, M., Carvalho, P. and Sarmento, L. (2012) Building a Sentiment Lexicon for Social Judgment Mining. International Conference on Computational Processing of the Portuguese Language, Coimbra, 17-20 April 2012, 218-228.
https://doi.org/10.1007/978-3-642-28885-2_25
[18]
Siegel, S. and Castellan, N. (1988) Nonparametric Statistics for the Behavioral Sciences. 2nd Edition, McGraw-Hill, New York.
[19]
Banerjee, M., Capozzoli, M., McSweeney, L. and Sinha, D. (1999) Beyond Kappa: A Review of Interrater Agreement Measures. Canadian Journal of Statistics, 27, 3-23.
https://doi.org/10.2307/3315487
[20]
Fleiss, J.L., Levin, B. and Paik, M.C. (2003) Statistical Methods for Rates and Proportions. 3rd Edition, Wiley, Hoboken. https://doi.org/10.1002/0471445428
[21]
Landis, J.R. and Koch, G.G. (1977) The Measurement of Observer Agreement for Categorical Data. Biometrics, 33, 159. https://doi.org/10.2307/2529310
[22]
Campbell, J.B. (2002) Introduction to Remote Sensing. The Guilford Press, New York.