In our study, we chose python as the programming platform for finding an Automatic Bengali Document Summarizer. English has sufficient tools to process and receive summarized records. However, there is no
specifically applicable to Bengali since Bengali has a lot of ambiguity, it
differs from English in terms of grammar. Afterward,
this language holds an important place because this language is spoken by 26 core
people all over the world. As a result, it has taken a new method to summarize Bengali documents.
The proposed system has been designed by
using the following stages: pre-processing the sample doc/input doc,
word tagging, pronoun replacement, sentence ranking, as well as summary. Pronoun replacement has been used to
reduce the incidence ofswinging
pronouns in the performance review. We ranked sentences based on
sentence frequency, numerical figures, and pronoun replacement. Checking the
similarity between two sentences in order to exclude one since it has less duplication.Hereby, we’ve taken 3000 data as input from newspaper and book documents
and learned the words to be appropriate with syntax. In addition, to evaluate the performance of the
designed summarizer, the design system
looked at the different documents. According to the assessment method, the recall, precision, and F-score were 0.70, 0.82and 0.74, respectively,
References
[1]
De Kunder, M. (2005) The Size of the World Wide Web.
[2]
Ferreira, R. and Luciano, S. (2014) A Multi-Document Summarization System Based on Statistics and Linguistic Treatment. Journal of Expert Systems with Applications, 41, 5780-5787.
[3]
Luhn, H.P. (1958) The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development, 2, 159-165. https://doi.org/10.1147/rd.22.0159
[4]
Edmundson, H.P. (1969) New Methods in Automatic Extracting. Journal of the ACM, 16, 264-285. https://doi.org/10.1145/321510.321519
[5]
Sarkar, K. (2012) Bengali Text Summarization by Sentence Extraction. Proceedings of International Conference on Business and Information Management (ICBIM-2012), Durgapur, 9-11 January 2012, 233-245.
[6]
Sarkar, K. (2012) An Approach to Summarizing Bengali News Documents. Proceedings of the International Conference on Advances in Computing, Communications and Informatics, Chennai, 3-5 August 2012, 857-862. https://doi.org/10.1145/2345396.2345535
[7]
Efat, I.A., Ibrahim, M. and Kayesh, H. (2013) Automated Bangla Text Summarization by Sentence Scoring and Ranking. Proceedings of 2013 International Conference on Informatics, Electronics & Vision (ICIEV), Dhaka, 17-18 May 2013, 1-5. https://doi.org/10.1109/ICIEV.2013.6572686
[8]
Jahan, B., Emon, I.S., Milu, S.A., Hossain, M.M. and Mahtab S.S. (2021) A Pronoun Replacement-Based Special Tagging System for Bengali Language Processing (BLP). In: Saini, H.S., Sayal, R., Govardhan, A. and Buyya, R., Eds., Innovations in Computer Science and Engineering, Springer, Singapore, 761-768. https://doi.org/10.1007/978-981-33-4543-0_80
[9]
Farrier, J. (2015) The Second Most Spoken Languages around the World. Olivet Nazarene University, Bourbonnais.
[10]
Jahan, B., Mahtab, S.S., Arif, F.H., Emon, I.S., Milu, S.A. and Raju, J. (2021) An Automated Bengali Text Summarization Technique Using Lexicon-Based Approach. In: Saini, H.S., Sayal, R., Govardhan, A. and Buyya, R., Eds., Innovations in Computer Science and Engineering, Springer, Singapore, 363-373. https://doi.org/10.1007/978-981-33-4543-0_39
[11]
Charniak, E. and McDermott, D. (1985) Introduction to Artificial Intelligence. Addison-Wesley Longman Publishing Co., Inc., Boston, MA.
[12]
Abuobieda, A., Salim, N., Albaham, A.T., Osman, A.H. and Kumar, Y.J. (2012) Text Summarization Features Selection Method Using Pseudo-Genetic-Based Model. Proceedings of the 2012 International Conference on Information Retrieval Knowledge Management, Kuala Lumpur, 13-15 March 2012, 193-197. https://doi.org/10.1109/InfRKM.2012.6204980
[13]
Sarkar, K. (2014) A Keyphrase-Based Approach to Text Summarization for English and Bengali Documents. International Journal of Technology Diffusion, 5, 28-38.
[14]
Baxendale, P.B. (1958) Machine-Made Index for Technical Literature—An Experiment. IBM Journal of Research and Development, 2, 354-361. https://doi.org/10.1147/rd.24.0354
[15]
Radev, D.R., Hovy, E. and McKeown, K. (2002) Introduction to the Special Issue on Summarization. Computational Linguistics, 28, 399-408. https://doi.org/10.1162/089120102762671927
[16]
Chandra, P., Arif, F., Rahman, M., Siddik, S., Rahman M.S. and Rahman, A. (2018) Automated Bengali Document Summarization by Collaborating Individual Word &Sentence Scoring. 2018 21st IEEE International Conference of Computer and Information Technology (ICCIT), Dhaka, 21-23 December 2018, 1-6. https://doi.org/10.1109/ICCITECHN.2018.8631926