全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

LbrBart: A Text Summarizer for Liberia News Outlets

DOI: 10.4236/oalib.1112923, PP. 1-15

Subject Areas: Online social network computing, Information Management, Sociology

Keywords: LbrBart, Pseudo-Summarization, Text Summarization, Low-Resource, Abstractive Summarization

Full-Text   Cite this paper   Add to My Lib

Abstract

The deluge of digital information in Liberia necessitates efficient content consumption. LbrBart, a proposed text summarization tool, is specifically designed to cater to the needs of Liberian news outlets, including the Liberian Observer, FrontPage Africa, AllAfrica, The Inquirer and Liberia News Agency (LINA). Employing advanced techniques like pseudo-summarization, centrality-based sentence recovery, and low-resource abstractive summarization using the BART model, LbrBart aims to significantly enhance the readability and accessibility of Liberian news content. This study explores the adaptation of these methodologies to the unique linguistic and cultural nuances of Liberian writing. We utilize established evaluation metrics to assess LbrBart’s performance against state-of-the-art summarization benchmarks, ensuring the preservation of core article meanings while significantly reducing reading time. To evaluate LbrBart’s capabilities, we have compared it with state-of-the-art models and the results show better performance over baselines. Addressing information overload and improving information dissemination in Liberia, LbrBart not only benefits individual readers but also contributes to enhanced public engagement and informed citizenship. This research marks a significant step forward in harnessing technology to meet the specific needs of local news consumers, setting a precedent for future advancements in natural language processing tailored to emerging markets. Experiments on baseline datasets demonstrate competitive performance compared to previous studies.

Cite this paper

Mangou, M. T. , Wu, C. and Sangary, O. (2025). LbrBart: A Text Summarizer for Liberia News Outlets. Open Access Library Journal, 12, e2923. doi: http://dx.doi.org/10.4236/oalib.1112923.

References

[1]  Mangou, M. (2024) LbrBart: A Text Summarizer for Liberia News Outlets. GitHub Repository. https://github.com/markmangou/lbrBart.git
[2]  Lewis, M., Liu, Y., Nogueira, K., Goyal, A., Ghazvinian, M., Kaiser, L., Zettle-moyer, L., et al. (2020) BART: A Bidirectional Encoder-Based Transformer for Natural Lan-guage Generation. arXiv: 1910.13461.
[3]  Nallapati, R., Zhou, B., dos Santos, C., Gulcehre, C. and Xiang, B. (2016) Abstractive Text Summarization Using Sequence-To-Sequence RNNs and Beyond. Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, Berlin, 7-12 August 2016, 280-290. https://doi.org/10.18653/v1/k16-1028
[4]  Radev, D.R., Blair, W. and Jin, R. (2000) Centroid-Based Summarization of Biomedical Doc-uments. Proceedings of the 19th International Conference on Computational Linguistics (COLING), Taipei City, 26-30 August 2002, 356-362.
[5]  Rush, A.M., Chopra, S. and Weston, J. (2015) A Neural Attention Model for Abstrac-tive Sentence Summarization. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, 17-21 September 2015, 379-389. https://doi.org/10.18653/v1/d15-1044
[6]  Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., et al. (2018) Deep Contextualized Word Representations. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), New Orleans, June 2018, 2227-2237. https://doi.org/10.18653/v1/n18-1202
[7]  See, A., Liu, P.J. and Manning, C.D. (2017) Get to the Point: Summarization with Point-er-Generator Networks. Proceedings of the 55th Annual Meeting of the Associa-tion for Computational Linguistics (Volume 1: Long Papers), Vancouver, 30 Ju-ly-4 August 2017, 1073-1083. https://doi.org/10.18653/v1/p17-1099
[8]  Bahdanau, D., Cho, K. and Ben-gio, Y. (2015) Neural Machine Translation by Jointly Learning to Align and Translate. arXiv: 1409.0473.
[9]  Vaswani, A., Shazeer, N., Parmar, N., Usz-koreit, J., Jones, L., Gomez, A.N., Kaiser, L. and Polosukhin, I. (2017) Attention Is All You Need. arXiv: 1706.03762.
[10]  Devlin, J., Chang, M.W., Lee, K. and Toutanova, K. (2019) BERT: Pre-Training of Deep Bi-Directional Transformers for Language Understanding. Proceedings of NAACL-HLT 2019, Minneapolis, 2-7 June 2019, 4171-4186
[11]  Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., et al. (2020) BART: Denoising Sequence-To-Sequence Pre-Training for Natural Language Generation, Translation, and Comprehen-sion. Proceedings of the 58th Annual Meeting of the Association for Computa-tional Linguistics, 5-10 July 2020, 7871-7880. https://doi.org/10.18653/v1/2020.acl-main.703
[12]  Liu, Y. and Lapata, M. (2019) Text Summarization with Pretrained Encoders. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong SAR, November 2019, 3730-3740. https://doi.org/10.18653/v1/d19-1387
[13]  Guo, M., Ainslie, J., Uthus, D., Ontanon, S., Ni, J., Sung, Y., et al. (2022) LongT5: Efficient Text-To-Text Trans-former for Long Sequences. Findings of the Association for Computational Lin-guistics: NAACL 2022, Seattle, 10-15 July 2022, 724-736. https://doi.org/10.18653/v1/2022.findings-naacl.55
[14]  Puduppully, R.S., Jain, P., Chen, N. and Steedman, M. (2023) Multi-Document Summarization with Centroid-Based Pretraining. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Toronto, 9-14 July 2023, 128-138. https://doi.org/10.18653/v1/2023.acl-short.13
[15]  Xiao, W., Beltagy, I., Carenini, G. and Cohan, A. (2022) PRIMERA: Pyramid-Based Masked Sentence Pre-Training for Multi-Document Summarization. Proceedings of the 60th An-nual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, 22-28 May 2022, 5245-5263. https://doi.org/10.18653/v1/2022.acl-long.360
[16]  Zhang, J., Zhao, Y., Saleh, M. and Liu, P. (2020) PEGASUS: Pre-Training with Extracted Gap-Sentences for Abstractive Summarization. Proceedings of the 37th Interna-tional Conference on Machine Learning, 13-18 July 2020, 11328-11339.
[17]  Goodwin, T., Savery, M. and Demner-Fushman, D. (2020) Flight of the PEGASUS? Comparing Transformers on Few-Shot and Zero-Shot Multi-Document Abstractive Summarization. Proceedings of the 28th Interna-tional Conference on Computational Linguistics, Barcelona, 8-13 December 2020, 5640-5646. https://doi.org/10.18653/v1/2020.coling-main.494
[18]  Kryscinski, W., Ra-jani, N., Agarwal, D., Xiong, C. and Radev, D. (2022) BOOKSUM: A Collection of Datasets for Long-Form Narrative Summarization. Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, 7-11 December 2022, 6536-6558. https://doi.org/10.18653/v1/2022.findings-emnlp.488
[19]  Liu, Y., Liu, P., Radev, D. and Neubig, G. (2022) BRIO: Bringing Order to Abstractive Summarization. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, 22-27 May 2022, 2890-2903. https://doi.org/10.18653/v1/2022.acl-long.207
[20]  Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., et al. (2023) GPT-4 Technical Report. arXiv: 2303.08774.
[21]  Hartman, V. and Campion, T.R. (2022) A Day-to-Day Approach for Auto-Mating the Hospital Course Section of the Discharge Summary. AMIA Summits on Translational Science Proceedings, 2022, 216-225.
[22]  Zhu, Y.Q., Yang, X.B., Wu, Y.Y. and Zhang, W.S. (2023) Leveraging Summary Guidance on Medical Report Summarization. arXiv: 2302.04001. https://arxiv.org/abs/2302.04001

Full-Text


Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133