全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
Algorithms  2012 

Better Metrics to Automatically Predict the Quality of a Text Summary

DOI: 10.3390/a5040398

Keywords: multi-document summarization, update summarization, evaluation, computational linguistics, text processing

Full-Text   Cite this paper   Add to My Lib

Abstract:

In this paper we demonstrate a family of metrics for estimating the quality of a text summary relative to one or more human-generated summaries. The improved metrics are based on features automatically computed from the summaries to measure content and linguistic quality. The features are combined using one of three methods—robust regression, non-negative least squares, or canonical correlation, an eigenvalue method. The new metrics significantly outperform the previous standard for automatic text summarization evaluation, ROUGE.

References

[1]  Luhn, H.P. The Automatic Creation of Literature Abstracts. In Advances in Automatic Text Summarization; The MIT Press: Cambridge, MA, USA, 1956; pp. 58–63.
[2]  McKeown, K.; Radev, D.R. Generating Summaries of Multiple News Articles. In Proceedings of the 18th Annual International ACM SIGIR Conference On Research and Development in Information Retrieval; ACM: New York, NY, USA, 1995; pp. 74–82. SIGIR ’95.
[3]  Text Analysis Conference, NIST, 2011. Available online: http://www.nist.gov/tac (accessed on 19 September 2012).
[4]  Lin, C.Y. ROUGE: A Package for Automatic Evaluation of Summaries. In Proceedings of the ACL-04 Workshop: Text Summarization Branches Out; Barcelona, Spain: 22–24 July 2004; pp. 74–81.
[5]  Conroy, J.M.; Dang, H.T. Mind the Gap: Dangers of Divorcing Evaluations of Summary Content from Linguistic Quality. In Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Stroudsburg, PA, USA, 18–22 August 2008; pp. 145–152.
[6]  Conroy, J.M.; Schlesinger, J.D.; O’Leary, D.P. Nouveau-ROUGE: A novelty metric for update summarization. Comput. Linguist. 2011, 37, 1–8, doi:10.1162/coli_a_00033.
[7]  De Oliveira, P.C.F.; Torrens, E.W.; Cidral, A.; Schossland, S.; Bittencourt, E. Evaluating Summaries Automatically-A system Proposal. Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08); European Language Resources Association (ELRA), Marrakech, Morocco, 28–30 May 2008; Available online: http://www.lrecconf. org/proceedings/lrec2008/ (accessed on 19 September 2012).
[8]  Giannakopoulos, G.; Karkaletsis, V.; Vouros, G.A.; Stamatopoulos, P. Summarization system evaluation revisited: N-gram graphs. Trans. Speech Lang. Process. 2008, 5, 1–39.
[9]  Giannakopoulos, G.; Vouros, G.A.; Karkaletsis, V. MUDOS-NG: Multi-Document Summaries Using N-gram Graphs (Technical Report). 2010. Available online: http://arxiv.org/abs/1012.2042 (accessed on 19 September 2012). arXiv:1012.2042.
[10]  Giannakopoulos, G.; Karkaletsis, V. AutoSummENG and MeMoG in Evaluating Guided Summaries. In Proceedings of the Text Analysis Conference (TAC 2011); NIST, Gaithersburg, MD, USA, 14–15 November 2011.
[11]  Pitler, E.; Louis, A.; Nenkova, A. Automatic Evaluation of Linguistic Quality in Multi-Document Summarization. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics; Association for Computational Linguistics, Uppsala, Sweden, 11–16 July 2010; pp. 544–554.
[12]  Kumar, N.; Srinathan, K.; Varma, V. Using graph based mapping of co-occurring words and closeness centrality score for summarization evaluation. Comput. Linguist. Intell. Text Process. 2012, 7182, 353–365, doi:10.1007/978-3-642-28601-8_30.
[13]  Steinberger, J.; Je?ek, K. Evaluation measures for text summarization. Comput. Inf. 2012, 28, 251–275.
[14]  Saggion, H.; Torres-Moreno, J.; Cunha, I.; SanJuan, E. Multilingual Summarization Evaluation Without Human Models. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics, Stroudsburg, PA, USA, 23–27 August 2010; pp. 1059–1067.
[15]  Louis, A.; Nenkova, A. Automatically Evaluating Content Selection in Summarization without Human Models. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing; Association for Computational Linguistics, Singapore, Singapore, 6–7 August 2009; pp. 306–314.
[16]  Document Understanding Conference, NIST, 2004. Available online: http://duc.nist.gov (accessed on 19 September 2012).
[17]  Over, P. Introduction to DUC-2001: An Intrinsic Evaluation of Generic News Text Summarization SystemsTechnical report, Retrieval Group, Information Access Division, National Institute of Standards and Technology, 2001.
[18]  Nenkova, A.; Passonneau, R.; Mckeown, K. The pyramid method: Incorporating human content selection variation in summarization evaluation. ACM Trans. Speech Lang. Process. 2007, 4, 1–4.
[19]  Conroy, J.M.; Schlesinger, J.D.; Rankel, P.A.; O’Leary, D.P. Guiding CLASSY Toward More Responsive Summaries. Proceedings of the TAC 2010 Workshop, Gaithersburg, MD, USA, 15–16 November 2010; Available online: http://www.nist.gov/tac/publications/index.html (accessed on 19 Septmber 2012).
[20]  Seber, G. Multivariate Observations (Wiley series in probability and statistics); Wiley-Interscience: Weinheim, Germany, 2004.
[21]  Tavernier, J.; Bellot, P. Combining Relevance and Readability for INEX 2011 Question-Answering Track. In Pre-Proceedings of INEX 2011; IR Publications: Amsterdam, The Netherlands, 2011; pp. 185–195.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133