Visualization methods for single documents are either too simple, considering word frequency only, or depend on syntactic and semantic information bases to be more useful. This paper presents an intermediary approach, based on H. P. Luhn’s automatic abstract creation algorithm, and intends to aggregate more information to document visualization than word counting methods do without the need of external sources. The method takes pairs of relevant words and computes the linkage force between them. Relevant words become vertices and links become edges in the resulting graph.
References
[1]
M. Grobelnik and D. Mladenic, “Tutorial on Text Mining,” PASCAL Network of Excellence Workshop on Text Classification, 2004.
[2]
IBM Research, “Many Eyes: Tag Cloud,” 2013.
http://www-958.ibm.com/software/analytics/manyeyes/page/Tag_Cloud.html
[3]
J. Feinberg, “Wordle—Beautiful Word Clouds,” 2013.
http://www.wordle.net/
[4]
J. D. Novak and A. J. Ca?as, “The Theory Underlying Concept Maps and How to Construct Them,” Technical Report IHMC CmapTools 2006-01 Rev 01-2008, Florida Institute for Human and Machine Cognition, 2008.
[5]
L. C. S. Silva and R. R. Sampaio, “Use of Graphs of Terms to Analyse Contents of Technical Documents,” Proceedings of the Brazilian Workshop on Social Network Analysis and Mining, XXXII Congress of the Brazilian Computer Society, Curitiba, 2012.
[6]
H. P. Luhn, “The Automatic Creation of Literature Abstracts,” IBM Journal of Research and Development, Vol. 2, No. 2, 1958, pp. 159-165.
http://dx.doi.org/10.1147/rd.22.0159
[7]
M. A. Russell, “Mining the Social Web,” O’Reilly, 2011, pp. 256-257.
[8]
S. M. Weiss, N. Indurkhya, T. Zhang and F. J. Damerau, “From Textual Information to Numerical Vectors,” In: Text Mining: Predictive Methods for Analysing Unstructured Information, Springer Verlag, 2005, pp. 15-44.
http://dx.doi.org/10.1007/978-0-387-34555-0_2
[9]
NumPy Developers, “Scientific Computing Tools for Python—Numpy,” 2013. http://www.numpy.org/
[10]
NLTK Project, “Natural Language Toolkit—NLTK 2.0,” 2013. http://nltk.org/
C. T. Butts, “Social Network Analysis with sna,” Journal of Statistical Software, Vol. 24, No. 6, 2008.
[13]
R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach and T. Berners-Lee, “Hypertext Transfer Protocol—HTTP/1.1,” The Internet Society, 1999.