%0 Journal Article
%T Extending Automatic Transcripts in a Unified Data Representation towards a Prosodic-based Metadata Annotation and Evaluation
%A BATISTA
%A F.
%A MONIZ
%A H.
%A TRANCOSO
%A I.
%A MAMEDE
%A N.
%J Journal of Speech Sciences
%D 2012
%I 
%X This paper describes a framework that extends automatic speech transcripts in order to accommodaterelevant information coming from manual transcripts, the speech signal itself, and other resources,like lexica. The proposed framework automatically collects, relates, computes, and stores all relevantinformation together in a self-contained data source, making it possible to easily provide a wide rangeof interconnected information suitable for speech analysis, training, and evaluating a number ofautomatic speech processing tasks. The main goal of this framework is to integrate different linguisticand paralinguistic layers of knowledge for a more complete view of their representation andinteractions in several domains and languages. The processing chain is composed of two main stages,where the first consists of integrating the relevant manual annotations in the speech recognition data,and the second consists of further enriching the previous output in order to accommodate prosodicinformation. The described framework has been used for the identification and analysis of structuralmetadata in automatic speech transcripts. Initially put to use for automatic detection of punctuationmarks and for capitalization recovery from speech data, it has also been recently used for studying thecharacterization of disfluencies in speech. It was already applied to several domains of Portuguesecorpora, and also to English and Spanish Broadcast News corpora.
%K Automatic speech processing
%K speech alignment
%K structural metadata
%K speech prosody
%K speech data representation
%K multiple-domain speech corpora
%K cross-language speech processing.
%U http://www.journalofspeechsciences.org/index.php/journalofspeechsciences/article/view/60/50