|
A Large available oral corpus: Orleans corpus 1968-2012 Un grand corpus oral disponible : le Corpus d'Orléans 1968-2012Keywords: oral corpus , variationniste corpus , anonymisation , transcription , corpus annotation , variations Abstract: This article presents the building and putting online OF the oral corpus ESLO. Our purpose is to show that it is important not only to collect and make available language data and metadata but also to make explicit the whole chain of treatments. In the first part, we will present the project and the corpus, then we will specify the legal and methodological problems which determined all corpus treatments, in particular the anonymisation procedures which are required to freely make available this kind of resource. In the second part, we present different annotations made on the raw data with some examples of their use. We will explain the followed methodology which is always guided by the nature of the data and by the final objective: build a large sociolinguistic variationist oral corpus of French. Finally, we will discuss the issues of putting the corpus online.
|