publication . Conference object . 2015

TEI across corpora, languages and genres: Towards a standard for the representation of social media and computer-mediated communication

Beißwenger, Michael; Chanier, Thierry; Ehrhardt, Eric; Herold, Axel; Lüngen, Harald; Poudat, Céline; Storrer, Angelika;
  • Published: 28 Oct 2015
  • Publisher: HAL CCSD
  • Country: France
International audience; The panel presents results and ongoing work from corpus projects in which TEI-P5 hasbeen adopted for the representation and linguistic annotation of genres of social mediaand computer-mediated communication (CMC). It relates to the work of the TEI-SIG“computer-mediated communication” which is developing TEI models for therepresentation of CMC genres and testing these models for a broad range of genres(ranging from “text-only” genres such as chat and SMS to multimodal genres such aslearning environments and Second Life) and in corpus building initiatives for variousEuropean languages.The goal of the panel is to give an overview of models a...
free text keywords: TEI, Text Encoding Initiative, CMC, computer-mediated communication, corpora, [SHS.LANGUE]Humanities and Social Sciences/Linguistics
Digital Humanities and Cultural HeritageDH-CH communities: CLARIN

Beißwenger, Michael; Ermakova, Maria; Geyken, Alexander; Lemnitzer, Lothar; Storrer, Angelika (2012): A TEI Schema for the Representation of Computer-mediated Communication. Journal of the Text Encoding Initiative (jTEI) 3. (DOI: 10.4000/jtei.476).

Beißwenger, Michael; Ermakova, Maria; Geyken, Alexander; Lemnitzer, Lothar; Storrer, Angelika (2013): DeRiK: A German Reference Corpus of Computer-Mediated Communication. In: Literary and Linguistic Computing (LLC).

Chanier, Thierry; Poudat, Celine; Sagot, Benoit; Antoniadis, Georges; Wigham, Ciara; Hriba, Linda; Longhi, Julien; Seddah, Djamé (2014): The CoMeRe corpus for [OpenAIRE]

4 Project „Whats Up, Deutschland“ (, initiated and coordinated by

CoMeRe (2015). CoMeRe Repository: Corpora of Computer-Mediated Communication in French. Ortolang : Nancy.

Margaretha, Eliza; Lüngen, Harald (2014): Building Linguistic Corpora from Wikipedia Articles and Discussions. In: Beißwenger, Michael; Oostdijk, Nelleke; Storrer, Angelika; van den Heuvel, Henk (Eds.): Building and Annotating Corpora of Computer-Mediated Communication: Issues and Challenges at the Interface of Corpus and Computational Linguistics. Special Issue, Journal of Language Technology and Computational Linguistics (JLCL 2/2014), 59-82.

Any information missing or wrong?Report an Issue