publication . Article . 2014

The CoMeRe corpus for French: structuring and annotating heterogeneous CMC genres

Thierry Chanier; Céline Poudat; Benoît Sagot; Georges Antoniadis; Wigham, Ciara R.; Linda Hriba; Julien Longhi; Djamé Seddah;
  • Published: 01 Jan 2014
  • Publisher: HAL CCSD
  • Country: France
Final version to Special Issue of JLCL (Journal of Language Technology and Computational Linguistics (JLCL, BUILDING AND ANNOTATING CORPORA OF COMPUTER-MEDIATED DISCOURSE: Issues and Challenges at the Interface of Corpus and Computational Linguistics (ed. by Michael Beißwenger, Nelleke Oostdijk, Angelika Storrer & Henk van den Heuvel); International audience; The CoMeRe project aims to build a kernel corpus of different Computer-Mediated Com-munication (CMC) genres with interactions in French as the main language, by assembling interactions stemming from networks such as the Internet or telecommunication, as well as mono and multimodal, synchr...
free text keywords: Computer Mediated Communication, CMC, CoMeRe, corpus, [ SHS.LANGUE ] Humanities and Social Sciences/Linguistics, [SHS.LANGUE]Humanities and Social Sciences/Linguistics
39 references, page 1 of 3

BEIßWENGER, M., ERMAKOVA, M., GEYKEN, A., LEMNITZER, L. & STORRER, A. (2012). „A TEI Schema for the Representation of Computer-mediated Communication”. In Journal of the Text Encoding Initiative (jTEI), 3, ; DOI : 10.4000/jtei.476.

BEIßWENGER, M., CHANIER, T. ,CHIARI, I., ERMAKOVA, M., VAN GOMPEL, M.,HENDRICKX, I, HEROLD, A., VAN DEN HEUVEL, H,LEMNITZER, L. & STORRER, A. (2013). „ComputerMediated Communication in TEI: What Lies Ahead”. Special Topic Panel, TEI Conference and Members Meeting 2013, 2-5 Octobre 2013, Rome, Italy.

BELLIK Y. & TEIL D. (1992). „Définitions terminologiques pour la communication multimodale”. Conference Interaction Humain-Machine IHM'92, Paris.

BURNARD, L. & BAUMAN, S. (2013). TEI P5: Guidelines for electronic text encoding and interchange [Document] . TEI consortium, doc/en/Guidelines.pdf

CHABERT, G., ZAMPA, V., ANTONIADIS, G. & MALLEN, M. (2012). Des SMS Alpins. Éditions de la Bibliothèque départementale des Hautes-Alpes: Gap.

CHANIER, T. & VETTER, A. (2006). „Multimodalité et expression en langue étrangère dans une plate-forme audio-synchrone”. Apprentissage des Langues et Systèmes d'Information et de Communication (ALSIC), 9. DOI: 10.4000/alsic.270,

CODATA/ITSCI Task Force on Data Citation (2013). „Out of cite, out of mind: The Current State of Practice, Policy and Technology for Data Citation”. Data Science Journal 12, pp 1-75, DOI: 10.2481/dsj.OSOM13-043

CoMeRe (2014). Communication Médiée par les Réseaux, project documentation [website],

CoMeRe Repository (2014). Repository fo the CoMeRe corpora [website],

COOK, P. & STEVENSON, S. (2009). „An Unsupervised Model for Text Message Normalization”. In Feldman, A. & Lönneker-Rodman, B. (Ed.). Proceedings of the Workshop on Computational Approaches to Linguistic Creativity, pp. 71-78. 2000.pdf

Corpus-écrits (2013). Consortium Corpus-écrits [website].

FAIRON, C. & PAUMIER, S. (2006). „A translated corpus of 30,000 French SMS”. In Proceedings of LREC 2006, 22-28 May 2006, Genova, Italy.

FALAISE, A. (in print). „Corpus de français tchaté getalp_org” [corpus] . In Chanier T. (ed) Banque de corpus CoMeRe Banque de corpus CoMeRe. : Nancy. [cmr-getalp_org-tei-v1 ; ]

FALAISE, A. (2005). „Constitution d'un corpus de français tchaté”. In Actes de RECITAL 2005, 6-10 June, Dourdan, France.

GEYKEN, A. (2007). „The DWDS-Corpus: A reference corpus for the German language of the 20th century”. In C. Fellbaum (Ed.). Collocations and idioms: linguistic, lexicographic, and computational aspects. London: Continuum Press.

39 references, page 1 of 3
Any information missing or wrong?Report an Issue