Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to DARIAH EU. Are you interested to view more results? Visit OpenAIRE - Explore.
1 Research products, page 1 of 1

  • DARIAH EU
  • Other research products
  • 2013-2022
  • Other ORP type
  • HAL-Pasteur
  • HAL-Inserm

Date (most recent)
arrow_drop_down
  • Other research product . Other ORP type . 2016
    French
    Authors: 
    Panckhurst, Rachel; Détrie, Catherine; Lopez, Cédric; Moïse, Claudine; Roche, Mathieu; Verine, Bertrand;
    Publisher: HAL CCSD
    Country: France

    The first version of the corpus (ISLRN : 024-713-187-947-8) was produced in 2014 as part of the "sud4science LR project". More than 88,000 authentic SMS, sent by hundreds of donators living mainly in the Montpellier area, were collected, in 2011, then anonymised, by the researchers, their student interns and a legal adviser-CIL.The initial corpus was then converted to TEI standard in the project CoMeRe (Communication Médiée par les Réseaux). This project aims to build a kernel corpus assembling existing corpora of different CMC (Computer-Mediated Communication) genres and new corpora build on data extracted from the Internet. These heterogenous corpora will be structured and processed in a uniform way, complemented with metadata. CoMeRe will be released as OpenData through the national infrastructure Ortolang, following constraints which will be reused for the forthcoming “Corpus de Référence du Français”. Project supported by the national consortium Corpus-écrits, sub-part of Huma-Num, and Ortolang (French correspondant to DARIAH); The first version of the corpus (ISLRN : 024-713-187-947-8) was produced in 2014 as part of the "sud4science LR project". More than 88,000 authentic SMS, sent by hundreds of donators living mainly in the Montpellier area, were collected, in 2011, then anonymised, by the researchers, their student interns and a legal adviser-CIL.The initial corpus was then converted to TEI standard in the project CoMeRe (Communication Médiée par les Réseaux). This project aims to build a kernel corpus assembling existing corpora of different CMC (Computer-Mediated Communication) genres and new corpora build on data extracted from the Internet. These heterogenous corpora will be structured and processed in a uniform way, complemented with metadata. CoMeRe will be released as OpenData through the national infrastructure Ortolang, following constraints which will be reused for the forthcoming “Corpus de Référence du Français”. Project supported by the national consortium Corpus-écrits, sub-part of Huma-Num, and Ortolang (French correspondant to DARIAH)