Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to DARIAH EU. Are you interested to view more results? Visit OpenAIRE - Explore.
12 Research products, page 1 of 2

  • DARIAH EU
  • 2013-2022
  • Open Access
  • Part of book or chapter of book
  • Hyper Article en Ligne - Sciences de l'Homme et de la Société
  • DARIAH EU

10
arrow_drop_down
Relevance
arrow_drop_down
  • Open Access
    Authors: 
    Angela Cossu;
    Country: France

    International audience

  • Publication . Part of book or chapter of book . 2019
    Open Access English
    Authors: 
    Gelati, Francesco;
    Publisher: HAL CCSD
    Project: EC | EHRI (654164)

    The European Holocaust Research Infrastructure (EHRI) portal website aims to aggregate digitally available archival descriptions concerning the Holocaust. This portal is actually a meta-catalogue, or an information aggregator, whose biggest goal is to have up-to-date information by means of building sustainable data pipelines between EHRI and its content providers. Just like in similar archival information aggregators (e.g. Archives Portal Europe or Monasterium), the XML-based metadata standard Encoded Archival Description (EAD) plays a key role. The article presents how EADs are imported into the portal, mainly thanks to the Open Archive Initiative protocols.

  • Open Access German
    Authors: 
    Christof Schöch;
    Publisher: HAL CCSD
    Country: France

    Licence Creative Commons Attribution 4.0 (CC-BY); The digital age, by making large amounts of text available to us, prompts us to develop new and additional reading strategies supported by the use of computers and enabling us to deal with such amounts of text. One such "distant reading" strategy is stylometry, a method of quantitative text analysis which relies on the frequencies of certain linguistic features such as words, letters or grammatical units to statistically assess the relative similarity of texts to each other and to classify texts on this basis. This method is applied here to French drama of the seventeenth century, more precisely to the now famous "Corneille / Molière- controversy". In this controversy, some researchers claim that Pierre Corneille wrote several of the plays traditionally attributed to Molière. The methodological challenge, it is shown here, lies in the fact that categories such as authorship, genre (comedy vs. tragedy) and literary form (prose vs. verse) all have an influence on stylometric distance measures and classification. Cross-genre and cross-form authorship attribution needs to distinguish such competing signals if it is to produce reliable attribution results. This contribution describes two attempts to accomplish this, parameter optimization and feature-range selection. The contribution concludes with some more general remarks about the use of quantitative methods in a hermeneutic discipline such as literary studies.

  • Publication . Part of book or chapter of book . 2017
    Open Access English
    Authors: 
    Laurent Romary; Conny Kristel; Tobias Blanke;
    Publisher: HAL CCSD
    Country: France

    International audience; Humanities have convincingly argued that they need transnational research opportunities and through the digital transformation of their disciplines also have the means to proceed with it on an up to now unknown scale. The digital transformation of research and its resources means that many of the artifacts, documents, materials, etc. that interest humanities research can now be combined in new and innovative ways. Due to the digital transformations, (big) data and information have become central to the study of culture and society. Humanities research infrastructures manage, organise and distribute this kind of information and many more data objects as they becomes relevant for social and cultural research.

  • Publication . Other literature type . Part of book or chapter of book . 2019
    Open Access French
    Authors: 
    Bergounioux, Gabriel;
    Publisher: HAL CCSD
    Country: France

    International audience; A la distinction traditionnelle entre transcription (traduction généralement alphabétique d'une donnée langagière orale) et annotation (enrichissement par un système de marques du texte obtenu par transcription), cet article substitue une prise en compte de la transcription comme première annotation, que ce soit dans les choix qui sont faits pour l'écriture des mots, leur séparation, l'usage de la ponctuation et des majuscules etc.

  • Publication . Part of book or chapter of book . 2019
    Open Access English
    Authors: 
    Elisa Nury;
    Publisher: HAL CCSD
    Country: Switzerland

    International audience; This paper describes the workflow of the Grammateus project, from gathering data on Greek documentary papyri to the creation of a web application. The first stage is the selection of a corpus and the choice of metadata to record: papyrology specialists gather data from printed editions, existing online resources and digital facsimiles. In the next step, this data is transformed into the EpiDoc standard of XML TEI encoding, to facilitate its reuse by others, and processed for HTML display. We also reuse existing text transcriptions available on . Since these transcriptions may be regularly updated by the scholarly community, we aim to access them dynamically. Although the transcriptions follow the EpiDoc guidelines, the wide diversity of the papyri as well as small inconsistencies in encoding make data reuse challenging. Currently, our data is available on an institutional GitLab repository, and we will archive our final dataset according to the FAIR principles.

  • Publication . Other literature type . Part of book or chapter of book . 2017
    Open Access French
    Authors: 
    Julien Longhi;
    Publisher: HAL CCSD
    Country: France

    International audience; L'analyse du discours politique connaît un renouvellement important, dû notamment aux nouveaux supports et formats d'expression, comme les réseaux sociaux numériques (RSN). Or, ces lieux de production d'écrits sont le plus souvent saisis par des disciplines qui les traitent comme des données sociales, plutôt que comme des discours. Cet article vise à décrire les enjeux philologiques, herméneutiques, et également institutionnels et interdisciplinaires, de la constitution d'un corpus de tweets politiques. Le corpus Polititweets (Longhi et al. 2014 : 34273 messages, 205 utilisateurs) a été élaboré selon le format TEI (avec des pistes d'extension aux formats CMC proposées par un groupe européen qui s'est constitué autour de cette question), afin de tenir compte des éléments spatio-temporels, contextuels, technologiques, interactionnels, thématiques, dialogiques, etc. des messages produits. Il s'agit donc dans un premier temps de décrire le contexte d'élaboration du corpus, la méthodologie et des considérations juridiques. Dans un second temps, nous détaillons les enjeux philologiques de la constitution du corpus, en explicitant les critères qui ont présidé à sa structuration, pour passer d'une base de données à un corpus au format TEI. Dans un dernier temps, nous décrivons la démarche de mise à disposition du corpus et les questions d'« open access ».

  • Open Access French
    Authors: 
    Olivier Ertzscheid; gabriel gallezot; Brigitte Simonnot;
    Publisher: HAL CCSD
    Country: France

    Dans un contexte de sociétés marquées par un procès d'informationnalisation et de circulation croissante et accélérée des flux d'information édités ou non, autant dans la sphère privative, dans celle du travail que dans l'espace public, le web est devenu une source pour les chercheurs en sciences humaines et sociales. La pluralité des formes du web, sa profondeur et sa dimension dynamique exigent une réflexion sur la notion de " documents " notamment lorsqu'il s'agit de constituer des corpus pour les chercheurs. Les auteurs interrogent ces conditions de constitution et de recueil des mémoires observables et de leurs dispositifs d'engrammation, le passage du document aux traces numériques, dans leurs dimensions épistémologiques, pragmatiques, éthiques et sociétales. In our societies marked by a process of informationnalisation and increasing circulation and accelerated flow of information, edited or not, in the private sphere, in the working sphere or in public space, the web has become a source for researchers in the humanities and social sciences. The plurality of forms on the web, its depth and its dynamics require reflection on the concept of "document" especially when it comes to form corpus for researchers. The authors question the conditions for the establishment and collection of observable memories and their engrammation devices, the transition from document to digital traces, in their epistemological, pragmatic, ethical and societal dimensions.

  • Open Access English
    Authors: 
    Thierry Chanier; Ciara R. Wigham;
    Publisher: HAL CCSD
    Country: France

    International audience; This chapter gives an overview of one possible staged methodology for structuring LCI data by presenting a new scientific object, LEarning and TEaching Corpora (LETEC). Firstly, the chapter clarifies the notion of corpora, used in so many different ways in language studies, and underlines how corpora differ from raw language data. Secondly, using examples taken from actual online learning situations, the chapter illustrates the methodology that is used to collect, transform and organize data from online learning situations in order to make them sharable through open-access repositories. The ethics and rights for releasing a corpus as OpenData are discussed. Thirdly, the authors suggest how the transcription of interactions may become more systematic, and what benefits may be expected from analysis tools, before opening the CALL research perspective applied to LCI towards its applications to teacher-training in Computer-Mediated Communication (CMC), and the common interests the CALL field shares with researchers in the field of Corpus Linguistics working on CMC.

  • Publication . Part of book or chapter of book . Other literature type . 2018
    Open Access German
    Authors: 
    Baillot, Anne;
    Publisher: HAL CCSD
    Country: France

    International audience; Die These, die hier vertreten wird, verortet die Bekehrungsmanöver in einer diametral entgegengesetzten Glaubensgemeinschaft: Eine Religion der Big Data gibt es wohl, und eines ihrer Evangelien nennt sich Netzwerkvisualisierung. Ohne Netzwerk geht nichts, alles ist Netzwerk. Sicherlich machen es zum einen die Datenflut und zum anderen die Verknüpfungen zwischen ebendiesen Daten nötig, sich Orientierung zu verschaffen. Im Zuge dessen wurde der Ideen- und Literaturgeschichte der Rekurs auf Netzwerkanalyse aufgebürdet. Das Kreuz, das es zu schleppen gilt, ist eben das Netzwerk. Aber liefern Netzwerke und ihre Visualisierungen wirklich die Orientierung, die die Geisteswissenschaften brauchen? Wozu sind Netzwerke für die Literaturwissenschaft gut? Was erlauben sie uns zu machen, was wir anders nicht bewerkstelligen könnten?Ansätze zur Beantwortung dieser Fragen werden in drei Schritten vorgestellt. Zunächst werde ich basal mit der Frage „Was ist ein Netzwerk?“ beginnen. Dabei geht es mir darum zu umreißen, was ein „gutes“ Netzwerk ausmacht, d.h. ein Netzwerk, aus dem man aus literaturwissenschaftlicher Sicht sinnvolle Informationen gewinnen kann. Im zweiten Teil stelle ich digitale Briefeditionen vor (im Speziellen meine eigene) und was diese an Anknüpfungspunkten für Netzmodelle bieten. In einem dritten Teil gehe ich schließlich auf die Einbettung von Netzmodellen in die konkrete Textarbeit ein.

Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to DARIAH EU. Are you interested to view more results? Visit OpenAIRE - Explore.
12 Research products, page 1 of 2
  • Open Access
    Authors: 
    Angela Cossu;
    Country: France

    International audience

  • Publication . Part of book or chapter of book . 2019
    Open Access English
    Authors: 
    Gelati, Francesco;
    Publisher: HAL CCSD
    Project: EC | EHRI (654164)

    The European Holocaust Research Infrastructure (EHRI) portal website aims to aggregate digitally available archival descriptions concerning the Holocaust. This portal is actually a meta-catalogue, or an information aggregator, whose biggest goal is to have up-to-date information by means of building sustainable data pipelines between EHRI and its content providers. Just like in similar archival information aggregators (e.g. Archives Portal Europe or Monasterium), the XML-based metadata standard Encoded Archival Description (EAD) plays a key role. The article presents how EADs are imported into the portal, mainly thanks to the Open Archive Initiative protocols.

  • Open Access German
    Authors: 
    Christof Schöch;
    Publisher: HAL CCSD
    Country: France

    Licence Creative Commons Attribution 4.0 (CC-BY); The digital age, by making large amounts of text available to us, prompts us to develop new and additional reading strategies supported by the use of computers and enabling us to deal with such amounts of text. One such "distant reading" strategy is stylometry, a method of quantitative text analysis which relies on the frequencies of certain linguistic features such as words, letters or grammatical units to statistically assess the relative similarity of texts to each other and to classify texts on this basis. This method is applied here to French drama of the seventeenth century, more precisely to the now famous "Corneille / Molière- controversy". In this controversy, some researchers claim that Pierre Corneille wrote several of the plays traditionally attributed to Molière. The methodological challenge, it is shown here, lies in the fact that categories such as authorship, genre (comedy vs. tragedy) and literary form (prose vs. verse) all have an influence on stylometric distance measures and classification. Cross-genre and cross-form authorship attribution needs to distinguish such competing signals if it is to produce reliable attribution results. This contribution describes two attempts to accomplish this, parameter optimization and feature-range selection. The contribution concludes with some more general remarks about the use of quantitative methods in a hermeneutic discipline such as literary studies.

  • Publication . Part of book or chapter of book . 2017
    Open Access English
    Authors: 
    Laurent Romary; Conny Kristel; Tobias Blanke;
    Publisher: HAL CCSD
    Country: France

    International audience; Humanities have convincingly argued that they need transnational research opportunities and through the digital transformation of their disciplines also have the means to proceed with it on an up to now unknown scale. The digital transformation of research and its resources means that many of the artifacts, documents, materials, etc. that interest humanities research can now be combined in new and innovative ways. Due to the digital transformations, (big) data and information have become central to the study of culture and society. Humanities research infrastructures manage, organise and distribute this kind of information and many more data objects as they becomes relevant for social and cultural research.

  • Publication . Other literature type . Part of book or chapter of book . 2019
    Open Access French
    Authors: 
    Bergounioux, Gabriel;
    Publisher: HAL CCSD
    Country: France

    International audience; A la distinction traditionnelle entre transcription (traduction généralement alphabétique d'une donnée langagière orale) et annotation (enrichissement par un système de marques du texte obtenu par transcription), cet article substitue une prise en compte de la transcription comme première annotation, que ce soit dans les choix qui sont faits pour l'écriture des mots, leur séparation, l'usage de la ponctuation et des majuscules etc.

  • Publication . Part of book or chapter of book . 2019
    Open Access English
    Authors: 
    Elisa Nury;
    Publisher: HAL CCSD
    Country: Switzerland

    International audience; This paper describes the workflow of the Grammateus project, from gathering data on Greek documentary papyri to the creation of a web application. The first stage is the selection of a corpus and the choice of metadata to record: papyrology specialists gather data from printed editions, existing online resources and digital facsimiles. In the next step, this data is transformed into the EpiDoc standard of XML TEI encoding, to facilitate its reuse by others, and processed for HTML display. We also reuse existing text transcriptions available on . Since these transcriptions may be regularly updated by the scholarly community, we aim to access them dynamically. Although the transcriptions follow the EpiDoc guidelines, the wide diversity of the papyri as well as small inconsistencies in encoding make data reuse challenging. Currently, our data is available on an institutional GitLab repository, and we will archive our final dataset according to the FAIR principles.

  • Publication . Other literature type . Part of book or chapter of book . 2017
    Open Access French
    Authors: 
    Julien Longhi;
    Publisher: HAL CCSD
    Country: France

    International audience; L'analyse du discours politique connaît un renouvellement important, dû notamment aux nouveaux supports et formats d'expression, comme les réseaux sociaux numériques (RSN). Or, ces lieux de production d'écrits sont le plus souvent saisis par des disciplines qui les traitent comme des données sociales, plutôt que comme des discours. Cet article vise à décrire les enjeux philologiques, herméneutiques, et également institutionnels et interdisciplinaires, de la constitution d'un corpus de tweets politiques. Le corpus Polititweets (Longhi et al. 2014 : 34273 messages, 205 utilisateurs) a été élaboré selon le format TEI (avec des pistes d'extension aux formats CMC proposées par un groupe européen qui s'est constitué autour de cette question), afin de tenir compte des éléments spatio-temporels, contextuels, technologiques, interactionnels, thématiques, dialogiques, etc. des messages produits. Il s'agit donc dans un premier temps de décrire le contexte d'élaboration du corpus, la méthodologie et des considérations juridiques. Dans un second temps, nous détaillons les enjeux philologiques de la constitution du corpus, en explicitant les critères qui ont présidé à sa structuration, pour passer d'une base de données à un corpus au format TEI. Dans un dernier temps, nous décrivons la démarche de mise à disposition du corpus et les questions d'« open access ».

  • Open Access French
    Authors: 
    Olivier Ertzscheid; gabriel gallezot; Brigitte Simonnot;
    Publisher: HAL CCSD
    Country: France

    Dans un contexte de sociétés marquées par un procès d'informationnalisation et de circulation croissante et accélérée des flux d'information édités ou non, autant dans la sphère privative, dans celle du travail que dans l'espace public, le web est devenu une source pour les chercheurs en sciences humaines et sociales. La pluralité des formes du web, sa profondeur et sa dimension dynamique exigent une réflexion sur la notion de " documents " notamment lorsqu'il s'agit de constituer des corpus pour les chercheurs. Les auteurs interrogent ces conditions de constitution et de recueil des mémoires observables et de leurs dispositifs d'engrammation, le passage du document aux traces numériques, dans leurs dimensions épistémologiques, pragmatiques, éthiques et sociétales. In our societies marked by a process of informationnalisation and increasing circulation and accelerated flow of information, edited or not, in the private sphere, in the working sphere or in public space, the web has become a source for researchers in the humanities and social sciences. The plurality of forms on the web, its depth and its dynamics require reflection on the concept of "document" especially when it comes to form corpus for researchers. The authors question the conditions for the establishment and collection of observable memories and their engrammation devices, the transition from document to digital traces, in their epistemological, pragmatic, ethical and societal dimensions.

  • Open Access English
    Authors: 
    Thierry Chanier; Ciara R. Wigham;
    Publisher: HAL CCSD
    Country: France

    International audience; This chapter gives an overview of one possible staged methodology for structuring LCI data by presenting a new scientific object, LEarning and TEaching Corpora (LETEC). Firstly, the chapter clarifies the notion of corpora, used in so many different ways in language studies, and underlines how corpora differ from raw language data. Secondly, using examples taken from actual online learning situations, the chapter illustrates the methodology that is used to collect, transform and organize data from online learning situations in order to make them sharable through open-access repositories. The ethics and rights for releasing a corpus as OpenData are discussed. Thirdly, the authors suggest how the transcription of interactions may become more systematic, and what benefits may be expected from analysis tools, before opening the CALL research perspective applied to LCI towards its applications to teacher-training in Computer-Mediated Communication (CMC), and the common interests the CALL field shares with researchers in the field of Corpus Linguistics working on CMC.

  • Publication . Part of book or chapter of book . Other literature type . 2018
    Open Access German
    Authors: 
    Baillot, Anne;
    Publisher: HAL CCSD
    Country: France

    International audience; Die These, die hier vertreten wird, verortet die Bekehrungsmanöver in einer diametral entgegengesetzten Glaubensgemeinschaft: Eine Religion der Big Data gibt es wohl, und eines ihrer Evangelien nennt sich Netzwerkvisualisierung. Ohne Netzwerk geht nichts, alles ist Netzwerk. Sicherlich machen es zum einen die Datenflut und zum anderen die Verknüpfungen zwischen ebendiesen Daten nötig, sich Orientierung zu verschaffen. Im Zuge dessen wurde der Ideen- und Literaturgeschichte der Rekurs auf Netzwerkanalyse aufgebürdet. Das Kreuz, das es zu schleppen gilt, ist eben das Netzwerk. Aber liefern Netzwerke und ihre Visualisierungen wirklich die Orientierung, die die Geisteswissenschaften brauchen? Wozu sind Netzwerke für die Literaturwissenschaft gut? Was erlauben sie uns zu machen, was wir anders nicht bewerkstelligen könnten?Ansätze zur Beantwortung dieser Fragen werden in drei Schritten vorgestellt. Zunächst werde ich basal mit der Frage „Was ist ein Netzwerk?“ beginnen. Dabei geht es mir darum zu umreißen, was ein „gutes“ Netzwerk ausmacht, d.h. ein Netzwerk, aus dem man aus literaturwissenschaftlicher Sicht sinnvolle Informationen gewinnen kann. Im zweiten Teil stelle ich digitale Briefeditionen vor (im Speziellen meine eigene) und was diese an Anknüpfungspunkten für Netzmodelle bieten. In einem dritten Teil gehe ich schließlich auf die Einbettung von Netzmodellen in die konkrete Textarbeit ein.