- home
- Advanced Search
14 Research products, page 1 of 2
Loading
- Publication . Article . Preprint . 2020Open Access EnglishAuthors:Del Gratta, Riccardo;Del Gratta, Riccardo;
In this article, we propose a Category Theory approach to (syntactic) interoperability between linguistic tools. The resulting category consists of textual documents, including any linguistic annotations, NLP tools that analyze texts and add additional linguistic information, and format converters. Format converters are necessary to make the tools both able to read and to produce different output formats, which is the key to interoperability. The idea behind this document is the parallelism between the concepts of composition and associativity in Category Theory with the NLP pipelines. We show how pipelines of linguistic tools can be modeled into the conceptual framework of Category Theory and we successfully apply this method to two real-life examples. Paper submitted to Applied Category Theory 2020 and accepted for Virtual Poster Session
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . Preprint . 2019 . Embargo End Date: 01 Jan 2019Open AccessAuthors:Kolar, Jana; Cugmas, Marjan; Ferligoj, Anuška;Kolar, Jana; Cugmas, Marjan; Ferligoj, Anuška;Publisher: arXivProject: EC | ACCELERATE (731112)
In 2018, the European Strategic Forum for research infrastructures (ESFRI) was tasked by the Competitiveness Council, a configuration of the Council of the EU, to develop a common approach for monitoring of Research Infrastructures' performance. To this end, ESFRI established a working group, which has proposed 21 Key Performance Indicators (KPIs) to monitor the progress of the Research Infrastructures (RIs) addressed towards their objectives. The RIs were then asked to assess their relevance for their institution. The paper aims to identify the relevance of certain indicators for particular groups of RIs by using cluster and discriminant analysis. This could contribute to development of a monitoring system, tailored to particular RIs. To obtain a typology of the RIs, we first performed cluster analysis of the RIs according to their properties, which revealed clusters of RIs with similar characteristics, based on to the domain of operation, such as food, environment or engineering. Then, discriminant analysis was used to study how the relevance of the KPIs differs among the obtained clusters. This analysis revealed that the percentage of RIs correctly classified into five clusters, using the KPIs, is 80%. Such a high percentage indicates that there are significant differences in the relevance of certain indicators, depending on the ESFRI domain of the RI. The indicators therefore need to be adapted to the type of infrastructure. It is therefore proposed that the Strategic Working Groups of ESFRI addressing specific domains should be involved in the tailored development of the monitoring of pan-European RIs. Comment: 15 pages, 8 tables, 3 figures
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Preprint . Article . 2019Open Access EnglishAuthors:Rizza, Ettore; Chardonnens, Anne; Van Hooland, Seth;Rizza, Ettore; Chardonnens, Anne; Van Hooland, Seth;Publisher: HAL CCSDCountries: France, Belgium
More and more cultural institutions use Linked Data principles to share and connect their collection metadata. In the archival field, initiatives emerge to exploit data contained in archival descriptions and adapt encoding standards to the semantic web. In this context, online authority files can be used to enrich metadata. However, relying on a decentralized network of knowledge bases such as Wikidata, DBpedia or even Viaf has its own difficulties. This paper aims to offer a critical view of these linked authority files by adopting a close-reading approach. Through a practical case study, we intend to identify and illustrate the possibilities and limits of RDF triples compared to institutions' less structured metadata. Comment: Workshop "Dariah "Trust and Understanding: the value of metadata in a digitally joined-up world" (14/05/2018, Brussels), preprint of the submission to the journal "Archives et Biblioth\`eques de Belgique"
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . Preprint . Conference object . 2019Open AccessAuthors:Lilia Simeonova; Kiril Simov; Petya Osenova; Preslav Nakov;Lilia Simeonova; Kiril Simov; Petya Osenova; Preslav Nakov;Publisher: Incoma Ltd., Shoumen, Bulgaria
We propose a morphologically informed model for named entity recognition, which is based on LSTM-CRF architecture and combines word embeddings, Bi-LSTM character embeddings, part-of-speech (POS) tags, and morphological information. While previous work has focused on learning from raw word input, using word and character embeddings only, we show that for morphologically rich languages, such as Bulgarian, access to POS information contributes more to the performance gains than the detailed morphological information. Thus, we show that named entity recognition needs only coarse-grained POS tags, but at the same time it can benefit from simultaneously using some POS information of different granularity. Our evaluation results over a standard dataset show sizable improvements over the state-of-the-art for Bulgarian NER. Comment: named entity recognition; Bulgarian NER; morphology; morpho-syntax
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . Preprint . 2020 . Embargo End Date: 01 Jan 2020Open AccessAuthors:Zamani, Maryam; Tejedor, Alejandro; Vogl, Malte; Krautli, Florian; Valleriani, Matteo; Kantz, Holger;Zamani, Maryam; Tejedor, Alejandro; Vogl, Malte; Krautli, Florian; Valleriani, Matteo; Kantz, Holger;Publisher: arXiv
We investigated the evolution and transformation of scientific knowledge in the early modern period, analyzing more than 350 different editions of textbooks used for teaching astronomy in European universities from the late fifteenth century to mid-seventeenth century. These historical sources constitute the Sphaera Corpus. By examining different semantic relations among individual parts of each edition on record, we built a multiplex network consisting of six layers, as well as the aggregated network built from the superposition of all the layers. The network analysis reveals the emergence of five different communities. The contribution of each layer in shaping the communities and the properties of each community are studied. The most influential books in the corpus are found by calculating the average age of all the out-going and in-coming links for each book. A small group of editions is identified as a transmitter of knowledge as they bridge past knowledge to the future through a long temporal interval. Our analysis, moreover, identifies the most disruptive books. These books introduce new knowledge that is then adopted by almost all the books published afterwards until the end of the whole period of study. The historical research on the content of the identified books, as an empirical test, finally corroborates the results of all our analyses. Comment: 19 pages, 9 figures
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Other literature type . Article . Preprint . 2021Open AccessAuthors:Christof Schöch;Christof Schöch;Publisher: Zenodo
The concept of literary genre is a highly complex one: not only are different genres frequently defined on several, but not necessarily the same levels of description, but consideration of genres as cognitive, social, or scholarly constructs with a rich history further complicate the matter. This contribution focuses on thematic aspects of genre with a quantitative approach, namely Topic Modeling. Topic Modeling has proven to be useful to discover thematic patterns and trends in large collections of texts, with a view to class or browse them on the basis of their dominant themes. It has rarely if ever, however, been applied to collections of dramatic texts. In this contribution, Topic Modeling is used to analyze a collection of French Drama of the Classical Age and the Enlightenment. The general aim of this contribution is to discover what semantic types of topics are found in this collection, whether different dramatic subgenres have distinctive dominant topics and plot-related topic patterns, and inversely, to what extent clustering methods based on topic scores per play produce groupings of texts which agree with more conventional genre distinctions. This contribution shows that interesting topic patterns can be detected which provide new insights into the thematic, subgenre-related structure of French drama as well as into the history of French drama of the Classical Age and the Enlightenment. Comment: 11 figures
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . Preprint . 2018Open Access EnglishAuthors:Nadia Boukhelifa; Michael Bryant; Natasa Bulatovic; Ivan Čukić; Jean-Daniel Fekete; Milica Knežević; Jörg Lehmann; David I. Stuart; Carsten Thiel;Nadia Boukhelifa; Michael Bryant; Natasa Bulatovic; Ivan Čukić; Jean-Daniel Fekete; Milica Knežević; Jörg Lehmann; David I. Stuart; Carsten Thiel;Publisher: HAL CCSDCountries: United Kingdom, FranceProject: EC | CENDARI (284432)
International audience; The CENDARI infrastructure is a research-supporting platform designed to provide tools for transnational historical research, focusing on two topics: medieval culture and World War I. It exposes to the end users modern Web-based tools relying on a sophisticated infrastructure to collect, enrich, annotate, and search through large document corpora. Supporting researchers in their daily work is a novel concern for infrastructures. We describe how we gathered requirements through multiple methods to understand historians' needs and derive an abstract workflow to support them. We then outline the tools that we have built, tying their technical descriptions to the user requirements. The main tools are the note-taking environment and its faceted search capabilities; the data integration platform including the Data API, supporting semantic enrichment through entity recognition; and the environment supporting the software development processes throughout the project to keep both technical partners and researchers in the loop. The outcomes are technical together with new resources developed and gathered, and the research workflow that has been described and documented.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . Conference object . Preprint . 2018 . Embargo End Date: 01 Jan 2018Open AccessAuthors:Christoph Hube; Besnik Fetahu;Christoph Hube; Besnik Fetahu;Publisher: arXivProject: EC | DESIR (731081), EC | ALEXANDRIA (339233), EC | AFEL (687916)
Biased language commonly occurs around topics which are of controversial nature, thus, stirring disagreement between the different involved parties of a discussion. This is due to the fact that for language and its use, specifically, the understanding and use of phrases, the stances are cohesive within the particular groups. However, such cohesiveness does not hold across groups. In collaborative environments or environments where impartial language is desired (e.g. Wikipedia, news media), statements and the language therein should represent equally the involved parties and be neutrally phrased. Biased language is introduced through the presence of inflammatory words or phrases, or statements that may be incorrect or one-sided, thus violating such consensus. In this work, we focus on the specific case of phrasing bias, which may be introduced through specific inflammatory words or phrases in a statement. For this purpose, we propose an approach that relies on a recurrent neural networks in order to capture the inter-dependencies between words in a phrase that introduced bias. We perform a thorough experimental evaluation, where we show the advantages of a neural based approach over competitors that rely on word lexicons and other hand-crafted features in detecting biased language. We are able to distinguish biased statements with a precision of P=0.92, thus significantly outperforming baseline models with an improvement of over 30%. Finally, we release the largest corpus of statements annotated for biased language. Comment: The Twelfth ACM International Conference on Web Search and Data Mining, February 11--15, 2019, Melbourne, VIC, Australia
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . Preprint . 2021 . Embargo End Date: 01 Jan 2021Open AccessAuthors:Papadopoulou, Maria; Smyrnaiou, Zacharoula;Papadopoulou, Maria; Smyrnaiou, Zacharoula;Publisher: arXiv
Digital technologies, such as the Internet and Artificial Intelligence, are part of our daily lives, influencing broader aspects of our way of life, as well as the way we interact with the past. Having dramatically changed the ways in which knowledge is produced and consumed, the algorithmic age has also radically changed the relationship that the general public has with History. Fields of History such as Public and Oral History have particularly benefitted from the rise of digital culture. How does our digital culture affect the way we think, study, research and teach the past, as historical evidence spreads rapidly in the public sphere? How do digital technologies promote the study, writing and teaching of History? What should historians, students of history and pre-service history teachers be critically aware of, when swarmed with digitized or born-digital content, constantly growing on the Internet? And while these changes are now visible globally, how is the discipline of History situated within the digital transformation rapidly advancing in Greece? Finally, what are the consequences of these changes for History as a subject taught at Greek secondary schools? These are some of the issues raised in the text that follows, which is part of the course materials of the undergraduate course offered during winter semester 2020-2021 at the School University of Athens, School of Philosophy, Pedagogy, Psychology. Course Title: 'Pedagogics of History: Theory and Practice', Academic Institution: School of Philosophy-Pedagogy-Psychology, University of Athens. Comment: 47 pages, in Greek, 8 figures
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Preprint . Article . 2019Open Access EnglishAuthors:Bamman, David; Lewke, Olivia; Mansoor, Anya;Bamman, David; Lewke, Olivia; Mansoor, Anya;
We present in this work a new dataset of coreference annotations for works of literature in English, covering 29,103 mentions in 210,532 tokens from 100 works of fiction. This dataset differs from previous coreference datasets in containing documents whose average length (2,105.3 words) is four times longer than other benchmark datasets (463.7 for OntoNotes), and contains examples of difficult coreference problems common in literature. This dataset allows for an evaluation of cross-domain performance for the task of coreference resolution, and analysis into the characteristics of long-distance within-document coreference.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.
14 Research products, page 1 of 2
Loading
- Publication . Article . Preprint . 2020Open Access EnglishAuthors:Del Gratta, Riccardo;Del Gratta, Riccardo;
In this article, we propose a Category Theory approach to (syntactic) interoperability between linguistic tools. The resulting category consists of textual documents, including any linguistic annotations, NLP tools that analyze texts and add additional linguistic information, and format converters. Format converters are necessary to make the tools both able to read and to produce different output formats, which is the key to interoperability. The idea behind this document is the parallelism between the concepts of composition and associativity in Category Theory with the NLP pipelines. We show how pipelines of linguistic tools can be modeled into the conceptual framework of Category Theory and we successfully apply this method to two real-life examples. Paper submitted to Applied Category Theory 2020 and accepted for Virtual Poster Session
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . Preprint . 2019 . Embargo End Date: 01 Jan 2019Open AccessAuthors:Kolar, Jana; Cugmas, Marjan; Ferligoj, Anuška;Kolar, Jana; Cugmas, Marjan; Ferligoj, Anuška;Publisher: arXivProject: EC | ACCELERATE (731112)
In 2018, the European Strategic Forum for research infrastructures (ESFRI) was tasked by the Competitiveness Council, a configuration of the Council of the EU, to develop a common approach for monitoring of Research Infrastructures' performance. To this end, ESFRI established a working group, which has proposed 21 Key Performance Indicators (KPIs) to monitor the progress of the Research Infrastructures (RIs) addressed towards their objectives. The RIs were then asked to assess their relevance for their institution. The paper aims to identify the relevance of certain indicators for particular groups of RIs by using cluster and discriminant analysis. This could contribute to development of a monitoring system, tailored to particular RIs. To obtain a typology of the RIs, we first performed cluster analysis of the RIs according to their properties, which revealed clusters of RIs with similar characteristics, based on to the domain of operation, such as food, environment or engineering. Then, discriminant analysis was used to study how the relevance of the KPIs differs among the obtained clusters. This analysis revealed that the percentage of RIs correctly classified into five clusters, using the KPIs, is 80%. Such a high percentage indicates that there are significant differences in the relevance of certain indicators, depending on the ESFRI domain of the RI. The indicators therefore need to be adapted to the type of infrastructure. It is therefore proposed that the Strategic Working Groups of ESFRI addressing specific domains should be involved in the tailored development of the monitoring of pan-European RIs. Comment: 15 pages, 8 tables, 3 figures
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Preprint . Article . 2019Open Access EnglishAuthors:Rizza, Ettore; Chardonnens, Anne; Van Hooland, Seth;Rizza, Ettore; Chardonnens, Anne; Van Hooland, Seth;Publisher: HAL CCSDCountries: France, Belgium
More and more cultural institutions use Linked Data principles to share and connect their collection metadata. In the archival field, initiatives emerge to exploit data contained in archival descriptions and adapt encoding standards to the semantic web. In this context, online authority files can be used to enrich metadata. However, relying on a decentralized network of knowledge bases such as Wikidata, DBpedia or even Viaf has its own difficulties. This paper aims to offer a critical view of these linked authority files by adopting a close-reading approach. Through a practical case study, we intend to identify and illustrate the possibilities and limits of RDF triples compared to institutions' less structured metadata. Comment: Workshop "Dariah "Trust and Understanding: the value of metadata in a digitally joined-up world" (14/05/2018, Brussels), preprint of the submission to the journal "Archives et Biblioth\`eques de Belgique"
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . Preprint . Conference object . 2019Open AccessAuthors:Lilia Simeonova; Kiril Simov; Petya Osenova; Preslav Nakov;Lilia Simeonova; Kiril Simov; Petya Osenova; Preslav Nakov;Publisher: Incoma Ltd., Shoumen, Bulgaria
We propose a morphologically informed model for named entity recognition, which is based on LSTM-CRF architecture and combines word embeddings, Bi-LSTM character embeddings, part-of-speech (POS) tags, and morphological information. While previous work has focused on learning from raw word input, using word and character embeddings only, we show that for morphologically rich languages, such as Bulgarian, access to POS information contributes more to the performance gains than the detailed morphological information. Thus, we show that named entity recognition needs only coarse-grained POS tags, but at the same time it can benefit from simultaneously using some POS information of different granularity. Our evaluation results over a standard dataset show sizable improvements over the state-of-the-art for Bulgarian NER. Comment: named entity recognition; Bulgarian NER; morphology; morpho-syntax
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . Preprint . 2020 . Embargo End Date: 01 Jan 2020Open AccessAuthors:Zamani, Maryam; Tejedor, Alejandro; Vogl, Malte; Krautli, Florian; Valleriani, Matteo; Kantz, Holger;Zamani, Maryam; Tejedor, Alejandro; Vogl, Malte; Krautli, Florian; Valleriani, Matteo; Kantz, Holger;Publisher: arXiv
We investigated the evolution and transformation of scientific knowledge in the early modern period, analyzing more than 350 different editions of textbooks used for teaching astronomy in European universities from the late fifteenth century to mid-seventeenth century. These historical sources constitute the Sphaera Corpus. By examining different semantic relations among individual parts of each edition on record, we built a multiplex network consisting of six layers, as well as the aggregated network built from the superposition of all the layers. The network analysis reveals the emergence of five different communities. The contribution of each layer in shaping the communities and the properties of each community are studied. The most influential books in the corpus are found by calculating the average age of all the out-going and in-coming links for each book. A small group of editions is identified as a transmitter of knowledge as they bridge past knowledge to the future through a long temporal interval. Our analysis, moreover, identifies the most disruptive books. These books introduce new knowledge that is then adopted by almost all the books published afterwards until the end of the whole period of study. The historical research on the content of the identified books, as an empirical test, finally corroborates the results of all our analyses. Comment: 19 pages, 9 figures
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Other literature type . Article . Preprint . 2021Open AccessAuthors:Christof Schöch;Christof Schöch;Publisher: Zenodo
The concept of literary genre is a highly complex one: not only are different genres frequently defined on several, but not necessarily the same levels of description, but consideration of genres as cognitive, social, or scholarly constructs with a rich history further complicate the matter. This contribution focuses on thematic aspects of genre with a quantitative approach, namely Topic Modeling. Topic Modeling has proven to be useful to discover thematic patterns and trends in large collections of texts, with a view to class or browse them on the basis of their dominant themes. It has rarely if ever, however, been applied to collections of dramatic texts. In this contribution, Topic Modeling is used to analyze a collection of French Drama of the Classical Age and the Enlightenment. The general aim of this contribution is to discover what semantic types of topics are found in this collection, whether different dramatic subgenres have distinctive dominant topics and plot-related topic patterns, and inversely, to what extent clustering methods based on topic scores per play produce groupings of texts which agree with more conventional genre distinctions. This contribution shows that interesting topic patterns can be detected which provide new insights into the thematic, subgenre-related structure of French drama as well as into the history of French drama of the Classical Age and the Enlightenment. Comment: 11 figures
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . Preprint . 2018Open Access EnglishAuthors:Nadia Boukhelifa; Michael Bryant; Natasa Bulatovic; Ivan Čukić; Jean-Daniel Fekete; Milica Knežević; Jörg Lehmann; David I. Stuart; Carsten Thiel;Nadia Boukhelifa; Michael Bryant; Natasa Bulatovic; Ivan Čukić; Jean-Daniel Fekete; Milica Knežević; Jörg Lehmann; David I. Stuart; Carsten Thiel;Publisher: HAL CCSDCountries: United Kingdom, FranceProject: EC | CENDARI (284432)
International audience; The CENDARI infrastructure is a research-supporting platform designed to provide tools for transnational historical research, focusing on two topics: medieval culture and World War I. It exposes to the end users modern Web-based tools relying on a sophisticated infrastructure to collect, enrich, annotate, and search through large document corpora. Supporting researchers in their daily work is a novel concern for infrastructures. We describe how we gathered requirements through multiple methods to understand historians' needs and derive an abstract workflow to support them. We then outline the tools that we have built, tying their technical descriptions to the user requirements. The main tools are the note-taking environment and its faceted search capabilities; the data integration platform including the Data API, supporting semantic enrichment through entity recognition; and the environment supporting the software development processes throughout the project to keep both technical partners and researchers in the loop. The outcomes are technical together with new resources developed and gathered, and the research workflow that has been described and documented.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . Conference object . Preprint . 2018 . Embargo End Date: 01 Jan 2018Open AccessAuthors:Christoph Hube; Besnik Fetahu;Christoph Hube; Besnik Fetahu;Publisher: arXivProject: EC | DESIR (731081), EC | ALEXANDRIA (339233), EC | AFEL (687916)
Biased language commonly occurs around topics which are of controversial nature, thus, stirring disagreement between the different involved parties of a discussion. This is due to the fact that for language and its use, specifically, the understanding and use of phrases, the stances are cohesive within the particular groups. However, such cohesiveness does not hold across groups. In collaborative environments or environments where impartial language is desired (e.g. Wikipedia, news media), statements and the language therein should represent equally the involved parties and be neutrally phrased. Biased language is introduced through the presence of inflammatory words or phrases, or statements that may be incorrect or one-sided, thus violating such consensus. In this work, we focus on the specific case of phrasing bias, which may be introduced through specific inflammatory words or phrases in a statement. For this purpose, we propose an approach that relies on a recurrent neural networks in order to capture the inter-dependencies between words in a phrase that introduced bias. We perform a thorough experimental evaluation, where we show the advantages of a neural based approach over competitors that rely on word lexicons and other hand-crafted features in detecting biased language. We are able to distinguish biased statements with a precision of P=0.92, thus significantly outperforming baseline models with an improvement of over 30%. Finally, we release the largest corpus of statements annotated for biased language. Comment: The Twelfth ACM International Conference on Web Search and Data Mining, February 11--15, 2019, Melbourne, VIC, Australia
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . Preprint . 2021 . Embargo End Date: 01 Jan 2021Open AccessAuthors:Papadopoulou, Maria; Smyrnaiou, Zacharoula;Papadopoulou, Maria; Smyrnaiou, Zacharoula;Publisher: arXiv
Digital technologies, such as the Internet and Artificial Intelligence, are part of our daily lives, influencing broader aspects of our way of life, as well as the way we interact with the past. Having dramatically changed the ways in which knowledge is produced and consumed, the algorithmic age has also radically changed the relationship that the general public has with History. Fields of History such as Public and Oral History have particularly benefitted from the rise of digital culture. How does our digital culture affect the way we think, study, research and teach the past, as historical evidence spreads rapidly in the public sphere? How do digital technologies promote the study, writing and teaching of History? What should historians, students of history and pre-service history teachers be critically aware of, when swarmed with digitized or born-digital content, constantly growing on the Internet? And while these changes are now visible globally, how is the discipline of History situated within the digital transformation rapidly advancing in Greece? Finally, what are the consequences of these changes for History as a subject taught at Greek secondary schools? These are some of the issues raised in the text that follows, which is part of the course materials of the undergraduate course offered during winter semester 2020-2021 at the School University of Athens, School of Philosophy, Pedagogy, Psychology. Course Title: 'Pedagogics of History: Theory and Practice', Academic Institution: School of Philosophy-Pedagogy-Psychology, University of Athens. Comment: 47 pages, in Greek, 8 figures
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Preprint . Article . 2019Open Access EnglishAuthors:Bamman, David; Lewke, Olivia; Mansoor, Anya;Bamman, David; Lewke, Olivia; Mansoor, Anya;
We present in this work a new dataset of coreference annotations for works of literature in English, covering 29,103 mentions in 210,532 tokens from 100 works of fiction. This dataset differs from previous coreference datasets in containing documents whose average length (2,105.3 words) is four times longer than other benchmark datasets (463.7 for OntoNotes), and contains examples of difficult coreference problems common in literature. This dataset allows for an evaluation of cross-domain performance for the task of coreference resolution, and analysis into the characteristics of long-distance within-document coreference.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.