- home
- Advanced Search
22 Research products, page 1 of 3
Loading
- Publication . Article . Other literature type . Conference object . 2020Open Access EnglishAuthors:Stefan Bornhofen; Marten Düring;Stefan Bornhofen; Marten Düring;Publisher: HAL CCSDCountry: FranceProject: ANR | BLIZAAR (ANR-15-CE23-0002)
AbstractThe paper presents Intergraph, a graph-based visual analytics technical demonstrator for the exploration and study of content in historical document collections. The designed prototype is motivated by a practical use case on a corpus of circa 15.000 digitized resources about European integration since 1945. The corpus allowed generating a dynamic multilayer network which represents different kinds of named entities appearing and co-appearing in the collections. To our knowledge, Intergraph is one of the first interactive tools to visualize dynamic multilayer graphs for collections of digitized historical sources. Graph visualization and interaction methods have been designed based on user requirements for content exploration by non-technical users without a strong background in network science, and to compensate for common flaws with the annotation of named entities. Users work with self-selected subsets of the overall data by interacting with a scene of small graphs which can be added, altered and compared. This allows an interest-driven navigation in the corpus and the discovery of the interconnections of its entities across time.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Conference object . 2019Open Access EnglishAuthors:Lamé, M.; Pittet, P.; Ponchio, F.; Markhoff, B.; EMILIO MARIA SANFILIPPO;Lamé, M.; Pittet, P.; Ponchio, F.; Markhoff, B.; EMILIO MARIA SANFILIPPO;Publisher: HAL CCSDCountries: France, Italy
International audience; In this paper, we present an online communication-driven decision support system to align terms from a dataset with terms of another dataset (standardized controlled vocabulary or not). Heterotoki differs from existing proposals in that it takes place at the interface with humans, inviting the experts to commit on their definitions, so as to either agree to validate the mapping or to propose some enrichment to the terminologies. More precisely, differently to most of existing proposals that support terminology alignment, Heterotoki sustains the negotiation of meaning thanks to semantic coordination support within its interface design. This negotiation involves domain experts having produced multiple datasets.
- Publication . 2019Open Access EnglishAuthors:Marlet , Olivier; Francart, Thomas; Markhoff, Béatrice; Rodier, Xavier;Marlet , Olivier; Francart, Thomas; Markhoff, Béatrice; Rodier, Xavier;Publisher: HAL CCSDCountry: FranceProject: EC | ARIADNEplus (823914)
International audience; CIDOC CRM is an ontology intended to facilitate the integration, mediation and interchange of heterogeneous cultural heritage information. The Semantic Web with its Linked Open Data cloud enables scholars and cultural institutions to publish their data in RDF, using CIDOC CRM as an interlingua that enables a semantically consistent re-interpretation of their data. Nowadays more and more projects have done the task of mapping legacy datasets to CIDOC CRM, and successful Extract-Transform-Load data-integration processes have been performed in this way. A next step is enabling people and applications to actually dynamically explore autonomous datasets using the semantic mediation offered by CIDOC CRM. This is the purpose of OpenArchaeo, a tool for querying archaeological datasets on the LOD cloud. We present its main features: the principles behind its user friendly query interface and its SPARQL Endpoint for programs, together with its overall architecture designed to be extendable and scalable, for handling transparent interconnections with evolving distributed sources while achieving good efficiency.
- Publication . Article . Conference object . 2020Open Access EnglishAuthors:Ivan Kratchanov;Ivan Kratchanov;
International audience; The National Library Ivan Vazov in Plovdiv is the second largest library in Bulgaria. It serves asthe second national legal depository of Bulgarian printed works. In addition, it has contributedsignificantly to the preservation and the digital accessibility of the national cultural andhistorical heritage. This article offers an overview of the library’s history and currentdevelopments in the field of automation and digitization.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Presentation . Other literature type . Conference object . 2018Open AccessAuthors:Tóth-Czifra, Erzsébet;Tóth-Czifra, Erzsébet;Publisher: figshare
Slides presented at the EADH conference in Galway, 09.12.2018. OpenMethods (https://openmethods.dariah.eu) is a metablog aimed at republishing and bringing together all sorts of Open Access publications (e.g. research articles, preprints, blogs, videos, podcasts) about Digital Humanities methods and tools to spread the knowledge and raise peer recognition for them. The has been developed in the supervision of the DARIAH community.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Conference object . 2019Open Access EnglishAuthors:Adeline Joffres; Mike Priddy; Francesca Morselli; Thomas Lebarbé; Xavier Granier; Paul Bertrand; Xavier Rodier; Fabrice Melka; Jason Camlot; Stéfan Sinclair; +17 moreAdeline Joffres; Mike Priddy; Francesca Morselli; Thomas Lebarbé; Xavier Granier; Paul Bertrand; Xavier Rodier; Fabrice Melka; Jason Camlot; Stéfan Sinclair; Idmhand Fatiha; Caroline Abéla; Mehdi Chayani; Christophe Parisse; Céline Poudat; Véronique Ginouvès; Sinatra, Michael E.; Emmanuel Chateau Dutier; Gimena del Rio Riande; Paula Ricaurte; Isabel Galina Russel; José Francisco Barron Tovar; Ernesto Priani Saiso; Martin Grandjean; Aurélien Berra; Olivier Baude; Stéphane Pouyllau;Publisher: HAL CCSDCountry: France
International audience; Knowledge production has always act globally, and when it comes to the humanities early networks of scholars can still be traced in their letter correspondence. With the emergence of digital humanities more prominently in the 1970s, research communities have organized themselves in many different ways. The enthusiasm generated by the promises of what was sometimes perceived as a "new field" were to some extent echoed in new forms of institutionalization, to the point of defining a discipline in its own right. But the enthusiasms was also accompanied by a certain resistance of communities reluctant to introduce digital technology into their field.The term of "digital humanities" in these earlier days of adopting digital methods into the humanities created an area, a niche, inside which pioneers in Digital Humanities could gain critical mass. Today, where digital methods are far more widely applied, one can observe an almost opposite trend, the abandoning of a ‘specific label’ and a much broader advocacy concerning all humanities.What remains specific for DH communities is the close alliance between content providers (which themselves are in a process of digitisation content and access), humanities scholars applying digital methods, and computer scientists linking to new methodological achievements in their field. However, this alliance can express itself in very different forms of national and international organisation, and is far from following a specific model.This panel examines different ways of "forming a community" among digital humanities scholars and scholars in other fields, and other actors in DH. The contributions span a range from generic ways to design digital research infrastructures in the SSH, over national solutions to supranational coordination.The purpose of this panel is to unfold the diversity of the current "digital humanist movement”, not only to compare, but also to understand what is at stake for the actors involved and what impact the different forms of organisation have on creation and evolution of research communities. We further discuss issues of cohesion and durability. Through the papers presented, we will examine the impact of bottom-up, top-down and horizontal strategies as well as the adoption of hybrid solutions (organizational, disciplinary, methodological, scalar) in the design of research communities. This approach will allow us to put convergences and challenges into perspective and to question the re- compositions at work within SSH communities.This panel will highlight the experiences of SSH research communities from different cultures and organizations rooted at different levels of governance, such as some French communities structured around institutional nodes such as Maisons des Sciences de l'Homme (MSH), or research infrastructures at the national (TGIR Huma-Num) or European level (DARIAH ERIC); project based collaboration of research infrastructures (DANS, The Netherlands) and Canada (CRIHN); and professional networks and transnational associations related to digital humanities (e.g. Humanistica, the French-speaking association of digital humanities, or the Latin American network for digital humanities under construction). The comparison of the experiences presented will not produce a homogeneous and smooth image but will highlight differences in approaches and organisation. Even it seems nearly impossible to give account of every association that could be representative on a way to build community in DH, the chair of the session will make an introduction with a brief summary of this landscape. That said, besides the geographical aspect that we try to include, another is that we are giving voice to formal and informal associations such as the LatamHD network, that is just at an early stage and that is not yet defined in its goals. We decided to propose several solutions to deal with the diversity of needs and practises inside our communities and we wanted to present some of them to share our experiences and initiate discussions during this panel in order to develop collaborations with colleagues sharing the same kind of constraints.Thus, the objective is to have a broad discussion with the audience to broaden the perspectives to other experiences.This panel aims to contribute to the reflective work in the wider DH context about factors of constitution, consolidation and evolution of its research communities.
- Publication . Article . Conference object . Preprint . 2019Open Access EnglishAuthors:Lilia Simeonova; Kiril Simov; Petya Osenova; Preslav Nakov;Lilia Simeonova; Kiril Simov; Petya Osenova; Preslav Nakov;
We propose a morphologically informed model for named entity recognition, which is based on LSTM-CRF architecture and combines word embeddings, Bi-LSTM character embeddings, part-of-speech (POS) tags, and morphological information. While previous work has focused on learning from raw word input, using word and character embeddings only, we show that for morphologically rich languages, such as Bulgarian, access to POS information contributes more to the performance gains than the detailed morphological information. Thus, we show that named entity recognition needs only coarse-grained POS tags, but at the same time it can benefit from simultaneously using some POS information of different granularity. Our evaluation results over a standard dataset show sizable improvements over the state-of-the-art for Bulgarian NER. named entity recognition; Bulgarian NER; morphology; morpho-syntax
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Conference object . Part of book or chapter of book . 2018Open AccessAuthors:Paola Puma;Paola Puma;Publisher: Springer International PublishingCountry: Italy
Here we address the roadmap of the Digital Cultural Heritage research group DigitCH group, which was set up in 2013 at the Department of Architecture, University of Florence. The aim of DigitCH group was to realize the link between scientifically validated methodologies and contents, innovative storytelling, and technological instrumentation. The spread of electronic devices has enabled rapid and easy technological fallout of research in the field of the acquisition-representation of the survey data expanding audiences and accelerating even an innovative approach to the whole knowledge of CH.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2020Open Access EnglishAuthors:Luca Foppiano; Laurent Romary;Luca Foppiano; Laurent Romary;Publisher: HAL CCSDCountry: FranceProject: EC | HIRMEOS (731102)
International audience; This paper presents an attempt to provide a generic named-entity recognition and disambiguation module (NERD) called entity-fishing as a stable online service that demonstrates the possible delivery of sustainable technical services within DARIAH, the European digital research infrastructure for the arts and humanities. Deployed as part of the national infrastructure Huma-Num in France, this service provides an efficient state-of-the-art implementation coupled with standardised interfaces allowing an easy deployment on a variety of potential digital humanities contexts. The topics of accessibility and sustainability have been long discussed in the attempt of providing some best practices in the widely fragmented ecosystem of the DARIAH research infrastructure. The history of entity-fishing has been mentioned as an example of good practice: initially developed in the context of the FP9 CENDARI, the project was well received by the user community and continued to be further developed within the H2020 HIRMEOS project where several open access publishers have integrated the service to their collections of published monographs as a means to enhance retrieval and access.entity-fishing implements entity extraction as well as disambiguation against Wikipedia and Wikidata entries. The service is accessible through a REST API which allows easier and seamless integration, language independent and stable convention and a widely used service oriented architecture (SOA) design. Input and output data are carried out over a query data model with a defined structure providing flexibility to support the processing of partially annotated text or the repartition of text over several queries. The interface implements a variety of functionalities, like language recognition, sentence segmentation and modules for accessing and looking up concepts in the knowledge base. The API itself integrates more advanced contextual parametrisation or ranked outputs, allowing for the resilient integration in various possible use cases. The entity-fishing API has been used as a concrete use case3 to draft the experimental stand-off proposal, which has been submitted for integration into the TEI guidelines. The representation is also compliant with the Web Annotation Data Model (WADM).In this paper we aim at describing the functionalities of the service as a reference contribution to the subject of web-based NERD services. In order to cover all aspects, the architecture is structured to provide two complementary viewpoints. First, we discuss the system from the data angle, detailing the workflow from input to output and unpacking each building box in the processing flow. Secondly, with a more academic approach, we provide a transversal schema of the different components taking into account non-functional requirements in order to facilitate the discovery of bottlenecks, hotspots and weaknesses. The attempt here is to give a description of the tool and, at the same time, a technical software engineering analysis which will help the reader to understand our choice for the resources allocated in the infrastructure.Thanks to the work of million of volunteers, Wikipedia has reached today stability and completeness that leave no usable alternatives on the market (considering also the licence aspect). The launch of Wikidata in 2010 have completed the picture with a complementary language independent meta-model which is becoming the scientific reference for many disciplines. After providing an introduction to Wikipedia and Wikidata, we describe the knowledge base: the data organisation, the entity-fishing process to exploit it and the way it is built from nightly dumps using an offline process.We conclude the paper by presenting our solution for the service deployment: how and which the resources where allocated. The service has been in production since Q3 of 2017, and extensively used by the H2020 HIRMEOS partners during the integration with the publishing platforms. We believe we have strived to provide the best performances with the minimum amount of resources. Thanks to the Huma-num infrastructure we still have the possibility to scale up the infrastructure as needed, for example to support an increase of demand or temporary needs to process huge backlog of documents. On the long term, thanks to this sustainable environment, we are planning to keep delivering the service far beyond the end of the H2020 HIRMEOS project.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . 2019Open Access EnglishAuthors:Raciti, Marco; Gabay, Simon; Moranville, Yoann; Jorge, Maria Do Rosário; Fernandes, João;Raciti, Marco; Gabay, Simon; Moranville, Yoann; Jorge, Maria Do Rosário; Fernandes, João;Publisher: HAL CCSDCountry: FranceProject: EC | DESIR (731081)
International audience; Europe has a long and rich tradition as a centre of research and teaching in the arts and humanities. However, the huge digital transformation that affects the arts and humanities research landscape all over the world requires that we set up sustainable research infrastructures, new and refined techniques, state-of-the-art methods and an expanded skills base. Responding to these challenges, the Digital Research Infrastructure for Arts and Humanities (DARIAH) was launched as a pan-European network and research infrastructure. After expansion and consolidation, which involved DARIAH’s inclusion in the ESFRI roadmap, DARIAH became a European Research Infrastructure Consortium (ERIC) in 2014. The Horizon 2020 funded project DESIR (DARIAH ERIC Sustainability Refined) sets out to strengthen the sustainability of DARIAH and help establish it as a reliable long-term partner within our communities. Sustaining existing digital expertise, tools, resources in Europe in the context of DESIR involves a goal-oriented set of measures in order to first, maintain, expand and develop DARIAH in its capacities as an organisation and technical research infrastructure; secondly, to engage its members further, as well as measure and increase their trust in DARIAH; thirdly, to expand the network in order to integrate new regions and communities. The DESIR consortium is composed of core DARIAH members, representatives from potential new DARIAH members and external technical experts. The sustainability of a research infrastructure is the capacity to remain operative, effective and competitive over its expected lifetime. In DESIR, this definition is translated into an evolving 6-dimensional process, divided into the following challenges:•Dissemination•Growth•Technology•Robustness•Trust•EducationWith our poster, we would like to show how the project helps sustaining DARIAH. Within DESIR, dissemination is the ability to communicate DARIAH’s strategy and benefits effectively within the DARIAH community and in new areas, spreading out to new communities. Through the international workshops held at Stanford University and at the Library of Congress, DARIAH has been introduced to many non-European DH scholars. These events were an important first step to foster international cooperation between US and European colleagues as well as a catalyst for ongoing collaborations in the future. A third workshop took place in Canberra at the Australian Research Data Commons in March 2019.DARIAH has currently 17 members from all over Europe. Nevertheless, efforts should be made to include as many countries as possible to bring in and scale, to a European level, even more state-of-the-art DH activities.Six candidates ready for building strong national consortia have been identified, enabling a substantial expansion of DARIAH’s country coverage. Additionally, thematic workshops are organised in each country as well as tailored training measures.DESIR widens the research infrastructure in core areas which are vital for DARIAH’s sustainability but are not yet covered by the existing set-up. As DARIAH expands across Europe, continuously enhancing and further developing the ERIC exceeds DARIAH’s internal technological capacities. Two notable results were achieved so far: firstly, the publication of a technical reference as a result of a workshop organised in October 2017 with CESSDA and CLARIN. It’s a collection of basic guidelines and references for development and maintenance of infrastructure services within DARIAH and beyond, addressing an ongoing issue for research infrastructures, namely software sustainability. Secondly, the organisation of a Code Sprint, focusing on bibliographical and citation metadata, which helped shaping DARIAH’s profile in four technology areas (visualisation, text analytic services, entity-based search and scholarly content management). Another Code sprint is expected to take place in Summer 2019.Another output is the implementation of a centralized helpdesk. This helpdesk is hosted by CLARIN-D and the solution of integration within the existing DARIAH website was the creation of a WordPress plugin. This plugin is used to connect our website with the OTRS server and allows the creation of issues easily by users unfamiliar with OTRS.Sustaining a research infrastructure involves also two important aspects: trust and education. For DARIAH, it is crucial to increase trust and confidence from its users. In DESIR we develop recommendations and strategies accordingly, targeting new cross-disciplinary communities, based on the results of a survey and interviews addressed to the scientific community, with different levels of approach - national, institutional and individual.In addition, education is a key area and the project contributes to the ongoing discussions about the role and modalities of training and education in the development, consolidation and sustainability of digital research infrastructures. We believe that investing time and efforts into training and educating users is a way of securing the social sustainability of a research infrastructure.
22 Research products, page 1 of 3
Loading
- Publication . Article . Other literature type . Conference object . 2020Open Access EnglishAuthors:Stefan Bornhofen; Marten Düring;Stefan Bornhofen; Marten Düring;Publisher: HAL CCSDCountry: FranceProject: ANR | BLIZAAR (ANR-15-CE23-0002)
AbstractThe paper presents Intergraph, a graph-based visual analytics technical demonstrator for the exploration and study of content in historical document collections. The designed prototype is motivated by a practical use case on a corpus of circa 15.000 digitized resources about European integration since 1945. The corpus allowed generating a dynamic multilayer network which represents different kinds of named entities appearing and co-appearing in the collections. To our knowledge, Intergraph is one of the first interactive tools to visualize dynamic multilayer graphs for collections of digitized historical sources. Graph visualization and interaction methods have been designed based on user requirements for content exploration by non-technical users without a strong background in network science, and to compensate for common flaws with the annotation of named entities. Users work with self-selected subsets of the overall data by interacting with a scene of small graphs which can be added, altered and compared. This allows an interest-driven navigation in the corpus and the discovery of the interconnections of its entities across time.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Conference object . 2019Open Access EnglishAuthors:Lamé, M.; Pittet, P.; Ponchio, F.; Markhoff, B.; EMILIO MARIA SANFILIPPO;Lamé, M.; Pittet, P.; Ponchio, F.; Markhoff, B.; EMILIO MARIA SANFILIPPO;Publisher: HAL CCSDCountries: France, Italy
International audience; In this paper, we present an online communication-driven decision support system to align terms from a dataset with terms of another dataset (standardized controlled vocabulary or not). Heterotoki differs from existing proposals in that it takes place at the interface with humans, inviting the experts to commit on their definitions, so as to either agree to validate the mapping or to propose some enrichment to the terminologies. More precisely, differently to most of existing proposals that support terminology alignment, Heterotoki sustains the negotiation of meaning thanks to semantic coordination support within its interface design. This negotiation involves domain experts having produced multiple datasets.
- Publication . 2019Open Access EnglishAuthors:Marlet , Olivier; Francart, Thomas; Markhoff, Béatrice; Rodier, Xavier;Marlet , Olivier; Francart, Thomas; Markhoff, Béatrice; Rodier, Xavier;Publisher: HAL CCSDCountry: FranceProject: EC | ARIADNEplus (823914)
International audience; CIDOC CRM is an ontology intended to facilitate the integration, mediation and interchange of heterogeneous cultural heritage information. The Semantic Web with its Linked Open Data cloud enables scholars and cultural institutions to publish their data in RDF, using CIDOC CRM as an interlingua that enables a semantically consistent re-interpretation of their data. Nowadays more and more projects have done the task of mapping legacy datasets to CIDOC CRM, and successful Extract-Transform-Load data-integration processes have been performed in this way. A next step is enabling people and applications to actually dynamically explore autonomous datasets using the semantic mediation offered by CIDOC CRM. This is the purpose of OpenArchaeo, a tool for querying archaeological datasets on the LOD cloud. We present its main features: the principles behind its user friendly query interface and its SPARQL Endpoint for programs, together with its overall architecture designed to be extendable and scalable, for handling transparent interconnections with evolving distributed sources while achieving good efficiency.
- Publication . Article . Conference object . 2020Open Access EnglishAuthors:Ivan Kratchanov;Ivan Kratchanov;
International audience; The National Library Ivan Vazov in Plovdiv is the second largest library in Bulgaria. It serves asthe second national legal depository of Bulgarian printed works. In addition, it has contributedsignificantly to the preservation and the digital accessibility of the national cultural andhistorical heritage. This article offers an overview of the library’s history and currentdevelopments in the field of automation and digitization.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Presentation . Other literature type . Conference object . 2018Open AccessAuthors:Tóth-Czifra, Erzsébet;Tóth-Czifra, Erzsébet;Publisher: figshare
Slides presented at the EADH conference in Galway, 09.12.2018. OpenMethods (https://openmethods.dariah.eu) is a metablog aimed at republishing and bringing together all sorts of Open Access publications (e.g. research articles, preprints, blogs, videos, podcasts) about Digital Humanities methods and tools to spread the knowledge and raise peer recognition for them. The has been developed in the supervision of the DARIAH community.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Conference object . 2019Open Access EnglishAuthors:Adeline Joffres; Mike Priddy; Francesca Morselli; Thomas Lebarbé; Xavier Granier; Paul Bertrand; Xavier Rodier; Fabrice Melka; Jason Camlot; Stéfan Sinclair; +17 moreAdeline Joffres; Mike Priddy; Francesca Morselli; Thomas Lebarbé; Xavier Granier; Paul Bertrand; Xavier Rodier; Fabrice Melka; Jason Camlot; Stéfan Sinclair; Idmhand Fatiha; Caroline Abéla; Mehdi Chayani; Christophe Parisse; Céline Poudat; Véronique Ginouvès; Sinatra, Michael E.; Emmanuel Chateau Dutier; Gimena del Rio Riande; Paula Ricaurte; Isabel Galina Russel; José Francisco Barron Tovar; Ernesto Priani Saiso; Martin Grandjean; Aurélien Berra; Olivier Baude; Stéphane Pouyllau;Publisher: HAL CCSDCountry: France
International audience; Knowledge production has always act globally, and when it comes to the humanities early networks of scholars can still be traced in their letter correspondence. With the emergence of digital humanities more prominently in the 1970s, research communities have organized themselves in many different ways. The enthusiasm generated by the promises of what was sometimes perceived as a "new field" were to some extent echoed in new forms of institutionalization, to the point of defining a discipline in its own right. But the enthusiasms was also accompanied by a certain resistance of communities reluctant to introduce digital technology into their field.The term of "digital humanities" in these earlier days of adopting digital methods into the humanities created an area, a niche, inside which pioneers in Digital Humanities could gain critical mass. Today, where digital methods are far more widely applied, one can observe an almost opposite trend, the abandoning of a ‘specific label’ and a much broader advocacy concerning all humanities.What remains specific for DH communities is the close alliance between content providers (which themselves are in a process of digitisation content and access), humanities scholars applying digital methods, and computer scientists linking to new methodological achievements in their field. However, this alliance can express itself in very different forms of national and international organisation, and is far from following a specific model.This panel examines different ways of "forming a community" among digital humanities scholars and scholars in other fields, and other actors in DH. The contributions span a range from generic ways to design digital research infrastructures in the SSH, over national solutions to supranational coordination.The purpose of this panel is to unfold the diversity of the current "digital humanist movement”, not only to compare, but also to understand what is at stake for the actors involved and what impact the different forms of organisation have on creation and evolution of research communities. We further discuss issues of cohesion and durability. Through the papers presented, we will examine the impact of bottom-up, top-down and horizontal strategies as well as the adoption of hybrid solutions (organizational, disciplinary, methodological, scalar) in the design of research communities. This approach will allow us to put convergences and challenges into perspective and to question the re- compositions at work within SSH communities.This panel will highlight the experiences of SSH research communities from different cultures and organizations rooted at different levels of governance, such as some French communities structured around institutional nodes such as Maisons des Sciences de l'Homme (MSH), or research infrastructures at the national (TGIR Huma-Num) or European level (DARIAH ERIC); project based collaboration of research infrastructures (DANS, The Netherlands) and Canada (CRIHN); and professional networks and transnational associations related to digital humanities (e.g. Humanistica, the French-speaking association of digital humanities, or the Latin American network for digital humanities under construction). The comparison of the experiences presented will not produce a homogeneous and smooth image but will highlight differences in approaches and organisation. Even it seems nearly impossible to give account of every association that could be representative on a way to build community in DH, the chair of the session will make an introduction with a brief summary of this landscape. That said, besides the geographical aspect that we try to include, another is that we are giving voice to formal and informal associations such as the LatamHD network, that is just at an early stage and that is not yet defined in its goals. We decided to propose several solutions to deal with the diversity of needs and practises inside our communities and we wanted to present some of them to share our experiences and initiate discussions during this panel in order to develop collaborations with colleagues sharing the same kind of constraints.Thus, the objective is to have a broad discussion with the audience to broaden the perspectives to other experiences.This panel aims to contribute to the reflective work in the wider DH context about factors of constitution, consolidation and evolution of its research communities.
- Publication . Article . Conference object . Preprint . 2019Open Access EnglishAuthors:Lilia Simeonova; Kiril Simov; Petya Osenova; Preslav Nakov;Lilia Simeonova; Kiril Simov; Petya Osenova; Preslav Nakov;
We propose a morphologically informed model for named entity recognition, which is based on LSTM-CRF architecture and combines word embeddings, Bi-LSTM character embeddings, part-of-speech (POS) tags, and morphological information. While previous work has focused on learning from raw word input, using word and character embeddings only, we show that for morphologically rich languages, such as Bulgarian, access to POS information contributes more to the performance gains than the detailed morphological information. Thus, we show that named entity recognition needs only coarse-grained POS tags, but at the same time it can benefit from simultaneously using some POS information of different granularity. Our evaluation results over a standard dataset show sizable improvements over the state-of-the-art for Bulgarian NER. named entity recognition; Bulgarian NER; morphology; morpho-syntax
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Conference object . Part of book or chapter of book . 2018Open AccessAuthors:Paola Puma;Paola Puma;Publisher: Springer International PublishingCountry: Italy
Here we address the roadmap of the Digital Cultural Heritage research group DigitCH group, which was set up in 2013 at the Department of Architecture, University of Florence. The aim of DigitCH group was to realize the link between scientifically validated methodologies and contents, innovative storytelling, and technological instrumentation. The spread of electronic devices has enabled rapid and easy technological fallout of research in the field of the acquisition-representation of the survey data expanding audiences and accelerating even an innovative approach to the whole knowledge of CH.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2020Open Access EnglishAuthors:Luca Foppiano; Laurent Romary;Luca Foppiano; Laurent Romary;Publisher: HAL CCSDCountry: FranceProject: EC | HIRMEOS (731102)
International audience; This paper presents an attempt to provide a generic named-entity recognition and disambiguation module (NERD) called entity-fishing as a stable online service that demonstrates the possible delivery of sustainable technical services within DARIAH, the European digital research infrastructure for the arts and humanities. Deployed as part of the national infrastructure Huma-Num in France, this service provides an efficient state-of-the-art implementation coupled with standardised interfaces allowing an easy deployment on a variety of potential digital humanities contexts. The topics of accessibility and sustainability have been long discussed in the attempt of providing some best practices in the widely fragmented ecosystem of the DARIAH research infrastructure. The history of entity-fishing has been mentioned as an example of good practice: initially developed in the context of the FP9 CENDARI, the project was well received by the user community and continued to be further developed within the H2020 HIRMEOS project where several open access publishers have integrated the service to their collections of published monographs as a means to enhance retrieval and access.entity-fishing implements entity extraction as well as disambiguation against Wikipedia and Wikidata entries. The service is accessible through a REST API which allows easier and seamless integration, language independent and stable convention and a widely used service oriented architecture (SOA) design. Input and output data are carried out over a query data model with a defined structure providing flexibility to support the processing of partially annotated text or the repartition of text over several queries. The interface implements a variety of functionalities, like language recognition, sentence segmentation and modules for accessing and looking up concepts in the knowledge base. The API itself integrates more advanced contextual parametrisation or ranked outputs, allowing for the resilient integration in various possible use cases. The entity-fishing API has been used as a concrete use case3 to draft the experimental stand-off proposal, which has been submitted for integration into the TEI guidelines. The representation is also compliant with the Web Annotation Data Model (WADM).In this paper we aim at describing the functionalities of the service as a reference contribution to the subject of web-based NERD services. In order to cover all aspects, the architecture is structured to provide two complementary viewpoints. First, we discuss the system from the data angle, detailing the workflow from input to output and unpacking each building box in the processing flow. Secondly, with a more academic approach, we provide a transversal schema of the different components taking into account non-functional requirements in order to facilitate the discovery of bottlenecks, hotspots and weaknesses. The attempt here is to give a description of the tool and, at the same time, a technical software engineering analysis which will help the reader to understand our choice for the resources allocated in the infrastructure.Thanks to the work of million of volunteers, Wikipedia has reached today stability and completeness that leave no usable alternatives on the market (considering also the licence aspect). The launch of Wikidata in 2010 have completed the picture with a complementary language independent meta-model which is becoming the scientific reference for many disciplines. After providing an introduction to Wikipedia and Wikidata, we describe the knowledge base: the data organisation, the entity-fishing process to exploit it and the way it is built from nightly dumps using an offline process.We conclude the paper by presenting our solution for the service deployment: how and which the resources where allocated. The service has been in production since Q3 of 2017, and extensively used by the H2020 HIRMEOS partners during the integration with the publishing platforms. We believe we have strived to provide the best performances with the minimum amount of resources. Thanks to the Huma-num infrastructure we still have the possibility to scale up the infrastructure as needed, for example to support an increase of demand or temporary needs to process huge backlog of documents. On the long term, thanks to this sustainable environment, we are planning to keep delivering the service far beyond the end of the H2020 HIRMEOS project.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . 2019Open Access EnglishAuthors:Raciti, Marco; Gabay, Simon; Moranville, Yoann; Jorge, Maria Do Rosário; Fernandes, João;Raciti, Marco; Gabay, Simon; Moranville, Yoann; Jorge, Maria Do Rosário; Fernandes, João;Publisher: HAL CCSDCountry: FranceProject: EC | DESIR (731081)
International audience; Europe has a long and rich tradition as a centre of research and teaching in the arts and humanities. However, the huge digital transformation that affects the arts and humanities research landscape all over the world requires that we set up sustainable research infrastructures, new and refined techniques, state-of-the-art methods and an expanded skills base. Responding to these challenges, the Digital Research Infrastructure for Arts and Humanities (DARIAH) was launched as a pan-European network and research infrastructure. After expansion and consolidation, which involved DARIAH’s inclusion in the ESFRI roadmap, DARIAH became a European Research Infrastructure Consortium (ERIC) in 2014. The Horizon 2020 funded project DESIR (DARIAH ERIC Sustainability Refined) sets out to strengthen the sustainability of DARIAH and help establish it as a reliable long-term partner within our communities. Sustaining existing digital expertise, tools, resources in Europe in the context of DESIR involves a goal-oriented set of measures in order to first, maintain, expand and develop DARIAH in its capacities as an organisation and technical research infrastructure; secondly, to engage its members further, as well as measure and increase their trust in DARIAH; thirdly, to expand the network in order to integrate new regions and communities. The DESIR consortium is composed of core DARIAH members, representatives from potential new DARIAH members and external technical experts. The sustainability of a research infrastructure is the capacity to remain operative, effective and competitive over its expected lifetime. In DESIR, this definition is translated into an evolving 6-dimensional process, divided into the following challenges:•Dissemination•Growth•Technology•Robustness•Trust•EducationWith our poster, we would like to show how the project helps sustaining DARIAH. Within DESIR, dissemination is the ability to communicate DARIAH’s strategy and benefits effectively within the DARIAH community and in new areas, spreading out to new communities. Through the international workshops held at Stanford University and at the Library of Congress, DARIAH has been introduced to many non-European DH scholars. These events were an important first step to foster international cooperation between US and European colleagues as well as a catalyst for ongoing collaborations in the future. A third workshop took place in Canberra at the Australian Research Data Commons in March 2019.DARIAH has currently 17 members from all over Europe. Nevertheless, efforts should be made to include as many countries as possible to bring in and scale, to a European level, even more state-of-the-art DH activities.Six candidates ready for building strong national consortia have been identified, enabling a substantial expansion of DARIAH’s country coverage. Additionally, thematic workshops are organised in each country as well as tailored training measures.DESIR widens the research infrastructure in core areas which are vital for DARIAH’s sustainability but are not yet covered by the existing set-up. As DARIAH expands across Europe, continuously enhancing and further developing the ERIC exceeds DARIAH’s internal technological capacities. Two notable results were achieved so far: firstly, the publication of a technical reference as a result of a workshop organised in October 2017 with CESSDA and CLARIN. It’s a collection of basic guidelines and references for development and maintenance of infrastructure services within DARIAH and beyond, addressing an ongoing issue for research infrastructures, namely software sustainability. Secondly, the organisation of a Code Sprint, focusing on bibliographical and citation metadata, which helped shaping DARIAH’s profile in four technology areas (visualisation, text analytic services, entity-based search and scholarly content management). Another Code sprint is expected to take place in Summer 2019.Another output is the implementation of a centralized helpdesk. This helpdesk is hosted by CLARIN-D and the solution of integration within the existing DARIAH website was the creation of a WordPress plugin. This plugin is used to connect our website with the OTRS server and allows the creation of issues easily by users unfamiliar with OTRS.Sustaining a research infrastructure involves also two important aspects: trust and education. For DARIAH, it is crucial to increase trust and confidence from its users. In DESIR we develop recommendations and strategies accordingly, targeting new cross-disciplinary communities, based on the results of a survey and interviews addressed to the scientific community, with different levels of approach - national, institutional and individual.In addition, education is a key area and the project contributes to the ongoing discussions about the role and modalities of training and education in the development, consolidation and sustainability of digital research infrastructures. We believe that investing time and efforts into training and educating users is a way of securing the social sustainability of a research infrastructure.