- home
- Advanced Search
19 Research products, page 1 of 2
Loading
- Publication . Other literature type . Article . 2018Open Access EnglishAuthors:Atherton, Christopher John; Barton, Thomas; Basney, Jim; Broeder, Daan; Costa, Alessandro; Daalen, Mirjam Van; Dyke, Stephanie; Elbers, Willem; Enell, Carl-Fredrik; Fasanelli, Enrico Maria Vincenzo; +30 moreAtherton, Christopher John; Barton, Thomas; Basney, Jim; Broeder, Daan; Costa, Alessandro; Daalen, Mirjam Van; Dyke, Stephanie; Elbers, Willem; Enell, Carl-Fredrik; Fasanelli, Enrico Maria Vincenzo; Fernandes, João; Florio, Licia; Gietz, Peter; Groep, David L.; Junker, Matthias Bernhard; Kanellopoulos, Christos; Kelsey, David; Kershaw, Philip; Knapic, Cristina; Kollegger, Thorsten; Koranda, Scott; Linden, Mikael; Marinic, Filip; Matyska, Ludek; Nyrönen, Tommi Henrik; Paetow, Stefan; Paglione, Laura A D; Parlati, Sandra; Phillips, Christopher; Prochazka, Michal; Rees, Nicholas; Short, Hannah; Stevanovic, Uros; Tartakovsky, Michael; Venekamp, Gerben; Vitez, Tom; Wartel, Romain; Whalen, Christopher; White, John; Zwölf, Carlo Maria;Country: GermanyProject: EC | GN4-2 (731122), EC | IS-ENES2 (312979), EC | IS-ENES (228203), EC | CALIPSOplus (730872), EC | CORBEL (654248), EC | AARC2 (730941), EC | EOSC-hub (777536), EC | ELIXIR-EXCELERATE (676559), NSF | Data Handling and Analysi... (1700765)
The authors also acknowledge the support and collaboration of many other colleagues in their respective institutes, research communities and IT Infrastructures, together with the funding received by these from many different sources. These include but are not limited to the following: (i) The Worldwide LHC Computing Grid (WLCG) project is a global collaboration of more than 170 computing centres in 43 countries, linking up national and international grid infrastructures. Funding is acknowledged from many national funding bodies and we acknowledge the support of several operational infrastructures including EGI, OSG and NDGF/NeIC. (ii) EGI acknowledges the funding and support received from the European Commission and the many National Grid Initiatives and other members. EOSC-hub receives funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 777536. (iii) The work leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under Grant Agreement No. 730941 (AARC2). (iv) Work on the development of ESGF's identity management system has been supported by The UK Natural Environment Research Council and funding from the European Union's Seventh Framework Programme for research, technological development and demonstration through projects IS-ENES (grant agreement no 228203) and IS-ENES2 (grant agreement no 312979). (v) Ludek Matyska and Michal Prochazka acknowledge funding from the RI ELIXIR CZ project funded by MEYS Czech Republic No. LM2015047. (vi) Scott Koranda acknowledges support provided by the United States National Science Foundation under Grant No. PHY-1700765. (vii) GÉANT Association on behalf of the GN4 Phase 2 project (GN4-2).The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under Grant Agreement No. 731122(GN4-2). (viii) ELIXIR acknowledges support from Research Infrastructure programme of Horizon 2020 grant No 676559 EXCELERATE. (ix) CORBEL life science cluster acknowledges support from Horizon 2020 research and innovation programme under grant agreement No 654248. (x) Mirjam van Daalen acknowledges that the research leading to this result has been supported by the project CALIPSOplus under the Grant Agreement 730872 from the EU Framework Programme for Research and Innovation HORIZON 2020. (xi) EISCAT is an international association supported by research organisations in China (CRIRP), Finland (SA), Japan (NIPR), Norway (NFR), Sweden (VR), and the United Kingdom (NERC). This white-paper expresses common requirements of Research Communities seeking to leverage Identity Federation for Authentication and Authorisation. Recommendations are made to Stakeholders to guide the future evolution of Federated Identity Management in a direction that better satisfies research use cases. The authors represent research communities, Research Services, Infrastructures, Identity Federations and Interfederations, with a joint motivation to ease collaboration for distributed researchers. The content has been edited collaboratively by the Federated Identity Management for Research (FIM4R) Community, with input sought at conferences and meetings in Europe, Asia and North America.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . 2018FrenchAuthors:Ginouvès, Véronique; Gras, Isabelle;Ginouvès, Véronique; Gras, Isabelle;Publisher: HAL CCSDCountry: France
International audience; En guise de postface, il nous a semblé nécessaire de revenir sur le processus collaboratif de la fabrication de cet ouvrage et de vous confier la genèse de ce projet. Tout est parti d'un constat pragmatique, de nos situations quotidiennes de travail : le/la chercheur·e qui produit ou utilise des données a besoin de réponses concrètes aux questions auxquelles il/elle est confronté·e sur son terrain comme lors de tous ses travaux de recherche. Produire, exploiter, diffuser, partager ou éditer des sources numériques fait aujourd'hui partie de notre travail ordinaire. La rupture apportée par le développement du web et l'arrivée du format numérique ont largement facilité la diffusion et le partage des ressources (documentaires, textuelles, photographiques, sonores ou audiovisuelles...) dans le monde de la recherche et, au-delà, auprès des citoyens de plus en plus curieux et intéressés par les documents produits par les scientifiques.
- Publication . Article . Preprint . Conference object . 2019Open AccessAuthors:Lilia Simeonova; Kiril Simov; Petya Osenova; Preslav Nakov;Lilia Simeonova; Kiril Simov; Petya Osenova; Preslav Nakov;Publisher: Incoma Ltd., Shoumen, Bulgaria
We propose a morphologically informed model for named entity recognition, which is based on LSTM-CRF architecture and combines word embeddings, Bi-LSTM character embeddings, part-of-speech (POS) tags, and morphological information. While previous work has focused on learning from raw word input, using word and character embeddings only, we show that for morphologically rich languages, such as Bulgarian, access to POS information contributes more to the performance gains than the detailed morphological information. Thus, we show that named entity recognition needs only coarse-grained POS tags, but at the same time it can benefit from simultaneously using some POS information of different granularity. Our evaluation results over a standard dataset show sizable improvements over the state-of-the-art for Bulgarian NER. Comment: named entity recognition; Bulgarian NER; morphology; morpho-syntax
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2014Open Access EnglishAuthors:Bridgette Wessels; Rachel Finn; Peter Linde; Paolo Mazzetti; Stefano Nativi; Susan Riley; Rod Smallwood; Mark J. Taylor; Victoria Tsoukala; Kush Wadhwa; +1 moreBridgette Wessels; Rachel Finn; Peter Linde; Paolo Mazzetti; Stefano Nativi; Susan Riley; Rod Smallwood; Mark J. Taylor; Victoria Tsoukala; Kush Wadhwa; Sally Wyatt;Countries: Italy, Netherlands, SwedenProject: EC | OPENAIRE (246686), EC | DRIVER II (212147), EC | APARSEN (269977), EC | RECODE (321463), EC | ETTIS (285593)
This paper explores key issues in the development of open access to research data. The use of digital means for developing, storing and manipulating data is creating a focus on ‘data-driven science’. One aspect of this focus is the development of ‘open access’ to research data. Open access to research data refers to the way in which various types of data are openly available to public and private stakeholders, user communities and citizens. Open access to research data, however, involves more than simply providing easier and wider access to data for potential user groups. The development of open access requires attention to the ways data are considered in different areas of research. We identify how open access is being unevenly developed across the research environment and the consequences this has in terms of generating data gaps. Data gaps refer to the way data becomes detached from published conclusions. To address these issues, we examine four main areas in developing open access to research data: stakeholder roles and values; technological requirements for managing and sharing data; legal and ethical regulations and procedures; institutional roles and policy frameworks. We conclude that problems of variability and consistency across the open access ecosystem need to be addressed within and between these areas to ensure that risks surrounding a data gap are managed in open access. 11 authors. Missing: Sally Wyatt
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2020Open AccessAuthors:Riccardo Pozzo; Andrea Filippetti; Mario Paolucci; Vania Virgili;Riccardo Pozzo; Andrea Filippetti; Mario Paolucci; Vania Virgili;Publisher: Oxford University Press (OUP)Country: Italy
AbstractThis article introduces the notion of cultural innovation, which requires adapting our approach to co-creation. The argument opens with a first conceptualization of cultural innovation as an additional and autonomous category of the complex processes of co-creation. The dimensions of cultural innovation are contrasted against other forms of innovation. In a second step, the article makes an unprecedented attempt in describing processes and outcomes of cultural innovation, while showing their operationalization in some empirical case studies. In the conclusion, the article considers policy implications resulting from the novel definition of cultural innovation as the outcome of complex processes that involve the reflection of knowledge flows across the social environment within communities of practices while fostering the inclusion of diversity in society. First and foremost, cultural innovation takes a critical stance against inequalities in the distribution of knowledge and builds innovation for improving the welfare of individuals and communities.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . 2020EnglishAuthors:Kristanti, Tanti; Romary, Laurent;Kristanti, Tanti; Romary, Laurent;Publisher: HAL CCSDCountry: France
International audience; This article presents an overview of approaches and results during our participation in the CLEF HIPE 2020 NERC-COARSE-LIT and EL-ONLY tasks for English and French. For these two tasks, we use two systems: 1) DeLFT, a Deep Learning framework for text processing; 2) entity-fishing, generic named entity recognition and disambiguation service deployed in the technical framework of INRIA.
- Publication . Article . 2020Open Access EnglishAuthors:Luca Foppiano; Laurent Romary;Luca Foppiano; Laurent Romary;Publisher: HAL CCSDCountry: FranceProject: EC | HIRMEOS (731102)
International audience; This paper presents an attempt to provide a generic named-entity recognition and disambiguation module (NERD) called entity-fishing as a stable online service that demonstrates the possible delivery of sustainable technical services within DARIAH, the European digital research infrastructure for the arts and humanities. Deployed as part of the national infrastructure Huma-Num in France, this service provides an efficient state-of-the-art implementation coupled with standardised interfaces allowing an easy deployment on a variety of potential digital humanities contexts. The topics of accessibility and sustainability have been long discussed in the attempt of providing some best practices in the widely fragmented ecosystem of the DARIAH research infrastructure. The history of entity-fishing has been mentioned as an example of good practice: initially developed in the context of the FP9 CENDARI, the project was well received by the user community and continued to be further developed within the H2020 HIRMEOS project where several open access publishers have integrated the service to their collections of published monographs as a means to enhance retrieval and access.entity-fishing implements entity extraction as well as disambiguation against Wikipedia and Wikidata entries. The service is accessible through a REST API which allows easier and seamless integration, language independent and stable convention and a widely used service oriented architecture (SOA) design. Input and output data are carried out over a query data model with a defined structure providing flexibility to support the processing of partially annotated text or the repartition of text over several queries. The interface implements a variety of functionalities, like language recognition, sentence segmentation and modules for accessing and looking up concepts in the knowledge base. The API itself integrates more advanced contextual parametrisation or ranked outputs, allowing for the resilient integration in various possible use cases. The entity-fishing API has been used as a concrete use case3 to draft the experimental stand-off proposal, which has been submitted for integration into the TEI guidelines. The representation is also compliant with the Web Annotation Data Model (WADM).In this paper we aim at describing the functionalities of the service as a reference contribution to the subject of web-based NERD services. In order to cover all aspects, the architecture is structured to provide two complementary viewpoints. First, we discuss the system from the data angle, detailing the workflow from input to output and unpacking each building box in the processing flow. Secondly, with a more academic approach, we provide a transversal schema of the different components taking into account non-functional requirements in order to facilitate the discovery of bottlenecks, hotspots and weaknesses. The attempt here is to give a description of the tool and, at the same time, a technical software engineering analysis which will help the reader to understand our choice for the resources allocated in the infrastructure.Thanks to the work of million of volunteers, Wikipedia has reached today stability and completeness that leave no usable alternatives on the market (considering also the licence aspect). The launch of Wikidata in 2010 have completed the picture with a complementary language independent meta-model which is becoming the scientific reference for many disciplines. After providing an introduction to Wikipedia and Wikidata, we describe the knowledge base: the data organisation, the entity-fishing process to exploit it and the way it is built from nightly dumps using an offline process.We conclude the paper by presenting our solution for the service deployment: how and which the resources where allocated. The service has been in production since Q3 of 2017, and extensively used by the H2020 HIRMEOS partners during the integration with the publishing platforms. We believe we have strived to provide the best performances with the minimum amount of resources. Thanks to the Huma-num infrastructure we still have the possibility to scale up the infrastructure as needed, for example to support an increase of demand or temporary needs to process huge backlog of documents. On the long term, thanks to this sustainable environment, we are planning to keep delivering the service far beyond the end of the H2020 HIRMEOS project.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . Conference object . 2016Open Access EnglishAuthors:Marcin Pciennik; Sandro Fiore; Giacinto Donvito; M. Owsiak; Marco Fargetta; Roberto Barbera; Riccardo Bruno; Emidio Giorgio; Dean N. Williams; Giovanni Aloisio;Marcin Pciennik; Sandro Fiore; Giacinto Donvito; M. Owsiak; Marco Fargetta; Roberto Barbera; Riccardo Bruno; Emidio Giorgio; Dean N. Williams; Giovanni Aloisio;
handle: 11587/403865
Publisher: ElsevierCountry: ItalyProject: EC | INDIGO-DataCloud (653549)AbstractIn this paper we present the approach proposed by EU H2020 INDIGO-DataCloud project to orchestrate dynamic workflows over a cloud environment. The main focus of the project is on the development of open source Platform as a Service solutions targeted at scientific communities, deployable on multiple hardware platforms, and provisioned over hybrid e-Infrastructures. The project is addressing many challenging gaps in current cloud solutions, responding to specific requirements coming from scientific communities including Life Sciences, Physical Sciences and Astronomy, Social Sciences and Humanities, and Environmental Sciences. We are presenting the ongoing work on implementing the whole software chain on the Infrastructure as a Service, PaaS and Software as a Service layers, focusing on the scenarios involving scientific workflows and big data analytics frameworks. INDIGO module for Kepler worflow system has been introduced along with the INDIGO underlying services exploited by the workflow components. A climate change data analytics experiment use case regarding the precipitation trend analysis on CMIP5 data is described, that makes use of Kepler and big data analytics services.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2013Open Access EnglishAuthors:Mark Hedges; Heike Neuroth; Kathleen M. Smith; Tobias Blanke; Laurent Romary; Marc Wilhelm Küster; Malcolm Illingworth;Mark Hedges; Heike Neuroth; Kathleen M. Smith; Tobias Blanke; Laurent Romary; Marc Wilhelm Küster; Malcolm Illingworth;
doi: 10.4000/jtei.774
Publisher: TEI ConsortiumCountry: FranceInternational audience; In recent years, a variety of initiatives have been funded with the aim of producing software tools or environments of a type variously known as virtual research environments, research infrastructures, or cyberinfrastructures. These initiatives vary in their scale, specialization, scope, and level of funding. One issue that they face in common, however, is that of sustainability: how can the continued--and useful--existence of a system or tool be guaranteed, or at least facilitated, once a project's funding has been spent? In this paper, we examine how such sustainability has been enabled, in the particular case of infrastructures for textual scholarship, in the context of three international projects: TextGrid,1 TEXTvre,2 and DARIAH3. Firstly, we will address the inter-project collaboration and crossfertilization between TextGrid and TEXTvre, including architectural decisions and shared data infrastructures, and investigate how the projects benefited from the exchange. We will then discuss how this existing collaboration can be taken forward by the loosely-coupled and distributed framework being developed by the DARIAH community, and how it can serve as a model for the sort of collaborations that DARIAH plans to enable.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2010Open Access EnglishAuthors:Benoit Habert; Claude Huc;Benoit Habert; Claude Huc;Publisher: HAL CCSDCountry: France
Pour permettre de comprendre les interactions possibles entre transmission et numerisation, un projet pilote d'archivage numerique perenne est presente par ses deux coordinateurs, L'article evoque le contexte actuel de transmission sous forme numerique des recherches passees et presentes en sciences humaines et sociales (SHS). Il souligne l'ecart entre le role croissant des donnees numeriques et leur fragilite. Il presente le modele abstrait standard d'archivage numerique perenne et la maniere dont il a ete instancie dans le projet pilote. Il termine par un retour reflexif sur les facteurs qui vont conditionner l'avenir de projets similaires: choix et comportements organisationnels, roles respectifs des donnees et des connaissances, constitution et comportement des communautes d'utilisateurs, statut de la memoire collective en SHS.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.
19 Research products, page 1 of 2
Loading
- Publication . Other literature type . Article . 2018Open Access EnglishAuthors:Atherton, Christopher John; Barton, Thomas; Basney, Jim; Broeder, Daan; Costa, Alessandro; Daalen, Mirjam Van; Dyke, Stephanie; Elbers, Willem; Enell, Carl-Fredrik; Fasanelli, Enrico Maria Vincenzo; +30 moreAtherton, Christopher John; Barton, Thomas; Basney, Jim; Broeder, Daan; Costa, Alessandro; Daalen, Mirjam Van; Dyke, Stephanie; Elbers, Willem; Enell, Carl-Fredrik; Fasanelli, Enrico Maria Vincenzo; Fernandes, João; Florio, Licia; Gietz, Peter; Groep, David L.; Junker, Matthias Bernhard; Kanellopoulos, Christos; Kelsey, David; Kershaw, Philip; Knapic, Cristina; Kollegger, Thorsten; Koranda, Scott; Linden, Mikael; Marinic, Filip; Matyska, Ludek; Nyrönen, Tommi Henrik; Paetow, Stefan; Paglione, Laura A D; Parlati, Sandra; Phillips, Christopher; Prochazka, Michal; Rees, Nicholas; Short, Hannah; Stevanovic, Uros; Tartakovsky, Michael; Venekamp, Gerben; Vitez, Tom; Wartel, Romain; Whalen, Christopher; White, John; Zwölf, Carlo Maria;Country: GermanyProject: EC | GN4-2 (731122), EC | IS-ENES2 (312979), EC | IS-ENES (228203), EC | CALIPSOplus (730872), EC | CORBEL (654248), EC | AARC2 (730941), EC | EOSC-hub (777536), EC | ELIXIR-EXCELERATE (676559), NSF | Data Handling and Analysi... (1700765)
The authors also acknowledge the support and collaboration of many other colleagues in their respective institutes, research communities and IT Infrastructures, together with the funding received by these from many different sources. These include but are not limited to the following: (i) The Worldwide LHC Computing Grid (WLCG) project is a global collaboration of more than 170 computing centres in 43 countries, linking up national and international grid infrastructures. Funding is acknowledged from many national funding bodies and we acknowledge the support of several operational infrastructures including EGI, OSG and NDGF/NeIC. (ii) EGI acknowledges the funding and support received from the European Commission and the many National Grid Initiatives and other members. EOSC-hub receives funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 777536. (iii) The work leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under Grant Agreement No. 730941 (AARC2). (iv) Work on the development of ESGF's identity management system has been supported by The UK Natural Environment Research Council and funding from the European Union's Seventh Framework Programme for research, technological development and demonstration through projects IS-ENES (grant agreement no 228203) and IS-ENES2 (grant agreement no 312979). (v) Ludek Matyska and Michal Prochazka acknowledge funding from the RI ELIXIR CZ project funded by MEYS Czech Republic No. LM2015047. (vi) Scott Koranda acknowledges support provided by the United States National Science Foundation under Grant No. PHY-1700765. (vii) GÉANT Association on behalf of the GN4 Phase 2 project (GN4-2).The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under Grant Agreement No. 731122(GN4-2). (viii) ELIXIR acknowledges support from Research Infrastructure programme of Horizon 2020 grant No 676559 EXCELERATE. (ix) CORBEL life science cluster acknowledges support from Horizon 2020 research and innovation programme under grant agreement No 654248. (x) Mirjam van Daalen acknowledges that the research leading to this result has been supported by the project CALIPSOplus under the Grant Agreement 730872 from the EU Framework Programme for Research and Innovation HORIZON 2020. (xi) EISCAT is an international association supported by research organisations in China (CRIRP), Finland (SA), Japan (NIPR), Norway (NFR), Sweden (VR), and the United Kingdom (NERC). This white-paper expresses common requirements of Research Communities seeking to leverage Identity Federation for Authentication and Authorisation. Recommendations are made to Stakeholders to guide the future evolution of Federated Identity Management in a direction that better satisfies research use cases. The authors represent research communities, Research Services, Infrastructures, Identity Federations and Interfederations, with a joint motivation to ease collaboration for distributed researchers. The content has been edited collaboratively by the Federated Identity Management for Research (FIM4R) Community, with input sought at conferences and meetings in Europe, Asia and North America.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . 2018FrenchAuthors:Ginouvès, Véronique; Gras, Isabelle;Ginouvès, Véronique; Gras, Isabelle;Publisher: HAL CCSDCountry: France
International audience; En guise de postface, il nous a semblé nécessaire de revenir sur le processus collaboratif de la fabrication de cet ouvrage et de vous confier la genèse de ce projet. Tout est parti d'un constat pragmatique, de nos situations quotidiennes de travail : le/la chercheur·e qui produit ou utilise des données a besoin de réponses concrètes aux questions auxquelles il/elle est confronté·e sur son terrain comme lors de tous ses travaux de recherche. Produire, exploiter, diffuser, partager ou éditer des sources numériques fait aujourd'hui partie de notre travail ordinaire. La rupture apportée par le développement du web et l'arrivée du format numérique ont largement facilité la diffusion et le partage des ressources (documentaires, textuelles, photographiques, sonores ou audiovisuelles...) dans le monde de la recherche et, au-delà, auprès des citoyens de plus en plus curieux et intéressés par les documents produits par les scientifiques.
- Publication . Article . Preprint . Conference object . 2019Open AccessAuthors:Lilia Simeonova; Kiril Simov; Petya Osenova; Preslav Nakov;Lilia Simeonova; Kiril Simov; Petya Osenova; Preslav Nakov;Publisher: Incoma Ltd., Shoumen, Bulgaria
We propose a morphologically informed model for named entity recognition, which is based on LSTM-CRF architecture and combines word embeddings, Bi-LSTM character embeddings, part-of-speech (POS) tags, and morphological information. While previous work has focused on learning from raw word input, using word and character embeddings only, we show that for morphologically rich languages, such as Bulgarian, access to POS information contributes more to the performance gains than the detailed morphological information. Thus, we show that named entity recognition needs only coarse-grained POS tags, but at the same time it can benefit from simultaneously using some POS information of different granularity. Our evaluation results over a standard dataset show sizable improvements over the state-of-the-art for Bulgarian NER. Comment: named entity recognition; Bulgarian NER; morphology; morpho-syntax
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2014Open Access EnglishAuthors:Bridgette Wessels; Rachel Finn; Peter Linde; Paolo Mazzetti; Stefano Nativi; Susan Riley; Rod Smallwood; Mark J. Taylor; Victoria Tsoukala; Kush Wadhwa; +1 moreBridgette Wessels; Rachel Finn; Peter Linde; Paolo Mazzetti; Stefano Nativi; Susan Riley; Rod Smallwood; Mark J. Taylor; Victoria Tsoukala; Kush Wadhwa; Sally Wyatt;Countries: Italy, Netherlands, SwedenProject: EC | OPENAIRE (246686), EC | DRIVER II (212147), EC | APARSEN (269977), EC | RECODE (321463), EC | ETTIS (285593)
This paper explores key issues in the development of open access to research data. The use of digital means for developing, storing and manipulating data is creating a focus on ‘data-driven science’. One aspect of this focus is the development of ‘open access’ to research data. Open access to research data refers to the way in which various types of data are openly available to public and private stakeholders, user communities and citizens. Open access to research data, however, involves more than simply providing easier and wider access to data for potential user groups. The development of open access requires attention to the ways data are considered in different areas of research. We identify how open access is being unevenly developed across the research environment and the consequences this has in terms of generating data gaps. Data gaps refer to the way data becomes detached from published conclusions. To address these issues, we examine four main areas in developing open access to research data: stakeholder roles and values; technological requirements for managing and sharing data; legal and ethical regulations and procedures; institutional roles and policy frameworks. We conclude that problems of variability and consistency across the open access ecosystem need to be addressed within and between these areas to ensure that risks surrounding a data gap are managed in open access. 11 authors. Missing: Sally Wyatt
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2020Open AccessAuthors:Riccardo Pozzo; Andrea Filippetti; Mario Paolucci; Vania Virgili;Riccardo Pozzo; Andrea Filippetti; Mario Paolucci; Vania Virgili;Publisher: Oxford University Press (OUP)Country: Italy
AbstractThis article introduces the notion of cultural innovation, which requires adapting our approach to co-creation. The argument opens with a first conceptualization of cultural innovation as an additional and autonomous category of the complex processes of co-creation. The dimensions of cultural innovation are contrasted against other forms of innovation. In a second step, the article makes an unprecedented attempt in describing processes and outcomes of cultural innovation, while showing their operationalization in some empirical case studies. In the conclusion, the article considers policy implications resulting from the novel definition of cultural innovation as the outcome of complex processes that involve the reflection of knowledge flows across the social environment within communities of practices while fostering the inclusion of diversity in society. First and foremost, cultural innovation takes a critical stance against inequalities in the distribution of knowledge and builds innovation for improving the welfare of individuals and communities.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . 2020EnglishAuthors:Kristanti, Tanti; Romary, Laurent;Kristanti, Tanti; Romary, Laurent;Publisher: HAL CCSDCountry: France
International audience; This article presents an overview of approaches and results during our participation in the CLEF HIPE 2020 NERC-COARSE-LIT and EL-ONLY tasks for English and French. For these two tasks, we use two systems: 1) DeLFT, a Deep Learning framework for text processing; 2) entity-fishing, generic named entity recognition and disambiguation service deployed in the technical framework of INRIA.
- Publication . Article . 2020Open Access EnglishAuthors:Luca Foppiano; Laurent Romary;Luca Foppiano; Laurent Romary;Publisher: HAL CCSDCountry: FranceProject: EC | HIRMEOS (731102)
International audience; This paper presents an attempt to provide a generic named-entity recognition and disambiguation module (NERD) called entity-fishing as a stable online service that demonstrates the possible delivery of sustainable technical services within DARIAH, the European digital research infrastructure for the arts and humanities. Deployed as part of the national infrastructure Huma-Num in France, this service provides an efficient state-of-the-art implementation coupled with standardised interfaces allowing an easy deployment on a variety of potential digital humanities contexts. The topics of accessibility and sustainability have been long discussed in the attempt of providing some best practices in the widely fragmented ecosystem of the DARIAH research infrastructure. The history of entity-fishing has been mentioned as an example of good practice: initially developed in the context of the FP9 CENDARI, the project was well received by the user community and continued to be further developed within the H2020 HIRMEOS project where several open access publishers have integrated the service to their collections of published monographs as a means to enhance retrieval and access.entity-fishing implements entity extraction as well as disambiguation against Wikipedia and Wikidata entries. The service is accessible through a REST API which allows easier and seamless integration, language independent and stable convention and a widely used service oriented architecture (SOA) design. Input and output data are carried out over a query data model with a defined structure providing flexibility to support the processing of partially annotated text or the repartition of text over several queries. The interface implements a variety of functionalities, like language recognition, sentence segmentation and modules for accessing and looking up concepts in the knowledge base. The API itself integrates more advanced contextual parametrisation or ranked outputs, allowing for the resilient integration in various possible use cases. The entity-fishing API has been used as a concrete use case3 to draft the experimental stand-off proposal, which has been submitted for integration into the TEI guidelines. The representation is also compliant with the Web Annotation Data Model (WADM).In this paper we aim at describing the functionalities of the service as a reference contribution to the subject of web-based NERD services. In order to cover all aspects, the architecture is structured to provide two complementary viewpoints. First, we discuss the system from the data angle, detailing the workflow from input to output and unpacking each building box in the processing flow. Secondly, with a more academic approach, we provide a transversal schema of the different components taking into account non-functional requirements in order to facilitate the discovery of bottlenecks, hotspots and weaknesses. The attempt here is to give a description of the tool and, at the same time, a technical software engineering analysis which will help the reader to understand our choice for the resources allocated in the infrastructure.Thanks to the work of million of volunteers, Wikipedia has reached today stability and completeness that leave no usable alternatives on the market (considering also the licence aspect). The launch of Wikidata in 2010 have completed the picture with a complementary language independent meta-model which is becoming the scientific reference for many disciplines. After providing an introduction to Wikipedia and Wikidata, we describe the knowledge base: the data organisation, the entity-fishing process to exploit it and the way it is built from nightly dumps using an offline process.We conclude the paper by presenting our solution for the service deployment: how and which the resources where allocated. The service has been in production since Q3 of 2017, and extensively used by the H2020 HIRMEOS partners during the integration with the publishing platforms. We believe we have strived to provide the best performances with the minimum amount of resources. Thanks to the Huma-num infrastructure we still have the possibility to scale up the infrastructure as needed, for example to support an increase of demand or temporary needs to process huge backlog of documents. On the long term, thanks to this sustainable environment, we are planning to keep delivering the service far beyond the end of the H2020 HIRMEOS project.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . Conference object . 2016Open Access EnglishAuthors:Marcin Pciennik; Sandro Fiore; Giacinto Donvito; M. Owsiak; Marco Fargetta; Roberto Barbera; Riccardo Bruno; Emidio Giorgio; Dean N. Williams; Giovanni Aloisio;Marcin Pciennik; Sandro Fiore; Giacinto Donvito; M. Owsiak; Marco Fargetta; Roberto Barbera; Riccardo Bruno; Emidio Giorgio; Dean N. Williams; Giovanni Aloisio;
handle: 11587/403865
Publisher: ElsevierCountry: ItalyProject: EC | INDIGO-DataCloud (653549)AbstractIn this paper we present the approach proposed by EU H2020 INDIGO-DataCloud project to orchestrate dynamic workflows over a cloud environment. The main focus of the project is on the development of open source Platform as a Service solutions targeted at scientific communities, deployable on multiple hardware platforms, and provisioned over hybrid e-Infrastructures. The project is addressing many challenging gaps in current cloud solutions, responding to specific requirements coming from scientific communities including Life Sciences, Physical Sciences and Astronomy, Social Sciences and Humanities, and Environmental Sciences. We are presenting the ongoing work on implementing the whole software chain on the Infrastructure as a Service, PaaS and Software as a Service layers, focusing on the scenarios involving scientific workflows and big data analytics frameworks. INDIGO module for Kepler worflow system has been introduced along with the INDIGO underlying services exploited by the workflow components. A climate change data analytics experiment use case regarding the precipitation trend analysis on CMIP5 data is described, that makes use of Kepler and big data analytics services.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2013Open Access EnglishAuthors:Mark Hedges; Heike Neuroth; Kathleen M. Smith; Tobias Blanke; Laurent Romary; Marc Wilhelm Küster; Malcolm Illingworth;Mark Hedges; Heike Neuroth; Kathleen M. Smith; Tobias Blanke; Laurent Romary; Marc Wilhelm Küster; Malcolm Illingworth;
doi: 10.4000/jtei.774
Publisher: TEI ConsortiumCountry: FranceInternational audience; In recent years, a variety of initiatives have been funded with the aim of producing software tools or environments of a type variously known as virtual research environments, research infrastructures, or cyberinfrastructures. These initiatives vary in their scale, specialization, scope, and level of funding. One issue that they face in common, however, is that of sustainability: how can the continued--and useful--existence of a system or tool be guaranteed, or at least facilitated, once a project's funding has been spent? In this paper, we examine how such sustainability has been enabled, in the particular case of infrastructures for textual scholarship, in the context of three international projects: TextGrid,1 TEXTvre,2 and DARIAH3. Firstly, we will address the inter-project collaboration and crossfertilization between TextGrid and TEXTvre, including architectural decisions and shared data infrastructures, and investigate how the projects benefited from the exchange. We will then discuss how this existing collaboration can be taken forward by the loosely-coupled and distributed framework being developed by the DARIAH community, and how it can serve as a model for the sort of collaborations that DARIAH plans to enable.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2010Open Access EnglishAuthors:Benoit Habert; Claude Huc;Benoit Habert; Claude Huc;Publisher: HAL CCSDCountry: France
Pour permettre de comprendre les interactions possibles entre transmission et numerisation, un projet pilote d'archivage numerique perenne est presente par ses deux coordinateurs, L'article evoque le contexte actuel de transmission sous forme numerique des recherches passees et presentes en sciences humaines et sociales (SHS). Il souligne l'ecart entre le role croissant des donnees numeriques et leur fragilite. Il presente le modele abstrait standard d'archivage numerique perenne et la maniere dont il a ete instancie dans le projet pilote. Il termine par un retour reflexif sur les facteurs qui vont conditionner l'avenir de projets similaires: choix et comportements organisationnels, roles respectifs des donnees et des connaissances, constitution et comportement des communautes d'utilisateurs, statut de la memoire collective en SHS.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.