International audience; The panel presents results and ongoing work from corpus projects in which TEI-P5 hasbeen adopted for the representation and linguistic annotation of genres of social mediaand computer-mediated communication (CMC). It relates to the work of the TEI-SIG“computer-mediated communication” which is developing TEI models for therepresentation of CMC genres and testing these models for a broad range of genres(ranging from “text-only” genres such as chat and SMS to multimodal genres such aslearning environments and Second Life) and in corpus building initiatives for variousEuropean languages.The goal of the panel is to give an overview of models and practices in representingCMC in TEI on the example of German and French CMC corpora. A documentation andODD files of the schemas developed by the group will be made available in the TEI wikiand be announced via the TEI mailing list before the conference so that everybody whois interested in participating in the discussion can examine the CMC models in advance.The discussion in the panel shall serve as an opportunity for collecting feedback onthese models and schema drafts from a broader community within the TEI who isinterested in adapting TEI-P5 for the representation of new (digital) genres. Thisfeedback will be taken into consideration when revising the models and – as a next stepafter the conference – preparing feature requests for adapting the TEI for CMC.
Publication . Other literature type . Part of book or chapter of book . 2011
The social sciences and the humanities taken together contain a heterogeneous range of research disciplines. Almost all existing methods of research can be found within these two domains. Data handling (collecting, processing, selecting, preserving) and publication methods differ greatly. Attitudes in the field towards Open Access of publications as well to research data vary as well. It is not possible to cover the total fullness, and complexity, of all the disciplines within these two domains. Our observations will therefore be based upon a number of case studies. Taken together these case studies give a fairly representative picture of the domains, at least of the most common research environments. The main dividing line is between those disciplines creating empirical data, such as survey data in the social sciences and those, especially in the humanities, using existing source material, such as history or text studies. This source material can either be of an analogous or a digital nature. As will be shown in the case studies in many disciplines a mix of created and existing is often combined.
The paper introduces into a new Science Gateway, developed in the framework of the European Horizon 2020 project EGI Engage - DARIAH Competence Centre, which started in March 2015 co-funded by the European Union, with the participation of about 70 (research) units in over 30 countries. In this paper the authors focus on trans-disciplinary collaboration in the framework of explorative lexicography in cultural context. On the one hand, they give a short overview of the architecture of the Science Gateway, used techniques, and specific applications and services developed during the DARIAH Competence Centre. On the other they mainly focus on possible added value and changes concerning work flow for Lexicographers and researchers on Lexical resources. This is exemplified on the European network of COST action IS 1305 “European Network of electronic lexicography (ENeL)”.
This paper is about data in the humanities. Most of my colleagues in literary and cultural studies would not necessarily speak of their objects of study as “data.” If you ask them what it is they are studying, they would rather speak of books, paintings and movies; of drama and crime fiction, of still lives and action painting; of German expressionist movies and romantic comedy. They would mention Denis Diderot or Toni Morrison, Chardin or Jackson Pollock, Fritz Lang or Diane Keaton. Maybe they would talk about what they are studying as texts, images, and sounds. But rarely would they consider their objects of study to be “data.” However, in the humanities just as in other areas of research, we are increasingly dealing with “data.” With digitization efforts in the private and public sectors going on around the world, more and more data relevant to our fields of study exists, and, if the data has been licensed appropriately, it is available for research. The digital humanities aim to raise to the challenge and realize the potential of this data for humanistic inquiry. As Christine Borgman has shown in her book on Scholarship in the Digital Age, this is as much a theoretical, methodological and social issue as it is a technical issue. Indeed, the existence of all this data raises a host of questions, some of which I would like to address here. For example: What is the relation between the data we have and our objects of study? – Does data replace books, paintings and movies? In what way can data be said to be representations of them? What difference does it make to analyze the digital representation or version of a novel or a painting instead of the printed book, the manuscript, or the original painting? What types of data are there in the humanities, and what difference does it make? – I will argue that one can distinguish two types of data, “big” data and “smart” data. What, then, does it mean to deal with big data, or smart data, in the humanities? What new ways of dealing with data do we need to adopt in the humanities? – How is big data and smart data being dealt with in the process of scholarly knowledge generation, that is when data is being created, enriched, analyzed and interpreted?
This paper describes the achievements of the H2020 project INDIGO-DataCloud. The project has provided e-infrastructures with tools, applications and cloud framework enhancements to manage the demanding requirements of scientific communities, either locally or through enhanced interfaces. The middleware developed allows to federate hybrid resources, to easily write, port and run scientific applications to the cloud. In particular, we have extended existing PaaS (Platform as a Service) solutions, allowing public and private e-infrastructures, including those provided by EGI, EUDAT, and Helix Nebula, to integrate their existing services and make them available through AAI services compliant with GEANT interfederation policies, thus guaranteeing transparency and trust in the provisioning of such services. Our middleware facilitates the execution of applications using containers on Cloud and Grid based infrastructures, as well as on HPC clusters. Our developments are freely downloadable as open source components, and are already being integrated into many scientific applications. 39 pages, 15 figures.Version accepted in Journal of Grid Computing
We investigated the evolution and transformation of scientific knowledge in the early modern period, analyzing more than 350 different editions of textbooks used for teaching astronomy in European universities from the late fifteenth century to mid-seventeenth century. These historical sources constitute the Sphaera Corpus. By examining different semantic relations among individual parts of each edition on record, we built a multiplex network consisting of six layers, as well as the aggregated network built from the superposition of all the layers. The network analysis reveals the emergence of five different communities. The contribution of each layer in shaping the communities and the properties of each community are studied. The most influential books in the corpus are found by calculating the average age of all the out-going and in-coming links for each book. A small group of editions is identified as a transmitter of knowledge as they bridge past knowledge to the future through a long temporal interval. Our analysis, moreover, identifies the most disruptive books. These books introduce new knowledge that is then adopted by almost all the books published afterwards until the end of the whole period of study. The historical research on the content of the identified books, as an empirical test, finally corroborates the results of all our analyses. 19 pages, 9 figures
Research infrastructures (RI) include major scientific equipment, scientific collections, archives, structured information and ICT-based infrastructures and services. They support top-level research and can be organized at the national and regional level, at EU Member State, European and global level. RIs have become a topic of interest and priority for funders, political bodies, and (increasingly) institutional decision makers. In Europe the European Commission is a funder of RIs, complementing funding done by EU Member States at the national level. Over the last ten years hundreds of RI-projects have been planned and some received funding for design, extension and improvement of operations and services to scientific communities. The ESFRI roadmap for research infrastructures represents a financial volume of approx. 20 billion EUR for ten years to construct 44 RIs. From the perspective of realizing the objectives set for RI, 2012 is an essential milestone, as the discussion of the HORIZON 2020 programmes at the European level will take place as well as consultations with member states. The following overview is by no means complete. It focuses on some RIs majorly influenced by the production and management of scientific information and which have relevance for the European political and funding agenda. RI projects include a variety of typologies, ranging from hard, single-site facilities to distributed, soft facilities relying on networks. Typically they have emerged from discipline-specific or cross-disciplinary requirements. RIs produce, process or manage big and small but heterogeneous volumes of information. They are the so-called ‘scientific data factories’ of the 21st century. They comprise various types of information resources such as publications, digitized collections, learning objects and research data. Key issues on today’s agenda for RIs are their uptake by researchers, and their viability, sustainability and interoperability. Research libraries’ engagement with RIs has been low. While this could be understandable in 2005 when the first priorities for RI investments were defined, it now represents a big gap in the European strategy. Key initiatives such as the ESFRI Research Infrastructures involve no participation by research libraries, except for DARIAH. Participation in EC-funded projects (through LIBER or directly through institutions) focused (with a few exceptions) on the areas of digitization, cultural heritage and publications. Research libraries need to become visible actors in strategic discussions on RIs and should actively explore their engagement in research data infrastructures. Open Access, open science (data), research data infrastructures and management are the catalysts to get research libraries back into the awareness of researchers beyond the humanities and social sciences. ‘Open Access is global — but implementation is local’. This is a popular slogan of the OpenAIRE project and gives local research libraries an important role in the European context. Research data are discipline-specific, but policies, workflows and standards also need to be implemented at the local level. Creating participatory infrastructures by involving institutional, national and disciplinary actors has been identified by the EC as a key task for the current decade. The term ‘participatory’ is also considered to be of fundamental relevance for European policy strategy, as it matches well with national and European coordination for cost efficiency and is instrumental in avoiding duplication of work. The primary challenges to building a coherent, fundable and sustainable ecosystem do not lie in ICT technology, but rather in governance, law, organization, socio- cultural aspects, trust, and, of course, costs. peerReviewed
Davide Salomoni; Isabel Campos; Luciano Gaido; J. Marco de Lucas; P. Solagna; Jorge Gomes; Luděk Matyska; P. Fuhrman; Marcus Hardt; Giacinto Donvito; +43 more
Davide Salomoni; Isabel Campos; Luciano Gaido; J. Marco de Lucas; P. Solagna; Jorge Gomes; Luděk Matyska; P. Fuhrman; Marcus Hardt; Giacinto Donvito; Lukasz Dutka; Marcin Plociennik; Roberto Barbera; Ignacio Blanquer; Andrea Ceccanti; Eva Cetinic; Mario David; Cristina Duma; Álvaro López-García; Germán Moltó; Pablo Orviz; Zdeněk Šustr; M. Viljoen; Fernando Aguilar; L. Alves; Marica Antonacci; Louis Antonelli; S. Bagnasco; Alexandre M. J. J. Bonvin; Riccardo Bruno; Y. Chen; Alessandro Costa; Davor Davidović; B. Ertl; Marco Fargetta; Sandro Fiore; S. Gallozzi; Zeynep Kurkcuoglu; Lara Lloret; João Martins; Alessandra Nuzzo; Paola Nassisi; Cosimo Palazzo; João Murta Pina; Eva Sciacca; Daniele Spiga; Marco Antonio Tangaro; Michal Urbaniak; S. Vallero; Bas Wegh; Valentina Zaccolo; Federico Zambelli; Tomasz Zok;
This paper describes the achievements of the H2020 project INDIGO-DataCloud. The project has provided e-infrastructures with tools, applications and cloud framework enhancements to manage the demanding requirements of scientific communities, either locally or through enhanced interfaces. The middleware developed allows to federate hybrid resources, to easily write, port and run scientific applications to the cloud. In particular, we have extended existing PaaS (Platform as a Service) solutions, allowing public and private e-infrastructures, including those provided by EGI, EUDAT, and Helix Nebula, to integrate their existing services and make them available through AAI services compliant with GEANT interfederation policies, thus guaranteeing transparency and trust in the provisioning of such services. Our middleware facilitates the execution of applications using containers on Cloud and Grid based infrastructures, as well as on HPC clusters. Our developments are freely downloadable as open source components, and are already being integrated into many scientific applications. INDIGO-Datacloud has been funded by the European Commision H2020 research and innovation program under grant agreement RIA 653549. Peer reviewed
Project: EC | FOSTER Plus (741839), EC | FOSTER Plus (741839)
To foster responsible research and innovation, research communities, institutions, and funders are shifting their practices and requirements towards Open Science. Open Science skills are becoming increasingly essential for researchers. Indeed general awareness of Open Science has grown among EU researchers, but the practical adoption can be further improved. Recognizing a gap between the needed and the provided training offer, the FOSTER project offers practical guidance and training to help researchers learn how to open up their research within a particular domain or research environment. Aiming for a sustainable approach, FOSTER focused on strengthening the Open Science training capacity by establishing and supporting a community of trainers. The creation of an Open Science training handbook was a first step towards bringing together trainers to share their experiences and to create an open and living knowledge resource. A subsequent series of train-the-trainer bootcamps helped trainers to find inspiration, improve their skills and to intensify exchange within a peer group. Four trainers, who attended one of the bootcamps, contributed a case study on their experiences and how they rolled out Open Science training within their own institutions. On its platform the project provides a range of online courses and resources to learn about key Open Science topics. FOSTER awards users gamification badges when completing courses in order to provide incentives and rewards, and to spur them on to even greater achievements in learning. The paper at hand describes FOSTER Plus’ training strategies, shares the lessons learnt and provides guidance on how to re-use the project’s materials and training approaches. Peer reviewed
In order to channel and align the efforts within the COREF project, the Registry of Research Data Repositories – re3data is revising its conceptual service model according to the most important use cases of the various stakeholders working with re3data. Adopting and reflecting current developments in the research data landscape, the update of the service architecture in COREF is based on a bottom-up approach that addresses the results from a stakeholder survey and a stakeholder workshop in November 2020. The findings from the survey and workshop sessions presented in this report informed the development of a Conceptual Model for User Stories, which embeds the registry within the research community and the infrastructure landscape to meet the emerging needs for a trusted repository reference.