The European Holocaust Research Infrastructure (EHRI) started in October 2010 to build on a network that connects both people (Holocaust researchers, archivists, curators, librarians and digital humanists) and dispersed Holocaust source material and collections. EHRI’s aim is making sources visible in a systematic way in order to counteract the fragmentation of the sources and to reveal interconnections. EHRI focuses on Archive and collection descriptions, which are available through the EHRI Portal. EHRI is currently in its second phase and is on the ESFRI Roadmap2 for a more sustainable future. EHRI has developed a set of controlled vocabularies that serves both as a retrieval and cataloguing tool for the multilingual and highly heterogeneous data of the EHRI portal. These vocabularies were partly implemented in the first phase of the project. In the current phase of EHRI the vocabularies are in the process of quality improvement improve and enrich the existing terms, add new terms, disambiguate and remove the mistakes (deduplication, merging, adding multilingual labels, consistency checks, multiple parent relations, etc.) and increase their coverage. In the EHRI portal the subject terms are currently not available for the public, as they are used only for retrieval purposes.
AbstractThe paper presents Intergraph, a graph-based visual analytics technical demonstrator for the exploration and study of content in historical document collections. The designed prototype is motivated by a practical use case on a corpus of circa 15.000 digitized resources about European integration since 1945. The corpus allowed generating a dynamic multilayer network which represents different kinds of named entities appearing and co-appearing in the collections. To our knowledge, Intergraph is one of the first interactive tools to visualize dynamic multilayer graphs for collections of digitized historical sources. Graph visualization and interaction methods have been designed based on user requirements for content exploration by non-technical users without a strong background in network science, and to compensate for common flaws with the annotation of named entities. Users work with self-selected subsets of the overall data by interacting with a scene of small graphs which can be added, altered and compared. This allows an interest-driven navigation in the corpus and the discovery of the interconnections of its entities across time.
Van Der Eycken, Johan; Styven, Dorien; Gheldof, Tom; Depoortere, Rolande;
Van Der Eycken, Johan; Styven, Dorien; Gheldof, Tom; Depoortere, Rolande;
Publisher: HAL CCSD
Countries: France, Belgium
This article shows that metadata plays a central role in our society and concludes that through collaborative work, it is possible to pool solutions and to establish relationships of cooperation, both at the level of practical tool development and with regard to sharing and creating knowledge and know-how. ispartof: ABB: Archives et Bibliothèques de Belgique - Archief- en Bibliotheekwezen in België vol:106 pages:135-144 status: published
More and more cultural institutions use Linked Data principles to share and connect their collection metadata. In the archival field, initiatives emerge to exploit data contained in archival descriptions and adapt encoding standards to the semantic web. In this context, online authority files can be used to enrich metadata. However, relying on a decentralized network of knowledge bases such as Wikidata, DBpedia or even Viaf has its own difficulties. This paper aims to offer a critical view of these linked authority files by adopting a close-reading approach. Through a practical case study, we intend to identify and illustrate the possibilities and limits of RDF triples compared to institutions' less structured metadata. Comment: Workshop "Dariah "Trust and Understanding: the value of metadata in a digitally joined-up world" (14/05/2018, Brussels), preprint of the submission to the journal "Archives et Biblioth\`eques de Belgique"
Reinhard Altenhöner; Ina Blümel; Franziska Boehm; Jens Bove; Katrin Bicher; Christian Bracht; Ortrun Brand; Lisa Dieckmann; Maria Effinger; Malte Hagener; +15 more
Reinhard Altenhöner; Ina Blümel; Franziska Boehm; Jens Bove; Katrin Bicher; Christian Bracht; Ortrun Brand; Lisa Dieckmann; Maria Effinger; Malte Hagener; Andrea Hammes; Lambert Heller; Angela Kailus; Hubertus Kohle; Jens Ludwig; Andreas Münzmay; Sarah Pittroff; Matthias Razum; Daniel Röwenstrunk; Harald Sack; Holger Simon; Dörte Schmidt; Torsten Schrade; Annika-Valeska Walzel; Barbara Wiermann;
Digital data on tangible and intangible cultural assets is an essential part of daily life, communication and experience. It has a lasting influence on the perception of cultural identity as well as on the interactions between research, the cultural economy and society. Throughout the last three decades, many cultural heritage institutions have contributed a wealth of digital representations of cultural assets (2D digital reproductions of paintings, sheet music, 3D digital models of sculptures, monuments, rooms, buildings), audio-visual data (music, film, stage performances), and procedural research data such as encoding and annotation formats. The long-term preservation and FAIR availability of research data from the cultural heritage domain is fundamentally important, not only for future academic success in the humanities but also for the cultural identity of individuals and society as a whole. Up to now, no coordinated effort for professional research data management on a national level exists in Germany. NFDI4Culture aims to fill this gap and create a user-centered, research-driven infrastructure that will cover a broad range of research domains from musicology, art history and architecture to performance, theatre, film, and media studies. The research landscape addressed by the consortium is characterized by strong institutional differentiation. Research units in the consortium's community of interest comprise university institutes, art colleges, academies, galleries, libraries, archives and museums. This diverse landscape is also characterized by an abundance of research objects, methodologies and a great potential for data-driven research. In a unique effort carried out by the applicant and co-applicants of this proposal and ten academic societies, this community is interconnected for the first time through a federated approach that is ideally suited to the needs of the participating researchers. To promote collaboration within the NFDI, to share knowledge and technology and to provide extensive support for its users have been the guiding principles of the consortium from the beginning and will be at the heart of all workflows and decision-making processes. Thanks to these principles, NFDI4Culture has gathered strong support ranging from individual researchers to high-level cultural heritage organizations such as the UNESCO, the International Council of Museums, the Open Knowledge Foundation and Wikimedia. On this basis, NFDI4Culture will take innovative measures that promote a cultural change towards a more reflective and sustainable handling of research data and at the same time boost qualification and professionalization in data-driven research in the domain of cultural heritage. This will create a long-lasting impact on science, cultural economy and society as a whole.
International audience; The National Library Ivan Vazov in Plovdiv is the second largest library in Bulgaria. It serves asthe second national legal depository of Bulgarian printed works. In addition, it has contributedsignificantly to the preservation and the digital accessibility of the national cultural andhistorical heritage. This article offers an overview of the library’s history and currentdevelopments in the field of automation and digitization.
We propose a morphologically informed model for named entity recognition, which is based on LSTM-CRF architecture and combines word embeddings, Bi-LSTM character embeddings, part-of-speech (POS) tags, and morphological information. While previous work has focused on learning from raw word input, using word and character embeddings only, we show that for morphologically rich languages, such as Bulgarian, access to POS information contributes more to the performance gains than the detailed morphological information. Thus, we show that named entity recognition needs only coarse-grained POS tags, but at the same time it can benefit from simultaneously using some POS information of different granularity. Our evaluation results over a standard dataset show sizable improvements over the state-of-the-art for Bulgarian NER. named entity recognition; Bulgarian NER; morphology; morpho-syntax