You have already added 0 works in your ORCID record related to the merged Research product.
You have already added 0 works in your ORCID record related to the merged Research product.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
Enabling complex analysis of large-scale digital collections: Humanities research, high-performance computing, and transforming access to British Library digital collections
doi: 10.1093/llc/fqx020
Although there has been a drive in the cultural heritage sector to provide large-scale, open data sets for researchers, we have not seen a commensurate rise in humanities researchers undertaking complex analysis of these data sets for their own research purposes. This article reports on a pilot project at University College London, working in collaboration with the British Library, to scope out how best high-performance computing facilities can be used to facilitate the needs of researchers in the humanities. Using institutional data-processing frameworks routinely used to support scientific research, we assisted four humanities researchers in analysing 60,000 digitized books, and we present two resulting case studies here. This research allowed us to identify infrastructural and procedural barriers and make recommendations on resource allocation to best support non-computational researchers in undertaking ‘big data’ research. We recommend that research software engineer capacity can be most efficiently deployed in maintaining and supporting data sets, while librarians can provide an essential service in running initial, routine queries for humanities scholars. At present there are too many technical hurdles for most individuals in the humanities to consider analysing at scale these increasingly available open data sets, and by building on existing frameworks of support from research computing and library services, we can best support humanities scholars in developing methods and approaches to take advantage of these research opportunities.
- London Library United Kingdom
- University of Sussex United Kingdom
- University of Sheffield United Kingdom
- University College London United Kingdom
- British Library United Kingdom
Microsoft Academic Graph classification: Service (systems architecture) Scope (project management) business.industry Computer science Big data Data science Cultural heritage World Wide Web Open data Digital humanities Scale (social sciences) Resource allocation (computer) business Humanities
Linguistics and Language, Language and Linguistics, Computer Science Applications, Information Systems
Linguistics and Language, Language and Linguistics, Computer Science Applications, Information Systems
Microsoft Academic Graph classification: Service (systems architecture) Scope (project management) business.industry Computer science Big data Data science Cultural heritage World Wide Web Open data Digital humanities Scale (social sciences) Resource allocation (computer) business Humanities
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).17 popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.Top 10% influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).Top 10% impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.Top 10% citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).17 popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.Top 10% influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).Top 10% impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.Top 10% Powered byBIP!