publication . Article . 2017

Enabling complex analysis of large-scale digital collections: humanities research, high-performance computing, and transforming access to British Library digital collections

Terras, Melissa; Baker, James; Hetherington, James; Beavan, David; Zaltz Austwick, Martin; Welsh, Anne; O'Neill, Helen; Finley, Will; Duke-Williams, Oliver; Farquhar, Adam;
Open Access
  • Published: 02 May 2017
  • Publisher: Oxford University Press (OUP)
  • Country: United Kingdom
Abstract
Although there has been a drive in the cultural heritage sector to provide large-scale, open data sets for researchers, we have not seen a commensurate rise in humanities researchers undertaking complex analysis of these data sets for their own research purposes. This article reports on a pilot project at University College London, working in collaboration with the British Library, to scope out how best high-performance computing facilities can be used to facilitate the needs of researchers in the humanities. Using institutional data-processing frameworks routinely used to support scientific research, we assisted four humanities researchers in analysing 60,000 digitized books, and we present two resulting case studies here. This research allowed us to identify infrastructural and procedural barriers and make recommendations on resource allocation to best support non-computational researchers in undertaking ‘big data’ research. We recommend that research software engineer capacity can be most efficiently deployed in maintaining and supporting data sets, while librarians can provide an essential service in running initial, routine queries for humanities scholars. At present there are too many technical hurdles for most individuals in the humanities to consider analysing at scale these increasingly available open data sets, and by building on existing frameworks of support from research computing and library services, we can best support humanities scholars in developing methods and approaches to take advantage of these research opportunities.
Fields of Science and Technology classification (FOS)
03 medical and health sciences, 0302 clinical medicine, 030212 general & internal medicine, 05 social sciences, 0509 other social sciences, 050904 information & library sciences
Sustainable Development Goals (SDG)
7. Clean energy, 11. Sustainability
Subjects
free text keywords: Computer Science Applications, Linguistics and Language, Language and Linguistics, Information Systems, D, Data science, Scope (project management), Scale (social sciences), Computer science, Big data, business.industry, business, World Wide Web, Service (systems architecture), Cultural heritage, Humanities, Digital humanities, Resource allocation (computer), Open data
Communities
Communities with gateway
OpenAIRE Connect image
Other Communities
  • Social Science and Humanities
56 references, page 1 of 4

Atkins, D. E., Borgman, C. L., Bindhoff, N., Ellisman, M., Felman, S., Foster, I., and Heck, A. (2010). ''RCUK Review of e-Science 2009.'' Research Councils UK. https://www.epsrc.ac.uk/newsevents/pubs/rcukreview-of-e-science-2009-building-a-uk-foundationfor-the-transformative-enhancement-of-research-andinnovation/.

Bates, M. J. (1996). The Getty end-user online searching project in the humanities: report no. 6: overview and conclusions. College and Research Libraries, 57(6): 514-23.

Bedi, S. and Walde, C. (2016). Transforming roles: Canadian academic librarians embedded in faculty research projects. College and Research Libraries. http:// crl.acrl.org/content/early/2016/03/22/crl16-871.

Bradigan, P. S. and Mularski, C. A. (1989). End-user searching in a medical school curriculum: an evaluated modular approach. Bulletin of the Medical Library Association, 77(4): 348-56. [OpenAIRE]

Brettle, A., Maden-Jenkins, M., Anderson, L., McNally, R., Pratchett, T., Tancock, J., Thornton, D., and Webb, A. (2011). Evaluating clinical librarian services: A systematic review. Health Information and Libraries Journal, 28(1): 3-22.

Burke, J. and Tumbleson, B. (2016). LMS embedded librarianship and the educational role of librarians. Library Technology Reports, 52(2): 5-9.

Chadwick, E. (1842). Report on the sanitary condition of the labouring population of Great Britain: supplementary report on the results of special inquiry into the practice of interment in towns (Vol. 1). HM Stationery Office.

Donald, D. (1996). The Age of Caricature: Satirical Prints in the Reign of George III. New Haven: Published for the Paul Mellon Centre for Studies in British Art by Yale University Press.

Farber, M. and Shoham, S. (2002). Users, end-users, and end-user searchers of online information: a historical overview. Online Information Review, 26(2): 92-100.

Feliu, V. and Frazer, H. (2012). Embedded librarians: teaching legal research as a lawyering skill. Journal of Legal Education, 61(4): 540-59.

Groen, D., Hetherington, J., Carver, H. B., Nash, R. W., Bernabeu, M. O., and Coveney, P. V. (2013). Analysing and modelling the performance of the HemeLB lattice-Boltzmann simulation environment. Journal of Computational Science, 4(5): 412- 22.

Hetherington, J. (2017). Question about data capacity and use at UCL. Response to Melissa Terras via email, 27 January 2017.

Hettrick, S. (2016). A not-so-brief history of Research Software Engineers. Software Sustainability Institute. https://www.software.ac.uk/blog/2016-08-19-not-sobrief-history-research-software-engineers.

Huber, M. (2007). The Old Bailey Proceedings, 1674-1834 Evaluating and annotating a corpus of 18th - and 19thcentury spoken English. Studies in Variation, Contacts and Change in English 1: Annotating Variation and Change. http://www.helsinki.fi/varieng/series/volumes/01/huber/.

Hughes, A. (2009). Higher Education in a Web 2.0 World. JISC. http://www.webarchive.org.uk/wayback/archive/ 20140614042502/http://www.jisc.ac.uk/publications/gen eralpublications/2009/heweb2.aspx.

56 references, page 1 of 4
Any information missing or wrong?Report an Issue