publication . Preprint . 2018

Explorations in an English Poetry Corpus: A Neurocognitive Poetics Perspective

Jacobs, Arthur M.;
Open Access English
  • Published: 06 Jan 2018
This paper describes a corpus of about 3000 English literary texts with about 250 million words extracted from the Gutenberg project that span a range of genres from both fiction and non-fiction written by more than 130 authors (e.g., Darwin, Dickens, Shakespeare). Quantitative Narrative Analysis (QNA) is used to explore a cleaned subcorpus, the Gutenberg English Poetry Corpus (GEPC) which comprises over 100 poetic texts with around 2 million words from about 50 authors (e.g., Keats, Joyce, Wordsworth). Some exemplary QNA studies show author similarities based on latent semantic analysis, significant topics for each author or various text-analytic metrics for Ge...
free text keywords: Computer Science - Computation and Language
