publication . Article . 2016

REDEN: Named Entity Linking in Digital Literary Editions Using Linked Data Sets

Carmen Brando; Francesca Frontini; Jean-Gabriel Ganascia;
Open Access English
  • Published: 29 Jul 2016
  • Publisher: HAL CCSD
  • Country: France
Abstract
International audience; This paper proposes a graph-based Named Entity Linking (NEL) algorithm named REDEN for the disambiguation of authors' names in French literary criticism texts and scientific essays from the 19th and early 20th centuries. The algorithm is described and evaluated according to the two phases of NEL as reported in current state of the art, namely, candidate retrieval and candidate selection. REDEN leverages knowledge from different Linked Data sources in order to select candidates for each author mention, subsequently crawls data from other Linked Data sets using equivalence links (e.g., owl:sameAs), and, finally, fuses graphs of homologous i...
Persistent Identifiers
Subjects
free text keywords: digital humanities, Named Entity Linking, graph centrality, linked data, data fusion, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL], [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, [INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS], Named Entity Linking; graph centrality; linked data; data fusion; digital humanities, lcsh:Information technology, lcsh:T58.5-58.64, Equivalence (measure theory), RDF, computer.file_format, computer, Referent, Literary criticism, Workflow, Information retrieval, Computer science, Semantic Web, Cultural heritage
Funded by
ANR| SUPER
Project
SUPER
Sorbonne Universités à Paris pour l'Enseignement et la Recherche
  • Funder: French National Research Agency (ANR) (ANR)
  • Project Code: ANR-11-IDEX-0004
Communities
Digital Humanities and Cultural Heritage
40 references, page 1 of 3

[1] T. Blanke and M. Hedges, “Scholarly Primitives: Building Institutional Infrastructure for Humanities E-Science,” vol. 29, no. 2, pp. 654-661. Available: http://dx.doi.org/10.1016/j.future.2011.06.006

[2] L. Burnard, What is the Text Encoding Initiative? How to Add Intelligent Markup to Digital Resources, ser. Encyclopédie numérique. OpenEdition Press. Available: http://books.openedition.org/oep/426

[3] S. V. Hooland, M. De Wilde, R. Verborgh, T. Steiner, and R. V. De Walle, “Exploring Entity Recognition and Disambiguation for Cultural Heritage Collections,” Literary and linguistic computing, 2013. Available: http://dx.doi.org/10.1093/llc/fqt067

20 https://github.com/cvbrandoe/REDEN

21 See http://obvil-dev.paris-sorbonne.fr/reden/RedenOnline/site/input-tei.html for beta version.

[4] C. Bizer, T. Heath, and T. Berners-Lee, “Linked Data - the Story so far,” Int. J. Semantic Web Inf. Syst., vol. 5, no. 3, p. 1-22, 2009. Available: http://dx.doi.org/10.4018/jswis.2009081901

[5] B. De Meester, T. De Nies, L. De Vocht, R. Verborgh, E. Mannens, and R. V. de Walle, “Exposing Digital Content as Linked Data, and Linking Them Using StoryBlink,” in Proceedings of the 3th NLP&DBpedia workshop, Oct. 2015. [OpenAIRE]

[6] C. Chiarcos, S. Nordhoff, and S. Hellmann, Eds., Linked Data in Linguistics - Representing and Connecting Language Data and Language Metadata. Springer, 2012. Available: http://dx.doi.org/10.1007/978-3-642-28249-2

[7] C. Brando, N. Abadie, and F. Frontini, “Linked Data Quality for Domain Specific Named Entity Linking,” in Proceedings of the 1st Atelier Qualité des Données du Web, 16ème Conférence Internationale Francophone sur l'Extraction et la Gestion de Connaissances, Reims, France, Jan. 2016.

[8] C. Brando, F. Frontini, and J.-G. Ganascia, “Disambiguation of Named Entities in Cultural Heritage Texts Using Linked Data Sets,” in New Trends in Databases and Information Systems. Springer, 2015, pp. 505-514. Available: http://dx.doi.org/10.1007/978-3-319-23201-0_51 [OpenAIRE]

[9] F. Frontini, C. Brando, and J.-G. Ganascia, “Semantic Web Based Named Entity Linking for Digital Humanities and Heritage Texts,” in Proceedings of the First International Workshop Semantic Web for Scientific Heritage at the 12th ESWC 2015 Conference, 2015, pp. 77-88. Available: http://ceur-ws.org/Vol-1364/ [OpenAIRE]

[10] F. Frontini, C. Brando, and J.-G. Ganascia, “Domain-Adapted Named-Entity Linker Using Linked Data,” in Workshop on NLP Applications: Completing the Puzzle co-located with the 20th International Conference on Applications of Natural Language to Information Systems (NLDB 2015), Passau, Germany, Jun. 2015. Available: https://hal.archives-ouvertes.fr/hal-01203356

[11] M. Cornolti, P. Ferragina, and M. Ciaramita, “A Framework for Benchmarking Entity-Annotation Systems,” in Proceedings of the 22nd international conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2013, pp. 249-260. Available: http://dx.doi.org/10.1145/2488388.2488411 [OpenAIRE]

[12] A. Fader, S. Soderland, O. Etzioni, and T. Center, “Scaling Wikipedia-Based Named Entity Disambiguation to Arbitrary Web Text,” in Proceedings of the IJCAI Workshop on User-contributed Knowledge and Artificial Intelligence: An Evolving Synergy, Pasadena, CA, USA, 2009, pp. 21-26.

[13] B. Hachey, W. Radford, J. Nothman, M. Honnibal, and J. R. Curran, “Evaluating Entity Linking with Wikipedia,” Artificial intelligence, vol. 194, pp. 130-150, 2013. Available: http://dx.doi.org/10.1016/j.artint.2012.04.005 [OpenAIRE]

40 references, page 1 of 3
Abstract
International audience; This paper proposes a graph-based Named Entity Linking (NEL) algorithm named REDEN for the disambiguation of authors' names in French literary criticism texts and scientific essays from the 19th and early 20th centuries. The algorithm is described and evaluated according to the two phases of NEL as reported in current state of the art, namely, candidate retrieval and candidate selection. REDEN leverages knowledge from different Linked Data sources in order to select candidates for each author mention, subsequently crawls data from other Linked Data sets using equivalence links (e.g., owl:sameAs), and, finally, fuses graphs of homologous i...
Persistent Identifiers
Subjects
free text keywords: digital humanities, Named Entity Linking, graph centrality, linked data, data fusion, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL], [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, [INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS], Named Entity Linking; graph centrality; linked data; data fusion; digital humanities, lcsh:Information technology, lcsh:T58.5-58.64, Equivalence (measure theory), RDF, computer.file_format, computer, Referent, Literary criticism, Workflow, Information retrieval, Computer science, Semantic Web, Cultural heritage
Funded by
ANR| SUPER
Project
SUPER
Sorbonne Universités à Paris pour l'Enseignement et la Recherche
  • Funder: French National Research Agency (ANR) (ANR)
  • Project Code: ANR-11-IDEX-0004
Communities
Digital Humanities and Cultural Heritage
40 references, page 1 of 3

[1] T. Blanke and M. Hedges, “Scholarly Primitives: Building Institutional Infrastructure for Humanities E-Science,” vol. 29, no. 2, pp. 654-661. Available: http://dx.doi.org/10.1016/j.future.2011.06.006

[2] L. Burnard, What is the Text Encoding Initiative? How to Add Intelligent Markup to Digital Resources, ser. Encyclopédie numérique. OpenEdition Press. Available: http://books.openedition.org/oep/426

[3] S. V. Hooland, M. De Wilde, R. Verborgh, T. Steiner, and R. V. De Walle, “Exploring Entity Recognition and Disambiguation for Cultural Heritage Collections,” Literary and linguistic computing, 2013. Available: http://dx.doi.org/10.1093/llc/fqt067

20 https://github.com/cvbrandoe/REDEN

21 See http://obvil-dev.paris-sorbonne.fr/reden/RedenOnline/site/input-tei.html for beta version.

[4] C. Bizer, T. Heath, and T. Berners-Lee, “Linked Data - the Story so far,” Int. J. Semantic Web Inf. Syst., vol. 5, no. 3, p. 1-22, 2009. Available: http://dx.doi.org/10.4018/jswis.2009081901

[5] B. De Meester, T. De Nies, L. De Vocht, R. Verborgh, E. Mannens, and R. V. de Walle, “Exposing Digital Content as Linked Data, and Linking Them Using StoryBlink,” in Proceedings of the 3th NLP&DBpedia workshop, Oct. 2015. [OpenAIRE]

[6] C. Chiarcos, S. Nordhoff, and S. Hellmann, Eds., Linked Data in Linguistics - Representing and Connecting Language Data and Language Metadata. Springer, 2012. Available: http://dx.doi.org/10.1007/978-3-642-28249-2

[7] C. Brando, N. Abadie, and F. Frontini, “Linked Data Quality for Domain Specific Named Entity Linking,” in Proceedings of the 1st Atelier Qualité des Données du Web, 16ème Conférence Internationale Francophone sur l'Extraction et la Gestion de Connaissances, Reims, France, Jan. 2016.

[8] C. Brando, F. Frontini, and J.-G. Ganascia, “Disambiguation of Named Entities in Cultural Heritage Texts Using Linked Data Sets,” in New Trends in Databases and Information Systems. Springer, 2015, pp. 505-514. Available: http://dx.doi.org/10.1007/978-3-319-23201-0_51 [OpenAIRE]

[9] F. Frontini, C. Brando, and J.-G. Ganascia, “Semantic Web Based Named Entity Linking for Digital Humanities and Heritage Texts,” in Proceedings of the First International Workshop Semantic Web for Scientific Heritage at the 12th ESWC 2015 Conference, 2015, pp. 77-88. Available: http://ceur-ws.org/Vol-1364/ [OpenAIRE]

[10] F. Frontini, C. Brando, and J.-G. Ganascia, “Domain-Adapted Named-Entity Linker Using Linked Data,” in Workshop on NLP Applications: Completing the Puzzle co-located with the 20th International Conference on Applications of Natural Language to Information Systems (NLDB 2015), Passau, Germany, Jun. 2015. Available: https://hal.archives-ouvertes.fr/hal-01203356

[11] M. Cornolti, P. Ferragina, and M. Ciaramita, “A Framework for Benchmarking Entity-Annotation Systems,” in Proceedings of the 22nd international conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2013, pp. 249-260. Available: http://dx.doi.org/10.1145/2488388.2488411 [OpenAIRE]

[12] A. Fader, S. Soderland, O. Etzioni, and T. Center, “Scaling Wikipedia-Based Named Entity Disambiguation to Arbitrary Web Text,” in Proceedings of the IJCAI Workshop on User-contributed Knowledge and Artificial Intelligence: An Evolving Synergy, Pasadena, CA, USA, 2009, pp. 21-26.

[13] B. Hachey, W. Radford, J. Nothman, M. Honnibal, and J. R. Curran, “Evaluating Entity Linking with Wikipedia,” Artificial intelligence, vol. 194, pp. 130-150, 2013. Available: http://dx.doi.org/10.1016/j.artint.2012.04.005 [OpenAIRE]

40 references, page 1 of 3
Any information missing or wrong?Report an Issue