publication . Conference object . 2018

Enhancing Usability for Automatically Structuring Digitised Dictionaries

Khemakhem, Mohamed; Herold, Axel; Romary, Laurent;
English
  • Published: 08 May 2018
  • Publisher: HAL CCSD
  • Country: France
Abstract
International audience; The last decade has seen a rapid development of the number of NLP tools which have been made available to the community. The usability of several e-lexicography tools represents a serious obstacle for researchers with little or no background in computer science. We present in this paper our efforts to overcome this issue in the case of a machine learning system for the automatic segmentation and semantic annotation of digitised dictionaries. Our approach is based on limiting the burdens of managing the tool's setup in different execution environments and lightening the complexity of the training process. We illustrate the possibility to r...
Subjects
free text keywords: Docker, TEI, Usability, Electronic lexicography, Digitised dictionaries, [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], [SHS.LANGUE]Humanities and Social Sciences/Linguistics, [STAT.ML]Statistics [stat]/Machine Learning [stat.ML], [INFO.INFO-HC]Computer Science [cs]/Human-Computer Interaction [cs.HC], [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
Communities
Communities with gateway
OpenAIRE Connect image
Other Communities
  • DARIAH EU
Funded by
EC| PARTHENOS
Project
PARTHENOS
Pooling Activities, Resources and Tools for Heritage E-research Networking, Optimization and Synergies
  • Funder: European Commission (EC)
  • Project Code: 654119
  • Funding stream: H2020 | RIA

Budin, G., Majewski, S., and Mo¨rth, K. (2012). Creating lexical resources in tei p5. a schema for multi-purpose digital dictionaries. Journal of the Text Encoding Initiative, (3).

Khemakhem, M., Foppiano, L., and Romary, L. (2017). Automatic Extraction of TEI Structures in Digitized Lexical Resources using Conditional Random Fields. In electronic lexicography, eLex 2017, Leiden, Netherlands, September. [OpenAIRE]

Lopez, P. and Romary, L. (2015). Grobid - information extraction from scientific publications. ERCIM News.

Meˇchura, M. B. (2017). Introducing Lexonomy: an opensource dictionary writing and publishing system. In electronic lexicography, eLex] 2017, Leiden.

Pustejovsky, J. and Stubbs, A. (2012). Natural Language Annotation for Machine Learning: A guide to corpusbuilding for applications. ” O'Reilly Media, Inc.”.

Abstract
International audience; The last decade has seen a rapid development of the number of NLP tools which have been made available to the community. The usability of several e-lexicography tools represents a serious obstacle for researchers with little or no background in computer science. We present in this paper our efforts to overcome this issue in the case of a machine learning system for the automatic segmentation and semantic annotation of digitised dictionaries. Our approach is based on limiting the burdens of managing the tool's setup in different execution environments and lightening the complexity of the training process. We illustrate the possibility to r...
Subjects
free text keywords: Docker, TEI, Usability, Electronic lexicography, Digitised dictionaries, [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], [SHS.LANGUE]Humanities and Social Sciences/Linguistics, [STAT.ML]Statistics [stat]/Machine Learning [stat.ML], [INFO.INFO-HC]Computer Science [cs]/Human-Computer Interaction [cs.HC], [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
Communities
Communities with gateway
OpenAIRE Connect image
Other Communities
  • DARIAH EU
Funded by
EC| PARTHENOS
Project
PARTHENOS
Pooling Activities, Resources and Tools for Heritage E-research Networking, Optimization and Synergies
  • Funder: European Commission (EC)
  • Project Code: 654119
  • Funding stream: H2020 | RIA

Budin, G., Majewski, S., and Mo¨rth, K. (2012). Creating lexical resources in tei p5. a schema for multi-purpose digital dictionaries. Journal of the Text Encoding Initiative, (3).

Khemakhem, M., Foppiano, L., and Romary, L. (2017). Automatic Extraction of TEI Structures in Digitized Lexical Resources using Conditional Random Fields. In electronic lexicography, eLex 2017, Leiden, Netherlands, September. [OpenAIRE]

Lopez, P. and Romary, L. (2015). Grobid - information extraction from scientific publications. ERCIM News.

Meˇchura, M. B. (2017). Introducing Lexonomy: an opensource dictionary writing and publishing system. In electronic lexicography, eLex] 2017, Leiden.

Pustejovsky, J. and Stubbs, A. (2012). Natural Language Annotation for Machine Learning: A guide to corpusbuilding for applications. ” O'Reilly Media, Inc.”.

Any information missing or wrong?Report an Issue