publication . Conference object . 2018

Enhancing Usability for Automatically Structuring Digitised Dictionaries

Khemakhem, Mohamed; Herold, Axel; Romary, Laurent;
English
  • Published: 08 May 2018
  • Publisher: HAL CCSD
  • Country: France
Abstract
International audience; The last decade has seen a rapid development of the number of NLP tools which have been made available to the community. The usability of several e-lexicography tools represents a serious obstacle for researchers with little or no background in computer science. We present in this paper our efforts to overcome this issue in the case of a machine learning system for the automatic segmentation and semantic annotation of digitised dictionaries. Our approach is based on limiting the burdens of managing the tool's setup in different execution environments and lightening the complexity of the training process. We illustrate the possibility to reach this goal through the adaptation of existing functionalities and through using out of the box software deployment technology. We also report on the community's feedback after exposing the new setup to real users of different professional backgrounds.
Subjects
free text keywords: Docker, TEI, Usability, Electronic lexicography, Digitised dictionaries, [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], [SHS.LANGUE]Humanities and Social Sciences/Linguistics, [STAT.ML]Statistics [stat]/Machine Learning [stat.ML], [INFO.INFO-HC]Computer Science [cs]/Human-Computer Interaction [cs.HC], [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]

Budin, G., Majewski, S., and Mo¨rth, K. (2012). Creating lexical resources in tei p5. a schema for multi-purpose digital dictionaries. Journal of the Text Encoding Initiative, (3).

Khemakhem, M., Foppiano, L., and Romary, L. (2017). Automatic Extraction of TEI Structures in Digitized Lexical Resources using Conditional Random Fields. In electronic lexicography, eLex 2017, Leiden, Netherlands, September. [OpenAIRE]

Lopez, P. and Romary, L. (2015). Grobid - information extraction from scientific publications. ERCIM News.

Meˇchura, M. B. (2017). Introducing Lexonomy: an opensource dictionary writing and publishing system. In electronic lexicography, eLex] 2017, Leiden.

Pustejovsky, J. and Stubbs, A. (2012). Natural Language Annotation for Machine Learning: A guide to corpusbuilding for applications. ” O'Reilly Media, Inc.”.

1 research outcomes, page 1 of 1
Abstract
International audience; The last decade has seen a rapid development of the number of NLP tools which have been made available to the community. The usability of several e-lexicography tools represents a serious obstacle for researchers with little or no background in computer science. We present in this paper our efforts to overcome this issue in the case of a machine learning system for the automatic segmentation and semantic annotation of digitised dictionaries. Our approach is based on limiting the burdens of managing the tool's setup in different execution environments and lightening the complexity of the training process. We illustrate the possibility to reach this goal through the adaptation of existing functionalities and through using out of the box software deployment technology. We also report on the community's feedback after exposing the new setup to real users of different professional backgrounds.
Subjects
free text keywords: Docker, TEI, Usability, Electronic lexicography, Digitised dictionaries, [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], [SHS.LANGUE]Humanities and Social Sciences/Linguistics, [STAT.ML]Statistics [stat]/Machine Learning [stat.ML], [INFO.INFO-HC]Computer Science [cs]/Human-Computer Interaction [cs.HC], [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]

Budin, G., Majewski, S., and Mo¨rth, K. (2012). Creating lexical resources in tei p5. a schema for multi-purpose digital dictionaries. Journal of the Text Encoding Initiative, (3).

Khemakhem, M., Foppiano, L., and Romary, L. (2017). Automatic Extraction of TEI Structures in Digitized Lexical Resources using Conditional Random Fields. In electronic lexicography, eLex 2017, Leiden, Netherlands, September. [OpenAIRE]

Lopez, P. and Romary, L. (2015). Grobid - information extraction from scientific publications. ERCIM News.

Meˇchura, M. B. (2017). Introducing Lexonomy: an opensource dictionary writing and publishing system. In electronic lexicography, eLex] 2017, Leiden.

Pustejovsky, J. and Stubbs, A. (2012). Natural Language Annotation for Machine Learning: A guide to corpusbuilding for applications. ” O'Reilly Media, Inc.”.

1 research outcomes, page 1 of 1
Any information missing or wrong?Report an Issue