research data . Dataset . 2006

ACE 2005 Multilingual Training Corpus

Walker, Christopher; Strassel, Stephanie; Medero, Julie; Maeda, Kazuaki;
  • Published: 15 Feb 2006
  • Publisher: Linguistic Data Consortium
Abstract
<h3>Introduction</h3><br> <p>ACE 2005 Multilingual Training Corpus was developed by the Linguistic Data Consortium (LDC) and contains approximately 1,800 files of mixed genre text in English, Arabic, and Chinese annotated for entities, relations, and events. This represents the complete set of training data in those languages for the 2005 Automatic Content Extraction (ACE) technology evaluation. The genres include newswire, broadcast news, broadcast conversation, weblog, discussion forums, and coversational telephone speech. The data was annotated by LDC with support from the ACE Program and additional assistance from LDC.</p><br> <p>The objective of the ACE pro...
Persistent Identifiers
Any information missing or wrong?Report an Issue