QALD-3 » Task2
September 2013 ⋅ Co-located with: CLEF 2013  

Task 2: Ontology lexicalization

Multilingual information access can be facilitated by the availability of lexica in different languages, for example allowing for an easy mapping of Spanish, German, and French natural language expressions to English ontology labels.

Task

The task consists in finding English lexicalizations of a set of classes and properties from the DBpedia ontology in a Wikipedia corpus. The submitted lexicalizations are expected to follow the ontology lexicon format lemon.

Full description: qald3_openchallenge.pdf (Last updated: March 25, 2013)

Training data

The training data consists of a set of 10 classes and 30 properties from the DBpedia ontology, as well as a lemon lexicon containing lexicalizations of those classes and properties. A suitable corpus for finding lexicalizations is Wikipedia. You can either download one of their data dumps, or directly download an already cleaned up part of English Wikipedia (1.54 GB).

Test data

The test data consists of a similar set of additional 10 classes and 30 properties from the DBpedia ontology, for which lexicalization have to be found.

Evaluation

Submitted lexica will be evaluated with respect to the reference data along three main criteria:

  • lexical precision (How many of the lexical entries in the submitted lexicon are also in the gold standard lexicon?)
  • lexical recall (How many of the lexical entries in the gold standard lexicon are also in the submitted lexicon?)
  • lexical accuracy (checking the correctness of the frames and argument mappings for each lexical entry in the submitted lexicon with respect to the gold standard lexicon)

For both training and test phase, results can be uploaded with the following evaluation form: