Vectors derived from patterns of co-occurrence of words in large bodies of text have often been used as representations of some aspects of the meanings of different words. Generally, the distance between such vectors is used as a measure of the semantic similarity between the word meanings they represent. One important way of evaluating the performance of these vectors has been to use them to answer vocabulary multiple choice questions (MCQs) where the participant is asked to judge which of several choice words is closest in meaning to a stem word. The existing vocabulary MCQ tests used in this way have been very useful but there are some practical problems in their use as general evaluation measures. Here, we discuss why such tests remain useful evaluation measures, introduce a new vocabulary test, evaluate several current sets of semantic vectors using the new test and compare their performance to human data.
|Title of host publication||Proceedings of the Annual Conference of the Cognitive Science Society|
|Subtitle of host publication||CogSci 2017 London: “Computational Foundations of Cognition”|
|Publisher||Cognitive Science Society|
|Number of pages||6|
|Publication status||Accepted/In press - 11 Apr 2017|
- Distributional semantics; vocabulary MCQ.