From language models to distributional semantics
Time: Monday, February 04, 2013, 01:00pm - 02:00pm
Place: Memorial Hall 401
Distributional semantics represents what an expression means as a vector that summarizes the contexts where it occurs. This approach has successfully extracted semantic relations such as similarity and entailment from large corpora. However, it remains unclear how to take advantage of syntactic structure, pragmatic context, and multiple information sources to overcome data sparsity. These issues also confront language models used for statistical parsing, machine translation, and text compression.
Thus, we seek guidance by converting language models into distributional semantics. We propose to convert any probability distribution over expressions into a denotational semantics in which each phrase denotes a distribution over contexts. Exploratory data analysis led us to hypothesize that the more accurate the expression distribution is, the more accurate the distributional semantics tends to be. We tested this hypothesis on two expression distributions that can be estimated using a tiny corpus: a bag-of-words model, and a lexicalized probabilistic context-free grammar a la Collins.
|In category: Computational linguistics|
JEvents v3.0.9 Stable Copyright © 2006-2013