Assignment 4: Learning Words

Due Sunday, 2006-09-24

Download these two Python modules. Notice that utils.py has been updated again with a few more functions of sequences and sets that you may want to use.

In this assigment you'll start with a long-term, cased-base memory (LTM) of all of the utterances that a learner has heard, take a new incomplete test utterance, which is an attempt to refer to something or understand a reference, and use the LTM to complete the test utterance, filling in the form for referring and the referent for understanding.

Read what's at the top of world.py to make sure you understand how Individuals and Episodes (utterances) are represented and compared for this assignment. Utterances are a kind of Episode, and in fact all of the Episodes in the assignment will be utterances. Note that five Individuals are pre-defined. These people, who are the potential speakers and hearers of utterances, are the only Individuals in the LTM. The section at the bottom of world.py is for artificially creating an LTM of utterances (instances of the Episode class) as if the learner (ego) had been exposed to them and remembered them. Using templates ("fake lexical entries") for each word, the procedure gen_from_templates adds instantiations of the templates to the LTM, each a separate utterance (Episode). This procedure is called in world.py, specifying 6 utterances for each word. (You can change this number if you want.) At the bottom of world.py, eight test Episodes are created, each an utterance with ego either as speaker or hearer. When ego is speaker (testeps 1-4), there is no form (no value for the 'sound' feature), and when ego is hearer (testeps 5-8), the 'referent' value is either missing or incomplete.

Write a function process that takes a test utterance and uses the LTM of previous utterances to complete it. For example,

# What to say when you refer to something sweet, white, grainy, etc.
>>> process(testep1)
{'sound': 'sugar', 'referent': {'color': 'white', 'taste': 'sweet', 'viscosity': 'runny', 'reflectivity': 'medrefl', 'texture': 'grainy'}, 'hearer': {'gender': 'female', 'age': 45, 'legs': 2, 'rellength': 6, 'intelligence': 'smart'}, 'speaker': {'intelligence': 'smart', 'legs': 2}}
# Here's the part that matters in all of that gunk:
>>> _['sound']
>>> 'sugar'
# How to interpret the word 'snake'
>>> process(testep7)
{'sound': 'snake', 'referent': {'intelligence': 'medintell', 'shape': 'cylinder', 'texture': 'smooth', 'flexibility': 'flexible', 'reflectivity': 'shiny'}, 'hearer': {'intelligence': 'smart', 'legs': 2}, 'speaker': {'gender': 'female', 'age': 45, 'legs': 2, 'rellength': 6, 'intelligence': 'smart'}}
# Aha, so this is what a 'snake' is like:
>>> _['referent']
{'intelligence': 'medintell', 'shape': 'cylinder', 'texture': 'smooth', 'flexibility': 'flexible', 'reflectivity': 'shiny'}

process works like this:

  1. It first finds all of the utterances in LTM that closely match the test utterance. You can either do this by finding all of those that are very similar to the test utterance or by finding those that are most similar to it. (In the latter case, you'll need to sort the utterances, or a copy of the utterances, by their similarity to the input test utterance.) I did it the second way; I haven't tested the first way. Constants for use both ways are provided at the top of world.py.
  2. Next it takes the resulting set of retrieved utterances and blends them by producing a single utterance (or dict) with the values that predominate for each feature in the retrieved utterances. If no value predominates for a feature, the resulting dict stores nothing for that feature. For example, if the retrieved utterances have referents with colors ['green', 'green', 'red', 'black', 'white'], then no value for 'color' would be recorded in the blended utterance.
  3. Finally it takes the blended utterance from the closely matching retrieved utterances and incorporates the feature values in it into the original test utterance. In this process, the test utterance has priority: if it already has a particular string, int, or Individual as the value of a feature, this feature's value in the test utterance is not replaced or updated with a value from the blended utterance.

Home

Calendar

Coursework

Notes

Code

HLW


IU | COGS | CSCI

Contact instructor