Homework 4: Natural Language Generation

Due Friday, 18 April

This assignment is worth 10% of your grade. You will write a simple recursive transition network to handle a set of English sentences and use it to randomly generate sentences.
  1. Write a recursive transition network grammar and an accompanying lexicon which will generate or parse the following sentences. Your grammar and lexicon should be in a declarative format, not in the form of a set of procedures.
    1. A doctor snored.
    2. The doctor looked sinister.
    3. The doctor squirmed.
    4. The doctor doubted a sinister lawyer.
    5. The lawyer resented a very lovable teacher.
    6. The lawyer doubted the rather sinister, pretty silly doctor. [Don't worry about the comma in your grammar.]
    7. Every teacher resented a lovable doctor.
    8. The lawyer believed that every teacher looked rather silly.
    9. The lawyer believed that every teacher wished that the very silly lawyer resented the doctor.
    10. The teacher doubted that every doctor snored.

    Of course your grammar should be general enough to handle sentences other than these, but it should not generate or parse sentences like the following.

    1. *A doctor slept sinister.
    2. *The lawyer wished the teacher.
    3. *Teacher slept.
    4. *The teacher looked rather.
    5. *The doctor believed.

  2. Write the help procedures that you will need to access networks within the grammar, states within networks, and transitions within states and to test whether a state is final, whether a transition is a jump transition, and whether a label is a lexical or a phrasal category. You will also need a procedure to randomly select a transition out of a state when there is more than one. You should bias this procedure to pick particular transitions; otherwise, your sentences could end up very long. One way to do this is to arrange the transitions in your grammar in order of preference and have the procedure pick the first ones with greater likelihood than the later ones. The transitions that lead to recursion should be relatively late so that the sentences don't end up too long.

  3. Write a procedure generate which generates a sentence from the grammar. Starting with the S network, it traverses the network, randomly selecting a transition out of a state when there is more than one. When a transition with a phrasal category label (such as NP) is selected, the program recursively traverses the network for that category, returning to the higher network when it is done. The output of the program should be a structured representation of a sentence, for example, for A doctor slept:

    (s (np (det a) (n doctor)) (vp (iv slept)))

  4. (5 points extra credit)