Homework 5: Genetic Algorithms and Simulation

Due: Friday, 2 May (no late handins accepted)

This assignment is worth 20% of your grade. You will write a genetic algorithm (GA) to solve a simplified problem for a robot: navigating through a maze. The simulation of the robot in the maze has been done for you. You will also use your GA to maximize a function. For extra credit, you can simulate the iterated prisoner's dilemma (IPD) and have the GA evolve strategies for playing the game.

The Robot's Task

The robot is a simple machine. It can move north, south, east or west (NSE or W). It has no sensory input and does not have conditional moves; its behavior is determined by a list of which moves it is to take while in the maze. If it hits a wall, that move is simply wasted. The maze has one entrance (where the robot starts) and one exit (the robot's goal). The closer the robot gets to the exit in the fewest number of moves, the better its score will be. The robot will simply be represented as a list of moves:
	(n e e e s s)
The maze might look like this:

	######
	#   x#
	### ##
	#e  R#
	######

where x is the exit, e is the entrance, and R is where the robot currently is (in fact, that is where it will end up after executing the moves listed above if it starts from e). Since the robot cannot see anything, it merely moves until it reaches the exit (at which point all subsequent moves in the list are ignored) or runs out of moves. How close it is the the exit determines its score. If it reaches the exit, it gets a high score and is also given extra points for using fewer moves to get to the exit.

Some Definitions

genome
The list of a population member's genes. Genes are 1s and 0s. You can represent this as a list, e.g.,
			(1 0 0 1 1 0 1 1)
		
phenome
The expression of an individual's genome in the "world." In this case, the phenome is the form of what's being evolved, or a description of the behavior of the individual, and it comes directly from the genome. The robot's genome would be a list of its moves, decoded from the genome. If every 2 bits in the genome encoded for one move (00 = n, 01 = s, 10 = e, 11 = w), then the above genome's phenome would be:
		GENOME= (1 0   0 1   1 0   1 1)
		PHENOME=( e     s     e     w )
		
fitness
How good a genome (or phenome) is. Genomes are sometimes scored directly, but often they are converted into a phenome and the phenome is given a score. For the example genome/phenome above, if the robot's moves of e, s, e, w lead it closer to the exit, it would have a higher score than one that leads it to some dark corner of the maze far away from the exit.
individual
Your GA will operate on individuals. An individual will be a list of the fitness, genome and corresponding phenome. For example:
		(4   (1 0 0 1 1 0 1 1)   (e s e w))
	
population
A group of individuals. The GA operates on these, producing their phenomes from genomes, evaluating each one's fitness, then selecting mates and producing new individuals for the next generation. This will be a list of individuals.

The Genetic Algorithm

The GA begins with an initial population with randomly-generated genomes. It generates phenomes for each genome, then evaluates each individual's fitness based on its phenome. You will need to write a routine, run-ga, which takes a population size, a genome size (how many genes), the number of generations to run, and two problem-specific routines, genome->phenome and eval-pop-fitness. The problem-specific routines follow these guidelines:
genome->phenome
INPUT: a genome
OUTPUT: a phenome based on the genome
eval-pop-fitness
INPUT: population of individuals
OUTPUT: same population, but fitness scores have been added
The following are suggestions as to what routines to write.
  1. Routines to make and access individual's components

    make-indiv takes a fitness, genome and phenome and returns a list containing these items. indiv->fitness and complementary access routines get the parts of an "indiv" record.

    These routines are simple and can help make the rest of your coding less cumbersome.

  2. mutate

    Takes a genome, returns the same genome with some of the bits flipped. Flip with a small probability. Usually this is 0.00001 or less. A small population or few generations might benefit from higher values, but 0.5 is a bad choice.

  3. crossover

    Takes 2 genomes, performs one-point crossover on them to produce two new genomes, which are returned in a list.

  4. mate

    Takes 2 individuals, each looking like (fitness genome phenome), and performs crossover on their genomes to get 2 new genomes. It then mutates the new genomes. Finally, it makes 2 new individuals with dummy values for fitness and phenome and returns a list of these new individuals. For example:

    		((0 (1 1 0 0 1 0 1 0) junk)
    		 (0 (1 0 0 1 0 1 1 1) junk))
    	

  5. select-indiv

    Takes a population of individuals. Chooses a single individual randomly and returns that individual. Important: this random choice must be biased based on the individual's fitness relative to the population. So, if individual #1 had fitness 8 and individual #2 had fitness 4, #1 should be chosen twice as often as #2.

    Use the rule that higher fitness is better. Also, beware of loser populations with all 0 fitnesses! You might want to randomly select in such a case.

  6. random-pop

    Takes a population size and number of genes in each genome and generates a population of individuals with random genomes. The phenomes and fitnesses can be left with dummy values for now.

    A useful helper might be random-genome, which takes a genome-length and returns a random genome of that length.

  7. run-ga

    This takes a pop-size, genome-size, num-generations, and the problem-specific routines genome->phenome and eval-pop-fitness.

    It creates an initial population with random-pop and then does the following things for each generation:

    1. Creates the phenome for each individual in the population.
    2. Evaluates the fitness of each individual.
    3. Selects individuals and mates them to produce the new generation
    4. Prints out some info (highest fitness or total population fitness, plus the generation#)
    Once a new generation is produced, this process begins anew - the old population is discarded and the new one is then evaluated, etc., according the the above steps.

    Helper routines might be make-pop-phenomes which takes a list of individuals and generates a list of the same individuals with their phenome "fields" filled in. make-new-pop would take the population that has just been evaluated and calls select-indiv and mate until a new population has been produced.

    When done - when the last generation is reached - run-ga outputs the final population so that you can see the resulting phenomes and fitnesses. It would also be good to output the highest and lowest fitness, the total fitness and the best individual.

Testing the GA

The GA is general-purpose, so you will have to provide problem-specific routines to it:
	genome->phenome
	eval-pop-fitness
To see if the GA is working, use something simple, like the following that selects genomes for having lots of 1s. If your test is too complex, you won't know if your GA is working properly. Run it many times to be sure.

;;;  phenome made by adding 1's in genome.
;;;  fitness = phenome's value, squared
;;;  should select for all 1's in genome
(define simple:gen->phen
  (lambda (g)
    (apply + g)))

(define simple:eval-pop-fit
  (lambda (pop)
    (let* ([indiv-fit (lambda (indiv)
                        (let ([p (indiv->phenome indiv)])
                          (* p p)))]
           [new-indiv (lambda (indiv)
                        (make-indiv (indiv-fit indiv)
                                    (indiv->genome indiv)
                                    (indiv->phenome indiv)))])
      (map new-indiv pop))))
Small populations work terribly, so try using at least 10 individuals. Small numbers of genes also aren't helpful, so use a genome-length of 10 or 15.

A simple GA task

Now try your hand at developing an encoding for a GA problem and evaluating a population's fitness.

Devise a max:genome->phenome and a max:eval-fitness for "maximizing" a function. The function is f(x) = 1 + 8x - x^2. What integer value for x will give f(x) its greatest value?

Hint: use a binary-coded decimal.

Run your GA on a population of 16 with a genome-length of 8.

Using the Robot-Maze code with the GA

The robot-maze problem code is here. This code also uses some matrix operations which you will need. The robot's routines are
	robot:genome->phenome
	robot:eval-pop-fitness
You may also make your own maze with make-maze (see the code for details). You would then have to run (init-maze my-maze) if you called your maze "my-maze". Then, you can run the ga.

The following would make a small maze for the robot:

        ;; *'s are walls; #s are the fitness value for getting to
	;; that part of the maze (the exit has the highest score
	;; in my maze, and surrounding spaces are higher scores
	;; since I want my robot to get as close to the exit as it can)
	(define maze-list 
		'(5 6 * 0 0
		  4 * * * 0
		  3 2 1 0 0))
        ;; make-maze takes #rows, #cols, start-coord (dotted list)
        ;;  exit-coord (dotted list) and a list of the maze items
	(define *my-maze* (make-maze 3 5 '(2 . 4) (0 . 1) maze-list))

	(init-maze *my-maze*)

Use at least a population of 10, if not 20 individuals. Diversity is the key here - using a ga on a population of 4 or 6 individuals can give bizarre results (try it!). The genome length must be even (every 2 genes codes for a cardinal direction). Make sure there are enough genes to at least allow the robot to finish the maze. Twice as many genes as there are steps would be a good number. The number of generations may vary, but expect to run at least 10.

A mutation rate of 0.01 or 0.05 or even 0.1 might help with a small population of robots.

Setting *verbose* to #t will cause the fitness evaluator to show each robot in each population moving through the maze, one frame at a time. This is recommended only for use with small populations and short genomes as it gives quite a bit of output. It may help you to see what is going on, however.

Extra Credit: The Iterated Prisoner's Dilemma

The prisoner's dilemma is a game between 2 players. Imagine that both have committed a crime and have been caught. Alone in a cell, prisoner1 is told that if he testifies against his partner and his partner says nothing, then he gets off free while his partner gets 5 years in jail. If he rats and his partner rats as well, they both get 3 years. If prisoner1 says nothing and his partner rats on him, then prisoner1 gets 5 years in jail while his "buddy" goes free. Finally, if both keep quiet, they both get 2 years in jail.

Ideally, both will keep quiet - they know each is being offered the same deal. However, a better deal for each is that they defect and the other stays quiet! Always defecting is the best strategy, as it safely gives the best payoff regardless of what the other prisoner does.

The iterated prisoner's dilemma is a bit more interesting. Now, in addition to playing the game, each prisoner has a memory of previous games played with the other. So, decisions on whether or not to defect can be based on previous interactions. Does the other prisoner always defect? Then defect. If he always keeps quiet, then maybe keeping quiet is less risky than it would be with no prior history.

What to do

Using your same GA routines, develop a genome->phenome and eval-pop-fitness that will evolve a population of good players for the iterated prisoner's dilemma. Allow for a memory of 2 games back.

The encoding is up to you. It's not too hard if you give it some thought. Hint: the non-iterated prisoner's dilemma can be solved with one bit. The IPD takes a few more bits than that, but not much. Describe your encoding and why you chose that encoding.

Discuss the strategies that arise and what seemed to work best for the GA (population size, number of generations, mutation rate?).