Cognitive Studies/Psychology/Visual Studies 201:
COGNITIVE SCIENCE IN CONTEXT

Laboratory Manual for Neural Net Module

Module by Elvis Au
Manual by Elvis Au and Whitney Tabor
Edited and updated by Doug Elrod

February 10, 1998

New for Spring 2001: MATLAB project by Mark Andrews

Introduction

This module is centered around an artificial "animal" that lives in a two dimensional environment and has to struggle for survival. The "animal" is implemented as a neural network, a simple model of a brain attached to perceptual sensors and motor controls.

You will be able to design the network "animal" by setting a few parameters; then you can design some typical situations in the environment and teach the network how to behave in those situations; finally, you'll be able to test the trained network in a randomly created environment and assess its survival skills.

The module will raise a number of intriguing questions at the interface of biology and psychology: In what sense is learning an advantageous capability for a species? What do we mean when we say that an organism has learned well? What is the nature of the tradeoff between complexity and efficiency for organisms that must struggle to survive? (Note that a more complex organism can usually handle a greater variety of situations than a simple one, but it pays a price in the form of the energy it must expend on maintaining, or growing, its complex structures). The module will also provide an introduction to the properties of artificial neural networks.

We can think of an analogous, biological animal at many levels (macro-organism level, micro-organism level, cellular level, molecular level...). This neural network organism can be analyzed at three different levels, crudely analogous to the biological organism:

  1. Interaction between organism and environment
  2. Theory of the mechanism of the organism
  3. Implementation details of the organism

Each of these levels will be investigated in turn and expanded in greater detail.

About Artificial Neural Networks

A neural network is made up many smaller parts (neurons) that are highly interconnected with each other. The activity of a neuron in a neural network depends on the activity of the neurons it is connected to, or to the environment (if it is a sensor, otherwise known as a receptor). A feedforward neural network is a neural network that does not have any recurrent connections (connections that loop back into the system); the connections strictly go in one direction (input to output).

Neural net with four 
input nodes, two hidden nodes, and three output nodes

One of the main reasons for being interested in neural networks is that there are general purpose algorithms which allow them to learn how to interact with an environment in an appropriate way. The abilities of a neural network are determined by the weights on the connections between its neurons. A learning algorithm is thus a method of adjusting these weights so that the network becomes capable of responding appropriately to the range of situations it encounters in its environment. The most popular neural network learning algorithm is called backpropagation. This algorithm is explained more fully in Section II. In this module, we are using neural networks as an engine for the decision-making processes that our artificial animal will use for a specific virtual environment.

We will build a small neural network "brain" for an animal that is free to wander in an environment populated by several basic objects. The goal for this autonomous agent is to collect "food" and avoid objects that can "hurt" it. This animal will be able to experience the environment through a perceptual field and interact with it using its motor units. You will play the part of evolution to help the animal learn appropriate behaviors that will promote its survival, or allow it to perform specific tasks.

Neural networks have been used to predict weather and stock markets and to recognize handwriting. They are also used in modeling speech production/parsing, visual recognition, and vision and movement in robotics. They have even been used to model neurological disorders. There is a large market for computer games with built-in intelligence (Command&Conquer: Red Alert, Creatures, etc.).

Reading Assignment 1

This is a good introduction into artificial neural networks and how they were derived from their biological counterparts. Different types of neural networks are described, including their history and applications in the real world. This reading should give you a good idea what neural networks basically are.

Also, visit these world wide web sites to get an idea of artificial neural network commercial applications in the real world:

BioComp Systems, Inc. "Systems that model, predict, and optimize"
http://www.bio-comp.com/

Computer Associates, Inc. NEUGENTS™: Software that can think™
"Neural network-based solution for predictive enterprise management"
http://www.cai.com/neugents/

NeuralWare, Inc. "Solving today's problems with tomorrow's technologies"
http://www.neuralware.com/

Pegasus Technologies Ltd (a subsidiary of KFX) "NeuSIGHT uses proven advanced neural network technology to achieve optimal performance on fossil-fueled furnaces"
http://www.pegasustec.com/


About the Software

The software accompanying this module allows for the simulation of a simple feedforward neural network. This simple feedforward network serves as a brain for a virtual animal. The brain links the animal's receptors to its output motors.

The Software (3 modules)

The network is made of up three layers of adjustable numbers of neurons:

The Scenario: Characters, Objects, and the Environment

The environment is a two-dimensional grid populated by several types of objects.

The Objective

To construct an animal that is able to survive in its environment (i.e. collect food, avoid rocks, and avoid predators) or perform special goal-oriented tasks. It is important to consider the configuration of the neural network in reaching this goal (i.e. certain configurations might be wasteful, others more efficient).


Section I. Interaction between organism and environment

This section will allow you to observe and experiment with the behaviors of the simulated animal in its environment. We are not going to explore the neural network's internal mechanisms in this section, but we are going to observe it from an ethological, or behavioral point of view; just as you would doing fieldwork if you were observing an actual animal. For now, we are more interested in the behavior of the neural network and how we can modify it by presentations of certain stimuli or by teaching the animal new behaviors.

The actual architecture and representation of the neural network will be the focus of Section II. For now, it is important to get a feeling for use of the software and the range of behaviors this simulated animal is capable of. This section will also explain how to define examples of appropriate behavior for the network in specific circumstances.

This section also brings in a touch of artificial life. By viewing this neural network as a creature with certain behaviors that we can observe and manipulate, we are heading in the direction of the study of artificial life: the belief that life does not have to exist in the traditional organic materials that we know. Researchers in artificial life are interested in observing the behaviors of animals to inspire new ideas regarding the causes and mechanisms of animal behavior. The goal is artificial life and ultimately, perhaps, the evolution of artificial intelligence.

In this laboratory, we are aiming to build a neural network that produces autonomous-like behaviors, things that animals do. Our objective is to teach the animal "rules" that it should follow in order to survive or perform some task. Of course, we cannot possibly teach all possible combinations of what the neural network may encounter in its environment (it would take too long). Therefore, we will have to come up with the most basic and logical rules that the network should learn, and hope that the network can generalize appropriately. Basically, you are playing the role of evolution.

Reading Assignment 2

For additional historical information, beyond that in the reading assignment, you may want to look into

For more information on artificial life, please refer to the following www links:

http://alife.fusebox.com/
This site includes several artificial life demos, the parameters of which you can manipulate and run. This offers an excellent introduction into artificial life and simulation of animal behaviors.

http://alife.org/
http://www.alcyone.com/max/links/alife.html
These two sites offer more in-depth information of current A-life research, and links to material on neural networks, genetic algorithms, cellular automata, etc.

Part I. Choosing a default architecture and training the network

  1. At the top of the screen, find the System menu and select open session. (you may have to click on the main menu window to bring up the menus). A file-selection window should pop up. Change your current directory to the one above this one (click on A.N.N.A. and select nn module release 3.5.2). Find default.net and press OPEN to load the file. The default network that comes with the program should be loaded.
  2. This default neural network consists of:
  3. You should see the Training Kit in front of you. The middle panel with the grid represents the animal (in the center of the grid). The area in the grid above the animal is what is in front of it. The area to its right is what is on the right side of the animal, and so on. We are viewing the animal as if we were looking down upon it from above. The red-shaded region in its immediate area is the animal's perceptual field. It can see objects within this field, but not objects outside its perceptual field.
  4. The network has also been set up so it can be crudely trained to approach food. Double-Click with the mouse on each of the scenarios in the scenario list.
    Note: Each type of object is represented by a different color. The presence of food is represented by a shade of light blue.
    You can see that each of the scenarios consists of having a "food" stimulus in front of the animal somewhere in its perceptual field. Note that the output selected to be trained on is "move forward". This should produce some food-oriented behavior, in which the animal moves toward food sources (approach behavior).
  5. A neural network is a device which can be exposed repeatedly to examples and "trained" to improve its behavior. As it sees more and more examples, it gets better and better at responding the appropriate way to each one. Error is a measure of how well the net is performing. The smaller the error term, the better the animal performs. This manual will explain it in more detail in Section II.
    The current settings of the training control panel should be:
    learning rate: 0.05
    min error: 0.05
    presentations: 5000
    (Change them to these values for this part of the experiment, if necessary).

Part II. Assessing the network's behavior

Train the animal with the current settings by hitting the train button. Note the number of epochs it needs before it is done training. One epoch means that the neural network was trained on the entire set of scenarios once. A small window on the right should be displaying the current error of the neural net in trying to learn these scenarios. The error should rapidly decrease over several hundred epochs of training set presentations. When the training is finally done, press OK in the window that pops up to inform you that the training is done.

Click on the environment button in the main menu to bring up the Environment Kit. In the Environment Kit, enter the following parameters into the world control panel:

Food density (%): 10
Rock density (%): 0
Predator density (%): 0

Press the generate world button to build the world with these new parameters.

This should sparsely fill the environment with food. This should also eliminate any rocks or predators in the environment. In the environment control panel over the world control panel, enter 100 in the steps text field. At the top of the screen, select the Environment menu and select animate OFF to turn the animation on (this menu item toggles from one setting to another when you release on it).

Click start test to activate the animal for 100 actions at a time.

The log panel on the left of the environment display should list the current direction the animal is facing and what action it is performing (move forward, turn CW, turn C-CW).

Note: that the animal ages every action step. This is recorded and displayed in the statistics window. The distance the animal travels and the amount of food that it has obtained is also displayed.

Exercise 1

Do you see any evidence that the animal is approaching food or is it just randomly wandering? Remember that the animal cannot see too well because of its limited perceptual field. A piece of food must be immediately in front of it before it is prompted to move forward onto it and eat it.

What does the animal do if it doesn't see food? Is there any pattern to its movements if the animal happens to wander into a particularly barren area of the environment?

Make a rough estimate of how many time steps or actions the animal takes to obtain a piece of food.

Click on the Training Kit button in the main menu to bring up the training kit.

In the training kit control panel, change the numbers to:

learning rate: 0.05
min error: 0.005
presentations: 2000

The important change here is that the minimum error level that the neural network has to be trained to is lowered by a factor of 10. This means that the neural network will take much longer to train before it reaches this minimum error.

Once the training is complete, note the number of training set presentations required (epochs). Bring up the Environment Kit and repeat the steps outlined above in Part I.

Now, do you see any evidence that the animal is approaching food or is it just randomly wandering? What does the animal do if it doesn't see food? Is there any pattern to its movements when the animal happens to wander into a particularly barren area of the environment? Enter 100 into the steps text field and click on start test to automatically test the animal for 100 time steps. Make a rough estimate of how many time steps or actions the animal takes to obtain a piece of food. Is there any difference this time through the simulation? Does more training help the animal in its food-oriented behavior?

Part III. Defining Scenarios

Click on the Training kit button in the main menu.

Now we are going to add a new object in the environment, namely the rock, and train the animal to avoid bumping into rocks because rocks can hurt it. What kind of situations do you think the animal could encounter in the environment that is now made up of rocks and food?

Double-Click on each of the training scenarios already built-in for food-orientation to get an idea. Remember, you might want to APPROACH food and AVOID rocks.

Enter this scenario:

Fill the perceptual field with rocks by selecting the rocks button on the left, and clicking on all the areas in the perceptual field until they change color. Then go to the output selection and highlight "turn CW". This is telling the animal that if it sees a wall of rocks right in front of it, it should NOT move forward, but it should turn clockwise away from the rocks. Add the scenario and call it "wall of rocks", by clicking on the add button next to the scenario list.

Exercise 2

Think of more scenarios like the one above and add them to the animal's list of things to learn. And think about why you would add these scenarios. Are they necessary to help train the animal? What function do these scenarios perform?

Exercise 3

  1. Similar to what you did above, train the animal to the 0.05 error level, test it in the environment, and then train it again to 0.005, to see if the training really helps. Fill the environment with 10% density of rocks and 0% food and 0% predators. Observe the behavior of the animal, and gather some statistics on its performance to determine if it is successfully avoiding rocks. Step the animal one action at a time to see what it actually does and finally let it run for a few hundred time steps. Note how many rocks the animal encountered. You may want to test the animal several times. Your animal may have just been "lucky" or "unlucky" this one trial. Use more trials to confirm the performance of the animal, and collect data each time.
  2. You may want to save your data during each run. Go to the Environment menu at the top of the screen (you have to be in the Environment Kit). Find stats options and select it. Select the kinds of data you want the simulation to record by checking the checkboxes in the window that opens. Press OK to confirm your selection.
    When you are ready to record, go to the same menu and select record stats. The program should ask you where you want to save your data. Later, when you are done, you can open up this file in any text editor to view the data or use copy-and-paste to enter the data into a spread-sheet/statistics program.
    Note: On the Macintosh, double-clicking on these data files will cause Java Runner to interpret them as Java programs, which they aren't (so you will get a Java error). To view them in a text editor, either drag the file and drop it on top of the text editor icon of your choice, or start the text editor and open the data file from the text editor's FILE menu.
  3. Now go back into the environment and set the density of food AND rocks to 10% each and click on rebuild world. Now you have a world filled with food and rocks. Test your animal with this environment. Any problems? What can you do to fix them? Did the animal come across any situations you did not foresee? For example, what did it do when there was both a rock and food in its perceptual field?
  4. You might want to open up the Training Kit and revise your scenarios.
    Note: you may also want to reset weights. This completely wipes out any experience from previous trainings, and lets you start with a totally brand new baby net.
  5. When you think you have a robust improvement in the animal's performance, compared to the original network, test this hypothesis with a t-test. Comment on the kinds of situations in which the animal's performance improved.

Section II. Theory of the mechanism of the organism

In this section, we will look inside this animal and try to understand the neural network engine that is driving its behaviors. This section is filled with exercises designed to introduce the properties of neural networks -- what they can do and what their limitations are. You will also be able to build your own neural network model, configuring its perceptual field and behaviors. In many cases, you will also be able to use your own model to run through these exercises.

What's an artificial neural network?

Now that you've seen a bit of what artificial neural networks can do, we will describe what they actually are. Artificial neural networks are models of biological neural networks found in organisms. They model basic properties of biological neural networks which include parallel distributed processing and graceful degradation. Artificial neural networks are not "real", they do not even possess "real" neurons, but they are models of real neural networks and are composed of models of real neurons.

Artificial neural networks do display many properties of biological neural networks. They demonstrate parallel distributed processing through the use of many interconnected artificial neurons that process and transfer information in parallel. Just as there is no one neuron in your brain that stores an image of your mother, the memory of artificial neural networks is also distributed across the different interconnections in the net.

Another property artificial neural nets demonstrate is graceful degradation. With the passing of each day, we lose more and more brain cells, yet we are still capable of functioning. In extreme cases, people have lost, traumatically or surgically, pieces of their brain, but are still capable of functioning (maybe not as optimally as before). Artificial neural networks are the same way -- they can be artificially lesioned or lose neurons or connections. Yet they are still able to function, but perhaps not at their best.

Artificial neural networks also demonstrate the ability to generalize. This property allows the net to learn to predict future outcomes from examples it has encountered in the past. You can train a neural network to produce certain behaviors under specific circumstances, but hopefully it can also learn to generalize from the instances you taught it and produce similar behavior under similar, but nonidentical, circumstances.

In this model, there are three layers of neurons (or "nodes") connected to each other. One layer serves as an input or receptor layer, receiving information from the environment. Another serves as the output layer, performing motor functions. The third layer bridges the input and output layers and serves as an additional layer of processing. Information flowing through this network is strictly one-way or feed-forward, meaning from input to output. Of course there are other models, more complicated, with all sorts of recurrent connections producing feedback, and capable of discriminating temporal patterns. But, as an introduction, our model is a basic feed-forward neural network.

The Parts of a Neural Network

In literature, most neural networks are represented by circle-line diagrams (sometimes it is boxes or ovals). The circles represent the neurons and the lines represent the connections between the neurons. How neural networks are actually represented in a computer will be explained later.

Neural net with four 
input nodes, two hidden nodes, and three output nodes

In this particular example of a neural network, there are three layers: output, hidden, and input layers. The circles in each layer represent the neurons or nodes in each layer. The terms neurons and nodes will be interchangeably used in this laboratory. The lines between the layers are the connections between the neurons. Information in this feed-forward network flows only in one direction, from input to output.

Note: in the sketch of the neural network model in the Construction Kit of the program, the order is reversed with the input layer at the top, and the output layer at the bottom.

The connections themselves can also be represented as weights. Weights are numbers assigned to each of the links to represent the strength of the connections between neurons. You will learn more about weights and connections later in Section III. when you will actually run through the calculations involved in the processing of input within a net. Usually, the higher the number, the stronger the connections. If the value of a weight is negative, then the connection is inhibitory.

Another important component of the neural network is the activation or squashing function, which in this model is the sigmoid function. The sigmoid function is an example of a non-linear function and introduces non-linearity into the network model. The function relates the input to the activation in the neurons of the hidden and output layers. For example, when the neurons in the input layer are activated by a particular stimulus, the signal gets passed through the connections between the input and hidden layers, and gets amplified or reduced (depending on the weight connection) and is accumulated in the hidden layers to produce a new signal. This new signal is run through the sigmoid function (non-linear) which "squashes" the original value to produce a new signal with a magnitude between 0.0 and 1.0 This new value is then processed in the same way through the next set of connections until an output is produced.

Sigmoid function rising
from -1.0 to 1.0 on X-axis Sigmoid formula: y(x)=1/(1+exp(-x))

The sigmoid function is often found in other systems, especially in biological systems. In erythrocytes (red blood cells), the ease of which these cells can pick up oxygen can be described using a sigmoid function. In biochemistry, there are numerous examples of molecular interactions which can be described by this function. We will discuss in depth in a later exercise (the XOR problem) why this non-linear component is necessary.

Neural Network Training Algorithm

So far, you have been able to train the animal to produce certain behaviors upon encountering certain stimuli. You have also been able to modify scenarios and add new ones to the list and successfully train the neural network on these as well. But, what actually happens when you hit the Train button? And how do the parameters like learning rate and error rate fit in?

When a neural network "learns", what actually happens is that the weights of the net are adjusted so that a certain stimulus produces a certain output. When you first build your neural network, all the weights of the connections are random numbers. When you train the net, the weights are gradually adjusted over many trials (the many epochs you see that are required to train the net) until the network produces the behavior you had specified. This is a gradient-descent algorithm, meaning that error between the network's output and your target output (the behavior to be learned) is gradually being minimized.

This where the error rate parameter comes in. In the Training Kit, you can specify the minimum error threshold to which the neural network should be trained to. Training stops when the network reduces its error to this value. The smaller the number, the better the neural network performs, but it may also take longer for the network to train. Learning rate is how much each of the connections should be adjusted during each trial of learning. High learning rates can allow the neural network to learn faster in certain cases, but may actually cause problems in more complex problems (searching using large steps can sometimes overlook a good set of weights). In our own experimentation, we typically left the learning rate at 0.05. You may want to experiment on your own to determine what learning rates work the best.

In this neural network implementation, the algorithm used to adjust the weights of the network is called the backward-error propagation training algorithm, or backprop for short. Backprop is a gradient-descent algorithm. As the name implies, the difference between what the network produces and what it should produce is used as an error signal which is sent backward into each of the preceding layers and used to slowly adjust the weights. Criticism of neural networks and their biological implausibility is usually directed at these "unnatural" training algorithms. In biological systems, neural networks do not produce error signals to adjust their connection strengths.

Reading Assignment 3

Finally! Let's build a brain!

Click on the Construction Kit in the main menu to bring up the construction kit window. There should be three panels visible to you.

The architecture of the model is a basic feedforward neural network. Information flow from one layer to the next is only in one direction, from the input layer to the output layer. In this model, there are three layers that need to be defined.

Building the receptor (input) layer

  1. Click on the visual button next to receptors in the left panel. A 5x5 grid should be displayed in the center panel. The receptors that are provided for this model allow for it to "see" the environment.
  2. Click on several of the tiles surrounding the animal. The tiles that turn red have been "activated" and the neural network can see those spaces relative to it in the environment. Note that as each tile is selected, a corresponding sketch of the neuron is drawn out on the sketch panel in the input layer (top layer). Note: that there is a node already displayed even before activation of any tiles. That represents the receptor at the animal itself (it will know that it's on top of a rock or food, etc.). The receptor at the animal's position is always present.

Building the output layer

  1. Click on the motor outputs button at the bottom of the left panel. A set of motor options should be displayed in the center panel.
  2. Select the different types of motor outputs you wish to endow your model with. Remember that motor units, like other neurons, require energy.
  3. A corresponding neuron is displayed as each motor output is selected.

Building the hidden layer

  1. Click on the text field in the left panel next to interneurons.
  2. Enter the number of interneurons in the text field.
  3. Click submit. The corresponding number of interneurons entered will be displayed as part of the neural network sketch, complete with lines showing the connections from one layer to the next.

Exercise 4

Part of the experimentation involves trying to figure out a good configuration of receptors, interneurons and output neurons. Remember that each neuron in the network has a cost to health, needing energy to maintain -- the more neurons you use, the more energy the neural network will need. For example, you may want to concentrate all the sensors at the front of the net, or have sensory receptors all around it. These decisions will ultimately affect the optimal choice of motor outputs and the training necessary to teach the animal how to survive.

Configure your neural network and discuss why you chose to build the neural network the way you did (i.e. how your choice of receptor configuration affects your output configuration, etc.).

Part I. Object encoding: distributed representation vs. localist representation

1. Go to the main menu and open the Construction Kit.

2. Select stimulus encode from the System menu. A window should pop up displaying the binary strings the neural network uses to distinguish one object from another.

They should read:

Rock1000
Food0100
Predator0010
Clear0001

These four strings denote the four types of objects that can be in the environment. (Note: Clear is an object representing an empty space in the environment, i.e. not filled with predator, nor food, nor rock. So "figure" and "ground" are not "hardwired" into the animal's perceptual system. The animal can just as easily be trained to run away from clear spaces as from predators).

In the Construction Kit, each time you added a new receptor to the network, the sketchpad displaying a circle and line drawing of the neural network would respond by drawing one additional node. Actually, every time you are adding a neuron to the receptive layer, you are adding several nodes. Using the orthogonal representation as outlined above, you would be adding four separate nodes, one for each element of the binary stimulus string, to the input layer for each addition to the perceptual field of the neural network..

In the coding scheme described above, you would use four nodes to represent an object in the environment, with each node specifically coding for a specific object. The object rock would be coded by activation of one node, while another object such as food would be coded by the activation of another node. This is called localist encoding.

An alternative way to code a stimulus in this environment would be to use a distributed representation. Another way of thinking of these binary strings is by viewing them as features. For example (using the above orthogonal representation), you can think of the first element in the string as coding for Rock. The second element for Food. The third for Predator, and so on.

Rock
Food
Predator
Clear
X = 1,0
X
X
X
X


In this orthogonal or localist encoding each object is represented by a particular node. (Note that we are using a fact about our world that a location can contain rock, food, predator, or clear, but not more than one of these.) In distributed encoding, each object is represented over several nodes. So a particular object is represented by a combination of features, each of which may be coded by a particular node. For example,

Round
Blue
Sharp
Large
X = 1,0
X
X
X
X


where

Rock may be represented as: 1001 (it's Round and Large).
Food as: 1000 (it's just Round).
Predator represented as: 0111 (it's Blue, Sharp, and Large).
Clear as: 0000 (not Round, nor Blue, nor Sharp, nor Large).

Whether an encoding is localist or distributed depends on whether the properties we choose to encode are mutually exclusive in the world (leading to localist encoding), or not (leading to a more distributed encoding).

In this scenario, the neural network may have trouble discriminating between Rocks and Food, because they are both round; so it might start nibbling on pieces of Rocks. You can also adjust the length of these binary strings. You can have as many features as you want to represent objects in the environment. For example, you could try an encoding scheme such as:

Bad
Good
X = 1,0
X
X


where the objects can be represented as:

Rock
11
Food
01
Predator
10
Clear
00


(This gets into another problem. We'll come back to it later in Part V.)

Exercise 5

  1. Use localist encoding (the default), where each object is explicitly represented by one input node. Train on just avoiding predators. Test on just rocks. Does the net generalize from predators to rocks?
  2. Now design a neural network and an encoding scheme in which the rocks in the environment look surprisingly similar, but not exactly, to the predators. To change the encoding scheme, return to the construction kit and select stimulus code from the System menu. Feel free to use as many nodes as you think are necessary, and document why you designed it the way you did. Train your neural network animal to become really good at avoiding predators. Do NOT train it on any rocks in the environment at all. Clear the environment, but leave the rocks and see if the neural network animal behaves as if it mistakes the rocks as predators. Document your observations and record some data as evidence to show whether your distributed encoding scheme worked.

Part II: Hidden units

The hidden layer is the only layer we have not explored yet. There are certain problems that neural networks CANNOT solve without a hidden layer of neurons (see PDP handbook, pp. 318-322). An example of feed-forward neural networks without hidden layers, where the input layer is directly connected to the output layer, is the perceptron. Hidden layers offer an additional layer of processing, and recoding of the stimulus before it is passed to the output units.

In this section, we will experiment to see if changing the size of the hidden layer affects the performance of neural networks trained on the same set of scenarios. And if it does, how does the size of the hidden layer affect the training process? This experiment is more open-ended than the other experiments. You may want to divide into groups, assign people to run experiments with different hidden layer sizes, and later compare notes.

Exercise 6

Design some experiments to explore the functionality of the hidden layers.

Include:

  1. Discuss how the number of nodes in the interneuron layer changes the amount of time required to reach a given error criterion. Does the neural network train better with more nodes or fewer nodes? Make sure you test a neural network with only one neuron in the hidden layer to test one extreme.
  2. Now consider the effect of altering the number of hidden nodes on the network's performance in its environment. Does the neural network perform better with more or fewer nodes? Justify the choice you make for the environment to test the net.

Note: Make sure the design of your experiments is controlled, and enough tests are run to prove that your claim is statistically valid, or that it isn't.

Part III. Translation invariance

Modeling, like this neural network model, is a way of studying complex systems. By extracting key elements and using them to build a less-complex model of the system, we hope to be able to understand the larger, more complex system through a simplified model of it. This also means that certain details (that are deemed insignificant) may be lost in the modeling, therefore making the model less valid in the real world. In this section, we are going to explore a deficit of this model.

Most animals that perceive their environment have the ability to recognize objects independently of the objects' locations. This is important because objects tend to move, and mis-identifying an object because it is in another part of your perceptual field can be problematic (especially if the object is a predator and wants to eat you). In many animals, the task of visual recognition is often separated into two tasks or pathways -- one for object identification and one for object-spatial location. This property of their perceptual systems is called translation invariance. In this section, we investigate ways of training the neural network so that it exhibits translation invariance. It turns out that this property is not an easy one for the neural network to learn.

Neural networks are good generalizers. But they are not in this case, and we will see some examples of this.

  1. Load transinvariance.net. You should see the training kit when it loads.
  2. Make sure the neural network is not trained on anything. If it is, clear its weights by pressing the reset weights buttons in the scenario panel.
  3. Add a scenario to the list to teach the neural network to move away from a predator object that is placed in the left-hand uppermost corner of its perceptual field.
  4. Now do the same to teach the neural network to move away from a predator object that is placed in the right-hand uppermost corner of its perceptual field.
  5. Now also add a scenario to teach the neural network to move toward a food object that is placed right in front (in the center) of its perceptual field.
  6. Train the neural network to about 0.005 error threshold on just these three scenarios.

Exercise 7

What are your predictions on how this neural network will perform with training on just these three scenarios? For example,

  1. How do you think the neural network will react to a food object directly in front of it? What does the neural network do when there is a predator in front of it on either side?
  2. Has the neural network generalized enough to realize that it should move forward toward any food element in its perceptual field, no matter where exactly it is? Run some tests using the above setup and come up with evidence to support your theory.
  3. What does the neural network do when you test it on a stimulus it has never seen before? (rock object and clear object) And why does it behave the way it does when presented with the new stimulus?

Note: you may want to manually move the neural network in the environment to test it on certain stimulus patterns. Click the reposition net button in the Environment Kit. Use it to help you position the neural network so it can receive the stimulus input pattern you want to test. You may also want to use the center button to center the neural network in the environment.

Translation invariance for neural networks?

The way the neural network is built, it has trouble generalizing object position. It may be able to discriminate between similar and non-similar objects within a particular location, but it has a difficult time understanding spatial locations of one object relative to another. For example, if it sees an item of food in front of it and reacts by moving toward it, it doesn't realize that it should also move toward a food item directly in front of it but just slightly further away. Fortunately, this is only an artifact as a result of the neural network's architecture, where each receptor of in the input layer literally represents a certain location in the immediate surroundings of the neural network.

Exercise 8

Think of ways of solving this translation invariance problem. You may want to propose ways of altering the architecture of the neural network, or augmenting the way the neural network is being trained, or the way the stimulus is presented, or any other creative approach. Come up with a solution and theoretically integrate it with the current neural network model. You do not have to actually implement your ideas.

Part IV. Ambiguous signals or conflicting stimulus

In this section, we will be exploring the way neural networks deal with ambiguous signals. This is one of the major appeals of neural networks. They can process amounts of information too overwhelming or ambiguous to the human observer, and produce a result that is generally in the right direction. Humans, when they can do this, often refer to it as a "hunch" or a "gut feeling". Because of these properties, neural networks have been applied to help predict weather, and even stocks. In our model, however, we are going to experiment with some simple ambiguous signals.

Exercise 9

Design several training scenarios to train the animal in seeking food and several for avoiding predators. First train your neural network model to seek food. Then train the model to avoid predators. (You should turn to the Environment Kit and select predators on in the Environment menu at the top of the screen to turn off the movements of the predators.)

Make sure you test the neural network on its ability to avoid predators and its ability to approach food independently. You may want to populate the world with predators for the first test, and then test its ability to avoid predators. Do the same for food-oriented behavior.

Now that the neural network model is both good at avoiding predators and seeking food, create a new environment which has both predators and food. Set up the neural network in the environment so it receives both a predator and a food signal. Now what will the neural network do? If you trained the neural network properly, it should have the tendency to run away from the predator, but it should also have the tendency to move toward the piece of food. What actually happens?

Note: You may want to look at the actual normalized values of the neural network output at the bottom of the screen. Each number represents the probability the neural network would produce a certain output at each time step. The higher the activation of an output node, the higher probability of that motor output being activated.

Can you think of any way of improving the net's performance? Try out your idea and report on the results.

Part V. Hard-to-learn tasks: the XOR problem

The XOR problem is an example of a function neural networks have difficulty in learning. The problem is defined by the following pairs of input-output patterns.

Input
Output
11
0
01
1
10
1
00
0


  1. Load XOR.net. You should see the Training Kit.
  2. If you click on the Construction Kit (don't press Continue), you should see a neural network with 1 input node, 2 hidden nodes, and 2 output nodes. Press Cancel to return to the training kit. Note that the one input node is a receptor at the position of the neural network, itself.
  3. The session has been set up so that the neural network can demonstrate the difficulty neural networks have in learning how to perform XOR functions.
  4. To represent the XOR problem we have already set up the neural network with:

Input code
Output action
Rock
11
0 forward
Food
01
1 backward
Predator
10
1 backward
Clear
00
0 forward


If the neural network encounters a food or predator, it should run backward. If it encounters rock or clear, it should run forward.

Exercise 10

Introduction to non-linearly separable problems
(why XOR is so hard?)

If you were to graph these input patterns as points in a two-dimensional space of all possible input patterns, it would look something like this:

Plot with X's at [0,1]
and [1,0] and O's at [0,0] and [1,1]

Notice that set of input points is divided into two groups (represented by x's and o's). That is because the points belong to two different groups: output of 0 or 1.

What a neural network strives to do is find a linear function that separates these two groups of input-output patterns. It's often not that easy, as you can see in the graph, there is no linear-function that can separate these sets of points.

For example, in this graph of some linearly-separable problem, the data points (represented by the x's and o's) can be separated by their type by some linear function, which in this case is a straight line (of the form y = mx +b. A linear function f(x) satisfies:

f(cx) = c f(x) where c is a real number

f(x1 + x2) = f(x1) + f(x2)

Note that x can be a n-dimensional vector.

Basically, a linear function is something that is a flat surface.

A one-dimensional linear function is a line, a two-dimensional linear function is a plane, a three-dimensional function is a volume, a four- or higher-dimensional linear function is called a hyper-plane.

Plot with a line from [0,0]
to [1,1] (approximately)

So now the XOR problem becomes a question of linear-separability; that is, whether the data is separable by some linear function. Two-layered feedforward neural networks cannot solve the XOR task because there is no linear function that separates the 1 outputs from the 0 outputs. (Try to draw some sort of linear function that separates the two sets of points in the above XOR graph.) The XOR problem is a non-linearly separable problem. It requires a non-linear solution.

Three-layer feedforward neural networks have the capability to represent non-linear functions and can perform tasks such as the XOR task. This capability arises from the extra layer of hidden neurons that allow the neural network to internally create a model of the input patterns presented to it, and thus recode the stimulus so it becomes a linearly separable problem. Crucially, a non-linear squashing function (sigmoid function) is implemented at each layer of activation in the network to provide a source of non-linearity.

Reading Assignment 4

Exercise 11

In the Training Kit, double-click with the mouse pointer on each of the scenarios already provided and verify that all of the XOR cases have been incorporated into them.

Make sure the error threshold is set to 0.005 and the learning rate to 0.05.

Press train to train the neural network.

  1. Do you observe any difference in how fast the neural network is learning in this particular problem? (Take note of the error level of the network and how quickly or slowly it is decreasing.)
  2. How many epochs of training do you need to reach the minimum error threshold of 0.005? (hint: set the max trainings to a high number...e.g. 30000) Right now, the best idea would be to stretch, grab some coffee, do the readings, or take a nap while the computer is working hard.
  3. During some point in training, stop and open up the Environment Kit. Test your net to see if it has developed any semblance of the XOR problem. Go to the Environment menu at the top of the screen (you may have to click on the main menu window) and select world and select edit world.
    A small window should open up at the bottom of the screen. You can now add or delete objects in the environment world itself with your mouse.
    Clear the world with the clear world button, and put objects directly on top of the neural network (that is where its one receptor is) and start the simulation with one step. Is it behaving, or starting to behave the way it should be if it had learned XOR?
  4. At some point during the training, you may want to start gathering some data. Go to the Training menu at the top of the screen and select data options. A small window should appear. Select epochs and training set error.
    Leave the others unchecked. Close the window. Go back to the Training menu and select record data. The program will ask you where to save it. Make sure the extension of the file is .dat .
  5. This will significantly slow down your training process (but go stretch, get coffee, etc.). Record for approximately 20000 epochs worth of data and stop recording. Then continue your training until the error reaches 0.005.
  6. Open up the data file in a text editor and cut and paste the data into an appropriate statistical analysis package. Graph the data with the number of epochs at the x-axis and the error levels at the y-axis. Describe and explain what you see.

Exercise 12

Now that you are done with the XOR net. Try the AND net. Go back to the Training Kit and redefine the scenarios in the following manner:

Input code
Output action
Rock
11
1 backward
Food
01
0 forward
Predator
10
0 forward
Clear
00
0 forward


How does the network learn this task, as compared to the XOR net? Graph the data points in two-dimensional space of all possible input patterns. Is the AND problem linearly-separable? If so, draw some linear function on the graph to separate the input patterns. What do you think will happen if you train a network with only one hidden node on the AND problem? Test your prediction.

Section III. Implementation details of the organism

So far, you have seen the neural network implemented as the brain of a virtual animal and have observed its behaviors in different circumstances. You have also been introduced to the idea of neural networks -- what they can do and what they cannot. In this section, we will explain how a neural network is actually represented in computer memory, and you will have the opportunity to simulate the computations a computer performs in a neural network simulation.

How is the neural network model represented in computer memory?

The neural network model is actually represented by matrices of numbers, which are the values of the connections between the layers of neurons. In this example, there are two layers of connections, so there are actually two layers of weights, one from the input layer to the hidden layer, and one from the hidden layer to the output layer.

Neural net with four 
input nodes, two hidden nodes, and three output nodes (yes,
the same one as all the previous node pictures)

Each connection, or weight, is a number which represents the strength of the connection between the neurons. A high, positive number would denote a strong, excitatory connection between a pair of neurons. A negative number represents an inhibitory connection between a pair of neurons.

Basically, for each neuron, all the input coming in from other neurons is adjusted by the connection-weights and summed to produce a net input for that neuron. A squashing or sigmoid function (explained in Section II) is applied to the input, resulting in the amount of activation for that neuron. This activation is passed through the weights between this neuron and the neurons in the next layer to produce activations in them.

Three circular input 
nodes with arrrows labelled 'w1,w2,w3' pointing to large 
node labelled 'neuron', 
which has arrow leading away labelled 'w4'.  Sigmoid 'squashing' 
function displayed below

For example, to figure out the output activation for the neuron shown here, we would multiply each of the input activations from the input neurons by their weight connection values:

input 1 * w1 = weighted activation 1

input 2 * w2 = weighted activation 2

input 3 * w3 = weighted activation 3

Then these weighted activations from each input neuron are summed to produce the net input, which passes through the squashing function to produce an activation from 0.0 to 1.0. This number is the activation of the current neuron.

net input = weighted activation 1 + weighted activation 2 + weighted activation 3

activation of current neuron = Sigmoid( net input )

Finally, if this neuron were to pass this signal to another neuron through weight connection w4, the activation of the receiving neuron would be calculated the same way as described above (multiply current activation by w4 to produce weighted activation sent to receiving neuron).

Part I. Manual Run-through of Model

Exercise 13

In this exercise, you will actually see the numbers that represent the connections (weights) between each of the neurons in the net. We are going to use the weights from the XOR net, since it is such a small net, and we are going to manually run through the calculations that occur in one or two of the scenarios from the XOR problem.

  1. In your XOR.net, go to the Training menu and select save weights. It should ask you where to save your file. Name it with a .dat extension.
  2. Open the file in a text editor. You should see two matrices of numbers. The top matrix represents connections from the input neurons (receptors) to the hidden layer of neurons. The bottom matrix represents the connections from the hidden layer of neurons to the output units. Rows correspond to the nodes the connection is leaving, and columns correspond to the nodes the connection is going to. The Saved Weights key will further help you understand the weight represented by each of these numbers.
  3. Select an object from the environment, and run its stimulus code (11, 01, 10, or 00) through the first matrix to produce hidden layer activations. You may actually want to sketch the network out on paper first (use the sketch in the Construction Kit to help). Remember that although the net has one receptor in its perceptual field, it is actually two receptors, because of the way we encoded the objects (with two dimensions: 11, 01, 10, or 00).
  4. Now use the hidden layer activations that you calculated in Step 3 to run the second matrix to produce output activations. This is your output. Make sure you use the sigmoid function at each layer.
    Sigmoid formula: y(x)=1/(1+exp(-x))where x= net input

  5. In the program, the simulation normalized the final output activations and probabilistically selected one as motor output.

Final Project

So far you have been experimenting with different aspects of neural networks; trying to get your model to perform certain tasks or trying to elucidate the mechanisms behind the behaviors that your model produced. This module has tried to give you a general overview of the neural network mechanisms and their use in the world. Along the way, we hope that you may have picked up insights into why neural networks behave they way they do, and how they resemble and differ from biological neural networks.

In this section, your goal is to build and train a neural network model that is capable of surviving an environment filled with randomly positioned food elements (necessary to maintain the network) and predators (necessary to avoid). The fitness of your neural network will be indicated in health points. If the number of health points drops below zero, then your neural network has been incapacitated.

You may need to play with the cost function to find the balance of values for each object in the world. For example, walking over a rock might cause your neural network to lose 1000 health points. That is probably too steep, so you may want to adjust that value lower. You access the cost function through the Environment menu at the top of the screen. Select world and then cost function. A window should pop up displaying current damage values for each of the objects in the environment, and the cost of maintaining each neuron in the network for each timestep.

Of course, don't cheat by setting the damage by rocks or predators to zero. But as you are building your model, you are allowed to "tinker" with the physics of the environment.

Note: Don't forget, the more neurons you add to your neural network, the more energy your neural network will consume per time unit. (This value can also be adjusted in the cost function menu.)

However...

you may just want to concentrate on one extreme aspect of survival. For example, you may want to build a neural network model that is capable of escaping from a mob of predators. Or turn the table on predators and build a neural network model that is carnivorous and decides to eat predators instead. You may want to explore models that can follow some sort of path in an obstacle course. Or figure out if you can turn your neural network into something plant-like that just sits there, regenerates its own health, but occasionally needs to run because of a marauding predator. There are many possibilities.

Whatever you choose, you should explain the reasoning behind your design/training/environment decisions in your final report.

You should turn in a final report describing, briefly, what you found in the earlier experiments, in addition to what you discovered with your own neural network. As much as possible you should follow the usual format for a scientific report (Introduction, Methods, Results, Discussion). When you mention the earlier experiments, you should be clear which experiment you are describing. You should include enough details in the latter part so that an interested reader could re-create your neural network organism, do the same procedures you did, and attempt to replicate the results you saw.

It would be useful to try to explain why your neural network organism behaves as it does. How did the particular training regimen affect its behavior? Can you give meaningful interpretations to the weights of the connections in the neural network? Why is the network succeeding/failing in the environment you chose (because of the weights in the network? because of the training regimen that established those weights?). Do you find similarities or differences from biological organisms? (These are just suggestions. You may think of other interesting questions to ask.)

You should include at least one statistical test addressing a hypothesis about the behavior of your organism.

Because even small neural networks can show complicated behaviors, it will likely be useful to focus on particular aspects that seem interesting.

Grades will be enhanced by demonstrating understanding rather than just repeating the readings.

Class Demonstration

Because each of you have been working on your own project, you may be curious about the organisms created by your classmates.

So, time will be set aside near the end of the module for you to demonstrate the features of your neural network organism, and how it functions. You should be prepared to answer questions about it.

Good luck!





Bibliography

Emmeche, Claus. The garden in the machine: the emerging science of artificial life. Princeton, N.J. Princeton University Press, ©1994.

Levy, Steven. Artificial life: the quest for a new creation. New York. Pantheon Books, ©1992.

McClelland, J.L.; Rumelhart, D.E.; and Hinton, G.E. "The Appeal of Parallel Distributed Processing", in Rumelhart, D.E., McClelland, J. L. Parallel distributed processing: explorations in the microstructure of cognition, Vol. 1: Foundations. Cambridge, MA. MIT Press, ©1986. pp. 3-31.

Rumelhart, D.E., Hinton, G.E., and Williams, R.J. "Learning Internal Representations by Error Propagation", in Collins, A. and Smith, E.E. Readings in cognitive science: A perspective from psychology and artificial intelligence, San Mateo, CA. Morgan Kaufmann, ©1988. pp. 399-407.

Sipper, M. 1998. Fifty years of research on self-replication: An overview. Artificial Life, 4, 237-257.


World Wide Web Sites and Links

Introduction to neural networks

http://www-dse.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html

Artificial neural network applications

http://www.bio-comp.com/

http://www.cai.com/neugents/

http://www.neuralware.com/

http://www.pegasustec.com/

A football play-predictor based on neural networks:
http://controls.ame.nd.edu/football/

Artificial life demos

http://alife.fusebox.com/

Artificial life research sites and links

http://www.alife.org/

http://www.alcyone.com/max/links/alife.html

http://www.cogs.susx.ac.uk/lab/adapt/index.html


Last revised: November 8, 2004

This page maintained by Doug Elrod