Conditioning using TDSynapse

This simulation demonstrates the use of a TDsynapse. Use the following steps to explore this simulation:
1. Clear the workspace (File > Clear Workspace).
2. Open the workspace file named TDSynapse.sim located in simulations/sims/conditioning (File > Open Workspace). The following network should open in your workspace:

In this network, the 5 horizontally placed neurons are the input neurons. These neurons are connected to a single neuron, which is the output neuron that will learn to predict the rewards. There is a single neuron placed at the right of the output neuron. This provides the reward signal for training. Also, the following Data-World should open in your workspace:

This table has 6 columns. First 5 columns provide the 5 inputs to the network. The 6th column provides the reward value. Each row of this table basically provieds a state vector to the network. Successive rows contain states for successive time steps. Note that only the 5th time step (5th row) of the world has a non-zero reward value. The network needs to learn to predict the occurance of reward in earlier time steps.
3. In the data world, set the iteration mode (Edit > Iteration Mode).
4. Make sure that all the neurons and weights in the network are clear. If they are not, select the network by clicking the mouse in the network window. Then press "n" and "c". Then press "w" and "c".
5. Double click on one of the gray colored synapses (these are the TD-synapses) to explore its properties. Make sure that you understand the meaning of these properties (refer TDsynapse for details).
6. Now you are ready to train the network. Press the Network Update button once. This should feed the first row of the data table to the network. Network should not produce any output. Press on Network Update button once more. This should feed in the 2nd row to the network. Continue in this manner for a total of 5 times. When you press on the Update Network button the 5th time, the synapse going from the 4th input unit to the output unit will turn red. Double click on this synapse to check its value. It should have have a value of 1. This means that the network has now learned to predict the reward one time step ahead of the actual reward delivery. If you go through the Network Update cycle 5 more times, the synapse from 3rd input neuron to the output neuron would turn red. This would mean that the network has learned to predict the reward 2 time steps ahead of its actual delivery. If you continue to iterate in this manner, the network would learn to predict the reward earlier and earlier, until it starts predicting the reward in the very first time step.