Projection Methods
There are numerous ways to project high dimensional data to lower dimensions. The projection menu in the projection component window allows one to switch projections and thereby compare the way neural network data look under different projections.
This is perhaps the simplest possible projection technique. If one has a list of datapoints with 40 components each, coordinate projection to two-dimensions simply ignores all but two of these components, which are then used to display the data in two-space.
Principal Component Analysis (PCA)
PCA builds on coordinate projection by making use of the "principal axes" of the dataset. The principal axes of an object are the directions in space about which the object is most balanced or evenly spaced. PCA selects the two principal axes along which the dataset is the most spread out and projects the data onto these two axes.
The Sammon map is an iterative technique for making interpoint distances in the low-dimensional projection as close as possible to the interpoint distances in the high-dimensional object. Two points close together in the high-dimensional space should appear close together in the projection, while two points far apart in the high dimensional space should appear far apart in the projection. By minimizing an error function between the high and low dimensional sets of interpoint distances, the Sammon map does its best to preserve these distances in the projection. This iterative procedure can be watched in the projection component window by loading a dataset and pressing the "play" button on the interface.
Note: Before Sammon Mapping is used, it is useful to Randomize,, the data points, as overlapping points cause the algorithm to blow up. One would run the Sammon Mapping process after data points have been developed by PCA or Coordinate Projection.
Adding Points
In some cases it is useful to be able to add new points to an existing dataset without running the projection method on the whole dataset again. Methods exist for quickly adding new data points based on data that have already been projected. These methods work best when a certain amount of data has already been collected and projected using, for example, PCA or the Sammon map. Note that these methods will rarely be applied in most uses of Simbrain.
Nearest Neighbor Subspace Method
(1) Takes each new point and determines the three points in the current data set that are closest to it.
(2) Finds the projection of the new point into the two-dimensional subspace that contains the three nearest neighbors in the high-dimensional space.
(3) Uses the three nearest neighbors and their corresponding points in the low dimensional dataset to find an affine map that approximates the full projection method (whichever one is currently being used).
(4) Applies the affine map to the new datapoint.
The Triangulate method takes each new point and determines which two points in the current data set are closest to it. Then, if possible, it will place the projected image of the new point so that its distance from the projected image of its two nearest neighbors is the same as it was in the high dimensional space. When it is not possible to project the point such that its distance to its two nearest neighbors is preserved, then the projected image of the new point will be placed on a line connecting the projected image of its two nearest neighbors. In this case the position of the projected image of the new point on this line is determined by the relative sizes of the distances between the new point and its two nearest neighbors in the current data set.