This window pops up as soon as OpendTect has created or retrieved the train and test sets. In order to create a training set OpendTect must compute all selected attributes at all picked locations. This may take some time. The user will be notified when the data is collected. Acknowledging the notification will automatically start the training phase.
Training can be stopped and restarted with the Pause (Resume) button. Clear randomizes the weights of the current network. This implies that all training results are lost. The Clear option can be used when overfitting (see below) has occurred and you wish to restart training from scratch. The randomized network is then re-trained and the network is stopped before overfitting occurs. This can be done manually (pressing Pause) or by specifying a number in the number of training vectors field.
In supervised mode, the network's performance is tracked during training in two graphs: Normalized RMS and % Misclassification. The normalized RMS error (see below) curves indicate the overall error on the train and test sets, in red and blue respectively on a scale from 0 (no error) to 1 (maximum error). Both curves should go down during training. When the test curve goes up again the network is overfitting. Training should be stopped when (preferably before) this happens. Typically a RMS value in the 0.8 range is considered reasonable, between 0.8 and 0.6 is good, between 0.6 and 0.4 is excellent and below 0.4 is perfect. The normalized error is calculated as follows:
The percentage misclassification shown in the lower left corner is a much easier quality control parameter to interpret. It simply shows how the percentage of the training and test set that is classified in the wrong class.
On the right-hand side of the window a graphical representation of the input attributes is shown. The circle in front of the attribute name changes color during training. The colors reflect the weights attached to each input node and are therefore indicative for the relevant importance of each attribute for the classification task at hand. Colors range from red (high weight means high importance) via yellow to red (relative small weights, less important). This feature is very useful when you wish to design small networks to increase processing speed.
Optionally, the neural network can be stored immediately by pressing the OK button. First, enter a neural network name in the appropriate field at the bottom of the NN training window.
The Save misclassified toggle allows saving the misclassified picks in a new Pickset. This Pickset is automatically loaded in OpendTect again. The Pickset can be indicative of picking errors. It is not recommended to bluntly remove the misclassified picks from a Pickset, since good picks, although misclassified during training, still help neural network training.
The supervised training window from well data is very similar to the training window from a Pickset (see above). The only difference is the display of a scatter plot instead of a % Misclassification plot.
A scatter plot shows the actual target data on the horizontal axis and the predicted target data by the neural network, as it is at that moment, on the vertical axis. Not all nodes are plotted. Only a random selection of the used train and test data is shown. Ideally, after sufficient training, all datapoints should be on the diagonal. That would mean that the trained neural network predicted all examples correctly. However, this will rarely be the case. In most cases, the data will cluster along the diagonal. The narrower this cloud, the better the neural network is trained.
Overtraining occurs when the Normalized RMS of the test data increases, while the Normalized RMS of the train set decreases. This usually also means that the cloud of train nodes becomes narrower, while the cloud of test nodes becomes wider again.
In unsupervised mode, the network performance is tracked in a graph that shows the average match (confidence) of clustered input. Typically the average match increases in a step-function. Each step indicates that the network has found a new cluster. Training can be stopped as soon as the average match has reached a stable situation. Usually this will be around 90%.
The colors of the input nodes in an unsupervised network will also change during training. In unsupervised mode these colors do not indicate that one attribute is more important than another. All attributes in a clustering experiment are equally important.
Optionally, the neural network can be stored immediately on pressing OK. To do this, enter a neural network name in the appropriate field at the bottom of the NN training window.