- •Laboratory work #4 (nn2)
- •Selecting Input and Target Columns
- •Input columns are marked with light-blue header background.
- •Validation set is a part of your data used to tune network topology or network parameters other than weights.
- •Working with Preprocessing Window
- •Working with Design Window
- •In the Design window you will see:
- •Designing Network Architecture Manually
- •In the Architecture Design dialog box you can manually enter a number of hidden layers (up to 5) and a number of hidden units in each layer (up to 256).
- •Input Columns (optional) - input columns from your dataset. Input columns are displayed only if the "Select all inputs" checkbox is selected on the local toolbar.
Working with Design Window
Design window automatically becomes active when you select Design Architecture, Define Properties or Search Architecture in the main toolbar or menu. You can also select the Design window using
Stages tab at the bottom of the main window.
In the Design window you will see:
Active Network pane contains network architectures selected manually or found by an architecture search method.
Network Properties pane contains network activation and error functions.
Architecture Search table is filled parameters of the tested network architectures.
Network button on the local toolbar or you can select any network in the table and activate the selected network using Activate Selected button. Note: Every network in the Architecture search table can be retrained using the Retrain button on the local toolbar. "Retrain" means training a network with another initial randomization of weights.
Training Graph displays network error (AE or CCR) on the training set for the architecture(s) currently tested by a search method The Network Parameters contain data from the Architecture Search table for the selected network.
Designing Network Architecture Manually
Click Design button on the main toolbar to define network architecture manually.
In the Architecture Design dialog box you can manually enter a number of hidden layers (up to 5) and a number of hidden units in each layer (up to 256).
Note: By default NeuroIntelligence proposes you a topology with one hidden layer and with the number of hidden units equal to the number of inputs divided by 2. This maybe very wrong in some cases; Note: A network with too few hidden units only roughly discovers hidden dependencies in your data whereby your network produces a significant number of errors. A network with too many hidden units will tend to memorize all your data instead of finding relations that also lead to bigger network errors. You need to find the best solution for your problem and your dataset. Hint:In our experience, the majority of problems are solved best with 1 hidden layer, another part is solved best with 2 layers and only some problems require 3 layers or more.
Selecting Network Properties
To change network properties click Define Properties on the main toolbar or in the menu.
Activation functions
NeuroIntelligence supports three activation functions for hidden layers and four activation functions for the output layer. For hidden layers you can select the following functions: (Linear. Logistic. Hyperbolic tangent. )
For output layer you can select the same functions if you have a regression type problem and Logistic or Softmax function if you have a classification type problem.
Error functions
NeuroIntelligence also supports two network error functions: Sum-of-Squares and Cross-Entropy.
Hint: Sum-of-Squares is the most common error function. Sum-of-Squares. The error is the sum of the squared differences between the actual value (target column value) and neural network output. Cross-entropy. This error is the sum of the products of the target value and the logarithm of the error value on each output unit.
Classification model
NeuroIntelligence can use two classification models: With the Winner-takes-all model NeuroIntelligence performs classification by selecting an output unit with the biggest activation level. With the Confidence limits model NeuroIntelligence performs classification by checking output unit activation against two levels: the Accept level and the Reject level.
Running an Architecture Search Method
To
run automatic architecture search click Search Architecture button of
the main toolbar.
Note:
You can change architecture search parameters in the Architecture
Search Options.
You
can monitor the architecture search process using Architecture Search
Table and Architecture Search Graph.
Two
search methods are available for your choice:
Heuristic Search - makes heuristic search inside specified search range; the method works only for 3-layer (1 hidden layer) networks.
Exhaustive Search - makes exhaustive search among ALL topologies of the search range you specified; the method works for up to 5 hidden layer networks.
To train a network, click Train button on the main toolbar or select Network > Train in the menu.
NeuroIntelligence
will train the network defined as active during the design stage. The
training algorithm and parameters can be specified in the Network
Training Options.
Note:
To retrain the network with another initial weights randomization,
click on the Train
button again. The training will begin again with another weights
initialization. The results of the previous training will be lost.
Note: One more time-saving feature is Continue Training. After training completion you can change Training Options and click Continue Training to proceed further from the ending training point.
Note: You can Jog Weights and further train the network. The Jog weights feature is used to help a network escape a local minimum in error space. Jogging weights means adding a small random number to each network weight. Jog Weights parameters are determined in the Jog Options. Selecting Training Algorithms
NeuroIntelligence has 7 training algorithms available:
Quick propagation
Conjugate Gradient Descent,
Quasi-Newton
Limited Memory Quasi-Newton
Levenberg-Marquardt
Incremental back propagation
Batch back propagation
There is no single best training algorithm for neural networks. You need to choose a training algorithm based on the characteristics of the problem. The following simple rules proved to be quite effective for most practical purposes:
If you have a network with a small number of weights (usually, up to 300), Levenberg-Marquardt algorithm is efficient. Levenberg-Marquardt often performs considerably faster than other algorithms and finds better optima than other algorithms. But its memory requirements are proportional to the square of the number of weights. Another Levenberg-Marquardt limitation is that it is specifically designed to minimize the sum of squares error and cannot be used for other types of network error.
If you have networks with a moderate number of weights, Quasi-Newton and Limited Memory Quasi-Newton algorithms are efficient. But their memory requirements are also proportional to the square of the number of weights.
If your network has a large number of weights, we recommend you using Conjugate Gradient Descent. Conjugate Gradient Descent has nearly the convergence speed of second-order methods, while avoiding the need to compute and store the Hessian matrix. Its memory requirements are proportional to the number of weights.
Conjugate Gradient Descent and Quick Propagation are general-purpose training algorithms of choice.
You can use incremental and batch Back propagation for networks of any size. Back propagation algorithm is the most popular algorithm for training of multi-layer perceptrons and is often used by researchers and practitioners.
NeuroIntelligence
automatically performs network testing after training completion. To
view testing results click Test on the main toolbar or select Network
> Test
in the menu. You can also select the Test window using Stages tab at
the bottom of the main window.
In
the Testing window you will see:
Actual
vs. Output Table - error values for each record from the input
dataset. Actual vs. Output Table can be saved to CSV file or copied
to the Clipboard using the Save
and Copy
All
toolbar buttons.
The table consists of the following
columns:
