next up previous contents
Next: Tree Level Measurements in Up: Computational Information Processing Previous: MENES - Methods for

Using External Knowledge in Neural Network Models

Researchers: Jouko Lampinen, Arto Selonen, Leena Ikonen, Harri Hakkarainen and Mika Kataikko

The project is a subproject in the MENES project described above.


One of the most important properties of neural networks is generality, as the same network can be trained to solve rather different tasks, depending on the training data. This is also one of the most prominent problems when practical real world problems are solved by neural networks, as existing domain knowledge is difficult to incorporate into the models.

The goal of the research project is to develop methods for adding prior knowledge to neural network modeling. The approach we are studying is based on training the knowledge on the network instead of hard-coding the knowledge in advance to the connections or weights. The knowledge is specified as target values or constraints for different order partial derivatives of the network. This approach can be viewed as a flexible regularization method that controls directly the characteristics of the resulting mapping.

A concrete goal of the project is to develop a neural network training and simulating system that will support modular network design and domain knowledge representation with fuzzy-like terms. Together with the industrial partners this prototype system will be further developed to a process modeling and control tool.

Recent Results

Our research group has developed a prototype neural modeling tool that supports advanced regularization by background knowledge and modular network design, where the networks are built from smaller blocks, and also using committees of networks to improve generalization. The developed tool supports following model construction principles:

1. Modular network design.
Subprocesses of the whole process are encapsulated into modules that can be trained on the data, expert rules and simulation models.

The modules are connected with adaptive layers and then the whole model can be trained together, so that the interconnection layers compensate the errors of the individual modules. This facilitates use of inaccurate simplified models for the subprocesses.

2. Training background knowledge on the network
The training data base consists of data samples and additional set of rule data. The rules are of the form:

If the system is in state $\Phi$, then effect of input xk to output oi is ${\partial o_i \over \partial x_k} = R_{ik}$

where the state $\Phi$ and target Rik can be specified with uncertain, "fuzzy-like" terms. Thus the rules contain the target values and the membership functions that are one for acceptable derivative values and zero for forbidden values.

Figure 1: Example of training knowledge on a network. The figures show model errors of simple multilayer perceptron network for an artificial test function consisting of three steps. In the rightmost figure, monotonicity rule ${\partial f \over \partial x} \ge 0$ was trained to the network. The result shows that the additional rule prevents overfitting (increase of the test error) by constraining the solution space to contain only monotonic functions
Figure 1

The theoretical background of the system, together with some experiments with real process data are reported in [41] and [25].

Figure 2: Overview of the prototype neural modeling tool Q-OPT 2, with main menu, variable control window and network display window  
Figure 2

next up previous contents
Next: Tree Level Measurements in Up: Computational Information Processing Previous: MENES - Methods for
Juha Merimaa