DF_CrossEntropy Class Reference

Error measure for classication tasks that can be used as the objective function for training. More...

#include <DF_CrossEntropy.h>

Inheritance diagram for DF_CrossEntropy:

ErrorFunction

List of all members.

Public Member Functions

double error (Model &model, const Array< double > &input, const Array< double > &target)
 Calculates the cross entropy error.
double errorDerivative (Model &model, const Array< double > &input, const Array< double > &target, Array< double > &derivative)
 Calculates the derivatives of the cross entropy error (see error) with respect to the parameters ErrorFunction::w.


Detailed Description

Error measure for classication tasks that can be used as the objective function for training.

If your model should return a vector whose components are reflecting the class conditonal probabilities of class membership given any input vector 'DF_CrossEntropy' is the adequate error measure for model-training. For C>1, dimension of model's output and every output dimension represents the probability for class membership of the given input vector, the error measure applied is defined as

\[ E = - \sum_{i=1}^N \sum_{k=1}^C \left\{target^i_k \cdot\ln \frac{\exp{(model_k(input^i))}} {\sum_{k^{\prime}=1}^C \exp{(model_{k^{\prime}}(input^i))}} \right\} \]

where i runs over all input patterns and every term in the sum equals zero if the coefficient equals zero, since x ln(x) is zero in limes of x running to zero. The argument of the logarithm calculates the so called softmax-activation to guarantee for unity at the outputs, i.e.

\[ \sum_{k=1}^C \frac{\exp{(model_k(input^i))}}{\sum_{k^{\prime}=1}^C \exp{(model_{k^{\prime}}(input^i))}} = 1 \]

This is neccessary in order to interprete the output values as probabilities. This error functional can be derivated and so used for training. In case of only one single output dimension 'DF_CrossEntropy' returns the corresponding cross entropy for two classes, using the formalism

\[ E = - \sum_{i=1}^N \left\{target^i\cdot \ln model(input^i) + (1-target^i) \cdot\ln (1-model(input^i))\right\} \]

In this implementation every target value has to be chosen from {0,1} (binary encoding). For theoretical reasons it is suggested to use for neural networks with one output neuron the logistic sigmoid activation function at the output. For multiple outputs it's required to use linear activiation, since this error implementation transforms the linear output to the softmax activiation as described above. For detailed information refer to (C.M. Bishop, Neural Networks for Pattern Recognition, Clarendon Press 1996, Chapter 6.9.)

This implementation of the cross entropy performs more efficient than the implemetation given in 'CrossEntropy.h' for more than one output dimensions, because redundant calculations of the outer derivatives are circumvented.

Status:
stable

Definition at line 98 of file DF_CrossEntropy.h.


Member Function Documentation

double DF_CrossEntropy::error ( Model model,
const Array< double > &  input,
const Array< double > &  target 
) [inline, virtual]

Calculates the cross entropy error.

The cross entropy function for N patterns and C>1 class-dimensions within the output vector is calculated via

\[ E = - \sum_{i=1}^N \sum_{k=1}^C \left\{target^i_k\cdot \ln \frac{\exp{(model_k(input^i))}} {\sum_{k^{\prime}=1}^C \exp{(model_{k^{\prime}}(input^i))}}\right\} \]

respectively for only one single output dimension and two classes via

\[ E = - \sum_{i=1}^N \left\{target^i\cdot \ln model(input^i) + (1-target^i) \ln (1-model(input^i))\right\} \]

Parameters:
model the model.
input Input vector for the model.
target Target vector.
Returns:
The error E.
Author:
M. Huesken
Date:
1999
Changes
Revision 2003/06/03 (S. Wiegand): softmax activation introduced
Status
stable

Implements ErrorFunction.

Definition at line 138 of file DF_CrossEntropy.h.

References Model::getOutputDimension(), and Model::model().

double DF_CrossEntropy::errorDerivative ( Model model,
const Array< double > &  input,
const Array< double > &  target,
Array< double > &  derivative 
) [inline, virtual]

Calculates the derivatives of the cross entropy error (see error) with respect to the parameters ErrorFunction::w.

The derivatives of the cross entropy for N patterns and C>1 class-dimensions within the output vector with respect to model parameters w are calculated via

\[ \frac{\partial E}{\partial w} = - \sum_{i=1}^N \sum_{k=1}^C \left\{ \left(target^i_k - \frac{\exp{(model_k(in^i))}}{\sum_{k^{\prime}=1}^C \exp{(model_{k^{\prime}}(input^i))}} \right) \cdot\frac{\partial model_k(input^i)}{\partial w}\right\} \]

respectively for only one single output dimension via

\[ \frac{\partial E}{\partial w} = - \sum_{i=1}^N \left\{ \frac{target^i - .model(input^i)}{model(input^i)\cdot(1-model(input^i))} \cdot\frac{\partial model(input^i)}{\partial w}\right\} \]

Parameters:
model the model.
input Input vector for the model.
target Target vector.
derivative error derivative
Returns:
The cross entropy error
Author:
M. Huesken
Date:
1999
Changes
Revision 2003/06/03 (S. Wiegand): softmax activation introduced, bugs in calculation of derivatives corrected
Status
stable

Reimplemented from ErrorFunction.

Definition at line 250 of file DF_CrossEntropy.h.

References Model::generalDerivative(), Model::getOutputDimension(), Model::getParameterDimension(), and Model::model().


The documentation for this class was generated from the following file: