Over fitting and under fitting are common problems in network training.
- Over fitting (overfit): the training error is small, but the error is very large for the test set. Perhaps the model is too complex to “remember” the training sample in training, but its generalization error is very high.
- Under fitting (underfit): training error is very large, unable to find suitable function to describe data set.
Here are some of the commonly used trick methods in these two situations.
- How to prevent over fitting
The main reasons for over-fitting are too many feature dimensions, too complex models, too many parameters, too little training data, too much noise and so on, which lead to the fitting function in the training set is good and the test set is poor. Starting from the reasons for its emergence, we can consider the following methods:
- Data enhancement Data Augmentation & increase noise data, and increase the number of source data.
- Using appropriate network model, reducing the number of layers and neurons can limit the fitting ability of the network.
Dropout（The output nodes of a network are randomly discarded to create new samples.
weightRegularization (that is, adding norm L1 norm / L2 norm to the loss function, and weight decay is the coefficient of regularization)
i. L1The absolute value of all parameters of a norm.
L1Norm can make some parameters 0, and also achieve feature sparse or feature selection.
ii. L2The square sum of all the parameters of a norm is then the square root.
L2Norms make the parameters smaller and close to zero, the smaller the parameters, the smaller the model, the less likely it is to overfit.
L1L2 tends to select more features, and these features tend to be 0.
- Early termination (early stopping) stops the learning process before overfitting.
2. How to prevent under fitting
The model does not capture the data characteristics well, and does not fit the data very well.
- Add more features
- Reduce regularization parameter