Welcome to Journal of University of Chinese Academy of Sciences,Today is

Journal of University of Chinese Academy of Sciences

Previous Articles     Next Articles

Research on solutions of cross-entropy loss with spectral decoupling regularization

HU Yinhan, GUO Tiande, HAN Congying   

  1. School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.
  • Received:2023-12-29 Revised:2023-09-01 Online:2023-09-01

Abstract: When the neural network is used to solve the classification task, using the cross-entropy loss for training, the obtained classifier shows the phenomenon of Gradient Starvation, that is, the model only focuses on the most significant features and ignores other useful features. The researchers found that adding the l2 norm of the model output to the cross-entropy loss as a regularizer, termed as spectral decoupling, can alleviate this phenomenon. In this paper, the influence of spectral decoupling with different strengths on the model is studied, using the over-parameterized models. In the absence of weight decay, we show that the models obtained by spectral decoupling of different strengths are equivalent. When there is a small weight decay, we use the second-order Taylor expansion of the objective function to obtain an approximate solution. Analyzing the approximate solution, it is found that reducing the spectral decoupling has the effect of enhancing the weight decay, and it is directly equivalent in the binary classification problem. Finally, we verify our analytical conclusions through experiments.

Key words: cross-entropy loss, spectral decoupling, weight decay, gradient starvation

CLC Number: