A Novel Learning Algorithm to Optimize Deep Neural Networks: Evolved Gradient Direction Optimizer (EVGO)

Karabayır, İbrahim; Akbilgiç, Oğuz; Taş, Nihat

doi:10.1109/tnnls.2020.2979121

A Novel Learning Algorithm to Optimize Deep Neural Networks: Evolved Gradient Direction Optimizer (EVGO)

Atıf İçin Kopyala

Karabayır İ., Akbilgiç O., Taş N.

IEEE Transactions On Neural Networks And Learning Systems, cilt.31, ss.1-10, 2020 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 31
Basım Tarihi: 2020
Doi Numarası: 10.1109/tnnls.2020.2979121
Dergi Adı: IEEE Transactions On Neural Networks And Learning Systems
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Aerospace Database, Agricultural & Environmental Science Database, Applied Science & Technology Source, Biotechnology Research Abstracts, Business Source Elite, Business Source Premier, Communication Abstracts, Compendex, Computer & Applied Sciences, EMBASE, INSPEC, MEDLINE, Metadex, Civil Engineering Abstracts
Sayfa Sayıları: ss.1-10
İstanbul Üniversitesi Adresli: Evet

Özet

Gradient-based algorithms have been widely used in optimizing parameters of deep neural networks' (DNNs) architectures. However, the vanishing gradient remains as one of the common issues in the parameter optimization of such networks. To cope with the vanishing gradient problem, in this article, we propose a novel algorithm, evolved gradient direction optimizer (EVGO), updating the weights of DNNs based on the first-order gradient and a novel hyperplane we introduce. We compare the EVGO algorithm with other gradient-based algorithms, such as gradient descent, RMSProp, Adagrad, momentum, and Adam on the well-known Modified National Institute of Standards and Technology (MNIST) data set for handwritten digit recognition by implementing deep convolutional neural networks. Furthermore, we present empirical evaluations of EVGO on the CIFAR-10 and CIFAR-100 data sets by using the well-known AlexNet and ResNet architectures. Finally, we implement an empirical analysis for EVGO and other algorithms to investigate the behavior of the loss functions. The results show that EVGO outperforms all the algorithms in comparison for all experiments. We conclude that EVGO can be used effectively in the optimization of DNNs, and also, the proposed hyperplane may provide a basis for future optimization algorithms.