Metaheuristics Approach for Hyperparameter Tuning of Convolutional Neural Network
Abstract
Deep learning is an artificial intelligence technique that has been used for various tasks. Deep learning performance is determined by its hyperparameter, architecture, and training (connection weight and bias). Finding the right combination of these aspects is very challenging. Convolution neural networks (CNN) is a deep learning method that is commonly used for image classification. It has many hyperparameters; therefore, tuning its hyperparameter is difficult. In this research, a metaheuristic approach is proposed to optimize the hyperparameter of convolution neural networks. Three metaheuristic methods are used in this research: ant colony optimization (ACO), genetic algorithm (GA), and Harmony Search (HS). The metaheuristics methods are used to find the best combination of 8 hyperparameters with 8 options each which creates 1.6. 107 of solution space. The solution space is too large to explore using manual tuning. The Metaheuristics method will bring benefits in terms of finding solutions in the search space more effectively and efficiently. The performance of the metaheuristic methods is evaluated using MNIST datasets. The experiment results show that the accuracy of ACO, GA and HS are 99,7%, 97.7% and 89,9% respectively. The computational times for the ACO, GA and HS algorithms are 27.9 s, 22.3 s, and 56.4 s, respectively. It shows that ACO performs the best among the three algorithms in terms of accuracy, however, its computational time is slightly longer than GA. The results of the experiment reveal that the metaheuristic approach is promising for the hyperparameter tuning of CNN. Future research can be directed toward solving larger problems or improving the metaheuristics operator to improve its performance.
Downloads
References
G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm for deep belief nets, Neural Computing,” Neural Comput, vol. 18, no. 7, pp. 1527–1554, 2006, doi: 10.1162/neco.2006.18.7.1527.
B. Akay, D. Karaboga, and R. Akay, “A comprehensive survey on optimizing deep learning models by metaheuristics,” Artif. Intell. Rev., vol. 55, pp. 829–894, 2021, doi: https://doi.org/10.1007/s10462-021-09992-0.
S. Ahuja, B. K. Panigrahi, N. Dey, V. Rajinikanth, and T. K. Gandhi, “Deep transfer learning-based automated detection of COVID-19 from lung CT scan slices,” Appl. Intell., vol. 51, pp. 572–585, 2020, doi: https://doi.org/10.1007/s10489-020-01826-w.
M. A. Khan et al., “Multiclass stomach diseases classification using deep learning features optimization,” Comput. Mater. Contin., vol. 67, no. 3, pp. 3381–3399, 2021, doi: https://doi.org/10.32604/cmc.2021.014983.
R. K. Kaliyar, A. Goswami, P. Narang, and S. Sinha, “FNDNet – A deep convolutional neural network for fake news detection,” Cogn. Syst. Res., vol. 61, pp. 32–44, 2020, doi: https://doi.org/10.1016/j.cogsys.2019.12.005.
Z.-T. Liu, M.-T. Han, B.-H. Wu, and A. Rehman, “Learning, Speech emotion recognition based on convolutional neural network with attention-based bidirectional long short-term memory network and multi-task,” Appl. Acoust., vol. 202, 2023, doi: https://doi.org/10.1016/j.apacoust.2022.109178.
E.-S. M. El-Kenawy et al., “Advanced Meta-Heuristics, Convolutional Neural Networks, and Feature Selectors for Efficient COVID-19 X-Ray Chest Image Classification,” IEEE Access, vol. 9, pp. 36019–36037, doi: 10.1109/ACCESS.2021.3061058.
A. M. Hafiz, R. A. Bhat, and M. Hassaballah, “Image classification using convolutional neural network tree ensembles,” Multimed. Tools Appl., vol. 82, pp. 6867–6884, 2023.
G. Chen, Q. Chen, S. Long, W. Zhu, Z. Yuan, and Y. Wu, “Quantum convolutional neural network for image classification,” Pattern Anal. Appl., vol. 26, pp. 655–667, 2022.
O. N. Oyelade and A. E. Ezugwu, “Characterization of abnormalities in breast cancer images using nature-inspired metaheuristic optimized convolutional neural networks model,” Concurr. Comput. Pract. Exp., vol. 34, no. 4, p. e6629, 2021, doi: https://doi.org/10.1002/cpe.6629.
W. L. Hakim et al., “Convolutional neural network (CNN) with metaheuristic optimization algorithms for landslide susceptibility mapping in Icheon, South Korea,” J. Environ. Manage., vol. 305, no. 1, p. 114367, 2022, doi: https://doi.org/10.1016/j.jenvman.2021.114367.
Z. Chen and D. Song, “Modeling landslide susceptibility based on convolutional neural network coupling with metaheuristic optimization algorithms,” Int. J. Digit. Earth, vol. 16, no. 1, pp. 3384–3416, 2023, doi: https://doi.org/10.1080/17538947.2023.2249863.
Z. Tian and S. Fong, “Survey of Meta-Heuristic Algorithms for Deep Learning Training,” in Optimization Algorithms - Methods and Applications, 2016, pp. 195–220. doi: 10.5772/63785.
F. Simon, S. Deb, and X. Yang, “How Meta-heuristic Algorithms Contribute to Deep Learning in the Hype of Big Data Analytics,” in Progress in Intelligent Computing Techniques: Theory, Practice, and Applications, Rourkela, Odisha, India, 2017, pp. 3–25.
Y. Le Cun et al., “Handwritten digit recognition with a back-propagation network,” in Advances in neural information processing systems 2, 1990, pp. 396–404.
O. Russakovsky et al., “Imagenet large scale visual recognition challenge,” Int. J. Confl. Violence, vol. 115, no. 3, pp. 211–252, 2015, doi: https://doi.org/10.48550/arXiv.1409.0575.
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proceedings of the International Conference on Learning Representations, 2015, pp. 1–12. doi: https://doi.org/10.48550/arXiv.1409.1556.
J. Bergstra and Y. Bengio, “Random search for hyper-parameter optimization,” J. Mach. Learn. Res., vol. 13, no. 10, pp. 281–305, 2012.
P. B. Liashchynskyi and P. Liashchynskyi, “Grid Search, Random Search, Genetic Algorithm: A Big Comparison for NAS,” arxiv, vol. @article{L, no. abs/1912.06059, 2019.
E. Ndiaye, T. Le, O. Fercoq, J. Salmon, and I. Takeuchi, “Safe Grid Search with Optimal Complexity,” in Proceedings of the 36th International Conference on Machine Learning, 2019, pp. 4771–4780. doi: https://doi.org/10.48550/arXiv.1912.06059.
M. Heidari, M. H. Moattar, and H. Ghaffari, “Forward propagation dropout in deep neural networks using Jensen–Shannon and random forest feature importance ranking,” Neural Networks, vol. 165, pp. 238–247, 2023, doi: https://doi.org/10.1016/j.neunet.2023.05.044.
M. Segu, A. Tonioni, and F. Tombari, “Batch normalization embeddings for deep domain generalization,” Pattern Recognit., vol. 135, p. 109115, 2023, doi: https://doi.org/10.1016/j.patcog.2022.109115.
Copyright (c) 2024 Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright in each article belongs to the author
- The author acknowledges that the RESTI Journal (System Engineering and Information Technology) is the first publisher to publish with a license Creative Commons Attribution 4.0 International License.
- Authors can enter writing separately, arrange the non-exclusive distribution of manuscripts that have been published in this journal into other versions (eg sent to the author's institutional repository, publication in a book, etc.), by acknowledging that the manuscript has been published for the first time in the RESTI (Rekayasa Sistem dan Teknologi Informasi) journal ;