Ya.S. Pikalyov Federal State Scientific Institution «Institute of Problems of Artificial intelligence», c. Donetsk
Research interests: Digital signal processing, data analysis, pattern recognition, natural language processing, computer vision, machine learning, neural networks
T.V. Yermolenko Federal State Educational Institution of Higher Education «Donetsk State University», Donetsk
Research interests: Digital signal processing, data analysis, discrete mathematics, theory of algorithms, pattern recognition, natural language processing, computer vision, machine learning, neural networks
Annotation: This work is devoted to the study of the effectiveness of various neural network models in the tasks of object detection and classification on devices with limited computing power. The authors use a two-step approach based on the Faster R-CNN architecture to detect an object in an image and recognize it. The basic network is the main block in the Faster R-CNN structure that affects the quality and performance of the entire system. The paper presents the results of numerical studies of the effectiveness of various network architectures according to criteria such as the separating ability of high-level features, clas-sification accuracy, the amount of RAM occupied, computational complexity. An integral assessment of the effectiveness of the models is proposed, taking into account the above criteria. The best value according to the integral criterion was shown by the hybrid network EdgeNeXt-S, which indicates a good balance of this model between performance, robustness and accuracy in computer systems
References: 1. Zuev V. M. Method for learning neural network for robot control/ V. M. Zuev, O.A. Butov, S.B.Ivanova, A.A.Nikitina, S.I. Ulanov. Problems of Artificial Intelligence. 2021. Vol. 2. № 21. P. 22-33.
2. Pokintelitsa А.E. Problems and features of data reduction in autonomous robotic systems / A.E.Pokintilitsa. Problems of Artificial Intelligence. 2023. Vol. 1. № 28. P. 31-41.
3. Russakovsky O. ImageNet Large Scale Visual Recognition Challenge / O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A.C. Berg, L. Fei-Fei. International Journal of Computer Vision. 2015. Vol. 115. № 3.
4. Woo S. ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders / S. Woo, S. Debnath, R. Hu, X. Chen, Z. Liu, I.S. Kweon, S. Xie. 2023.
5. Ding M. DaViT: Dual Attention Vision Transformers / M. Ding, B. Xiao, N. Codella, P. Luo, J. Wang, L. Yuan. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2022. Vol. 13684 LNCS.
6. Zhang H. ParC-Net: Position Aware Circular Convolution with Merits from ConvNets and Transformer / H. Zhang, W. Hu, X. Wang. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2022. Vol. 13686 LNCS.
7. Maaz M. EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications / M. Maaz, A. Shaker, H. Cholakkal, S. Khan, S.W. Zamir, R.M. Anwer, F. Shahbaz Khan. 2023.
8. Li Y. EfficientFormer: Vision Transformers at MobileNet Speed / Y. Li, G. Yuan, Y. Wen, J. Hu, G. Evangelidis, S. Tulyakov, Y. Wang, J. Ren. 2022.
9. Tan M. EfficientNetV2: Smaller Models and Faster Training / M. Tan, Q. V. Le. 2021.
10. Wadekar S.N. MobileViTv3: Mobile-Friendly Vision Transformer with Simple and Effective Fusion of Local, Global and Input Features / S.N. Wadekar, A. Chaurasia. 2022.
11. LiJ. Next-ViT: Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios/ J. Li, X. Xia, W. Li, H. Li, X. Wang, X. Xiao, R. Wang, M. Zheng, X. Pan. 2022.
12. He K. Deep residual learning for image recognition / K. He, X. Zhang, S. Ren, J. Sun. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2016. Vol. 2016-December.
13. Wu K. TinyViT: Fast Pretraining Distillation for Small Vision Transformers / K. Wu, J. Zhang, H. Peng, M. Liu, B. Xiao, J. Fu, L. Yuan. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2022. Vol. 13681 LNCS.
14. Barbu A. ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models / A. Barbu, D. Mayo, J. Alverio, W. Luo, C. Wang, D. Gutfreund, J. Tenenbaum, B. Katz. Advances in Neural Information Processing Systems. 2019. Vol. 32.
15. Borji A. ObjectNet Dataset: Reanalysis and Correction / A. Borji. 2020.
16. ZhongZ. Random Erasing Data Augmentation / Z. Zhong, L. Zheng, G. Kang, S. Li, Y. Yang. 2017.
17. Krizhevsky A. Learning Multiple Layers of Features from Tiny Images / A. Krizhevsky. Science Department, University of Toronto, Tech. 2009.
18. McInnes L. UMAP: Uniform Manifold Approximation and Projection / L. McInnes, J. Healy, N. Saul, L. Großberger. Journal of Open Source Software. 2018. Vol. 3. № 29.
19. Coates A. Learning feature representations with K-means / A. Coates, A.Y. Ng. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2012. Vol. 7700 LECTURE NO.
20. Davies D.L. A Cluster Separation Measure / D.L. Davies, D.W. Bouldin. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1979. Vol. PAMI-1. № 2.
21. Chen X. Symbolic Discovery of Optimization Algorithms / X. Chen, C. Liang, D. Huang, E. Real, K. Wang, Y. Liu, H. Pham, X. Dong, T. Luong, C.-J. Hsieh, Y. Lu, Q. V. Le. 2023.
22. Zhang M. Lookahead Optimizer: k steps forward, 1 step back / M. Zhang, J. Lucas, J. Ba, G.E. Hinton. Advances in Neural Information Processing Systems. 2019. P. 9593-9604.
Issues: 3(30)'2023
Section: Informatics, Computer Engineering and Control
Cite:
Pikalyov, Ya.S. ABOUT NEURAL ARCHITECTURES OF FEATURE EXTRACTION FOR THE PROBLEM OF OBJECT RECOGNITION ON DEVICES WITH LIMITED COMPUTING POWER // Ya.S. Pikalyov, T.V. Yermolenko // Проблемы искусственного интеллекта. - 2023. № 3 (30). - http://search.rads-doi.org/project/13749/object/201186 doi: 10.34757/2413-7383.2023.30.3.004