РУС ENG

KEY OBJECTS DETECTION AND CROSS-VIEW GEOLOCATION: DATASET ANALYSIS AND METHODOLOGICAL ASPECTS

About the magazine

News
Goals and sphere
Founder and publisher
Editorial Board
Licensing conditions
Confidentiality
Attitude towards plagiarism
Publication ethics
Archiving Policy
Subscription


For authors

Instructions for authors
The review process
Copyright
Agreement on the transfer of rights
Editorial fees


Archive

All issues
Search


Contacts

Contacts


Pikalev Yaroslav Sergeevich
Candidate of Technical Sciences, Senior Researcher at the Laboratory of Intelligent Systems and Data Analysis, Federal State Budget Scientific Institution "Institute of Artificial Intelligence Problems".
283048, Donetsk, Otvazhnykh str., 19, apt. 85, phone: +7949 4287388, email: i@pikaliov.ru.
Research interests: digital signal processing, data analysis, pattern recognition, natural language processing, computer vision, machine learning, neural networks.

UDC 004.932.72
DOI 10.24412/2413-7383-2024-4-25-37
Language: Russian
Annotation: This work focuses on the challenge of developing a system for cross-view geo-localization using a neural network-based approach. This area is highly relevant, as such systems enable UAVs to navigate complex environments by identifying objects, obstacles, and routes, thereby reducing dependence on operators. Additionally, recognition systems allow UAVs to efficiently collect and analyze data. In the course of this study, an analysis of existing datasets suitable for cross-view geo-localization tasks was conducted, and general principles and requirements for such datasets were identified. The following datasets were highlighted: Objects365, LVIS, VisDrone, DOTA, iSAID, and GeoText. Moreover, the key challenges of the task were outlined, including: 1) image preprocessing; 2) feature extraction; 3) the feasibility of pre-training a base model; 4) accounting for the semantic characteristics of objects, among others.
Keywords: neural networks, datasets, cross-view geo-localization, computer vision, object recognition, data augmentation, autonomous systems.

List of literature:
1. Ronzhin, A.L., Le, V.N., & Shuvalov, N.S. (2024). Optimization of the technological map of permissible system-technical solutions for the video analytics problem of aquaculture. *Bulletin of the South Ural State University. Series "Mathematics. Mechanics. Physics"*, × 2 ×(16), 50-58. ISSN 2075-809Х. DOI: 10.14529/mmph240205
2. Ronzhin, A. L. (2024). Intellectualization and robotization of scientific equipment for interdisciplinary research. *Problems of Artificial Intelligence*, × 1 ×(28), 4-10. ISSN 2413-7383. (DOI missing)
3. Durgam, A., et al. (2024). Cross-view geo-localization: a survey. *arXiv preprint arXiv:2406.09722*.
4. Zhang, X., et al. (2020). Understanding image retrieval re-ranking: a graph neural network perspective. *arXiv preprint arXiv:2012.07620*.
5. Lin, J., et al. (2022). Joint representation learning and keypoint detection for cross-view geo-localization. *IEEE Transactions on Image Processing*, × 31 ×, 3780-3792.
6. Ali, B., Sadekov, R.N., & Tsodokova, V.V. (2022). Navigation algorithms for unmanned aerial vehicles using computer vision systems. *Gyroscopy and Navigation*, × 30 ×(4), 87-105. ISSN 0869-7035. DOI: 10.17285/0869-7035.00105
7. Pikalev, Ya.S. (2022). Development of a system for normalizing text corpora. *Problems of Artificial Intelligence*, × 2 ×(25), 64-78. ISSN 2413-7383
8. Khakimov, R. S., Nizhnikova, O. L., & Blizno, M. V. (2024). On the development of a data annotation system for computer vision tasks. *Problems of Artificial Intelligence*, × 3 ×(34), 70–79. ISSN 2413-7383. DOI: 10.24412/2413-7383-2024-3-70-79
9. Gupta, A., Dollar, P., & Girshick, R. (2019). LVIS: A dataset for large vocabulary instance segmentation. In *Proceedings of the IEEE/CVF conference on computer vision and pattern recognition* (pp. 5356-5364).
10. Shao, S., et al. (2019). Objects365: A large-scale, high-quality dataset for object detection. In *Proceedings of the IEEE/CVF international conference on computer vision* (pp. 8430-8439).
11. Zamir, S.W., et al. (2019). iSAID: A large-scale dataset for instance segmentation in aerial images. In *Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops* (pp. 28-37).
12. Cao, Y., et al. (2021). VisDrone-DET2021: The vision meets drone object detection challenge results. In *Proceedings of the IEEE/CVF International conference on computer vision* (pp. 2847-2854).
13. Xia, G. S., et al. (2018). DOTA: A large-scale dataset for object detection in aerial images. In *Proceedings of the IEEE conference on computer vision and pattern recognition* (pp. 3974-3983).
14. Chu, M., et al. (2025). Towards natural language-guided drones: GeoText-1652 benchmark with spatial relation matching. In *European Conference on Computer Vision* (pp. 213-231). Springer, Cham.
15. Ermolenko, T.V. (2019). Classification of errors in text based on deep learning. *Problems of Artificial Intelligence*, × 3 ×(14), 47-57. ISSN 2413-7383
16. Zuev, V. M. (2024). Comparison of object detection using artificial intelligence methods versus classical methods. *Problems of Artificial Intelligence*, × 3 ×(34), 4-10. ISSN 2413-7383. DOI: 10.24412/2413-7383-2024-3-30-35.
17. Pikalev, Ya. S., & Ermolenko, T. V. (2023). On neural architectures for feature extraction in object recognition tasks on devices with limited computing power. *Problems of Artificial Intelligence*, × 3 ×(30), 44-54. ISSN 2413-7383. DOI: 10.34757/2413-7383.2023.30.3.004
18. Pavlenko, B. V., & Bondarenko, V. I. (2024). Intellectually algorithmic method for sight calibration. *Problems of Artificial Intelligence*, × 3 ×(34), 55–63. ISSN 2413-7383. DOI: 10.24412/2413-7383-2024-3-55-63
19. Krishnan, Sh. R., & Amudha, P. (2024). Improving anomaly detection in video using enhanced UNET technology and cascade sliding window technique. *Informatics and Automation*, × 6 ×(23), 1899-1930. ISSN 2713-3192. DOI: 10.15622/ia.23.6.12
20. Soifer, V. A., Fursov, V. A., & Kharitonov, S. I. (2024). Kalman filtering for a class of images of dynamic objects. *Informatics and Automation*, × 4 ×(23), 953-968. ISSN 2713-3192. DOI: 10.15622/ia.23.4.1

Release: 4(35)'2024
Chapter: ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
How to quote: Pikalyov Y. S. KEY OBJECTS DETECTION AND CROSS-VIEW GEOLOCATION: DATASET ANALYSIS AND METHODOLOGICAL ASPECTS [Text] / Y. S. Pikalyov // Problems of artificial intelligence. - 2024. № 4 (35). - P. 25-37ОБНАРУЖЕНИЕ КЛЮЧЕВЫХ ОБЪЕКТОВ И ПЕРЕКРЁСТНАЯ ГЕОЛОКАЛИЗАЦИЯ: АНАЛИЗ НАБОРОВ ДАННЫХ И МЕТОДОЛОГИЧЕСКИЕ АСПЕКТЫ. - http://paijournal.guiaidn.ru/ru/2024/3(34)-3.html