Towards robust visual recognition for smart cities and remote sensing: A survey of regression losses in rotated object detection

Thai Chien; Trang Mai Xuan; Anh Son Le

doi:10.22144/ctujoisd.2025.054

Thai Chien , Trang Mai Xuan ^* and Anh Son Le

* Corresponding author: Trang Mai Xuan (email: trang.maixuan@phenikaa-uni.edu.vn)

Full Text: PDF

Received: 30 Jun 2025

Revised: 18 Aug 2025

Accepted: 07 Oct 2025

Published: 16 Oct 2025

DOI: 10.22144/ctujoisd.2025.054

Views

383

Downloads

154

How to Cite

Thai, C., Trang, M. X., & Anh, S. L. (2025). Towards robust visual recognition for smart cities and remote sensing: A survey of regression losses in rotated object detection. CTU Journal of Innovation and Sustainable Development, 17(Special issue: ISDS), 64-74. https://doi.org/10.22144/ctujoisd.2025.054

Issue

Vol. 17 No. Special issue: ISDS (2025)

Section

Intelligent Systems and Data Science (ISDS 2025)

Abstract

Rotated object detection (ROD), often termed oriented object detection, is essential for numerous practical tasks, including remote sensing, self-driving systems, urban surveillance, and text recognition in natural scenes. Unlike conventional object detection, ROD must estimate object orientation, making angle regression and loss function design crucial to model performance. This paper presents a comprehensive survey of regression loss functions used in ROD, categorized into coordinate-based, approximated rotated IoU-based, and Gaussian-based approaches. We analyze their theoretical foundations, practical trade-offs, and effectiveness in addressing core challenges including angle periodicity, edge ambiguity, and metric inconsistency. Representative loss functions are benchmarked on standard datasets to evaluate their suitability for various detection frameworks. By emphasizing application contexts such as smart city monitoring and environmental analysis, this survey offers practical guidance for designing robust and efficient ROD systems that support sustainable development goals.

Keywords: Autonomous driving, regression loss functions, rotated object detection, smart city applications

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

References

Chen, Z., Chen, K., Lin, W., See, J., Yu, H., Ke, Y., & Yang, C. (2020, August). Piou loss: Towards accurate oriented object detection in complex environments. In the European Conference on Computer Vision (pp. 195-211). Cham: Springer International Publishing.

Ding, J., Xue, N., Long, Y., Xia, G. S., & Lu, Q. (2018). Learning RoI transformer for detecting oriented objects in aerial images. arXiv preprint arXiv:1812.00155.

Girshick, R. (2015). Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1440-1448).

Han, J., Ding, J., Li, J., & Xia, G. S. (2021). Align deep features for oriented object detection. IEEE Transactions on Geoscience and Remote Sensing, 60, 1-11.

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision And Pattern Recognition (pp. 770-778).

Lang, A. H., Vora, S., Caesar, H., Zhou, L., Yang, J., & Beijbom, O. (2019). Pointpillars: Fast encoders for object detection from point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 12697-12705).

Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017). Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2980-2988).

Liu, Z., Yuan, L., Weng, L., & Yang, Y. (2017, February). A high resolution optical satellite image dataset for ship recognition and some new baselines. In International Conference on Pattern Recognition Applications and Methods (Vol. 2, pp. 324-331). SciTePress.

Loshchilov, I., & Hutter, F. (2017). Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101.

Ming, Q., Zhou, Z., Miao, L., Zhang, H., & Li, L. (2021, May). Dynamic anchor learning for arbitrary-oriented object detection. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 3, pp. 2355-2363).

Pan, X., Ren, Y., Sheng, K., Dong, W., Yuan, H., Guo, X., … & Xu, C. (2020). Dynamic refinement network for oriented and densely packed object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 11207-11216).

Qian, W., Yang, X., Peng, S., Yan, J., & Guo, Y. (2021, May). Learning modulated loss for rotated object detection. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 3, pp. 2458-2466).

Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., & Savarese, S. (2019). Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition (pp. 658-666).

Thai, C., Trang, M. X., Ninh, H., Ly, H. H., & Le, A. S. (2025). Enhancing rotated object detection via anisotropic Gaussian bounding box and Bhattacharyya distance. Neurocomputing, 623, 129432.

Xia, G. S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., … & Zhang, L. (2018). DOTA: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3974-3983).

Yang, X., Yang, J., Yan, J., Zhang, Y., Zhang, T., Guo, Z., … & Fu, K. (2019). Scrdet: Towards more robust detection for small, cluttered and rotated objects. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 8232-8241).

Yang, X., & Yan, J. (2020, August). Arbitrary-oriented object detection with circular smooth label. In European Conference on Computer Vision (pp. 677-694). Cham: Springer International Publishing.

Yang, X., Hou, L., Zhou, Y., Wang, W., & Yan, J. (2021). Dense label encoding for boundary discontinuity free rotation detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 15819-15829).

Yang, X., Yan, J., Feng, Z., & He, T. (2021, May). R3det: Refined single-stage detector with feature refinement for rotating object. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 4, pp. 3163-3171).

Yang, X., Yan, J., Ming, Q., Wang, W., Zhang, X., & Tian, Q. (2021, July). Rethinking rotated object detection with Gaussian Wasserstein distance loss. In International Conference on Machine Learning (pp. 11830-11841). PMLR.

Yang, X., Yang, X., Yang, J., Ming, Q., Wang, W., Tian, Q., & Yan, J. (2021). Learning high-precision bounding box for rotated object detection via Kullback-Leibler divergence. Advances in Neural Information Processing Systems, 34, 18381-18394.

Yang, X., Yan, J., Liao, W., Yang, X., Tang, J., & He, T. (2022). Scrdet++: Detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(2), 2384-2399.

Yang, X., Zhou, Y., Zhang, G., Yang, J., Wang, W., Yan, J., … & Tian, Q. (2022). The KFIoU loss for rotated object detection. arXiv preprint arXiv:2201.12558.

Yin, T., Zhou, X., & Krahenbuhl, P. (2021). Center-based 3d object detection and tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 11784-11793).

Yu, J., Jiang, Y., Wang, Z., Cao, Z., & Huang, T. (2016, October). Unitbox: An advanced object detection network. In Proceedings of the 24th ACM International Conference on Multimedia (pp. 516-520).

Yu, Y., & Da, F. (2023). Phase-shifting coder: Predicting accurate orientation in oriented object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 13354-13363).

Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., & Ren, D. (2020, April). Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 07, pp. 12993-13000).

Zhou, D., Fang, J., Song, X., Guan, C., Yin, J., Dai, Y., & Yang, R. (2019, September). IoU loss for 2d/3d object detection. In the 2019 International Conference on 3D Vision (3DV) (pp. 85-94). IEEE.

Zhou, Y., Yang, X., Zhang, G., Wang, J., Liu, Y., Hou, L., … & Chen, K. (2022, October). Mmrotate: A rotated object detection benchmark using PyTorch. In Proceedings of the 30th ACM International Conference on Multimedia (pp. 7331-7334).

Article Sidebar

Main Article Content

Abstract

Article Details

References