A Vietnamese benchmark for vehicle detection and real-time empirical evaluation
Main Article Content
Abstract
The current situation of traffic in Vietnam has many outstanding problems, especially traffic congestion, since the supply of infrastructure has often not been able to keep up with the growth in mobility. Thus, proposing monitoring plans to support authorities to make suitable and prompt decisions has always received large attention from the community. Meanwhile, applying information technology, especially advanced models which could process or analyze traffic data in real time is recently considered to be a priority solution due to the time, accuracy, and cost saving that it can potentially achieve. Therefore, this paper outlines research on three advanced real-time object detection methods: YOLOX, YOLOF, and YOLACT and the development of the newest Vietnamese traffic dataset named UIT-VinaDeveS22. The work contains both theoretical and empirical analysis, which are expected to create premises for further studies into addressing problems such as traffic density management, traffic separation, and traffic congestion.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
References
Bochkovskiy, A., Wang, C.-Y., & Liao, H.-Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. ArXiv Preprint ArXiv:2004.10934.
Bolya, D., Zhou, C., Xiao, F., & Lee, Y. J. (2019). Yolact: Real-time instance segmentation. Proceedings of the IEEE/CVF International Conference on Computer Vision, 9157–9166.
Chen, K., Wang, J., Pang, J., Cao, Y., Xiong, Y., Li, X., Sun, S., Feng, W., Liu, Z., & Xu, J. (2019). MMDetection: Open mmlab detection toolbox and benchmark. ArXiv Preprint ArXiv:1906.07155.
Chen, Q., Wang, Y., Yang, T., Zhang, X., Cheng, J., & Sun, J. (2021). You only look one-level feature. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13039–13048.
Dinh, V.-T., Luu, N.-D., & Trinh, H.-H. (2016). Vehicle classification and detection based coarse data for warning traffic jam in VietNam. 2016 3rd National Foundation for Science and Technology Development Conference on Information and Computer Science (NICS), 223–228.
Ge, Z., Liu, S., Li, Z., Yoshie, O., & Sun, J. (2021). Ota: Optimal transport assignment for object detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 303–312.
Ge, Z., Liu, S., Wang, F., Li, Z., & Sun, J. (2021). Yolox: Exceeding yolo series in 2021. ArXiv Preprint ArXiv:2107.08430.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778.
Huang, X., Wang, X., Lv, W., Bai, X., Long, X., Deng, K., Dang, Q., Han, S., Liu, Q., & Hu, X. (2021). PP-YOLOv2: A practical object detector. ArXiv Preprint ArXiv:2104.10419.
Huynh, C.-K., Le, T.-S., & Hamamoto, K. (2016). Convolutional neural network for motorbike detection in dense traffic. 2016 IEEE Sixth International Conference on Communications and Electronics (ICCE), 369–374.
Jiang, B., Luo, R., Mao, J., Xiao, T., & Jiang, Y. (2018). Acquisition of localization confidence for accurate object detection. Proceedings of the European Conference on Computer Vision (ECCV), 784–799.
Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision, 2980–2988.
Redmon, J., & Farhadi, A. (2018). Yolov3: An incremental improvement. ArXiv Preprint ArXiv:1804.02767.
Redmon, J., & Farhadi, A. (2017). YOLO9000: better, faster, stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7263–7271.
Song, G., Liu, Y., & Wang, X. (2020). Revisiting the sibling head in object detector. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11563–11572.
Thai, N. D., Le, T. S., Thoai, N., & Hamamoto, K. (2014). Learning bag of visual words for motorbike detection. 2014 13th International Conference on Control Automation Robotics & Vision (ICARCV), 1045–1050.
Tian, Z., Shen, C., Chen, H., & He, T. (2019). Fcos: Fully convolutional one-stage object detection. Proceedings of the IEEE/CVF International Conference on Computer Vision, 9627–9636.
Wu, S., Yang, J., Wang, X., & Li, X. (2019). Iou-balanced loss functions for single-stage object detection. ArXiv Preprint ArXiv:1908.05641.
Wu, Y., Chen, Y., Yuan, L., Liu, Z., Wang, L., Li, H., & Fu, Y. (2020). Rethinking classification and localization for object detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10186–10195.