BagViT: Bagged vision transformers for classifying chest X-ray images

Truong Thi-Diem; Do Thanh-Nghi

doi:10.22144/ctujoisd.2025.050

Truong Thi-Diem and Do Thanh-Nghi ^*

* Corresponding author: Do Thanh-Nghi (email: dtnghi@ctu.edu.vn)

Full Text: PDF

Received: 13 Jul 2025

Revised: 15 Aug 2025

Accepted: 02 Oct 2025

Published: 16 Oct 2025

DOI: 10.22144/ctujoisd.2025.050

Views

469

Downloads

225

How to Cite

Truong, T.-D., & Do, T.-N. (2025). BagViT: Bagged vision transformers for classifying chest X-ray images. CTU Journal of Innovation and Sustainable Development, 17(Special issue: ISDS), 29-36. https://doi.org/10.22144/ctujoisd.2025.050

Issue

Vol. 17 No. Special issue: ISDS (2025)

Section

Intelligent Systems and Data Science (ISDS 2025)

Abstract

In this paper, we propose a novel ensemble method, termed Bagged Vision Transformers (BagViT), to enhance the classification accuracy for Chest X-ray (CXR) images. BagViT constructs an ensemble of independent Vision Transformer (ViT) models, each of which is trained on a bootstrap sample (sampling with replacement) drawn from the original training dataset. To enhance model diversity, we use MixUp to generate synthetic training examples and introduce training randomness by varying the number of training epochs and selectively fine-tuning the top layers of each model. Final predictions are obtained through majority voting. Experimental results on a real-world dataset collected from Chau Doc Hospital (An Giang, Vietnam) demonstrate that BagViT significantly outperforms fine-tuned baselines such as VGG16, ResNet, DenseNet, ViT. Our BagViT achieves a classification accuracy of 72.25%, highlighting the effectiveness of ensemble learning with transformer architectures in scenarios with complex CXR images.

Keywords: Bagging, Deep learning, Lung disease classification, Vision transformer (ViT), X-ray images

Conflict of Interest

The authors declare no conflicts of interest.

References

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., … Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous distributed systems. https://www.tensorflow.org/

Adjei-Mensah, I., Zhang, X., Agyemang, I. O., Yussif, S. B., Baffour, A. A., Cobbinah, B. M., Sey, C., Fiasam, L. D., Chikwendu, I. A., & Arhin, J. R. (2024). Cov-Fed: Federated learning-based framework for COVID-19 diagnosis using chest X-ray scans. Engineering Applications of Artificial Intelligence, 128, 107448. https://doi.org/10.1016/j.engappai.2023.107448

Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. https://doi.org/10.1007/BF00058655

Callı, E., Sogancioglu, E., van Ginneken, B., van Leeuwen, K. G., & Murphy, K. (2021). Deep learning for chest X-ray analysis: A survey. Medical Image Analysis, 72, 102125. https://doi.org/10.1016/j.media.2021.102125

Chen, G.-Y., & Lin, C.-T. (2024). Multi-task supervised contrastive learning for chest X-ray diagnosis: A two-stage hierarchical classification framework for COVID-19 diagnosis. Applied Soft Computing, 155, 111478. https://doi.org/10.1016/j.asoc.2024.111478

Chicco, D. (2021). Siamese neural networks: An overview. In: Cartwright, H. (eds) Artificial Neural Networks. Methods in Molecular Biology, vol 2190. Humana, New York, NY (pp. 73-94). https://doi.org/10.1007/978-1-0716-0826-5_3

Chollet, F. (2015). Keras. https://keras.io/

Do, T.-N., Le, V.-T., & Doan, T.-H. (2022). SVM on top of deep networks for Covid-19 detection from chest X-ray images. Korea Institute of Information and Communication Engineering, 20(3), 219–225. https://doi.org/10.56977/jicce.2022.20.3.219

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. https://doi.org/10.48550/arXiv.2010.11929

Galán-Cuenca, A., Gallego, A. J., Saval-Calvo, M., & Pertusa, A. (2024). Few-shot learning for COVID-19 chest X-ray classification with imbalanced data: An inter vs. intra domain study. Pattern Analysis and Applications, 27(3), 69. https://doi.org/10.1007/s10044-024-01285-w

Global Asthma Network. (2022). GAR 2022. http://globalasthmareport.org/gar2022.html

Hage Chehade, A., Abdallah, N., Marion, J.-M., Hatt, M., Oueidat, M., & Chauvet, P. (2024). A systematic review: Classification of lung diseases from chest X-ray images using deep learning algorithms. SN Computer Science, 5(4), 405. https://doi.org/10.1007/s42979-024-02751-2

Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction, second edition. Springer Series in Statistics.

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778). https://openaccess.thecvf.com/content_cvpr_2016/html/He_Deep_Residual_Learning_CVPR_2016_paper.html

Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (pp. 2261–2269). https://doi.org/10.1109/CVPR.2017.243

Itseez. (2015). Open source computer vision library. https://github.com/itseez/opencv

Koyyada, S. P., & Singh, T. P. (2024). A systematic survey of automatic detection of lung diseases from chest X-ray images: COVID-19, pneumonia, and tuberculosis. SN Computer Science, 5(2), 229.

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., … Chintala, S. (2019). PyTorch: An imperative style, high-performance deep learning library (Vol. 32). Curran Associates, Inc.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: Machine learning with Python. Journal of Machine Learning Research, 12, 2825--2830.

Poloju, N., & Rajaram, A. (2024). Hybrid technique for lung disease classification based on machine learning and optimization using X-ray images. Multimedia Tools and Applications, 84(21), 23531–23553. https://doi.org/10.1007/s11042-024-19959-2

Shelke, A., Inamdar, M., Shah, V., Tiwari, A., Hussain, A., Chafekar, T., & Mehendale, N. (2021). Chest X-ray classification using deep learning for automated COVID-19 screening. SN Computer Science, 2(4), 300.

Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556. https://doi.org/10.48550/arXiv.1409.1556

Truong, T.-D., Huynh, P.-H., Nguyen, V. H., & Do, T.-N. (2024). Enhancing the efficiency of lung disease classification based on multi-modal fusion model. Intelligent Systems and Data Science, 55–70. https://doi.org/10.1007/978-981-97-9616-8_5

Vapnik, V. (1995). The Nature of statistical learning theory. New York, NY: Springer-Verlag.

Verma, S., Devarajan, G. G., & Sharma, P. K. (2024). Comparative evaluation of feature extraction techniques in chest Xray image with different classification model. International Advanced Computing Conference, 197–209. https://doi.org/10.1007/978-3-031-56703-2_17

Vo, T.-T., & Do, T.-N. (2024). Improving chest X-ray image classification via integration of self-supervised learning and machine learning algorithms. Journal of Information and Communication Convergence Engineering, 22(2), 165–171. https://doi.org/10.56977/jicce.2024.22.2.165

World Health Organization. (2022). Pneumonia in children. https://www.who.int/news-room/fact-sheets/detail/pneumonia

Yadav, P., Menon, N., Ravi, V., & Vishvanathan, S. (2023). Lung-GANs: Unsupervised representation learning for lung disease classification using chest CT and X-ray images. IEEE Transactions on Engineering Management, 70(8), 2774–2786. https://doi.org/10.1109/TEM.2021.3103334

Zhang, H., Cisse, M., Dauphin, Y. N., & Lopez-Paz, D. (2018). Mixup: Beyond empirical risk minimization. In 6th International Conference on Learning Representations (ICLR 2018), Vancouver Convention Center, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. https://openreview.net/

Article Sidebar

Main Article Content

Abstract

Article Details

Conflict of Interest

References

Most read articles by the same author(s)