Cross-Dataset Unified Vision Transformer Model for Diabetic Retinopathy Detection
Keywords:
Diabetic Retinopathy, Vision Transformers, EyePACS, APTOSAbstract
Diabetic retinopathy (DR) is a major cause of blindness around the world that can be prevented; it is caused by prolonged hyperglycemia that leads to damage to retinal vasculature. Early detection of DR, and good DR grading are important to ensure timely clinical intervention and improved outcomes for patients. Machine learning and convolutional neural network (CNN)-based approaches have held promise for DR screening, but they are often limited in their ability to capture fine-grained lesions or long-range dependencies because of limited receptive fields and insufficient modeling of global context.Vision Transformers, or ViTs, lever self-attention mechanisms to capture global relationships across retinal structures. This is a Review of ViT based frameworks on grading DR. The review concentrates on studies using two of the most widely applied and diverse benchmarks between EyePACS and APTOS; expert annotated fundus images assessment. The review covers a variety of recent advances such as hybrid CNN-ViT architectures, lesion-aware transformer modules, multi-scale feature aggregation, and federated learning strategies for privacy-preserving medical image analysis. It then highlights the role of interpretable attention maps in improving clinical trust and decision transparency. This is also the review of the remaining challenges in DR grading; it includes extreme class imbalance in DR severity levels, high computational costs of transformer models, and the demand for a powerful and robust explainability technique to favor clinical adoption. In connecting current achievements, unsolved issues, and new directions of research, this review endeavors to orient researchers and practitioners towards designing efficient, generalizable, and clinically relevant ViT-based DR detection and grading systems. In general, it shows how much transformer-driven approaches have the potential to revolutionize automated ophthalmic diagnosis and enhance the global diabetic eye care workflow.
Downloads
Published
How to Cite
Issue
Section
License
This is an open Access Article published by Research Center of Computing & Biomedical Informatics (RCBI), Lahore, Pakistan under CCBY 4.0 International License



