VISION BASED DEEP LEARNING FRAMEWORK FOR SIGN LANGUAGE CONVERSION | IJET – Volume 12 Issue 2 | IJET-V12I2P101

International Journal of Engineering and Techniques (IJET) Logo

International Journal of Engineering and Techniques (IJET)

Open Access • Peer Reviewed • High Citation & Impact Factor • ISSN: 2395-1303

Volume 12, Issue 2  |  Published: April 2026

Author: S. Kumaran, S.G. Nandhini, A. Ushananthini, G. Jayanthini

DOI: https://doi.org/{{doi}}  •  PDF: Download

Abstract

This paper introduces a real-time, deep learning framework designed to bridge communication gaps for individuals with hearing and speech impairments. By leveraging the YOLOv8 object detection algorithm, the system accurately identifies American Sign Language (ASL) gestures from live video feeds. To balance portability with performance, the framework uses a hybrid edge-cloud architecture: Edge: A Raspberry Pi Zero 2 W equipped with a Pi Camera captures image data. Cloud: Data is transmitted to an AWS EC2 instance where a FastAPI backend manages the inference process. The system demonstrated high reliability, achieving a mean Average Precision (mAP@0.5) of 95.6%. By instantly converting hand signals into readable text, this solution offers a cost-effective and portable tool for enhancing accessibility in healthcare, education, and public sectors.

Keywords

YOLOv8, Sign Language Recognition, ASL, Deep Learning, Computer Vision, Raspberry Pi, AWS, FastAPI, Edge-cloud Architecture.

Conclusion

Through the use of YOLOv8 and a hybrid edge-cloud architecture, this team successfully created a real-time Sign Language Recognition system. The Raspberry Pi Zero 2 W is used by the system to efficiently record hand motions, which are then processed by a cloud-based model for precise recognition. High precision, speed, and robustness under various environmental conditions are guaranteed by using YOLOv8. The solution lowers overall costs and hardware constraints by shifting processing to the cloud. It offers a practical way to translate sign language into text, enhancing deaf and mute people’s ability to communicate. The study shows how embedded systems and deep learning may be used practically in assistive technology. For full-word recognition and practical use, it can be further improved.

References

Alsharif, M., Khan, A., and Ahmad, Z. (2024). YOLOv8-based ASL detection. CNN-LSTM sign language system (Anturkar, A., Khot, A., and Megadump, A., 2025). FastAPI for ML deployment, Bhatt, S., Trivedi, A., and Patel, R. (2023). YOLOv8 framework, Jocher, G. et al., 2023. 3D CNN sign detection by Kumar, S., Rani, R., and Chaudhari, U. (2024). Najib, F. M. (2025). Sign recognition in many languages. YOLOv11 sign recognition; Rahman, K., Islam, M., and Ahmed, S., 2025; Stipulator, M. et al., 2025. Deep learning for sign language process.

Cite this article

APA
{{author}} (April 2026). {{title}}. International Journal of Engineering and Techniques (IJET), 12(2). https://doi.org/{{doi}}
{{author}}, “{{title}},” International Journal of Engineering and Techniques (IJET), vol. 12, no. 2, April 2026, doi: {{doi}}.
Submit Your Paper