International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064


Downloads: 4 | Views: 193 | Weekly Hits: ⮙1 | Monthly Hits: ⮙2

Analysis Study Research Paper | Computer Science | India | Volume 12 Issue 10, October 2023 | Popularity: 5.3 / 10


     

Image - to - Audio Captioning for the Visually Impaired

Neha Tyagi


Abstract: This article provides a comprehensive overview of the evolving landscape of image captioning, with a focus on its applications in accessibility for the visually impaired. It explores the challenges of real - time object recognition, traditional object detection methods, and the transformative impact of deep learning techniques, particularly those employing region proposal object detection algorithms. The paper introduces Vision Voice, a groundbreaking web application that converts text extracted from images into natural - sounding speech. The article details the image processing pipeline, including preprocessing, segmentation, classification, and post - processing stages. It also delves into the mathematical concepts, image preprocessing techniques, and shortcomings of existing models. The study highlights the ResNet - LSTM models significant potential in generating descriptive and contextually coherent image captions, improving the quality of synthesized speech. Moreover, it discusses the future scope of the VisionVoice project, emphasizing the potential for continued advancements in accuracy, hardware capabilities, and the development of full Image - Speech conversion systems. The ultimate goal is to revolutionize accessibility and inclusion, providing visually impaired individuals with better access to information and a higher quality of life.


Keywords: Object detection, deep learning, image processing, text extraction, speech synthesis, image captioning, ResNet - LSTM, accessibility, visually impaired, future scope


Edition: Volume 12 Issue 10, October 2023


Pages: 1609 - 1613


DOI: https://www.doi.org/10.21275/SR231020171202



Make Sure to Disable the Pop-Up Blocker of Web Browser




Text copied to Clipboard!
Neha Tyagi, "Image - to - Audio Captioning for the Visually Impaired", International Journal of Science and Research (IJSR), Volume 12 Issue 10, October 2023, pp. 1609-1613, https://www.ijsr.net/getabstract.php?paperid=SR231020171202, DOI: https://www.doi.org/10.21275/SR231020171202



Similar Articles

Downloads: 204 | Weekly Hits: ⮙1 | Monthly Hits: ⮙2

Research Paper, Computer Science, United States of America, Volume 9 Issue 9, September 2020

Pages: 1095 - 1100

Maintaining Social Distancing using Artificial Intelligence

Krish Chaudhary

Share this Article

Downloads: 0

Review Papers, Computer Science, India, Volume 11 Issue 7, July 2022

Pages: 1263 - 1270

A Comparative Review of Recent Architectures of Convolutional Neural Networks

Kalpana Devi, Aman Kumar Sharma

Share this Article

Downloads: 1

Review Papers, Computer Science, Saudi Arabia, Volume 11 Issue 2, February 2022

Pages: 854 - 860

Rumor Detection Using Machine Learning in Social Media: A Survey

Afnan Alsadhan, Monirah Al-Ajlan, Mehmet Sabih Aksoy

Share this Article

Downloads: 1 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1

Research Paper, Computer Science, India, Volume 11 Issue 5, May 2022

Pages: 500 - 507

Flood Detection System Based on Machine Learning Algorithms

Nishant Agarwal

Share this Article

Downloads: 1 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1

Analysis Study Research Paper, Computer Science, India, Volume 12 Issue 7, July 2023

Pages: 1300 - 1304

Object Detection with Deep Learning

Yash Bhadiyadra

Share this Article



Top