Multi-Modal Fusion for Enhanced Image and Speech Recognition in AI Systems

Ankur Tak; Ankur Tak

doi:10.21275/SR231208202748

Downloads: 4 | Views: 329 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1

Informative Article | Science and Technology | India | Volume 10 Issue 6, June 2021 | Popularity: 5 / 10

Multi-Modal Fusion for Enhanced Image and Speech Recognition in AI Systems

Ankur Tak

Abstract: This research investigates the integration of multi-modal information, specifically images and speech, to enhance the recognition capabilities of artificial intelligence (AI) systems. Adopting an interpretive philosophy and employing a deductive approach, the study explores the potential of dynamic attention mechanisms, semi-supervised learning, and cross-domain adaptation techniques. A descriptive research design is employed, utilizing secondary data collection from reputable academic sources. The research critically evaluates the feasibility and applicability of hardware optimization for efficient multi-modal processing, considering factors like specialized processors and parallel computing. The study presents a thorough analysis of dynamic attention mechanisms, emphasizing their role in dynamically allocating attention across different modalities based on contextual relevance. Additionally, it delves into semi-supervised learning techniques, showcasing their ability to leverage both labeled and unlabeled data for improved recognition performance. Cross-domain adaptation techniques are explored to facilitate the seamless deployment of multi-modal fusion models in diverse real-world scenarios.

Keywords: AI systems, knowledge, connecting, integrating, multi-modal classification, aural, visual information

Edition: Volume 10 Issue 6, June 2021

Pages: 1780 - 1788

DOI: https://www.doi.org/10.21275/SR231208202748

Please Disable the Pop-Up Blocker of Web Browser

Verification Code will appear in 2 Seconds ... Wait

Text copied to Clipboard!

Ankur Tak, "Multi-Modal Fusion for Enhanced Image and Speech Recognition in AI Systems", International Journal of Science and Research (IJSR), Volume 10 Issue 6, June 2021, pp. 1780-1788, https://www.ijsr.net/getabstract.php?paperid=SR231208202748, DOI: https://www.doi.org/10.21275/SR231208202748