Downloads: 4 | Views: 329 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1
Informative Article | Science and Technology | India | Volume 10 Issue 6, June 2021 | Popularity: 5 / 10
Multi-Modal Fusion for Enhanced Image and Speech Recognition in AI Systems
Ankur Tak
Abstract: This research investigates the integration of multi-modal information, specifically images and speech, to enhance the recognition capabilities of artificial intelligence (AI) systems. Adopting an interpretive philosophy and employing a deductive approach, the study explores the potential of dynamic attention mechanisms, semi-supervised learning, and cross-domain adaptation techniques. A descriptive research design is employed, utilizing secondary data collection from reputable academic sources. The research critically evaluates the feasibility and applicability of hardware optimization for efficient multi-modal processing, considering factors like specialized processors and parallel computing. The study presents a thorough analysis of dynamic attention mechanisms, emphasizing their role in dynamically allocating attention across different modalities based on contextual relevance. Additionally, it delves into semi-supervised learning techniques, showcasing their ability to leverage both labeled and unlabeled data for improved recognition performance. Cross-domain adaptation techniques are explored to facilitate the seamless deployment of multi-modal fusion models in diverse real-world scenarios.
Keywords: AI systems, knowledge, connecting, integrating, multi-modal classification, aural, visual information
Edition: Volume 10 Issue 6, June 2021
Pages: 1780 - 1788
DOI: https://www.doi.org/10.21275/SR231208202748
Please Disable the Pop-Up Blocker of Web Browser
Verification Code will appear in 2 Seconds ... Wait