Downloads: 0 | Views: 225
Research Paper | Computer Science & Engineering | India | Volume 12 Issue 1, January 2023 | Rating: 4.3 / 10
Enhancing Fashion Image Retrieval with Multi-Modal Query and Zero-Shot Learning for Cross-Domain
Abstract: CBIR (Content-Based Image Retrieval) system has two main challenges in a) Generalizability and b) Retrieval on Cross-Domain data. Fashion Image Retrieval (FIR) encounters the challenge of retrieving images in cross- domain data due to the difference in user shot photograph and product photographs. This is due to the viewpoints, lighting conditions, and the presence of complex backgrounds a relevant query is crucial for retrieving the closest match. The scenario of inadequate relevant query, to search and retrieve images is a major cause for low generalizability in CBIR. This research targets both these challenges by implementing multi-modal queries for retrieval to handle the first challenge. And the second challenge is addressed by a zero-shot learning model for retrieval to enhance the retrieval accuracy on cross- domain data for FIR. DeepFashion (Liu et al., 2016) dataset with the cross-domain data will be used to propose a system that can retrieve based on user shot images and text queries. Evaluation metrics like Recall, Retrieval Accuracy, F1 score, and Mean Average Precision (mAP) will be used to evaluate the model. The evaluation metrics for each attribute type will be presented in this research.
Keywords: CBIR, Image Retrieval, Cross-Domain, Multi-Modal Query, Zero-Shot Learning
Edition: Volume 12 Issue 1, January 2023,
Pages: 131 - 141