Downloads: 2 | Views: 298 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1
Research Paper | Computer Science and Information Technology | United States of America | Volume 12 Issue 7, July 2023 | Popularity: 5 / 10
Optimizing Efficiency and Performance: Investigating Data Pipelines for Artificial Intelligence Model Development and Practical Applications
Manoj Suryadevara, Sandeep Rangineni, Srinivas Venkata
Abstract: Due to the nature of AI, it is difficult for businesses to continually create and deploy models to complicated production systems while maintaining quality. Data processing, model training, code creation, and system management are the pipeline's four steps. We also relate the difficulties of pipeline deployment, modification, and deployment to these four phases of AI evolution. The potential for ongoing model improvement to boost AI performance and flexibility has garnered considerable interest in both academia and industry. This report provides a survey of ongoing efforts in both academia and industry to advance AI model development. We begin with an overview of the pipeline's most crucial parts, which include data collection and preparation, model development and assessment, rollout and monitoring, and iterative refinement. We go into the difficulties at each level and look at recent developments in research and best practices in the field. The next section explores the present status of data collecting and preprocessing studies, with a particular emphasis on methods for gathering and cleaning large-scale datasets, dealing with data bases, and assuring privacy and security. To address the interpretability and fairness of models, we also look at methods for training and evaluating models, such as transfer learning, reinforcement learning, and explainability approaches. We also examine the deployment phase, dissecting the best practices for deploying models across different environments, as well as the advantages and disadvantages of containerization and scalability. We address methods for updating and retraining models, as well as the need of continual monitoring and assessment in detecting model drift, bias, and performance decline. Finally, we examine feedback loops and their function in the continuous development pipeline, with special emphasis on the value of user input, human-in-the-loop strategies, and assessment methods designed with the end user in mind. We talk about the algorithmic bias, transparency, and accountability that are ethical concerns in the ongoing development of AI models. We hope that this in-depth look at the AI model creation process will help academics and practitioners make more informed decisions moving forward. To guarantee the trustworthy and beneficial deployment of AI models across a variety of fields, we address the obstacles and advances at each level, paving the path for future research and highlighting the need for strong and responsible AI development procedures.
Keywords: Data Pipeline, Artificial Intelligence, Machine Learning Operations, Data Quality
Edition: Volume 12 Issue 7, July 2023
Pages: 1330 - 1340
DOI: https://www.doi.org/10.21275/SR23719211528
Make Sure to Disable the Pop-Up Blocker of Web Browser