Enhancing Big Data Interoperability: Automating Schema Expansion from Parquet to BigQuery
International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064


Downloads: 8 | Views: 306 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1

Informative Article | Data & Knowledge Engineering | India | Volume 8 Issue 4, April 2019 | Popularity: 5 / 10


     

Enhancing Big Data Interoperability: Automating Schema Expansion from Parquet to BigQuery

Preyaa Atri


Abstract: In the realm of data engineering, efficient data migration and transformation are pivotal. The Parquet Schema Expansion Migrator for BigQuery is a Python library designed to streamline the process of migrating column data from Parquet files to Google BigQuery tables, while expanding the BigQuery table schema to accommodate columns present in the Parquet data but missing from the BigQuery schema. This paper explores the problem of schema evolution in data warehouses, introduces the library as a solution, discusses its uses and impact, and outlines future enhancements and recommendations for robust data type management.


Keywords: BigQuery, Parquet, Schema Migration, Data Engineering, Cloud Storage, Data Transformation


Edition: Volume 8 Issue 4, April 2019


Pages: 2000 - 2002



Make Sure to Disable the Pop-Up Blocker of Web Browser


Text copied to Clipboard!
Preyaa Atri, "Enhancing Big Data Interoperability: Automating Schema Expansion from Parquet to BigQuery", International Journal of Science and Research (IJSR), Volume 8 Issue 4, April 2019, pp. 2000-2002, https://www.ijsr.net/getabstract.php?paperid=SR24522144712, DOI: https://www.doi.org/10.21275/SR24522144712

Top