Downloads: 8 | Views: 306 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1
Informative Article | Data & Knowledge Engineering | India | Volume 8 Issue 4, April 2019 | Popularity: 5 / 10
Enhancing Big Data Interoperability: Automating Schema Expansion from Parquet to BigQuery
Preyaa Atri
Abstract: In the realm of data engineering, efficient data migration and transformation are pivotal. The Parquet Schema Expansion Migrator for BigQuery is a Python library designed to streamline the process of migrating column data from Parquet files to Google BigQuery tables, while expanding the BigQuery table schema to accommodate columns present in the Parquet data but missing from the BigQuery schema. This paper explores the problem of schema evolution in data warehouses, introduces the library as a solution, discusses its uses and impact, and outlines future enhancements and recommendations for robust data type management.
Keywords: BigQuery, Parquet, Schema Migration, Data Engineering, Cloud Storage, Data Transformation
Edition: Volume 8 Issue 4, April 2019
Pages: 2000 - 2002
Make Sure to Disable the Pop-Up Blocker of Web Browser