International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064

Downloads: 3 | Views: 89 | Weekly Hits: ⮙2 | Monthly Hits: ⮙2

Informative Article | Data & Knowledge Engineering | India | Volume 8 Issue 4, April 2019 | Rating: 4.9 / 10


Enhancing Big Data Interoperability: Automating Schema Expansion from Parquet to BigQuery

Preyaa Atri [7]


Abstract: In the realm of data engineering, efficient data migration and transformation are pivotal. The Parquet Schema Expansion Migrator for BigQuery is a Python library designed to streamline the process of migrating column data from Parquet files to Google BigQuery tables, while expanding the BigQuery table schema to accommodate columns present in the Parquet data but missing from the BigQuery schema. This paper explores the problem of schema evolution in data warehouses, introduces the library as a solution, discusses its uses and impact, and outlines future enhancements and recommendations for robust data type management.


Keywords: BigQuery, Parquet, Schema Migration, Data Engineering, Cloud Storage, Data Transformation


Edition: Volume 8 Issue 4, April 2019,


Pages: 2000 - 2002

Rate this Article


Select Rating (Lowest: 1, Highest: 10)

5

Your Comments

Characters: 0


Type Your Registered Email Address below to Rate the Article


Verification Code will appear in 2 Seconds ... Wait

Top