AirFlights Gold Layer – Round-Trip Analysis
Tools: Databricks | Pandas | PyArrow | PySpark | Delta Lake
Description:
This notebook processes cleaned flight data from Silver tables to create business-ready analytics for round-trip flight combinations.
It transforms cleaned Silver-layer flight data into actionable Gold-layer insights by identifying the best-priced round-trip flight combinations across airlines and routes.
The Notebook begins by loading the necessary libraries and defining strict PyArrow schemas to ensure data consistency throughout processing. It then reads both direct and connecting flight records from Unity Catalog’s Silver Delta tables, consolidates them into a single dataset, and categorizes trips by type.
The core logic matches outbound and return flights by airline and mirrored origin/destination pairs, calculates total round-trip pricing, and extracts the cheapest combination per airline and route. Results are surfaced through a temporary SQL view, making them immediately queryable for validation and downstream consumption.
The final output is a clean, aggregated dataset of optimal round-trip flight pairings — structured for direct use in business reporting, dashboards, and further analytics.
