{"id":1196,"date":"2026-01-03T20:55:12","date_gmt":"2026-01-03T20:55:12","guid":{"rendered":"https:\/\/ranaghazzi.com\/?page_id=1196"},"modified":"2026-04-17T18:28:24","modified_gmt":"2026-04-17T18:28:24","slug":"ibm-using-databricks-pyspar","status":"publish","type":"page","link":"https:\/\/ranaghazzi.com\/?page_id=1196","title":{"rendered":"IBM Stocks ETL \u2013 Databricks"},"content":{"rendered":"<p><style>\n    .light-font-container, .light-font-container p, .light-font-container h2, .light-font-container li {<br \/>\n        font-weight: #FFFFFF !important;<br \/>\n    }<br \/>\n<\/style>\n<\/p>\n<div class=\"light-font-container\" style=\"background-color: #2b85d9; padding: 40px; border-radius: 15px;\">\n\n\n<div style=\"height:13px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading has-text-align-left has-contrast-color has-text-color has-background has-link-color has-large-font-size wp-elements-4a873062d42abcc91be409a7af64b432\" style=\"background-color:#2b85d9\">Tools: Databricks | Pyspark | Pandas|<code> Numpy<\/code> |&nbsp;<code>delta.tables<\/code><\/h2>\n\n\n\n<div style=\"height:24px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-group alignwide is-content-justification-center is-nowrap is-layout-flex wp-container-core-group-is-layout-bfbbbc10 wp-block-group-is-layout-flex\">\n<div class=\"wp-block-buttons is-content-justification-left is-layout-flex wp-container-core-buttons-is-layout-fc4fd283 wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link has-background has-large-font-size has-custom-font-size wp-element-button\" href=\"https:\/\/github.com\/Ranoush75\/Databricks_IBM.git\" style=\"background-color:#6ad92b96\"><strong>GitHub<\/strong><\/a><\/div>\n\n\n\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link has-background has-large-font-size has-custom-font-size wp-element-button\" href=\"https:\/\/ranoush75.github.io\/Databricks_IBM\/\" style=\"background-color:#2bcdd9\">    <strong>Digram<\/strong>      <\/a><\/div>\n<\/div>\n\n\n\n<div style=\"height:100px;width:0px\" aria-hidden=\"true\" class=\"wp-block-spacer wp-container-content-6388d5dc\"><\/div>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading has-contrast-color has-text-color has-link-color has-x-large-font-size wp-elements-3c937b15830dbe38fa7a627b2d1c8cf3\">Description:<\/h2>\n\n\n\n<div class=\"wp-block-group has-global-padding is-layout-constrained wp-block-group-is-layout-constrained\">\n<p class=\"has-contrast-color has-text-color has-background has-link-color has-large-font-size wp-elements-f1ae0293f2ead9a94b7193e06326d451\" style=\"background-color:#2b85d9\">This repo contains a Databricks notebook that is intended to run on a&nbsp;schedule&nbsp;to pull IBM daily stock data from an&nbsp;API&nbsp;and process changes through a&nbsp;layered (Bronze \u2192 Silver)&nbsp;Delta Lake design.<\/p>\n\n\n\n<p class=\"has-contrast-color has-text-color has-background has-link-color has-large-font-size wp-elements-88e7c25e5c29df24d5a506b70efb1f23\" style=\"background-color:#2b85d9\">The API returns data in JSON format, including daily open, high, low, close prices, and volume for IBM.<\/p>\n\n\n\n<p class=\"has-contrast-color has-text-color has-background has-link-color has-large-font-size wp-elements-0dfcf49df8a8f1a3ca08097667944868\" style=\"background-color:#2b85d9\">The dataset includes the latest 100 trading days. Data is updated every two to three days.<\/p>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading has-contrast-color has-text-color has-link-color has-x-large-font-size wp-elements-526a0ecbee4576089afedd2d6786c53d\"><strong>What it does:<\/strong><\/h2>\n\n\n\n<p class=\"has-contrast-color has-text-color has-background has-link-color has-large-font-size wp-elements-36545b4eead763f72b5b9deca34ffedd\" style=\"background-color:#2b85d9\">To ensures that business intelligence tools and dashboards display the most current data, we will build CDC pipeline enables continuous, incremental updates to data warehouses by propagating only the changes instead of reloading complete datasets. Also we will make sure to keep all historical data saved in our staging table as the 100 days API window moving forward,<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading has-text-align-center has-accent-4-color has-text-color has-background has-link-color has-x-large-font-size wp-elements-432f8170f2306ef4898e89f70ab967ed\" style=\"background-color:#2b85d9\">Bronze Layer<\/h2>\n\n\n\n<div style=\"height:0px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-group is-layout-grid wp-container-core-group-is-layout-e2bd5cb0 wp-block-group-is-layout-grid\">\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading has-contrast-color has-text-color has-link-color has-large-font-size wp-elements-358d8eddda449ea3940438b884bf4854\">Bronze layer (workspace.bronze.ibm) \u2014 ingestion + history<\/h2>\n\n\n\n<div class=\"wp-block-group has-global-padding is-layout-constrained wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-group has-contrast-color has-text-color has-background has-link-color wp-elements-1c18bb8f18af2d162c085df4fe9b79bb has-global-padding is-layout-constrained wp-block-group-is-layout-constrained\" style=\"background-color:#2b86d9a1\">\n<p class=\"has-contrast-color has-text-color has-background has-link-color has-large-font-size wp-elements-842e988473fed0c5fe581af9f5e19f98\" style=\"background-color:#2b86d9a1\">Connects to the API and ingests daily IBM stock data (JSON \u2192 tabular).<\/p>\n\n\n\n<p class=\"has-contrast-color has-text-color has-background has-link-color has-large-font-size wp-elements-cd11e57c64a8d59a26655fa9f0765395\" style=\"background-color:#2b86d9a1\"><strong>Appends only new records<\/strong>&nbsp;each scheduled run (based on a watermark \/ max&nbsp;<code>Date<\/code>&nbsp;in the existing Bronze table).<\/p>\n\n\n\n<p class=\"has-contrast-color has-text-color has-background has-link-color has-large-font-size wp-elements-6e8c5128fd08415e4fcb85b6e747ff58\" style=\"background-color:#2b86d9a1\"><strong>Keeps all historical data<\/strong>&nbsp;(Bronze is the long-term, append-friendly history layer).<\/p>\n\n\n\n<p class=\"has-contrast-color has-text-color has-background has-link-color has-large-font-size wp-elements-d58bf43c91b8af23e88d7127afd7beec\" style=\"background-color:#2b86d9a1\"><strong>Key idea:<\/strong>&nbsp;Bronze grows over time and preserves what was ingested each run (historic retention).<\/p>\n<\/div>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading has-text-align-center has-accent-4-color has-text-color has-background has-link-color has-x-large-font-size wp-elements-73b264cbda630581c093417a6bbfa37a\" style=\"background-color:#2b85d9\"><strong>Silver Layer<\/strong><\/h2>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading has-contrast-color has-text-color has-background has-link-color has-large-font-size wp-elements-042c4f2519a577ca0667fb88e1eb8cda\" style=\"background-color:#2b85d9\">Silver layer (workspace.silver.ibm) \u2014 curated \u201clatest version\u201d per record<a href=\"https:\/\/github.com\/Ranoush75\/Databricks_IBM#silver-layer-workspacesilveribm--curated-latest-version-per-record\"><\/a><\/h3>\n\n\n\n<ul style=\"background-color:#2b85d9\" class=\"wp-block-list has-contrast-color has-text-color has-background has-link-color has-large-font-size wp-elements-ee9f2d0fba2f6086fc7234869b67ab73\">\n<li class=\"has-contrast-color has-text-color has-link-color wp-elements-7be3f15f448c499c7a477b0260c4e618\">Reads from the Bronze table.<\/li>\n\n\n\n<li class=\"has-contrast-color has-text-color has-link-color wp-elements-161fa92f9e2880bdc87aa585b5e9dffc\">Applies cleaning\/validation steps (casting types, dropping nulls on critical columns, de-duplication on the key).<\/li>\n\n\n\n<li class=\"has-contrast-color has-text-color has-link-color wp-elements-8670a896b98afc31a9eaa0b67ea4a044\">Loads into Silver using a Delta&nbsp;<strong>MERGE (upsert)<\/strong>&nbsp;keyed by&nbsp;<code>Date<\/code>.<\/li>\n<\/ul>\n\n\n\n<div class=\"wp-block-group has-background has-global-padding is-layout-constrained wp-block-group-is-layout-constrained\" style=\"background-color:#2b86d9a1\">\n<div class=\"wp-block-group has-contrast-color has-text-color has-background has-link-color wp-elements-6c1f3f5245c74aa0b74482f5c637b591 has-global-padding is-layout-constrained wp-block-group-is-layout-constrained\" style=\"background-color:#2b86d9a1\">\n<p class=\"has-contrast-color has-text-color has-link-color has-large-font-size wp-elements-593fc175372ffc7fb0b681bf6a19c740\"><strong>Key idea:<\/strong>&nbsp;Silver represents the&nbsp;<strong>latest version of each record (per&nbsp;<code>Date<\/code>)<\/strong>:<\/p>\n\n\n\n<ul class=\"wp-block-list has-contrast-color has-text-color has-link-color has-large-font-size wp-elements-8f9c3d12386ae38c156bd85065f16845\">\n<li>If a&nbsp;<code>Date<\/code>&nbsp;already exists \u2192 it can be updated (latest values kept)<\/li>\n\n\n\n<li>If a&nbsp;<code>Date<\/code>&nbsp;is new \u2192 it is inserted<\/li>\n<\/ul>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Tools: Databricks | Pyspark | Pandas| Numpy |&nbsp;delta.tables Description: This repo contains a Databricks notebook that is intended to run on a&nbsp;schedule&nbsp;to pull IBM daily stock data from an&nbsp;API&nbsp;and process changes through a&nbsp;layered (Bronze \u2192 Silver)&nbsp;Delta Lake design. The API returns data in JSON format, including daily open, high, low, close prices, and volume for [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"parent":30,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1196","page","type-page","status-publish","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.2 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>IBM Stocks ETL \u2013 Databricks - Rana Nasri Ghazzi<\/title>\n<meta name=\"description\" content=\"Browse real-world data projects by Rana Ghazzi, covering data cleaning, analysis, and storytelling with Python, SQL, and Tableau Explore Rana Ghazzi&#039;s data analytics portfolio \u2014 dashboards, visualizations, and insights built with Tableau, Power BI &amp; Python\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/ranaghazzi.com\/?page_id=1196\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"IBM Stocks ETL \u2013 Databricks - Rana Nasri Ghazzi\" \/>\n<meta property=\"og:description\" content=\"Browse real-world data projects by Rana Ghazzi, covering data cleaning, analysis, and storytelling with Python, SQL, and Tableau Explore Rana Ghazzi&#039;s data analytics portfolio \u2014 dashboards, visualizations, and insights built with Tableau, Power BI &amp; Python\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ranaghazzi.com\/?page_id=1196\" \/>\n<meta property=\"og:site_name\" content=\"Rana Nasri Ghazzi\" \/>\n<meta property=\"article:modified_time\" content=\"2026-04-17T18:28:24+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/ranaghazzi.com\/?page_id=1196\",\"url\":\"https:\/\/ranaghazzi.com\/?page_id=1196\",\"name\":\"IBM Stocks ETL \u2013 Databricks - Rana Nasri Ghazzi\",\"isPartOf\":{\"@id\":\"https:\/\/ranaghazzi.com\/#website\"},\"datePublished\":\"2026-01-03T20:55:12+00:00\",\"dateModified\":\"2026-04-17T18:28:24+00:00\",\"description\":\"Browse real-world data projects by Rana Ghazzi, covering data cleaning, analysis, and storytelling with Python, SQL, and Tableau Explore Rana Ghazzi's data analytics portfolio \u2014 dashboards, visualizations, and insights built with Tableau, Power BI & Python\",\"breadcrumb\":{\"@id\":\"https:\/\/ranaghazzi.com\/?page_id=1196#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/ranaghazzi.com\/?page_id=1196\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/ranaghazzi.com\/?page_id=1196#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/ranaghazzi.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Home\",\"item\":\"https:\/\/ranaghazzi.com\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Projects\",\"item\":\"https:\/\/ranaghazzi.com\/?page_id=30\"},{\"@type\":\"ListItem\",\"position\":4,\"name\":\"IBM Stocks ETL \u2013 Databricks\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/ranaghazzi.com\/#website\",\"url\":\"https:\/\/ranaghazzi.com\/\",\"name\":\"Rana Nasri Ghazzi\",\"description\":\"Turning Data into Decisions\",\"publisher\":{\"@id\":\"https:\/\/ranaghazzi.com\/#\/schema\/person\/d8ee34f53cb0df9faaf816fb5363a4cc\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/ranaghazzi.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\/\/ranaghazzi.com\/#\/schema\/person\/d8ee34f53cb0df9faaf816fb5363a4cc\",\"name\":\"Rana Ghazzi\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/ranaghazzi.com\/wp-content\/uploads\/2025\/11\/logo.png\",\"url\":\"https:\/\/ranaghazzi.com\/wp-content\/uploads\/2025\/11\/logo.png\",\"contentUrl\":\"https:\/\/ranaghazzi.com\/wp-content\/uploads\/2025\/11\/logo.png\",\"width\":1024,\"height\":1024,\"caption\":\"Rana Ghazzi\"},\"logo\":{\"@id\":\"https:\/\/ranaghazzi.com\/wp-content\/uploads\/2025\/11\/logo.png\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"IBM Stocks ETL \u2013 Databricks - Rana Nasri Ghazzi","description":"Browse real-world data projects by Rana Ghazzi, covering data cleaning, analysis, and storytelling with Python, SQL, and Tableau Explore Rana Ghazzi's data analytics portfolio \u2014 dashboards, visualizations, and insights built with Tableau, Power BI & Python","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/ranaghazzi.com\/?page_id=1196","og_locale":"en_US","og_type":"article","og_title":"IBM Stocks ETL \u2013 Databricks - Rana Nasri Ghazzi","og_description":"Browse real-world data projects by Rana Ghazzi, covering data cleaning, analysis, and storytelling with Python, SQL, and Tableau Explore Rana Ghazzi's data analytics portfolio \u2014 dashboards, visualizations, and insights built with Tableau, Power BI & Python","og_url":"https:\/\/ranaghazzi.com\/?page_id=1196","og_site_name":"Rana Nasri Ghazzi","article_modified_time":"2026-04-17T18:28:24+00:00","twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/ranaghazzi.com\/?page_id=1196","url":"https:\/\/ranaghazzi.com\/?page_id=1196","name":"IBM Stocks ETL \u2013 Databricks - Rana Nasri Ghazzi","isPartOf":{"@id":"https:\/\/ranaghazzi.com\/#website"},"datePublished":"2026-01-03T20:55:12+00:00","dateModified":"2026-04-17T18:28:24+00:00","description":"Browse real-world data projects by Rana Ghazzi, covering data cleaning, analysis, and storytelling with Python, SQL, and Tableau Explore Rana Ghazzi's data analytics portfolio \u2014 dashboards, visualizations, and insights built with Tableau, Power BI & Python","breadcrumb":{"@id":"https:\/\/ranaghazzi.com\/?page_id=1196#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ranaghazzi.com\/?page_id=1196"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/ranaghazzi.com\/?page_id=1196#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/ranaghazzi.com\/"},{"@type":"ListItem","position":2,"name":"Home","item":"https:\/\/ranaghazzi.com\/"},{"@type":"ListItem","position":3,"name":"Projects","item":"https:\/\/ranaghazzi.com\/?page_id=30"},{"@type":"ListItem","position":4,"name":"IBM Stocks ETL \u2013 Databricks"}]},{"@type":"WebSite","@id":"https:\/\/ranaghazzi.com\/#website","url":"https:\/\/ranaghazzi.com\/","name":"Rana Nasri Ghazzi","description":"Turning Data into Decisions","publisher":{"@id":"https:\/\/ranaghazzi.com\/#\/schema\/person\/d8ee34f53cb0df9faaf816fb5363a4cc"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ranaghazzi.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/ranaghazzi.com\/#\/schema\/person\/d8ee34f53cb0df9faaf816fb5363a4cc","name":"Rana Ghazzi","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/ranaghazzi.com\/wp-content\/uploads\/2025\/11\/logo.png","url":"https:\/\/ranaghazzi.com\/wp-content\/uploads\/2025\/11\/logo.png","contentUrl":"https:\/\/ranaghazzi.com\/wp-content\/uploads\/2025\/11\/logo.png","width":1024,"height":1024,"caption":"Rana Ghazzi"},"logo":{"@id":"https:\/\/ranaghazzi.com\/wp-content\/uploads\/2025\/11\/logo.png"}}]}},"_hostinger_reach_plugin_has_subscription_block":false,"_hostinger_reach_plugin_is_elementor":false,"_links":{"self":[{"href":"https:\/\/ranaghazzi.com\/index.php?rest_route=\/wp\/v2\/pages\/1196","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ranaghazzi.com\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/ranaghazzi.com\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/ranaghazzi.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/ranaghazzi.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1196"}],"version-history":[{"count":121,"href":"https:\/\/ranaghazzi.com\/index.php?rest_route=\/wp\/v2\/pages\/1196\/revisions"}],"predecessor-version":[{"id":3456,"href":"https:\/\/ranaghazzi.com\/index.php?rest_route=\/wp\/v2\/pages\/1196\/revisions\/3456"}],"up":[{"embeddable":true,"href":"https:\/\/ranaghazzi.com\/index.php?rest_route=\/wp\/v2\/pages\/30"}],"wp:attachment":[{"href":"https:\/\/ranaghazzi.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1196"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}