Building an ETL Pipeline to Load Data Incrementally from Office365 to S3 using ADF and Databricks

Written by yi | Published 2021/11/19
Tech Story Tags: databricks | delta-lake | data-factory | data-pipeline | pyspark | coding | hackernoon-top-story | tutorial

TLDRIn this post, we will look at creating an Azure data factory with a pipeline that loads Office 365 event data incrementally based on change data capture (CDC) information in the source of Change Data Feed(CDF) of a Delta lake table to an AWS S3 bucket. What we’ll cover: Create an ADF Pipeline that loads Calendar events from Offfice365 to a Blob container. Run a Databricks Notebook with the activity in the ADF pipeline, transform extracted Calendar event and merge to a Delta Lake table.via the TL;DR App

no story

Written by yi | A Technology Enthusiast
Published by HackerNoon on 2021/11/19