Custom Email Notifications for Databricks Pipeline Failures

Written by hacker95231466 | Published 2025/09/29
Tech Story Tags: databricks | azure-app-service | smtp-services | graph-api | python-development | databricks-pipeline-failures | databricks-email-notifications | email-for-databricks

TLDRThis blog walks through approaches to implement custom notifications using SMTP, SendGrid, Azure Logic Apps, and Microsoft Graph API. Using SMTP with Python inside a Databricks notebook, you can generate an Excel report and send it via email whenever a pipeline fails.via the TL;DR App

When working with Databricks pipelines and workflows, failures are inevitable. While Databricks provides built-in notifications for job failures, these alerts are often not customizable and may not fit specific reporting or formatting needs. A more flexible and cost-effective approach is to set up custom email notifications that include pipeline details and error messages in a structured format, such as an Excel attachment.

This blog walks through approaches to implement custom notifications using SMTP, SendGrid, Azure Logic Apps, and Microsoft Graph API.

Why Custom Notifications?

  • Flexible formatting: Include pipeline metadata, error messages, and runtime details.
  • Attachments: Share structured reports (Excel, CSV, etc.) instead of plain text.
  • Cost efficiency: Avoid additional third-party monitoring solutions.
  • Integration options: Easily plug into existing email infrastructure.

Approach 1: SMTP-Based Notifications

Using SMTP with Python inside a Databricks notebook, you can generate an Excel report and send it via email whenever a pipeline fails.

Example Implementation

import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.base import MIMEBase
from email.mime.text import MIMEText
from email import encoders
from io import BytesIO
import pandas as pd

#Sample pipeline history
df = spark.createDataFrame([ ('pipeline1', 'success', '7min'), ('pipeline1', 'fail', '3min'), ('pipeline1', 'success', '10min') ], ["PipelineName", "Status", "Duration"])

# Convert DataFrame to Excel
output = BytesIO() 
with pd.ExcelWriter(output, engine='xlsxwriter') as writer: 
df_pd = df.toPandas() 
df_pd.to_excel(writer, index=False, sheet_name='Sheet1') 
workbook = writer.book 
worksheet = writer.sheets['Sheet1']
# Apply formatting
header_format = workbook.add_format({
    'bold': True,
    'bg_color': '#FFF00',
    'border': 1,
    'align': 'center',
    'valign': 'vcenter'
})
for col_num, value in enumerate(df_pd.columns):
    worksheet.write(0, col_num, value.upper(), header_format)

cell_format = workbook.add_format({'border': 1})
for row in range(1, len(df_pd) + 1):
    for col in range(len(df_pd.columns)):
        worksheet.write(row, col, df_pd.iloc[row-1, col], cell_format)

for i, col in enumerate(df_pd.columns):
    worksheet.set_column(i, i, 20)
output.seek(0)

# Email configuration
sender = "[email protected]"
receiver = "[email protected]"
subject = "Pipeline Execution Report"
body = """Hello Team,

Please find the attachment of the latest pipeline report.

Thanks,
Pipeline Team"""

msg = MIMEMultipart()
msg['From'] = sender
msg['To'] = receiver
msg['Subject'] = subject
msg.attach(MIMEText(body, 'plain'))

part = MIMEBase('application', 'vnd.openxmlformats-officedocument.spreadsheetml.sheet')
part.set_payload(output.read())
encoders.encode_base64(part)
part.add_header('Content-Disposition', 'attachment; filename="pipeline_report.xlsx"')
msg.attach(part)

smtp_server = "smtp.office.com"
smtp_port = 587

with smtplib.SMTP(smtp_server, smtp_port) as server:
    server.starttls()
    server.login(sender, "sender_password")
    server.send_message(msg)

print("Email sent successfully with Excel attachment")

Scheduling Notifications

You can automate the notification trigger by scheduling the notebook:

Option 1: Databricks Jobs

  • Create or edit a Databricks job.
  • Add a task dependency so the notification script runs only if the previous task fails.
  • This ensures error details are captured and reported immediately.

Option 2: Azure Logic Apps

  • Configure a Logic App that listens for pipeline failures.
  • Pass pipeline details and attachments via an API call in JSON format.
  • Logic Apps handle email delivery and retry mechanisms.

Conclusion

While Databricks provides basic failure notifications, extending them with custom SMTP or Logic App workflows ensures:

  • Rich, formatted reports.
  • Team visibility with detailed context.
  • Seamless integration with enterprise communication tools.

This approach is cost-effective, scalable, and easily adaptable for large-scale pipeline monitoring.


Written by hacker95231466 | Healthcare Architect. Develop applications in C#.Net. Java , Python ,Typescript & SQL in both cloud native and on Prem servers.
Published by HackerNoon on 2025/09/29