Quartz is an open-source job scheduling framework that can be integrated into a wide variety of Java applications. It is used to schedule jobs that can be executed at a later time or on a repeating basis.
Quartz is designed to be a simple and powerful scheduling framework that is easy to use and configure. But there is no built-in feature in the Quartz framework that allows for the automatic retrying of jobs that encounter exceptions during their execution.
This article is inspired by Harvey Delaney’s blog post about Job Exception Retrying in Quartz.NET. Here we consider using Quartz in a Spring application and implement an exponential random backoff policy that reduces the number of retry attempts, prevents overloading the system, allows the system to recover slowly over time, and distributes retry attempts evenly.
We will use JobDataMap
-s that are associated with JobDetail
and Trigger
objects of our job. JobDetail
’s JobDataMap
will store exponential random backoff policy parameters MAX_RETRIES
, RETRY_INITIAL_INTERVAL_SECS
, and RETRY_INTERVAL_MULTIPLIER
. JobDataMap associated with Trigger will store RETRY_NUMBER
.
Now let’s add a Spring component JobFailureHandlingListener
that implements the org.quartz.JobListener
interface. JobListener#jobToBeExecuted
is called when the job is about to be executed and JobListener#jobWasExecuted
is called after job completion. In jobToBeExecuted
we will increment the RETRY_NUMBER
counter. jobWasExecuted
will contain the main code of exception handling and creating a new trigger for the next execution of a job.
Exponential random backoff is an algorithm used to progressively introduce delays in retrying an operation that has previously failed. The idea is to increase the delay between retries in an exponential manner, with each delay being chosen randomly from a range of possible values. This helps to avoid the scenario where multiple devices or systems all attempt to retry an operation at the same time, which can further exacerbate the problem that caused the operation to fail in the first place.
Exponential random backoff is often used in distributed systems that need to retry operations that have failed. For example, it might be used in a distributed database to retry transactions that fail due to conflicts with other transactions or in a distributed file system to retry operations that fail due to network errors or the temporary unavailability of servers.
We will use an exponential random backoff algorithm to calculate the time of the next attempt to start the job.
package quartzdemo.listeners;
import org.quartz.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.util.Date;
@Component
public class JobFailureHandlingListener implements JobListener {
private final String RETRY_NUMBER_KEY = "RETRY_NUMBER";
private final String MAX_RETRIES_KEY = "MAX_RETRIES";
private final int DEFAULT_MAX_RETRIES = 5;
private final String RETRY_INITIAL_INTERVAL_SECS_KEY = "RETRY_INITIAL_INTERVAL_SECS";
private final int DEFAULT_RETRY_INITIAL_INTERVAL_SECS = 60;
private final String RETRY_INTERVAL_MULTIPLIER_KEY = "RETRY_INTERVAL_MULTIPLIER";
private final double DEFAULT_RETRY_INTERVAL_MULTIPLIER = 1.5;
private final Logger logger = LoggerFactory.getLogger(getClass());
@Override
public String getName() {
return "FailJobListener";
}
@Override
public void jobToBeExecuted(JobExecutionContext context) {
context.getTrigger().getJobDataMap().merge(RETRY_NUMBER_KEY, 1,
(oldValue, initValue) -> ((int) oldValue) + 1);
}
@Override
public void jobExecutionVetoed(JobExecutionContext context) { }
@Override
public void jobWasExecuted(JobExecutionContext context, JobExecutionException jobException) {
if (context.getTrigger().getNextFireTime() != null || jobException == null) {
return;
}
int maxRetries = (int) context.getJobDetail().getJobDataMap()
.computeIfAbsent(MAX_RETRIES_KEY, key -> DEFAULT_MAX_RETRIES);
int timesRetried = (int) context.getTrigger().getJobDataMap().get(RETRY_NUMBER_KEY);
if (timesRetried > maxRetries) {
logger.error("Job with ID and class: " + context.getJobDetail().getKey() +", " + context.getJobDetail().getJobClass() +
" has run " + maxRetries + " times and has failed each time.", jobException);
return;
}
TriggerKey triggerKey = context.getTrigger().getKey();
int initialIntervalSecs = (int) context.getJobDetail().getJobDataMap()
.computeIfAbsent(RETRY_INITIAL_INTERVAL_SECS_KEY, key -> DEFAULT_RETRY_INITIAL_INTERVAL_SECS);
double multiplier = (double) context.getJobDetail().getJobDataMap()
.computeIfAbsent(RETRY_INTERVAL_MULTIPLIER_KEY, key -> DEFAULT_RETRY_INTERVAL_MULTIPLIER);
Date newStartTime = ExponentialRandomBackoffFixtures.getNextStartDate(timesRetried, initialIntervalSecs, multiplier);
Trigger newTrigger = TriggerBuilder.newTrigger()
.withIdentity(triggerKey)
.startAt(newStartTime)
.usingJobData(context.getTrigger().getJobDataMap())
.build();
newTrigger.getJobDataMap().put(RETRY_NUMBER_KEY, timesRetried);
try {
context.getScheduler().rescheduleJob(triggerKey, newTrigger);
} catch (SchedulerException e) {
throw new RuntimeException(e);
}
}
}
package quartzdemo.listeners;
import org.apache.commons.lang3.RandomUtils;
import java.time.Duration;
import java.time.LocalDateTime;
import java.time.ZoneId;
import java.util.Date;
public class ExponentialRandomBackoffFixtures {
public static Date getNextStartDate(int timesRetried, int initialIntervalSecs, double multiplier) {
double minValue = initialIntervalSecs * Math.pow(multiplier, timesRetried - 1);
double maxValue = minValue * multiplier;
Duration duration = Duration.ofMillis((long) (RandomUtils.nextDouble(minValue, maxValue) * 1000));
LocalDateTime nextDateTime = LocalDateTime.now().plus(duration);
return Date.from(nextDateTime.atZone(ZoneId.systemDefault()).toInstant());
}
}
Now we can add the listener to a scheduler
package quartzdemo;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.scheduling.annotation.EnableScheduling;
import org.springframework.scheduling.quartz.SchedulerFactoryBean;
import quartzdemo.listeners.JobFailureHandlingListener;
@Configuration
@EnableScheduling
public class SchedulingConfiguration {
private final JobFailureHandlingListener jobFailureHandlingListener;
public SchedulingConfiguration2(JobFailureHandlingListener jobFailureHandlingListener) {
this.jobFailureHandlingListener = jobFailureHandlingListener;
}
@Bean
public SchedulerFactoryBean scheduler() {
SchedulerFactoryBean schedulerFactory = new SchedulerFactoryBean();
// ...
schedulerFactory.setGlobalJobListeners(jobFailureHandlingListener);
return schedulerFactory;
}
}
and run a job with a custom RETRY_INITIAL_INTERVAL_SECS
parameter
package quartzdemo.services;
import org.quartz.*;
import org.springframework.stereotype.Service;
import quartzdemo.jobs.SimpleJob;
import java.time.LocalDateTime;
import java.time.ZoneId;
import java.util.Date;
import static quartzdemo.listeners.JobFailureHandlingListener.RETRY_INITIAL_INTERVAL_SECS_KEY;
@Service
public class SchedulingService {
private final Scheduler scheduler;
public SchedulingService(Scheduler scheduler) {
this.scheduler = scheduler;
}
// ...
private void scheduleJob() throws SchedulerException {
Date afterFiveSeconds = Date.from(LocalDateTime.now().plusSeconds(5).atZone(ZoneId.systemDefault()).toInstant());
JobDetail jobDetail = JobBuilder.newJob(SimpleJob.class).usingJobData(RETRY_INITIAL_INTERVAL_SECS_KEY, 30).build();
Trigger trigger = TriggerBuilder.newTrigger().startAt(afterFiveSeconds).usingJobData("", "").build();
scheduler.scheduleJob(jobDetail, trigger);
}
}
Quartz is a robust framework that provides a wide range of capabilities. Even if it lacks something, it is always possible to implement the necessary functionality using the provided API. This allows developers to extend the framework to meet their specific requirements, so Quartz is a great choice for any scheduling needs.