ML-Based Batch Estimation for Dynamic Pricing in Same-Day Delivery Services

Five years ago, same-day delivery felt like a luxury. Today, it’s a baseline expectation. Big Tech — and the machine learning boom it triggered — quietly rewired consumer habits: waiting until tomorrow is no longer an option. The market continues to expand at double-digit rates, and nearly every leap in efficiency now comes from ML models that juggle demand, routing, and pricing. Picture the busiest day of a sales season. Express delivery (“in 30–45 minutes”) already relies on well-tested algorithms: surcharges rise and fall with spikes in demand, traffic, and weather. But the moment a shopper clicks “deliver by 6 p.m. today,” most platforms offer a discount based on gut feel — without actually accounting for how many courier hours that choice saves. Paradoxically, neither Amazon nor Uber has published an end-to-end approach to pricing these deferred delivery windows — even though they hold the richest margin upside. Across the delivery market, dynamic pricing is a basic market requirement — but most public articles focus on real-time supply–demand balancing (like surge or “busy” fees). Instacart even admits it prices delivery windows differently to shift demand, while Uber Eats documents a Busy Area Fee when orders outstrip courier supply. admits admits documents documents For deferred same-day windows, retailers do signal that flexibility is cheaper — but none of them reveal how the math actually works. Sainsbury’s promotes cheaper 4-hour Saver slots, Tesco runs Flexi-saver pricing that varies by postcode, day, and time window, Ocado links delivery charges to slot availability, and Amazon offers Amazon Day (free scheduled consolidation) while historically running Prime Now as free 2-hour vs. paid 1-hour. Saver slots Saver slots Flexi-saver Flexi-saver slot availability slot availability Amazon Day Amazon Day running running It clearly shows that longer windows benefit retailers. Yet the market still lacks an end-to-end solution that converts batching efficiency and courier-hour savings into the discount customers actually see. It clearly shows that longer windows benefit retailers. Yet the market still lacks an end-to-end solution that converts batching efficiency and courier-hour savings into the discount customers actually see. That’s the gap this piece tackles: linking the width of a same-day window to the predicted Supply-Hour Saving and turning it into a clear, fair price signal. This article distils the key ideas behind a solution I built to address the problem. The goal is to outline the framework, not describe any one company’s system. The approach links delivery-window width to the expected Supply Hour Economy (SH economy) and turns it into a price signal that’s fair for shoppers, couriers, and the platform. This article distils the key ideas behind a solution I built to address the problem. The goal is to outline the framework, not describe any one company’s system. The approach links delivery-window width to the expected Supply Hour Economy (SH economy) and turns it into a price signal that’s fair for shoppers, couriers, and the platform. I’m sharing my approach to give ML engineers and last-mile delivery practitioners a concrete reference point — something practical to help move the field forward faster. I’m sharing my approach to give ML engineers and last-mile delivery practitioners a concrete reference point — something practical to help move the field forward faster. Batching as a Key to Efficiency Batching as a Key to Efficiency At a high level, a marketplace has two core components: Demand — delivery requests created by users in real time. Supply — the couriers who fulfil those requests. Demand — delivery requests created by users in real time. Demand Supply — the couriers who fulfil those requests. Supply In the simplest case, each courier carries just one order at a time. But in peak periods, demand often far exceeds supply. Many orders arrive within a short span, so assignments have to be made more efficiently (see picture 1). But what does “efficient” really mean? It depends on how we define it. Here, efficiency is the number of delivery hours completed divided by the number of courier hours available: How can we increase that efficiency? There are multiple strategies — but one of the most important is batching. Batching means a courier delivers multiple orders in a single run. This system property significantly boosts efficiency. batching Understanding Batching Delivery Logic Understanding Batching Delivery Logic Let’s walk through a simple example. Suppose we have two delivery requests. Each request can be represented as a pair of points: point A (pickup) and point B (drop-off). For each request, demand hours can be estimated — the time required to travel from A to B, factoring in traffic, weather, and other conditions. These are the demand-side components. What are our options? One approach is to assign a separate courier to each request. In this case, it requires courier time for both the actual delivery (A to B) and the approach time — the time it takes a courier to reach point A. This works, but it’s not very efficient. approach time Now consider something smarter: assign one courier to serve both orders. For instance, the courier first travels to point A of the second request, picks it up, then goes to the drop-off point of the first request, delivers it, and finally delivers the second one. both orders In this scenario, the courier follows the green route shown on the right. Using both demand and supply components — and the efficiency formula from the previous section — we can compute the batching efficiency. demand supply The intuition should be clear by now: batching is a powerful lever for improving efficiency. batching is a powerful lever for improving efficiency. SH Economy as the Principal Metric SH Economy as the Principal Metric However, it's better not to evaluate batching quality using the overall efficiency metric. Efficiency is influenced by numerous external factors, so a measure that depends solely on batch properties is preferable. Courier-time savings — or SH economy (Supply Hour economy) — serve this purpose, capturing the reduction in supply hours attributable to batching. A batch is a set of orders delivered simultaneously. SH economy is defined as the difference between the courier time required if the orders were handled independently and the time actually spent when the courier delivers the entire batch, expressed in relative form: Normalizing by the independent-delivery total yields a dimensionless ratio that can be easily compared across batches of any size or geography. Returning to the earlier two-order example: Independent deliveries: 33 minutes of courier time Independent deliveries: Batched route: 26 minutes Batched route: An SH economy of 0.21 indicates a 21% reduction in supply hours achieved purely through batching. This provides a clean, batch-specific indicator, unaffected by external fluctuations in the wider delivery network. SH economy of 0.21 21% reduction in supply hours a clean, batch-specific indicator, Increasing Batching Efficiency with Flexible-Window Delivery Increasing Batching Efficiency with Flexible-Window Delivery Batching efficiency improves when express service is complemented by flexible-window delivery. Express requests trigger immediate courier assignment, which minimizes customer wait time — but gives the platform no time to find batching opportunities. immediate courier assignment In contrast, a flexible window — whether 30 minutes, two hours, or any other same-day interval — widens the delivery time range and allows the request to remain in the queue for a short period. During this pause, there’s a higher chance that a nearby order will arrive, allowing the two to be combined into a single batch and assigned to one courier. a flexible window The larger the pool of pairable requests, the higher the probability of successful batching. This leads to fewer supply hours, better use of courier time, and ultimately — a lower price for the customer. pairable requests, In practice, wider delivery windows expand the pool of pairable requests, raising batching probability and, in turn, the SH economy. Pricing Flexible-Window Delivery Pricing Flexible-Window Delivery Setting a fair price for flexible-window delivery is a key product challenge. Most marketplaces already operate a dynamic pricing engine for express delivery. In such systems, the cost per order (CPO) can be represented as two multiplicative components: a dynamic pricing engine the cost per order (CPO) CPO_base — the dynamic base price, which already accounts for real-time signals such as demand spikes, courier availability, traffic, weather, and other operational factors. CPO_base CPO_economy(window_width) — a discount factor tied to the width of the delivery window. As the interval widens, this component reduces the final price to reflect courier-time savings achieved via batching. CPO_economy(window_width) The core task is to learn the CPO_economy(window_width) function — while keeping the existing CPO_base(·) untouched, since the express delivery pricing is already working as intended. Quick Example Quick Example Quick Example Let’s walk through a simple calculation. Let’s walk through a simple calculation. Assume a minute of courier time costs $0.20. Delivering two orders independently takes 33 minutes, so the cost per order is: Assume a minute of courier time costs $0.20. Delivering two orders independently takes 33 minutes, so the cost per order is: **==33 × $0.20 = $6.60==** **==33 × $0.20 = $6.60==** Now, if a wider delivery window allows the two orders to be batched, courier time drops to 26 minutes, bringing the cost down to: Now, if a wider delivery window allows the two orders to be batched, courier time drops to 26 minutes, bringing the cost down to: **==26 × $0.20 = $5.20==** **==26 × $0.20 = $5.20==** The SH economy in this case is: The SH economy in this case is: **==(33 − 26) / 33 ≈ 0.21==** **==(33 − 26) / 33 ≈ 0.21==** This results in a 21% reduction in supply hours — and the customer pays roughly 21% less. This results in a 21% reduction in supply hours — and the customer pays roughly 21% less. Thus, pricing flexible-window options reduces to an analytical problem: determining the value of SH economy. pricing flexible-window options reduces to an analytical problem: Since prices must be shown to the customer up front, for several same-day delivery windows, the courier-time savings can’t be calculated after the fact — they need to be predicted in advance. up front, the courier-time savings can’t be calculated after the fact predicted in advance. From Baseline to ML-Based SH Economy Forecasts From Baseline to ML-Based SH Economy Forecasts A natural starting point is a constant forecast: use the historical average the SH economy as the predicted savings for every request. This baseline is weak. The actual distribution has heavy tails on both sides and a spike at zero — coming from orders in low-density areas that can never be batched. average the SH economy heavy tails zero Relying on a single average ignores these patterns and leaves money on the table. Why ML is Needed Why ML is Needed Improving prediction quality requires an ML-based forecast. But introducing machine learning into pricing isn't trivial: any predicted discount becomes a real commitment shown to the customer at checkout. This means the model must balance predictive power with high reliability across all customer segments. a real commitment predictive power high reliability Three key challenges typically arise when building such a system. Three key challenges Defining the target Defining the target The first and most obvious question is: what exactly should the model learn to predict? what exactly should the model learn to predict? Consider the SH economy formula. It compares two quantities: Independent delivery: how many courier minutes would be spent if each order in the batch were handled by a separate courier? **Batched delivery:**how many minutes were actually spent when those orders were delivered together in a batch? Independent delivery: how many courier minutes would be spent if each order in the batch were handled by a separate courier? Independent delivery: **Batched delivery:**how many minutes were actually spent when those orders were delivered together in a batch? The second value comes from real logs (with some cleanup).The first is counterfactual —it never happened, so it has to be forecast. it never happened, forecast. Even in a simple example, solo delivery includes: Approach time to the pickup point, A → B travel time, and in production: wait time at pickup, hand-off time, queueing, etc. Approach time to the pickup point, A → B travel time, and in production: wait time at pickup, hand-off time, queueing, etc. Each of these elements is a separate forecast — and every forecast can be biased. To avoid cumulative distortion, the total must be bias-corrected so that the predicted solo time matches the observed average in historical data. bias-corrected predicted solo time Where: Where: Sheconomy(batch) — occurs in reality SH(order i) — forecasts SH(batch) — actual Sheconomy(batch) — occurs in reality occurs in reality SH(order i) — forecasts forecasts SH(batch) — actual actual Another complication is that pricing is done per order, but the SH economy is defined per batch. So we need to allocate the economy across orders. A naive split by demand hours isn't ideal — because customers pay for courier time, not distance. In practice, we allocate SH economy proportionally to predicted supply hours per order — the very quantities the model just forecasted. per order, per batch. allocate the economy across orders. customers pay for courier time, proportionally to predicted supply hours per order This transformation creates the final per-order training target, which aggregates back to the true batch-level SH economy after delivery. Volatility & drift Volatility & drift The second obstacle is the target’s natural volatility. When the average SH economy is plotted over an extended period, the curve is anything but stationary: a sudden dip appears in December (Holiday), and broader seasonal waves mark early summer lulls and year-end peaks. Such drift makes a plain “mean” model useless. To stabilise the predictions, the feature set is split into two. Content features depend only on the offer itself: the chosen delivery window, time of day, day of week, and geography. Weekend request patterns differ sharply from weekdays, and order density in a big city is unlike that in a smaller million-plus city, so these categorical signals absorb recurring seasonality. Complementing them are statistical features that describe the marketplace’s current state: rolling densities of demand and supply, plus carefully crafted target-like features — short-term averages of recent SH economy. The latter are especially informative but must be handled with care; otherwise, the model risks leaking future information. Proper lagging and windowing keep these target-like signals safe while still letting the predictor react to real-time shifts in batching efficiency. “Content” features – depend only on the offer itself “Content” features – depend only on the offer itself chosen delivery window time of day/day of week geography chosen delivery window chosen delivery window chosen delivery window time of day/day of week time of day/day of week time of day/day of week geography geography geography “Statistical” features – describe the current marketplace state “Statistical” features – describe the current marketplace state rolling demand density rolling supply density carefully lagged target-like features (recent averages of the SH economy) rolling demand density rolling demand density rolling supply density rolling supply density carefully lagged target-like features (recent averages of the SH economy) carefully lagged target-like features (recent averages of the SH economy) Feedback loops left uncontrolled Feedback loops left uncontrolled The model’s own predictions reshape the target it is asked to predict. Raise the expected SH economy, and the displayed price for flexible-window delivery falls. A lower price persuades more customers to pick the slower option, which in turn increases batching and boosts the actual SH economy. The distribution of the target keeps drifting. A complex, hand-built correction model could try to subtract this self-influence, but any fixed formula would soon become inaccurate. Instead, a simpler and more robust approach is used: Continuous retraining: Models are refreshed on the latest data at short, regular intervals, letting them absorb the shifting mix of slow-window orders. Distribution guardrails: Each freshly trained model is compared with the current one. If the prediction distribution has not moved much, deployment is decided on offline metrics alone; if the shift is large, a full online A/B test is run before rollout. Continuous retraining: Models are refreshed on the latest data at short, regular intervals, letting them absorb the shifting mix of slow-window orders. Continuous retraining: Distribution guardrails: Each freshly trained model is compared with the current one. If the prediction distribution has not moved much, deployment is decided on offline metrics alone; if the shift is large, a full online A/B test is run before rollout. Distribution guardrails: This cycle keeps the pricing system stable even while customer behaviour — and therefore the SH economy target — continually evolves. Conclusion Conclusion Same-day last-mile delivery can’t be optimized just by nudging couriers or tweaking surge multipliers. The real unlock is batching. batching. Once delivery windows were linked to the expected Supply-Hour Economy (SH economy) — the relative courier-time saving that a batch unlocks — dynamic pricing became a machine learning problem: forecast SH economy for every offer and translate that saving into a fair discount. Supply-Hour Economy (SH economy) dynamic pricing became a machine learning problem: Three technical hurdles shaped the solution: Three technical hurdles shaped the solution: Turn a batch metric into an order target.Predict the solo courier minutes for each request, then allocate the future batch’s savings back to its orders in proportion to those minutes. Stabilise a volatile target.Absorb seasonality and daily shocks with a two-tier feature set: stable content-based signals (window length, time, geography) and rolling marketplace statistics (demand, supply, lagged SH economy). Control the feedback loop. Continuous retraining keeps the model aligned with the behaviors it creates, while distribution guardrails trigger A/B testing when predictions drift. Two more layers — robust experimentation and the dynamic strategies of large shippers — work behind the scenes to keep the system stable and trustworthy in production. Public, end-to-end cases of flexible-window pricing remain rare. Sharing real-world lessons helps the industry move faster toward the friction-free, on-demand marketplaces tomorrow’s customers will take for granted. Turn a batch metric into an order target.Predict the solo courier minutes for each request, then allocate the future batch’s savings back to its orders in proportion to those minutes. Turn a batch metric into an order target.Predict the solo courier minutes for each request, then allocate the future batch’s savings back to its orders in proportion to those minutes. Turn a batch metric into an order target. Stabilise a volatile target.Absorb seasonality and daily shocks with a two-tier feature set: stable content-based signals (window length, time, geography) and rolling marketplace statistics (demand, supply, lagged SH economy). Stabilise a volatile target.Absorb seasonality and daily shocks with a two-tier feature set: stable content-based signals (window length, time, geography) and rolling marketplace statistics (demand, supply, lagged SH economy). Stabilise a volatile target. Control the feedback loop. Continuous retraining keeps the model aligned with the behaviors it creates, while distribution guardrails trigger A/B testing when predictions drift. Control the feedback loop. Continuous retraining keeps the model aligned with the behaviors it creates, while distribution guardrails trigger A/B testing when predictions drift. Control the feedback loop. Two more layers — robust experimentation and the dynamic strategies of large shippers — work behind the scenes to keep the system stable and trustworthy in production. Public, end-to-end cases of flexible-window pricing remain rare. Sharing real-world lessons helps the industry move faster toward the friction-free, on-demand marketplaces tomorrow’s customers will take for granted. Two more layers — robust experimentation and the dynamic strategies of large shippers — work behind the scenes to keep the system stable and trustworthy in production. robust experimentation the dynamic strategies of large shippers Public, end-to-end cases of flexible-window pricing remain rare. Sharing real-world lessons helps the industry move faster toward the friction-free, on-demand marketplaces tomorrow’s customers will take for granted. the friction-free, on-demand marketplaces