Ad budget pacing distributes a campaign’s daily budget evenly across its eligible time window. Advertisers choose it when they want stable spend across the day rather than burning through the budget early.

The original approach applied the same rule to every campaign. Regardless of campaign characteristics, each one started from the same line, and the exposure probability was adjusted uniformly based on cumulative spend. It worked on average, but it broke wherever campaign characteristics varied sharply. Some campaigns burned through the budget early; others hit the initial cap and never fully spent.

A single rule applied uniformly was the limit. I redesigned this feature as a two-layer control loop — per-campaign learning and real-time correction.

Limits of a Single Rule

The flaw was clear. Every campaign started from the same line. Campaign characteristics live as a distribution, not an average, and they shift by interval and by environment. A single rule that assumes the average breaks at the ends of that distribution — neither the fast spenders nor the under-spenders are absorbed by the same rule.

Applying the same probability to every campaign means ignoring what each one actually is.

Two-Layer Structure

Solving this inside a single layer ran into a contradiction. React quickly and exposure becomes jittery; react slowly and budget misses pile up. Frequent probability swings looked unstable to advertisers, while only large-scale adjustments could not absorb the in-between traffic variance.

So I split it into two layers.

Slow controller. Analyzes a campaign’s recent spend pattern over a longer interval and derives a baseline for the next one.

Fast controller. Compares expected and actual spend at a shorter interval and absorbs the residual drift.

The slow controller hypothesizes the pace each campaign should follow. The fast controller checks whether the actual pace strays from that hypothesis. The two controllers’ time scales divided the responsibility naturally.

Baseline and correction, separated

The point is correction magnitude. If the slow controller fails to set the baseline, the fast controller has to swing the probability hard at every step. The more accurate the baseline, the smaller the correction. Small corrections do not hurt exposure stability while still absorbing short-term shocks.

The baseline itself is measurement-driven — the pace observed in the previous interval becomes the seed for the next. A simple measure-apply-measure loop.

A familiar trap follows. When the measurement is missing. Some campaigns had too little exposure in the previous interval to register a measurement at all. Without one, the correction has to come from outside as a default seed — the same shape as cold-start in general control systems, not a quirk of ad pacing.

After

Comparing the same campaigns over the same intervals before and after, three changes stood out. Initial spend concentration eased, the exposure spike at the top of each interval subsided, and under-spending campaigns saw their fill rate rise. The three changes are different faces of the same cause — moving from a fixed rule to per-campaign learning.

Retrospective

The biggest decision in this work was the split itself. Doing learning and correction inside a single layer would have made the probability jitter at every step; doing only large-scale adjustments would have missed the short-term shocks. Separating two signals with different time scales into two controllers turned out to be the natural division of responsibility.

It was interesting to see how cleanly control-loop language fits the ad domain. Problems familiar from control engineering — cold-start, integrator windup, missing measurements — appeared in ad pacing in the same shape. Where the same problem keeps reappearing in different domains, borrowing the established vocabulary keeps the thinking stable.

Next time something with a similar grain shows up — measurement signals running at two or more time scales — splitting into layers from the start is where I would begin.