Cool Problem
During the summer, I came across a specific flavor of problem that I could not solve initially, but with some help, I eventually became comfortable with it. Fun fact: this question was also asked in a final-round interview at Squarepoint.
What made this problem particularly interesting to me was that it looked simple on the surface, but quickly forced you to think carefully about stopping times, conditioning, and how to structure expectations. At first glance, the most natural instinct is to brute force it—and that’s exactly where I started.
Problem Statement
Let be i.i.d. random variables and let
Find .
In words, we keep drawing uniform random variables until their cumulative sum exceeds , and we want to know how many draws this takes on average.
First Approach: Brute Force via Tail Probabilities
Since is a discrete random variable taking positive integer values, a natural starting point is to express its expectation in terms of tail probabilities:
To make progress, we need to understand what the event actually means. Observing the definition of , we see that
So the problem reduces to computing probabilities involving partial sums of uniform random variables.
Define the event
Our goal is to compute in a form that is easy to work with.
Recursive Formulation
We can write this probability as a volume of a region in :
Let . Rather than computing this integral directly for each , we look for a recursive structure.
Conditioning on the last variable , the event requires
This gives the recursion
To anchor the recursion, note that since ,
Using this,
At this point, a clear pattern emerges. By induction,
Computing
We can now return to the expression we started with. Using the fact that ,
Substituting the closed-form expression for ,
This is nothing but the Taylor expansion of , and hence
While this approach works nicely, it does rely on explicitly understanding the geometry of these probability regions. Fortunately, there is a cleaner way.
Second Approach: Conditioning on the First Variable
Instead of conditioning on the last variable, we now condition on the first. This perspective turns out to be much more powerful.
Using the law of total expectation,
The intuition is simple: if the very first draw already exceeds , then the process stops immediately.
Define
Since ,
Case Analysis
There are two cases to consider:
- If , then .
- If , then after drawing , we need to exceed using fresh uniform variables.
In the second case,
Putting everything together,
Differentiating both sides gives
With the boundary condition , we again obtain
This method avoids multidimensional integrals entirely and already hints at how to extend the result.
Extension to
Now suppose . The same conditioning idea still applies, but the structure of the recursion changes slightly because the remaining threshold can now lie either in or in .
Define
As before,
We analyze by splitting into cases depending on the value of .
Case Analysis
If , i.e. , then after observing , the remaining problem is still in the regime . Hence,
If , i.e. , then the remaining problem falls back into the original case. From the previous section,
Integral Equation for
Combining both cases, we obtain
We now simplify each term.
First,
and by a change of variables,
Next,
and
Putting everything together,
Solving the ODE
Differentiating both sides with respect to gives
This is a first-order linear ODE. Using the boundary condition , we solve to obtain
In particular, at
Expected Overshoot Sum
We now switch gears and look at a closely related quantity.
Define
the sum at the stopping time. Our goal is to compute .
Let
As before,
Case Analysis
- If , then the process stops immediately and .
- If , then .
Thus,
Differentiating,
Using , we get
Final Insight
Notice something elegant:
This is exactly what Wald’s lemma predicts, since is a stopping time.
Closing Thought
The key lesson here is that conditioning—when done thoughtfully—can turn a high-dimensional integration problem into a one-line differential equation.
The next time you see a chain of uniform random variables, try conditioning early. It will almost always save you from visualizing multidimensional simplices.
Pretty cool, right?