Cool Problem

Cool Problem

During the summer, I came across a specific flavor of problem that I could not solve initially, but with some help, I eventually became comfortable with it. Fun fact: this question was also asked in a final-round interview at Squarepoint.

What made this problem particularly interesting to me was that it looked simple on the surface, but quickly forced you to think carefully about stopping times, conditioning, and how to structure expectations. At first glance, the most natural instinct is to brute force it—and that’s exactly where I started.


Problem Statement

Let be i.i.d. random variables and let

Find .

In words, we keep drawing uniform random variables until their cumulative sum exceeds , and we want to know how many draws this takes on average.


First Approach: Brute Force via Tail Probabilities

Since is a discrete random variable taking positive integer values, a natural starting point is to express its expectation in terms of tail probabilities:

To make progress, we need to understand what the event actually means. Observing the definition of , we see that

So the problem reduces to computing probabilities involving partial sums of uniform random variables.

Define the event

Our goal is to compute in a form that is easy to work with.


Recursive Formulation

We can write this probability as a volume of a region in :

Let . Rather than computing this integral directly for each , we look for a recursive structure.

Conditioning on the last variable , the event requires

This gives the recursion

To anchor the recursion, note that since ,

Using this,

At this point, a clear pattern emerges. By induction,


Computing

We can now return to the expression we started with. Using the fact that ,

Substituting the closed-form expression for ,

This is nothing but the Taylor expansion of , and hence

While this approach works nicely, it does rely on explicitly understanding the geometry of these probability regions. Fortunately, there is a cleaner way.


Second Approach: Conditioning on the First Variable

Instead of conditioning on the last variable, we now condition on the first. This perspective turns out to be much more powerful.

Using the law of total expectation,

The intuition is simple: if the very first draw already exceeds , then the process stops immediately.

Define

Since ,


Case Analysis

There are two cases to consider:

  • If , then .
  • If , then after drawing , we need to exceed using fresh uniform variables.

In the second case,

Putting everything together,

Differentiating both sides gives

With the boundary condition , we again obtain

This method avoids multidimensional integrals entirely and already hints at how to extend the result.


Extension to

Now suppose . The same conditioning idea still applies, but the structure of the recursion changes slightly because the remaining threshold can now lie either in or in .

Define

As before,

We analyze by splitting into cases depending on the value of .


Case Analysis

  • If , i.e.  , then after observing , the remaining problem is still in the regime . Hence,

  • If , i.e.  , then the remaining problem falls back into the original case. From the previous section,


Integral Equation for

Combining both cases, we obtain

We now simplify each term.

First,

and by a change of variables,

Next,

and

Putting everything together,


Solving the ODE

Differentiating both sides with respect to gives

This is a first-order linear ODE. Using the boundary condition , we solve to obtain

In particular, at

Expected Overshoot Sum

We now switch gears and look at a closely related quantity.

Define

the sum at the stopping time. Our goal is to compute .

Let

As before,


Case Analysis

  • If , then the process stops immediately and .
  • If , then .

Thus,

Differentiating,

Using , we get


Final Insight

Notice something elegant:

This is exactly what Wald’s lemma predicts, since is a stopping time.


Closing Thought

The key lesson here is that conditioning—when done thoughtfully—can turn a high-dimensional integration problem into a one-line differential equation.

The next time you see a chain of uniform random variables, try conditioning early. It will almost always save you from visualizing multidimensional simplices.

Pretty cool, right?

No comment found.

Add a comment

You must log in to post a comment.