Estimating the parameter from infinitely many IID Bernoulli distributions

DRMacIver
18 Jul 2025

These are some notes working out a particular unbiased estimator that I needed for making some improvements to the AWRS algorithm in genlm-control. You can see where this gets used in practice in the AWRS implementation.

Suppose you’ve got an infinite sequence of IID distributions, . You want an unbiased estimator for . How can you efficiently read the values of the to get one?

We’re particularly interested in the case where sampling the is “expensive” in some sense, so you want to use as few values as possible. I won’t make this precise here, it’s just the motivating intuition.

The first, and most obvious, strategy, is that for some you unconditionally return .

This approach is fine, but you always “spend” samples. One of the cases that we’re interested in is that when is close to we would like this process to be cheap, which means that spending samples each time isn’t ideal.

A natural idea is to sample until you see your first success. Unfortunately, this idea doesn’t work, for the following reason:

Theorem: Let . The only unbiased estimator for from a single sample of is to return if and if .

Proof:

Suppose we have some unbiased estimator .

Then

So .

The right hand side is a power series in , so by the uniqueness of power series, we must have that and for as desired.

However, you can get a good estimator for by sampling until you see successes and record the number of failures, . This gives you a negative binomial distribution, and this has a standard minimum variance unbiased estimator for of .

This works just fine.

However, another consideration that we might care about is that we’d like our estimator to not be too expensive when is small. This has expected value , so when is small we will have to take many samples.

The solution is to ``round towards zero’’ at a certain point. The reason we need unboundedly many draws is that we need to distinguish between very small values of . If we’re happy to return an estimator of for some small non-zero values of , we can bound the cost of sampling using the following theorem:

Fix integers . Sample from the until we’ve seen either failures or successes, and let be the number of failures we saw and and the number of successes.

These variables have the joint law:

Consider the case . This result arises precisely when the first draws contain exactly successes, of which the last one is a success.

Each such sequence of assignments occurs with probability , so to calculate the probability of this we just need to count sequences.

There are such sequences, distinguished only by where the successes are. One success must be at the end, so there are successes to distribute among positions. This can be done in ways, giving the desired result.

The other case is proved identically.

Now, suppose .

Define as follows: If (i.e. we stopped because we hit the maximum number of failures), . Otherwise (i.e. we stopped because we hit the maximum number of successes), let .

Theorem: is an unbiased estimator for .

Proof:

This will follow from the Rao-Blackwell theorem.

First, note that is a sufficient statistic for , as all dependence on is via its value.

Now, consider the trivial unbiased estimator . i.e. we estimate if succeeds and otherwise.

Let us consider the conditional expectation .

First, suppose .

Then

Where we used that because this is just exactly the same process but with one fewer success required to complete (because it was already provided by the .

We could do the algebra for the other direction, but instead we exploit the symmetry of the process. By swapping success and failure, we get the same process but with and swapped and . Thus is an unbiased estimator for , and:

as desired.

These values are precisely as we previously defined it, completing the proof.

Estimating the parameter from infinitely many IID Bernoulli distributions

Add a comment