Longest path search

A lovely counting inequality

I stumbled upon this very nice inequality bounding the number of Boolean vectors of length n which have no more than ρn nonzero entries. (This has some applications to data availability sampling, and, in particular to ZODA, which I might write about soon.)

Let n be any number and set 0<ρ1/2. Assume, for simplicity, that ρn is an integer (though the general case is obvious with some floors). Then, the following is true:

i=0ρn(ni)2nh2(ρ),

where h2 is the binary entropy function, defined

h2(ρ)=(ρlog2(ρ)+(1ρ)log2(1ρ)).

An alternative form is

i=0ρn(ni)exp(nh(ρ)),

where h is the natural entropy function (i.e., h2 except replacing the log2 with the natural log.)

The only proof I knew of this was via an argument using the moment generating function and a Chernoff bound. On the other hand, I was reading chapter 12 of Cover's information theory and discovered a more elementary proof, which can be seen as a relatively straightforward extension/strengthening of Cover's version in the book.

Proof

Note that (via Binomial expansion)

1=(ρ+(1ρ))n=i=0n(ni)ρi(1ρ)ni.

Certainly, clipping the tail of the right hand side to ρn means that

i=0ρn(ni)ρi(1ρ)ni1.

Now, note that, since ρ1/2 we have that ρρn(1ρ)(1ρ)nρi(1ρ)ni for all iρn since ρ1ρ. From here, the punchline is obvious, but worth writing anyways:

ρρn(1ρ)(1ρ)ni=0ρn(ni)i=0ρn(ni)ρi(1ρ)ni1.

Moving the constant ρρn(1ρ)(1ρ)n to the right hand side gives the result that

i=0ρn(ni)ρρn(1ρ)(1ρ)n=2n(ρlog2(ρ)+(1ρ)log2(1ρ))=2nh2(ρ),

as required.