A lovely counting inequality
I stumbled upon this very nice inequality bounding the number of Boolean vectors of length which have no more than nonzero entries. (This has some applications to data availability sampling, and, in particular to ZODA, which I might write about soon.)
Let be any number and set . Assume, for simplicity, that is an integer (though the general case is obvious with some floors). Then, the following is true:
where is the binary entropy function, defined
An alternative form is
where is the natural entropy function (i.e., except replacing the with the natural log.)
The only proof I knew of this was via an argument using the moment generating function and a Chernoff bound. On the other hand, I was reading chapter 12 of Cover's information theory and discovered a more elementary proof, which can be seen as a relatively straightforward extension/strengthening of Cover's version in the book.
Proof
Note that (via Binomial expansion)
Certainly, clipping the tail of the right hand side to means that
Now, note that, since we have that for all since . From here, the punchline is obvious, but worth writing anyways:
Moving the constant to the right hand side gives the result that
as required.