Beyond Gaussian Mixtures: Applying Empirical Bayes to Discrete Data Problems

Written by lossfunctions | Published 2025/09/09
Tech Story Tags: loss-function | discrete-data | poisson-mixture-models | binomial-mixture-models | actuarial-science | educational-testing | statistical-modeling | nonparametric-methods

TLDRUsing real-world examples from actuarial science and a unique thumbtack experiment, it discusses key concepts like partial identification and the transformation of binomial models to the Gaussian framework.via the TL;DR App

Table of Links

Abstract and 1. Introduction

  1. The Compound Decision Paradigm
  2. Parametric Priors
  3. Nonparametric Prior Estimation
  4. Empirical Bayes Methods for Discrete Data
  5. Empirical Bayes Methods for Panel Data
  6. Conclusion

Appendix A. Tweedie’s Formula

Appendix B. Predictive Distribution Comparison

References

5. Empirical Bayes Methods for Discrete Data

The range of empirical Bayes methods extends far beyond the Gaussian mixture settings that we have emphasized thus far. Parametric Poisson mixture models have a long history in actuarial risk analysis and ecology. See, for example Buhlmann and Straub (1970) and Fisher et al (1943), respectively. Given observations y1, · · · , yn with marginal density,

The example involves repeated rolls of a common thumbtack. A one was recorded if the tack landed point up and a zero was recorded if the tack landed point down. All tacks started point down. Each tack was flicked or hit with the fingers from where it last rested. A fixed tack was flicked 9 times. The data are recorded in Table 1. There are 320 9-tuples. These arose from 16 different tacks, 2 “flickers,” and 10 surfaces. The tacks vary considerably in shape and in proportion of ones. The surfaces varied from rugs through tablecloths through bathroom floors.

Unconditionally on the type of tack and surface the experimental outcomes have marginal mixture density.

The binomial mixture model is easily adapted to situations with varying numbers of trials m, but a cautionary note is required regarding identification in such models. Only m+1 distinct frequencies can be observed for B(m, p) binomials and this implies that only m + 1 moments of G are identifiable. Partial identification in discrete response models is discussed in more detail in Koenker and Gu (2024) in the context of the Kline and Walters (2021) model of employment discrimination.

In binomial mixture models with a large number of trials it is often convenient to transform to the Gaussian model as for example in the extensive literature on baseball batting averages, see e.g. Gu and Koenker (2017a). In other settings it is more convenient to consider logistic models as for the Rasch model commonly used in educational testing or the Bradley-Terry model for rating participants in pairwise competition. See Gu and Koenker (2022).

Authors:

(1) Roger Koenker;

(2) Jiaying Gu.


This paper is available on arxiv under CC BY 4.0 DEED license.


Written by lossfunctions | Loss Function fuels the quest for accuracy, driving models to minimize errors and maximize their predictive prowess!
Published by HackerNoon on 2025/09/09