Bayesian Analysis of Sparse Counts Obtained From the Unrelated Question Design


  •  Balgobin Nandram    
  •  Yuan Yu    

Abstract

In sample surveys with sensitive items, sampled units may not respond or they respond untruthfully. Usually a negative answer is given when it is actually positive, thereby leading to an estimate of the population proportion of positives (sensitive proportion) that is too small. In our study, we have binary data obtained from the unrelated-question design, and both the sensitive proportion and the nonsensitive proportion are of interest. A respondent answers the sensitive item with a known probability, and to avoid non-identifiable parameters, at least two (not necessarily exactly two) different random mechanisms are used, but only one for each cluster of respondents. The key point here is that the counts are sparse (very small sample sizes), and we show how to overcome some of the problems associated with the unrelated question design. A standard approach to this problem is to use the expectation-maximization (EM) algorithm. However, because we consider only small sample sizes (sparse counts), the EM algorithm may not converge and asymptotic theory, which can permit normality assumptions for inference, is not appropriate; so we develop a Bayesian method. To compare the EM algorithm and the Bayesian method, we have presented an example with sparse data on college cheating and a simulation study to illustrate the properties of our procedure. Finally, we discuss two extensions to accommodate finite population sampling and optional responses.



This work is licensed under a Creative Commons Attribution 4.0 License.