Bayesian nonparametric methods for entropy estimation in spike data
Shannon’s entropy is a basic quantity in information theory, and a useful tool for the analysis of neural codes. However, estimating entropy from data is a difficult statistical problem. In this talk, I will discuss the problem of estimating entropy in the “under-sampled regime”, where the number of samples is small relative to the number of symbols. Dirichlet and Pitman-Yor processes provide tractable priors over countably-infinite discrete distributions, and have found applications in Bayesian non-parametric statistics and machine learning. In this talk, I will show that they also provide natural priors for Bayesian entropy estimation. These nonparametric priors permit us to address two major issues with previously-proposed Bayesian entropy estimators: their dependence on knowledge of the total number of symbols, and their inability to account for the heavy-tailed distributions which abound in biological and other natural data. What’s more, by “centering” a Dirichlet Process over a flexible parametric model, we are able to develop Bayesian estimators for the entropy of binary spike trains using priors designed to flexibly exploit the statistical structure of simultaneously-recorded spike responses. Finally, in applications to simulated and real neural data, I'll show that these estimators perform well in comparison to traditional methods.
Shannon’s entropy is a basic quantity in information theory, and a useful tool for the analysis of neural codes. However, estimating entropy from data is a difficult statistical problem. In this talk, I will discuss the problem of estimating entropy in the “under-sampled regime”, where the number of samples is small relative to the number of symbols. Dirichlet and Pitman-Yor processes provide tractable priors over countably-infinite discrete distributions, and have found applications in Bayesian non-parametric statistics and machine learning. In this talk, I will show that they also provide natural priors for Bayesian entropy estimation. These nonparametric priors permit us to address two major issues with previously-proposed Bayesian entropy estimators: their dependence on knowledge of the total number of symbols, and their inability to account for the heavy-tailed distributions which abound in biological and other natural data. What’s more, by “centering” a Dirichlet Process over a flexible parametric model, we are able to develop Bayesian estimators for the entropy of binary spike trains using priors designed to flexibly exploit the statistical structure of simultaneously-recorded spike responses. Finally, in applications to simulated and real neural data, I'll show that these estimators perform well in comparison to traditional methods.