StatModeling Memorandum

StatModeling Memorandum

StanとRとPythonでベイズ統計モデリングします. たまに書評.

Report of Michael Betancourt's Stan Lecture

We are really happy to hold Michael Betancourt's Stan Lecture on June 4 at DWANGO. To tell the truth, this is not the first Stan meeting in Japan. Three BUGS/Stan meetings were held about 2 years ago.

DWANGO is a very famous company for its video service (nicovideo), and the following lectures were broadcast using it. I'm surprised that there were 400+ listeners even contents were English and specialized.

I'd like to introduce the contents.

Hiroki ITO "Dealing with latent discrete parameters in Stan"

That movie includes the next presentation. The slides are here.

This lecture was based on the book Hiroki translated:
BUGSで学ぶ階層モデリング入門: 個体群のベイズ解析

BUGSで学ぶ階層モデリング入門: 個体群のベイズ解析

The first half of his presentation was on "data augmentation", and the last half was on "Hidden Markov Model". Data augmentation is an amazing technic to estimate the probability of existence and the probability of observation separately by adding virtual individuals' data. Furthermore, the number of existing individuals can be also estimated. This technic is very flexible and widely applicable. I think data augmentation is one of the best parts in the above book.

The above book is about BUGS, but Hiroki has translated all of them into Stan, and pushed them to the official repository of Stan. Ch.06 is the first directory that contains source codes of data augmentation.

Kentaro Matsuura "Replica exchange MCMC with Stan and R"

The slides are here.

Source codes are here. The details are here (in Japanese). I'd like to supplement the answer to the question. Simulated annealing can reach optimum, but replica exchange MCMC can get the distribution. That is a big difference.

Michael Betancourt "Scalable Bayesian Inference with Hamiltonian Monte Carlo"

He started his talk from an overview of tall data and wide data, then stepped forward to Hamiltonian Monte Carlo (HMC) which can do efficient sampling of high dimensional models (i.e. many parameters). His introduction of HMC using gravity was really nice. Xiangze who was one of the participants posted a good blog about that (in Japanese). NUTS is an algorithm that can tune the parameters in HMC automatically. The details of NUTS seemed to be cut because it was hard for us to comprehend. He also introduced Adiabatic Monte Carlo (arXiv) maybe for my presentation. Thank you! The animation was so cool.

Michael Betancourt "Some Bayesian Modeling Techniques in Stan"

He talked about linear models, GLM and hierarchical models gradually and kindly. Centered vs. non-centered parameterization was very interesting. Centered models are ordinary hierarchical models, and non-centered models are so-called reparameterized (scale-separated) hierarchical models. Please refer to the above video or the JSS paper of Stan if you want to know details.

I'd like to quote part of JSS paper of Stan.

Betancourt and Girolami (2013) provide a detailed analysis of the benefits of centered vs. non-centered parameterizations of hierarchical models in Euclidean and Riemannian HMC, with the conclusion that non-centered are best when data are sparse and centered when the data strongly identifies coefficients.

However, it is generally difficult to measure the sparseness of data, so the recommendation could be the following (from the JSS paper).

We recommend starting with the more natural centered parameterization and moving to the non-centered parameterization if sampling of the coefficients does not mix well.

It seems that we can also use non-centered parameterization partially (i.e. a stratum or a group).

Thank you very much again, DWANGO, our stuffs, Hiroki and Michael Betancourt!