Research

Spatial Statistics

A central application of spatial statistics is using geographically sparsely sampled data to create spatially continuous maps for policy-making. For example, the figure shows how the PM10 data from a small number of monitoring stations in Europe (left) are used to predict the pollutant’s distribution across the region (middle) as well as the probability of it exceeding the regulatory threshold (right). Learn more about spatial statistics here.

Bayesian Computation

Markov chain Monte Carlo (MCMC) has been an essential tool for Bayesian statistics, empowering posterior inference for otherwise intractable probabilistic models. While alternatives have emerged, MCMC remains one of the most reliable and broadly applicable approaches to characterize complex posterior distributions common in modern biomedical/public health applications. We push computational limits of Bayesian inference in the era of Big Data through fundamental innovations in MCMC and other computational algorithms.

(Figure: the novel “zigzag” variant of Hamiltonian Monte Carlo enables statistical phylogenetics software BEAST to infer correlation among relations between gene mutations and biological traits of viruses.)

Precision Medicine/Health Data Analytics

For a specific disease and clinical question of interest, a single data source rarely provide a sufficient number of patients, longitudinal coverage, and breadth of information. Generating actionable clinical insights on how to treat individual patients necessitates integrating patient experiences across multiple data sources. Bayesian inference provides a natural framework to account for the hierarchical structure of such data as well as to incorporate our scientific understanding of the underlying clinical and biological processes. We build Bayesian machinery and software to support enterprises such as the Johns Hopkins’s inHealth Precision Medicine initiative and the Observational Health Data Science and Informatics collaborative.

(Figure: the Active Surveillance program aims to minimize the harm and waste from unwarranted surgical removal of a low-risk prostate cancer by leveraging surrogate measurements to quantify the cancer state. Hopkins’s participation in the GAP3 global database provides an opportunity to further improve individual-level prediction through Bayesian hierarchical modeling.)