JHU BLAST Working Group

Bayesian Learning and Spatio-Temporal modeling
Department of Biostatistics
Johns Hopkins Bloomberg School of Public Health

Leadership

Photo of Abhi Datta

Abhi Datta

Professor

Department of Biostatistics

Abhi develops statistical and machine learning methods for large spatial datasets as well as Bayesian models for multi-source epidemiological datasets.

Photo of Aki Nishimura

Aki Nishimura

Assistant Professor

Department of Biostatistics

Aki uses Bayesian methods and statistical computing to tackle methodological challenges in healthcare analytics and large-scale biomedical applications.

Upcoming Events

From Dependence to Heterogeneity: Bayesian Methods for Structured High-Dimensional Inference

Wednesday, February 11, 2026

Speaker: Soham Ghosh (University of Wisconsin, Madison)

Modern scientific studies routinely record multiple predictors alongside various correlated outcomes, often of mixed types, such as continuous measurements and binary disease indicators. Analysing such data outcome-by-outcome can ignore residual dependence, distort uncertainty quantification, and reduce power in high dimensions. This talk introduces Bayesian tools for learning multivariate structure and heterogeneous effects in high dimensions, with scalable computation and theoretical guarantees.

I will first present mixed-mSSL, a joint regression framework for mixed-type multivariate responses built on latent Gaussian augmentation. By combining spike-and-slab LASSO priors on regression effects with sparse graphical priors on the residual precision matrix, mixed-mSSL simultaneously selects predictors and learns an outcome-dependence network. A scalable Monte Carlo ECM algorithm enables MAP estimation, and we establish posterior contraction rates for both the coefficient matrix and the precision matrix and support recovery guarantees under diverging outcome-dimensions. mixed-mSSL demonstrates excellent finite-sample properties, using extensive simulation studies and applications spanning medicine to ecology.

Next, I move beyond constant effects to settings where covariate impacts may vary with context. I will discuss sparseVCBART, which places BART ensembles on varying-coefficient functions while inducing two-way sparsity: selecting relevant covariates and identifying which modifiers drive effect heterogeneity. As a natural extension, I will also outline ongoing work on multivariate BART for multiple correlated outcomes, allowing outcome-adaptive tree structure while borrowing strength via a shared residual covariance.

Bayesian Transfer Learning Approaches for Large-scale Spatiotemporal Problems

Wednesday, February 18, 2026

Speaker: Luca Presicce (University of Milan, Bicocca)

The increasing availability of large-scale geospatial and spatiotemporal data presents new opportunities and challenges for statistical modeling in environmental, technological, medical, and other complex areas, which increasingly rely on massive multivariate spatiotemporal datasets. Yet, Bayesian learning for such problems remains severely limited by computational bottlenecks and the lack of flexible modeling tools. Modern applications require methods that are adaptive and effective, but still computationally efficient, scalable to massive datasets, and capable of delivering reliable automated inference with principled uncertainty quantification and (possibly) minimal experienced human intervention. Classical Bayesian approaches, although theoretically appealing and offering rich inferential frameworks, often become computationally infeasible in data-rich environments, especially when confronted with massive datasets or dynamic, high-dimensional dependence structures. Existing approaches often fail to scale, leaving a gap between the theoretical richness of Bayesian inference and its practical deployment in data-rich applications. This thesis develops Bayesian transfer learning methodologies to address these challenges, enabling efficient information propagation and scalable inference across large spatial and spatiotemporal domains, providing a unified framework that merges distributional theory for matrix-variate models with computational innovations in Bayesian predictive stacking. Through extensive simulation experiments and data applications to global and satellite monitoring of vegetation indices, sea surface temperature, and land-atmospheric climate composition, the thesis also demonstrates the potential of Bayesian transfer learning to redefine spatial and spatiotemporal multivariate modeling, providing flexible, computationally efficient solutions that open the way for scalable, automated, and truly modern tools for geospatial learning in data-rich environments.