Modern scientific studies routinely record multiple predictors alongside various correlated outcomes, often of mixed types, such as continuous measurements and binary disease indicators. Analysing such data outcome-by-outcome can ignore residual dependence, distort uncertainty quantification, and reduce power in high dimensions. This talk introduces Bayesian tools for learning multivariate structure and heterogeneous effects in high dimensions, with scalable computation and theoretical guarantees.
I will first present mixed-mSSL, a joint regression framework for mixed-type multivariate responses built on latent Gaussian augmentation. By combining spike-and-slab LASSO priors on regression effects with sparse graphical priors on the residual precision matrix, mixed-mSSL simultaneously selects predictors and learns an outcome-dependence network. A scalable Monte Carlo ECM algorithm enables MAP estimation, and we establish posterior contraction rates for both the coefficient matrix and the precision matrix and support recovery guarantees under diverging outcome-dimensions. mixed-mSSL demonstrates excellent finite-sample properties, using extensive simulation studies and applications spanning medicine to ecology.
Next, I move beyond constant effects to settings where covariate impacts may vary with context. I will discuss sparseVCBART, which places BART ensembles on varying-coefficient functions while inducing two-way sparsity: selecting relevant covariates and identifying which modifiers drive effect heterogeneity. As a natural extension, I will also outline ongoing work on multivariate BART for multiple correlated outcomes, allowing outcome-adaptive tree structure while borrowing strength via a shared residual covariance.