Events

Upcoming

Check back for future events!

Past

Modeling Structure and Cross-Country Variability in Misclassification Matrices of Verbal Autopsy Cause-of-Death Classifiers

Wednesday, January 22, 2025

Speaker: Dr. Sandipan Pramanik (Johns Hopkins University)

Verbal autopsy (VA) algorithms are routinely employed in low- and middle-income countries to determine individual causes of death (COD). The CODs are then aggregated to estimate population-level cause-specific mortality fractions (CSMFs) essential for public health policymaking. However, VA algorithms often misclassify COD, introducing bias in CSMF estimates. A recent method, VA-calibration, addresses this bias by utilizing a VA misclassification matrix derived from limited labeled COD data collected in the CHAMPS project. Due to the limited labeled samples, the data are pooled across countries to improve estimation precision, thereby implicitly assuming homogeneity in misclassification rates across countries. In this presentation, I will highlight substantial cross-country heterogeneity in VA misclassification, challenging this homogeneity assumption and revealing its impact on VA-calibration’s efficacy. To address this, I will propose a comprehensive country-specific VA misclassification matrix modeling framework in data-scarce settings. The framework introduces a novel base model that parsimoniously characterizes the misclassification matrix through two latent mechanisms: intrinsic accuracy and systematic preference. We theoretically prove that these mechanisms are identifiable from the data and manifest as a form of invariance in misclassification odds, a pattern evident in the CHAMPS data. Building on this, the framework then incorporates cross-country heterogeneity through interpretable effect sizes and uses shrinkage priors to balance the bias-variance tradeoff in misclassification matrix estimation. This effort broadens VA-calibration’s applicability and strengthens ongoing efforts of using VA for mortality surveillance. I will illustrate this through simulations and applications to mortality surveillance projects, such as COMSA in Mozambique and CA CODE.

Fast Bayesian Functional Principal Components Analysis

Wednesday, December 11, 2024

Speaker: Joe Sartini (Johns Hopkins University)

Functional Principal Components Analysis (FPCA) is one of the most successful and widely used analytic tools for functional data exploration and dimension reduction. Standard implementations of FPCA estimate the principal components from the data but ignore their sampling variability in subsequent inferences. To address this problem, we propose the Fast Bayesian Functional Principal Components Analysis (Fast BayesFPCA), that treats principal components as parameters on the Stiefel manifold. To ensure efficiency, stability, and scalability we introduce three innovations: (1) project all eigenfunctions onto an orthonormal spline basis, reducing modeling considerations to a smaller-dimensional Stiefel manifold; (2) induce a uniform prior on the Stiefel manifold of the principal component spline coefficients via the polar representation of a matrix with entries following independent standard Normal priors; and (3) constrain sampling leveraging the FPCA structure to improve stability. We demonstrate the improved credible interval coverage and computational efficiency of Fast BayesFPCA in comparison to existing software solutions. We then apply Fast BayesFPCA to actigraphy data from NHANES 2011-2014, a modelling task which could not be accomplished with existing MCMC-based Bayesian approaches.

Air Pollution Monitoring

Thursday, December 5, 2024

Speaker: Dr. Chris Heaney, Matthew Aubourg, Bonita Salmerón (Johns Hopkins Univeristy)

Data-related challenges in air pollution monitoring and health impacts, focused on South Baltimore and in partnership with South Baltimore Community Land Trust (represented by Greg Galen). Joint seminar with the JHU Causal Inference Working Group.

Backwards sequential Monte Carlo for efficient Bayesian optimal experimental design

Wednesday, November 13, 2024

Speaker: Andrew Chin (Johns Hopkins University)

The expected information gain (EIG) is a crucial quantity in Bayesian optimal experimental design (OED), quantifying how useful an experiment is by the amount we expect the posterior to differ from the prior. However, evaluating the EIG can be computationally expensive since it requires the posterior normalizing constant. A rich literature exists for estimation of this normalizing constant, with sequential Monte Carlo (SMC) approaches being one of the gold standards. In this work, we leverage two idiosyncrasies of OED to improve efficiency of EIG estimation via SMC. The first is that, in OED, we simulate the data and thus know the true underlying parameters. The second is that we ultimately care about the EIG, not the individual normalizing constants. This lets us create an EIG-specific SMC method that starts with a sample from the posterior and tempers backwards towards the prior. The key lies in the observation that, in certain cases, the Monte Carlo variance of SMC for the normalizing constant of a single dataset is significantly lower than the variance of the normalizing constants themselves across datasets. This suggests the potential to slightly increase variance while drastically decreasing computation time by reducing the SMC population, and taking this idea to the extreme gives rise to our method. We demonstrate our method on a simulated coupled spring-mass system where we observe order of magnitude performance improvements.