Statistics Seminar - Spring 2023
Seminars are held on Thursdays from 4:00 - 5:00pm on Webex unless otherwise noted. For access information, please contact the Math Department via email at math@njit.edu.
For questions about the seminar schedule, please contact Zuofeng Shang and Chong Jin.
February 16
Wujuan Zhong, Associate Principal Scientist at Merck
Location: WebEx
fastGWA-GE: A Fast and Powerful Linear Mixed Model Approach for Genotype-environment Interaction Tests in Large-scale GWAS
Genotype-by-environment interaction (GEI or GxE) plays an important role in understanding complex human traits. However, it is usually challenging to detect GEI signals efficiently and accurately while adjusting for population stratification and sample relatedness in large-scale genome-wide association studies (GWAS). Here we propose a fast and powerful linear mixed model-based approach, fastGWA-GE, to test for GEI effect and G + GxE joint effect. Our extensive simulations show that fastGWA-GE outperforms other existing GEI test methods by controlling genomic inflation better, providing larger power and running hundreds to thousands of times faster. We performed a fastGWA-GE analysis of ~7.27 million variants on 452,249 individuals of European ancestry for 13 quantitative traits and five environment variables in the UK Biobank GWAS data and identified 96 significant signals (72 variants across 57 loci) with GEI test P-values < 1 × 10-9, including 27 novel GEI associations, which highlights the effectiveness of fastGWA-GE in GEI signal discovery in large-scale GWAS.
March 2
Tian Tian, Children's Hospital of Philadelphia
Location: CKB 207
Model-based Deep Learning Approaches for Analyses of Single-cell and Spatial Genomics Data
With the advances of single-cell and spatial sequencing techniques, analysis has been touted for diverse biomedical questions. However, the analysis of single-cell data remains computationally and analytically challenging due to the discrete, over-dispersed and degree of noise in the data. For the spatial genomics data, the situation is further compromised by complex spatial dependencies. To address these limitations, we propose model-based deep learning approaches for various analysis of single-cell and spatial genomics data. First, we will discuss a model-based deep hyperbolic manifold learning approach to visualize complex hierarchical structures in single-cell genomics data. Second, we will propose dependency-aware deep generative model for multitasking analysis of spatial genomics data, including dimensionality reduction, visualization, clustering, batch integration, denoising, differential expression, spatial imputation, resolution enhancement, and identifying spatial genes.
March 23
Ruiyi Yang, Princeton University
Location: WebEx
Optimization on Manifolds via Graph Gaussian Processes
Optimization problems on smooth manifolds are ubiquitous in science and engineering. Oftentimes the manifolds are not known analytically and only available as an unstructured point cloud, so that gradient-based methods are not directly applicable. In this talk, we shall discuss a Bayesian optimization approach, which exploits a Gaussian process over the point cloud and an acquisition function to sequentially search for the global optimizer. Regret bounds are established and several numerical examples demonstrate the effectiveness of our method.
April 13
Reuben Adatorwovor, Assistant Professor at University of Kentucky
Location: WebEx
A Flexible Copula Model for Bivariate Survival with Dependent Censoring: An Application in Prostate Cancer Data
Independent censoring is a key assumption usually made when analyzing time-to-event data. However, this assumption is untestable and can be questionable in many cases, especially when there is a disproportionate loss to follow-up. This paper develops a likelihood approach for analyzing bivariate survival data under dependent censoring. Specifically, we use a flexible Joe- Hu copula to capture the dependence within the quadruple (two event times and two censoring times), while the marginal distribution of each event/censoring time is formulated by a Cox proportional hazards model. Our estimator possesses consistency and desirable asymptotic properties under regularity conditions. We provide results under extensive simulations with application to the Danish twin prostate cancer data.
April 27
Jiaoyang Huang, Wharton Statistics and Data Science, University of Pennsylvania
Location: WebEx
Efficient Derivative-free Bayesian Inference for Large-Scale Inverse Problems
We consider Bayesian inference for large-scale inverse problems, where computational challenges arise from the need for the repeated evaluations of an expensive forward model, which is often given as a black box or is impractical to differentiate. In this talk I will propose a new derivative-free algorithm Unscented Kalman Inversion, which utilizes the ideas from Kalman filter, to efficiently solve these inverse problems. First, I will explain some basics about Variational Inference under general metric tensors. In particular, under the Fisher-Rao metric, the Gaussian Variational Inference leads to the natural gradient descent. Next, I will discuss two different views of our algorithm. It can be obtained from a Gaussian approximation of the filtering distribution of a novel mean field dynamical system. And it can also be viewed as a derivative-free approximation of the natural gradient descent. I will also discuss theoretical properties for linear inverse problems. Finally, I will discuss an extension of our algorithm using Gaussian mixture approximation, which leads to the Gaussian Mixture Kalman Inversion, an efficient derivative-free Bayesian inference approach capable of capturing multiple modes. I will demonstrate the effectiveness of this approach in several numerical experiments with multimodal posterior distributions, which typically converge within O(10) iterations. This is based on joint works with Yifan Chen, Daniel Zhengyu Huang, Sebastian Reich and Andrew Stuart.
Updated: April 13, 2023