Statistics Seminar - Fall 2021
Seminars are held on Thursdays from 4:00 - 5:00pm on Webex unless otherwise noted. For access information, please contact the Math Department.
For questions about the seminar schedule, please contact Zuofeng Shang
October 7
Linjun Zhang, Rutgers University
The Cost of Privacy in Generalized Linear Models: Algorithms and Optimal Rate of Convergence
In this talk, we introduce differentially private algorithms for parameter estimation in both low-dimensional and high-dimensional sparse generalized linear models (GLMs) by constructing private versions of projected gradient descent. We show that the proposed algorithms are nearly rate-optimal by characterizing their statistical performance and establishing privacy-constrained minimax lower bounds for GLMs. The lower bounds are obtained via a novel technique based on Stein's Lemma that generalizes the tracing attack technique for privacy-constrained lower bounds. This lower bound argument can be of independent interest as it applies to general parametric models. Simulated and real data experiments are conducted to demonstrate the numerical performance of our algorithms.
November 11
Shyamal D. Peddada, Biostatistics and Bioinformatics Branch, National Institute of Child Health and Human Development (NICHD), National Institutes of Health (NIH)
Analysis of Composition of Microbiome with Application to HIV-1 Data
Increasingly researchers are interested in understanding changes in microbial compositions in two or more study groups or experimental conditions. Two common statistical parameters of interest are differential abundance of taxa and differential relative abundance of taxa, in a unit volume of an ecosystem. A variety of statistical methods are introduced in the literature under various assumptions. These methods are often evaluated using simulation studies. Unfortunately, the simulation studies are not always precise about the null hypothesis and thus resulting in misleading conclusions. One of the major challenges with these data is how to normalize them. This issue is not limited to microbiome data but also exists for other count data. In this talk we shall describe a recent development in this field and illustrate our methodology using an unpublished HIV AIDS microbiome data. Significant changes in the gut microbiome were identified several months prior to HIV infection in the early phase of the AIDS pandemic in the USA. This was associated with increased inflammatory biomarkers in blood and risk for development of AIDS.
December 2
Brent Burger and Peixin Zhang, JAZZ Pharmaceuticals
*Please note, this talk will be held in-person in CULM 111 from 11:30 am - 12:30 pm followed by a brief overview of JAZZ Pharmaceuticals and their Biometrics Department from 12:30 pm - 1:30 pm.*
Count Data Regression Models - A Clinical Trial Application
Although not as common as normal linear models or time-to-event models, use of count data regression models have gained popularity in recent years for situations where the endpoint of interest is a count (eg. 0, 1, 2, 3 …).
This seminar will present an overview of count data regression models including examples from modeling seizure (count) data.
December 9
Molin Wang, Harvard T.H. Chan School of Public Health
Statistical Methods for Studying Disease Etiologic Heterogeneity and Dealing with Missing Subtypes
A fundamental goal of epidemiologic research is to investigate the relationship between exposures and disease risk. Cases of the disease are often considered a single outcome, and assumed to share a common etiology. However, evidence indicates that many human diseases arise and evolve through a range of heterogeneous molecular pathologic processes, influenced by diverse exposures. In this talk, we will discuss analytic options for studying disease subtype heterogeneity in time to event data settings, emphasizing methods for evaluating whether the association of a potential risk factor with disease varies by disease subtype. Often the disease subtypes are unknown for some cases. We will also focus on statistical methods for dealing with the missing subtype data problems. We will illustrate our methods in an analysis of colorectal cancer data from the Nurses' Health Study cohort.
Updated: December 7, 2021