2:00-3:00pm, March 03, 2023, Distinguished Lecture:
Location: 25 Park Place, Room 1441
,
Professor Zhezhen Jin,
Department of Biostatistics, Columbia University,
Statistical issues and challenges in biomedical studies
Abstract:
In this talk, I will present statistical issues and challenges that I have encountered in my biomedical research.
Through real biomedical studies in transplantation, aging, and environmental science, I will illustrate the topics
including data collection, data cleaning, formulation of research questions, data analysis, and related statistical
methodology development. After a discussion on the issues and challenges, I will focus on item selection in
disease screening, comparison and identification of biomarkers that are more informative to disease diagnosis,
estimation of weights on relatively importance of exposure variables on health outcome, subsampling ,
and variable selection and dimension reduction for adjusted analysis. I will also present our
newly developed methods tied to the real studies which can address some of the issues and challenges.
11:00-12:00pm, March 03, 2023, Colloquium, Location: 25 Park Place, Room 1441 ,
Huimin Cheng,
Department of Statistics
University of Georgia,
Masked Mirror Validation in Graphon Estimation
Abstract:
Graphon, short for graph function, provides a generative model for networks. An accurate estimation of graphon plays
a key role in many applications, such as link prediction. In recent decades, various methods for graphon estimation have been proposed.
The success of most graphon estimation methods depends on a proper specification of hyperparameters.
Some network cross-validation methods have been proposed, but they suffer from restrictive model assumptions,
expensive computational costs, and a lack of theoretical guarantees. To address these issues, we propose a masked
mirror validation (MMV) method. Asymptotic properties of the MMV are established. The effectiveness of the proposed method in
terms of both computation and accuracy is demonstrated by
extensive simulation experiments. We further apply MMV for drug repurposing in a real data application.
3:00-4:00pm, February 24, 2023, Statistics Seminar,
Location: 25 Park Place, Room 1441 ,
Dr. Shihao Yang,
School of Industrial & Systems Engineering at Georgia Tech,
Inference of dynamic systems from noisy and sparse data via manifold-constrained Gaussian processes
Abstract:
Parameter estimation for nonlinear dynamic system models, represented by ordinary differential equations (ODEs) or
partial differential equations (PDEs), using noisy and sparse experimental data is a vital task in many fields.
We propose a fast and accurate method, manifold-constrained Gaussian process Inference, for this task.
Our method uses a Gaussian process model over system components, explicitly conditioned on the manifold
constraint that gradients of the Gaussian process must satisfy the ODE/PDE system. By doing so, we completely
bypass the need for numerical integration and achieve substantial savings in computational time. Our
method is also suitable for inference with unobserved system components, which often occur in real experiments.
Our method is distinct from existing approaches as we provide a principled statistical construction
under a Bayesian framework, which rigorously incorporates the ODE/PDE system through conditioning.
2:00-3:00pm, December 12, 2022, Colloquium, Location: 25 Park Place, Room 1441, Webex Meeting Link:
https://gsumeetings.webex.com/webappng/sites/gsumeetings/meeting/info/7
,
Dr. Yi Li,
M. Anthony Schork Professor of Biostatistics and Professor of Global Public Health,
Department of Biostatistics, University of Michigan,
Distinguished Lecture: Machine Learning in the Era of Big Data: Model Selection, Estimation, and Inference
Abstract:
In the era of big data, high-throughput data are routinely collected. These high dimensional data defy classical
regression models, which are either infeasible to fit or likely to incur low predictability because of overfitting.
In this talk, we will introduce several cutting-edge machine learning methods developed by my group in the last few years for
modeling (censored) outcome data with high dimensional predictors. Specifically, we will introduce a Dantzig selector for fitting survival
models with high dimensional predictors, followed by various semiparametric and nonparametric feature screening methods for handling ultra-high
dimensional predictors. We will also discuss statistical inference for regression models with high dimensional predictors.
With high-dimensional outcome data, we will introduce a new class of high-dimensional Gaussian graphical regression models with predictors.
The talk focuses on statistical principles and concepts behind these methods, which are motivated and illustrated by various biomedical
examples with precision medicine contexts.
2:00-3:00pm, November 18, 2022, Virtual Colloquium, Webex Meeting Link:
https://gsumeetings.webex.com/gsumeetings/j.php?MTID=m5315fe8659224e9317efc53c2f6c56a0 ,
Dr. Lili Yu,
Professor and Karl E. Peace endowed Chair, Department of Biostatistics, Epidemiology and Environmental Health Sciences, Georgia Southern University,
Survival data analysis with Heteroscedastic Accelerated Failure time model
Abstract:
The Buckley-James method for the classical accelerated failure time model has been extended to accommodate heteroscedastic survival
data in two ways. The first is the weighted least squares method, which estimates the heteroscedasticity nonparametrically, while the second is
the local Buckely-James method, which uses local
Kaplan-Meier method to estimate heteroscedasticity. In this talk, we will discuss and compare these two methods
theoretically and numerically with simulation studies. Two real data examples are used for practical illustration of the comparison.
3:00-4:00pm, November 04, 2022, Virtual Statistics Seminar, Webex Meeting Link:
https://gsumeetings.webex.com/gsumeetings/j.php?MTID=mebd181d58e6c1ea7ab546b83c1bf8e39 ,
Dr. TSZ Chai Fung,
Assistant Professor, Department of Risk Management & Insurance, Georgia State University,
A Posteriori Risk Classification and Ratemaking with Random Effects in the Mixture-of-Experts Model
Abstract:
In the underwriting and pricing of non-life insurance products, the insurer needs to utilize both policyholder
information and claim history to ensure profitability and proper risk management. While the policyholder information
such as age and gender reflect the observable risk characteristics, their claim history may be regarded
as a manifestation of unmeasurable and unobservable risk factors, which could vary drastically across different policyholders.
This presentation introduces a flexible regression model, called the Mixed LRMoE, for a posteriori rate making,
which leverages policyholder information and their claim history, to categorize policyholders into groups with risk
profiles and to determine a premium that accurately captures the unobserved risks. Our proposed framework outperforms
the benchmark models regarding goodness-of-fit to a real, multiyear automobile insurance dataset while offering intuitive
and interpretable characterization of policyholders' risk profiles to reflect their claim history adequately.