Statistics Seminar

2:00-3:00pm, April 23, 2021, Virtual Colloquium, Webex Meeting Link: https://gsumeetings.webex.com/gsumeetings/j.php?MTID=me31dd6364114a93e00b6e9fd0e1854d3, Dr. Gang Li, Professor of Biostatistics and Computational Medicine, Director, Jonsson Comprehensive Cancer Center Biostatistics, Analytical and Evaluation Support Shared Resource, University of California at Los Angeles,

Distinguished Lecture: Prediction Accuracy Measures for a Nonlinear Model and for Right-Censored Time-to-Event Data

Abstract: In this talk, I will discuss a pair of new prediction summary measures for a nonlinear prediction function with right-censored time-to-event data. The first measure, defined as the proportion of explained variance by a corrected prediction function, quantifies the potential predictive power of the prediction function. The second measure, defined as the proportion of explained prediction error, gauges the closeness of the prediction function to its corrected version and serves as a supplementary measure to indicate (by a value less than 1) whether the correction is needed to fulfill its potential predictive power and to quantify how much prediction error reduction can be realized with the correction. The two measures together provide a complete summary of the predictive accuracy of a nonlinear prediction function. We motivate these measures by first establishing a variance decomposition and a prediction error decomposition at the population level and then deriving uncensored and censored sample versions of these decompositions. We note that for the least square prediction function under the linear model with no censoring, the first measure reduces to the classical coefficient of determination and the second measure degenerates to 1. We show that the sample measures are consistent estimators of their population counterparts and conduct extensive simulations to investigate their finite sample properties. An R package, PAmeasures, is available to implement these measures for various nonlinear models and survival models. Real data illustrations will be given.

3:00-4:00pm, March 26, 2021, Virtual Colloquium, Webex Meeting Link: https://gsumeetings.webex.com/gsumeetings/j.php?MTID=m1a0732f257ae59c1656beccb2614c195 , Dr. Dayu Sun, Department of Biostatistics, Emory University,

Kernel meets sieve: transformed hazards model with sparse longitudinal covariates

Abstract: Regression analysis of censored failure observations via a class of transformed hazards model permits time-varying covariates and a unified statistical framework, including both the proportional hazards model and additive hazards model. In practice, such longitudinal covariates are typically sparse, that is, covariates are only measured at infrequent and irregularly spaced follow-up times. Full likelihood analyses of joint models for longitudinal and survival data impose stringent modeling assumptions that are difficult to verify in practice and that are complicated both inferentially and computationally. Other ad hoc methods, such as the last value carried forward, may lead to biased estimation. In this paper, a simple half-kernel weighted sieve semiparametric maximum likelihood method is proposed with minimal assumptions. It is established that these estimators are consistent and asymptotically normal though they converge at rates that are slower than the parametric rates that may be achieved with fully observed covariates. Simulation results demonstrate that the large sample approximations are adequate for practical use and may yield improved performance relative to the last value carried orward approach. The analysis of the data from a Covid-19 study in Wuhan demonstrates the utility of the proposed methods.

2:00-3:00pm, November 6, 2020, Virtual Colloquium, Webex Meeting Link: https://gsumeetings.webex.com/gsumeetings/j.php?MTID=m720f979c35e0e767286baca03df55dc6, Associate Professor Pengsheng Ji, Department of Statistics, University of Georgia,

Collaborative Spectral Clustering in Attributed Networks

Abstract: We proposed a novel spectral clustering algorithm for attributed networks, where $n$ nodes split into $R$ non-overlapping communities and each node has a $p-$dimensional meta covariate from various of formats such as text, image, speech etc.. The connectivity matrix $W_{n \times n}$ is constructed with the adjacent matrix $A_{n \times n}$ and covaraite matrix $X_{n \times p}$, and $W = (1-\alpha)A + \alpha K(X,X')$, where $\alpha \in [0,1]$ is a tuning parameter and $K$ is a Kernel to measure the covariate similarities. We then perform the classical $k$-means algorithm on the element-wise ratio matrix of the first $K$ leading eigenvector of $W$. Theoretical and simulation studies showed the consistent performance under both Stochastic Block Model (SBM) and Degree-Corrected Block Model (DCBM), especially in imbalanced networks where most community detection algorithms fail.

Statistics Seminar at Georgia State University

Fall 2020-Spring 2021, Fridays 3:00-4:00pm, Virtual Seminar

Organizer: Yichuan Zhao

If you would like to give a talk in Statistics Seminar, please send an email to Yichuan Zhao at yichuan@gsu.edu