Statistics Seminar at Georgia State University
Fall 2015-Spring 2016, Fridays 3:00-4:00pm, Paul Erdos Conference room (796)
Organizer: Yichuan Zhao
If you would like to give a talk in Statistics Seminar, please send an email to
Yichuan Zhao at
2:00-3:00pm, April 8, 2016, 796 COE
Dr. Min Zhang,
Department of Statistics,
Statistical Methods for Integrative Genome-Wide Analysis
We developed a variable selection framework to integrate pathway information for genome-wide association analysis.
Unlike Bayesian variable selection methods that rely on computation-intensive Markov chain Monte Carlo algorithms,
we proposed an iterated conditional modes/medians algorithm to implement an empirical Bayes variable selection. Iterated conditional modes are first utilized to optimize values of the hyper-parameters and to implement the empirical Bayes method, and then iterated conditional medians are used to estimate the model parameters and therefore implement the variable selection function. In addition to the advantages of Bayesian inference, the proposed method enjoys efficient computation, increased statistical power of the analysis, and improved estimation of the model parameters. Extensive computer simulation studies show the superior performance of our proposed approach, and the method has been applied to real data from genome-wide association studies.
This is a joint work with Vitara Pungpapong and Dabao Zhang.
3:00-4:00pm, March 11, 2016, 796 COE
Dr. Zhongjian Lin,
Assistant Professor of Economics,
Identification and Estimation of Hierarchy Effects in Social Interactions
This paper studies status based heterogenous peer effects in a large network. We extend
Xu (2011)'s large network game to allow for heterogenous peer effects. In particular,
we measure individuals' social status by the number of friends nominations in the network,
which determine the strength of peer effects. Hierarchy effects are characterized by
the differences of peer effects from friends with different social status. To solve the computational
burden when the data come from a single large network, we extend Aguirregabiria
and Mira (2007)'s nested pseudo likelihood estimation (NPLE) to the large network game
model. We illustrate our method through both Monte Carlo experiments and an empirical
study of high school students' college attendance decisions using the Add Health dataset.
3:00-4:00pm, February 19, 796 COE
Dr. Benjamin Haaland, Assistant Professor,
Stewart School of Industrial and Systems Engineering
Georgia Institute of Technology
Computationally efficient approximation of large-scale and high-dimensional simulations
Simulations are used by scientists and engineers to study complex real systems such as material micro-structure or
passenger flows through a new airport design. Frequently, these complex simulations are too expensive to allow full
exploration of the unknown relationship, much less optimization. A common solution is to build a computationally efficient
approximate simulation, or emulator. Here, we discuss several aspects of building an accurate and efficient emulator in the
context of large-scale and high-dimensional simulations. Specifically, we examine sources of inaccuracy related to data
collection and present two techniques well-adapted to large-scale and high-dimensional simulations, local Gaussian process
fitting and multi-resolution functional ANOVA modeling.
3:00-4:00pm, February 12, 796 COE
Dr. Norou Diawara,
Department of Mathematics and Statistics, Old Dominion University
Statistical Pattern Recognition using Gaussian Copula
Abstract: Statistical pattern recognition has attracted great interest due to their applicability
and to the advances in technology and computing. Significant research has been done in areas such as automatic character
recognition, medical diagnostic and data mining. Classical discrimination rule for pattern recognition assumes normality.
But in real life, this assumption is often questionable. In some situation, the pattern vector is a mixture of discrete and continuous random variables.
In this talk, we use copula densities to model class conditional distribution for pattern recognition with Bayes' decision rule. These types of densities are useful when the marginal densities of a pattern vector are not normally distributed. Those models are also useful for a mixed pattern vectors. We present simulations' results to compare the performance of the copula based classifier with classical normal distribution based model and the independent assumption based model. Application to real data is presented.
2:00-3:00pm, January 29, 796 Petite Science Center Building (PSC) 124
(Distinguished Lecture in Statistics)
Dr. C. F. Jeff Wu, the Professor and Coca-Cola Chair in Engineering Statistics,
School of Industrial
and Systems Engineering, Georgia Institute of Technology
From real world problems to esoteric research: examples and personal experience
Abstract: Young (and some not-so-young) researchers often wonder how to extract good research
ideas and develop
useful methodologies from solving real world problems. The path is rarely straightforward and its success depends
on the circumstances, tenacity and luck. I will use three examples to illustrate how I trod the path. The first
involved an attempt to find optimal growth conditions for nano structures. It led to the development of a new method
"sequential minimum energy design (smed)", which exploits an analogy to potential energy of charged particles.
After a few years of frustrated efforts and relentless pursuit, we realized that smed is more suitable for
generating samples adaptively to mimic an arbitrary distribution rather than for optimization. The main objective
of the second example was to build an efficient statistical emulator based on finite element simulation results with
two mesh densities in cast foundry operations. It eventually led to the development of a class of nonstationary Gaussian
process models that can be used to connect simulation data of different precisions and speeds. The third example is about
sequential design that works well for small samples in sensitivity testing. I will describe three major papers in a span of
30 years and how each paper had one new idea that inspired the next paper. In each example, the developed methodology has
broader applications beyond the original problem. I will explain the thought process in each example. Finally, I will share
some secrets about a "path to innovation".
3:00-4:00pm, December 4, 796 COE
Professor Betty Sao-Hou Lai,
Division of Epidemiology and Biostatistics,
Georgia State University
Children's Reactions to Trauma: Modeling Posttraumatic Stress Symptoms After
Hurricanes Andrew and Katrina
Abstract: Approximately 100 million children are exposed to disasters each year. Disaster exposure leads to the development of posttraumatic stress, anxiety, and depression symptoms in children.
However, how and why children differ in their reactions to disaster is poorly understood. This talk focuses on two applied examples of growth mixture
modeling, examining children's varying reactions after Hurricanes Andrew and Katrina.
2- 3pm, November 20, 796 COE
Dr. Yi Li,
Professor of Biostatistics and
Director of Kidney Epidemiology and Cost Center, University of Michigan
Modeling Complex Large-scale Time-to-event Data: An Efficient Quasi-Newton Approach
Nonproportional hazards models often arise in modern biomedical studies, as
evidenced by a recent national kidney transplant study.
During the follow up, the effects of
baseline risk factors, such as patients' commorbidity conditions collected at
transplantation, may vary over time, resulting in a weakening or
strengthening of associations over time. Time-varying survival models have
emerged as a powerful means of modeling the dynamic changes of covariate
Traditional methods of fitting time-varying effects survival model
rely on an expansion of the original dataset in a repeated
measurement format, which, even with a moderate sample size, leads to an
extremely large working dataset. Consequently, the computational
burden increases quickly as the sample size grows, and analyses of a large
dataset such as our motivating example defy any existing statistical
methods and software. We propose a novel application of quasi-Newton
iteration method, via a refined line search procedure, to model the dynamic
changes of covariates' effects in survival analysis. We show that the
algorithm converges superlinearly and is computationally efficient
for large-scale datasets. We apply the proposed methods to analyze the
national kidney transplant data and study the impact of potential risk factors on
3:00-4:00pm, Novemver 6, 796 COE
Professor Tao Zha,
School of Economics,
Emory University and Federal Reserve Bank of Atlanta
Dynamic Striated Metropolis Hastings Sampler for High-Dimensional Models
Having efficient and accurate samplers for simulating the posterior distribution
is crucial for Bayesian analysis. We develop a generic posterior simulator called the "dynamic
striated Metropolis-Hastings (DSMH)" sampler. Grounded in the Metropolis-Hastings algorithm,
it pools the strengths from the equi-energy and sequential Monte Carlo samplers
while avoiding the weaknesses of the standard Metropolis-Hastings algorithm and those of
importance sampling. In particular, the DSMH sampler possesses the capacity to cope
with extremely irregular distributions that contain winding ridges and multiple peaks; and
it is robust to how the sampling procedure progresses across stages. The high-dimensional
application studied in this paper provides a natural platform for testing any generic sampler.
3:00-4:00pm, October 23, 796 COE
Professor Enlu Zhou,
ISYE, Georgia Institute of Technology
Gradient-based Adaptive Stochastic Search (GASS)
Gradient-based adaptive stochastic search (GASS) is an algorithm for solving general optimization problems with little structure.
GASS iteratively finds high quality solutions by randomly sampling candidate solutions from a parameterized distribution model over
the solution space. The basic idea is to convert the original (possibly non-continuous, non-differentiable) problem into a
differentiable optimization problem on the parameter space of the parameterized distribution, and then use a direct gradient search method
to find improved distributions. Thus, GASS combines the robustness feature of stochastic search by considering a population of
candidate solutions with the relative fast convergence speed of classical gradient methods. The performance of the algorithm is
illustrated on a number of benchmark problems and a resource allocation problem in communication networks. If time permits,
I will also talk about the extension of GASS to simulation optimization problems, where the objective function can only
be evaluated through a stochastic simulation model.
2:00-3:00pm, September 11, 796 COE
Professor and Dean, Tao Wang,
School of Mathematics,
Yunnan Normal University, China
The Estimation and Exact Lower Confidence Limit of the Conditional Reliability for
Weibull Distribution in the Life Tests with Fixed Stopping Time
In this talk a new method for calculating the lower confidence limit of
the conditional reliability for Weibull distribution in the life time test
with fixed stopping time is presented. For the data obtained from the tests
with fixed stopping time, how to obtain the accurate lower confidence limit of the
conditional reliability is a difficult problem. Based on the theory of ordering method
in the sample space, for prearranged confidence level, with arbitrary sample size (not less than 2),
we give the accurate lower confidence limit for the conditional reliability as well as
its effective calculating method. The software is also presented.
This is joint work with Jiading Chen, School of Mathematical Sciences, Peking University.