Statistics Seminar - Fall 2017

Seminar Schedule

Seminars are held on Thursdays at 4:00PM. Please note the location for each event in the schedule below, which will either be Cullimore 611 (CULM 611) or the Campus Center. For questions about the seminar schedule, please contact Antai Wang.

Date Location Speaker, Affiliation, and Title Host
September 28 CTR 215 Wei Sun, Department of Management Science, University of Miami
Personalized Advertising and Ad Clustering via Sparse Tensor Methods

Tensor as a multi-dimensional generalization of matrix has received increasing attention in industry due to its success in personalized recommendation systems. Traditional recommendation systems are mainly based on the user-item matrix, whose entry denotes each user's preference for a particular item. To incorporate additional information into the analysis, such as the temporal behavior of users, we encounter a user-item-time tensor. Existing tensor decomposition methods are mostly established in the non-sparse regime where the decomposition components include all features. In online advertising, the ad-click tensor is usually sparse due to the rarity of ad clicks.

In this talk, I will discuss a new sparse tensor decomposition method that incorporates the sparsity of each latent component to the CP tensor decomposition. In theory, in spite of the non-convexity of the optimization problem, it is proven that an alternating updating algorithm attains an estimator whose rate of convergence significantly improves those shown in non-sparse decomposition methods. The potential business impact of our method is demonstrated via an application of click-through rate prediction for personalized advertising.

In the second part of the talk, I will discuss an extension of the proposed sparse tensor decomposition to handle multiple sources of tensor data. In online advertising, the users’ click behavior on different ads from multiple devices forms a user-ad-device tensor, and the ad characteristics data forms an ad-feature matrix. We propose a unified learning framework to extract latent features embedded in both tensor data and matrix data. We conduct cluster analysis of advertisements based on the extracted latent features and provide meaningful insights in linking different ad industries.

Speaker Introduction: Will Wei Sun is currently an assistant professor of Management Science at University of Miami School of Business Administration. Before that, he was a research scientist in the advertising science team at Yahoo labs. He obtained his PhD in Statistics from Purdue University in 2015. Dr. Sun’s research focuses on machine learning, with applications in computational advertising, personalized recommendation system, and Neuroimaging analysis.
Yixin Fang
October 6
CULM 611 Jing Qiu, Department of Applied Economics and Statistics, University of Delaware
FDR Control of the High Dimensional TOST Tests

High dimensional equivalence testing is a very important but seldom studied problem. When researchers look for equivalently expressed genes, the common practice is to conduct differential tests and treat genes that are not differentially expressed as equivalently expressed genes. This is statistically not valid because it does not control the type I error appropriately. An appropriate way is to conduct equivalence tests. A well-known equivalence test is two one-sided tests (TOST). The existing FDR controlling methods are over-conservative for equivalence tests. We investigate the performance of existing FDR controlling methods and propose three new methods to control the FDR for equivalence test.
Wenge Guo
October 12 CTR 230 Xin Yuan, Bell Labs
Bayesian Deep Generative Deconvolutional

A deep generative model is developed for representation and analysis of images, based on a hierarchical convolutional dictionary-learning framework. Stochastic unpooling is employed to link consecutive layers in the model, yielding top-down image generation. A Bayesian support vector machine is linked to the top layer features, yielding max-margin discrimination. Deep deconvolutional inference is employed when testing, to infer the latent features, and the top-layer features are connected with the max-margin classifier for discrimination tasks. The model is efficiently trained using a Monte Carlo expectation-maximization (MCEM) algorithm; the algorithm is implemented on graphical processor units (GPU) to enable large-scale learning, and fast testing. Excellent results are obtained on several benchmark datasets, including ImageNet, demonstrating that the proposed model achieves results that are highly competitive with similarly sized convolutional neural networks.
Yixin Fang
November 2 CULM 611 Wenguang Sun, Department of Data Sciences and Operations, University of Southern California
A General Framework for Information Pooling in Two-Sample Multiple Testing

In this talk, I will discuss a general framework for exploiting the sparsity information in two-sample multiple testing problems. We propose to first construct a covariate sequence, in addition to the usual primary test statistics, to capture the sparsity structure, and then incorporate the auxiliary covariates in inference via a three-step algorithm consisting of grouping, adjusting and pooling (GAP). The GAP procedure provides a simple and effective framework for information pooling. An important advantage of GAP is its capability of handling various dependence structures such as those arise from multiple testing for high-dimensional linear regression, differential correlation analysis, and differential network analysis. We establish general conditions under which GAP is asymptotically valid for false discovery rate control, and show that these conditions are fulfilled in a range of applications. Numerical results demonstrate that existing methods can be much improved by the proposed framework.
November 16 CTR 230 Weijie Su, Department of Statistics, University of Pennsylvania
Statistical Inference for Stochastic Approximation and Online Learning via Hierarchical Incremental Gradient Descent

Stochastic gradient descent (SGD) is an immensely popular approach for optimization in settings where data arrives in a stream or data sizes are very large. Despite an ever-increasing volume of works on SGD, less is known about statistical inferential properties of predictions based on SGD solutions. In this paper, we introduce a novel procedure termed HiGrad to conduct inference on predictions, without incurring additional computational cost compared with the vanilla SGD. HiGrad begins by performing SGD iterations for a while and then split the single thread into a few, and it hierarchically operates in this fashion along each thread. With predictions provided by multiple threads in place, a t-based confidence interval is constructed by decorrelating predictions using covariance structures given by the Ruppert– Polyak averaging scheme. Under certain regularity conditions, the HiGrad confidence interval is shown to attain asymptotically exact coverage probability. Finally, the performance of HiGrad is evaluated through extensive simulation studies and a real data example.
Yixin Fang
December 7 CULM 611 Annie Qu, Department of Mathematical Sciences, University of Illinois at Urbana Champaign
A Group-Specific Recommender System

In recent years, there has been a growing demand to develop efficient recommender systems which track users’ preferences and recommend potential items of interest to users. In this talk, we propose a group-specific method to utilize dependency information from users and items which share similar characteristics under the singular value decomposition framework. The new approach is effective for the “cold-start” problem, where, in the testing set, majority responses are obtained from new users or for new items, and their preference information is not available from the training set. One advantage of the proposed model is that we are able to incorporate information from the missing mechanism and group-specific features through clustering based on the numbers of ratings from each user and other variables associated with missing patterns. Our simulation studies and MovieLens data analysis both indicate that the proposed group-specific method improves prediction accuracy significantly compared to existing competitive recommender system approaches. In addition, we also extend the recommender system for the tensor data with multiple arrays. This is joint work with Xuan Bi, Xiaotong Shen and Junhui Wang.
Antai Wang

Updated: December 6, 2017