# Statistics Seminar - Spring 2021

Seminars are held from 4:00 - 5:00pm, unless noted otherwise. Location: WebEx (Room Information will be posted at a later date)

Seminars are held from 4:00 - 5:00pm, unless noted otherwise. Location: WebEx (Room Information will be posted at a later date)

For questions about the seminar schedule, please contact Zuofeng Shang

**Peijun Sang**, University of Waterloo

**A Reproducing Kernel Hilbert Space Framework for Functional Data Classification**

We encounter a bottleneck when we try to borrow the strength of classical classifiers to classify functional data. The major issue is that functional data are intrinsically infinite dimensional, thus classical classifiers cannot be applied directly or have poor performance due to curse of dimensionality. To address this concern, we propose to project functional data onto one specific direction, and then a distance-weighted discrimination DWD classifier is built upon the projection score. The projection direction is identified through minimizing an empirical risk function that contains the particular loss function in a DWD classifier, over a reproducing kernel Hilbert space. Hence our proposed classifier can avoid overfitting and enjoy appealing properties of DWD classifiers. This framework is further extended to accommodate functional data classification problems where scalar covariates are involved. In contrast to previous work, we establish a non-asymptotic estimation error bound on the relative misclassification rate. In finite sample case, we demonstrate that the proposed classifiers compare favorably with some commonly used functional classifiers in terms of prediction accuracy through simulation studies and a real-world application.

This is a joint work with professors Adam B Kashlak and Linglong Kong from the University of Alberta.

**Yang Chen**, University of Michigan

**Statistical and Computational Problems in Space Weather Data Challenges**

In this talk, I will introduce several data challenges in the field of space weather, present our initial solutions, and discuss our ongoing efforts. Space weather generally refers to the conditions on the sun, in the solar wind, and within Earth's magnetosphere, ionosphere and thermosphere that can influence the performance and reliability of space-borne and ground-based technological systems and can endanger human life or health (definition given by NASA). More precisely, for example, a big solar flare might cause outages in the earth’s electrical power grids and disturbances of satellites in orbit. Therefore, early warnings of potential strong solar flare events are of great importance. The statistical challenges with solar flare prediction problems come from the sparsity of samples (rarity of events), large volumes of data of various types, and the heterogeneous temporal dependency, to name a few. We adopted the Long Short Term Memory neural network to approach solar flare classification and prediction problems as our first attempt. Results and ongoing work on solar flare predictions will be presented in the first half of the talk. In the second half of the talk, I will present video imputation methods that we recently developed for the reconstruction of the total electron content (TEC) map. The TEC is the total number of electrons present along a path between a radio transmitter and receiver. For ground to satellite communication and satellite navigation, TEC is a good parameter to monitor for possible space weather impacts. Thus it is necessary to monitor the TEC maps in both realtime and in a predictive fashion. Our proposed video imputation methods allow for spatial and temporal smoothness, signal sparsity and the use of auxiliary data, which yield nicely reconstructed local and global features of TEC maps.

**Guan Yu**, University of Buffalo

**Locally Weighted Nearest Neighbor Classifier and Its Theoretical Properties**

Weighted nearest neighbor (WNN) classifiers are fundamental non-parametric classifiers for classification. They have become the methods of choice in many applications where limited knowledge of the data generation process is available a priori. There exists a vast room of flexibility in the choice of weights for the neighbors in a WNN classifier. In this talk, I will introduce a new locally weighted nearest neighbor (LWNN) classifier, which adaptively assigns weights for different test data points. Given a training data set and a test data point x0, the weights for classifying x0 in LWNN is obtained by minimizing an upper bound of the conditional expected estimation error of the regression function at x0. The resultant weights have a neat closed-form expression, and therefore the computation of LWNN is more efficient than some existing adaptive WNN classifiers that require estimating the marginal feature density. Like most other WNN classifiers, LWNN assigns large weights for closer neighbors. However, in addition to the ranks of neighbors' distances, the weights in LWNN also depend on the raw values of the distances. Our theoretical study shows that LWNN achieves the minimax rate of convergence of the excess risk, when the marginal feature density is bounded away from zero. In the general case with an additional tail assumption on the marginal feature density, the upper bound of the excess risk of LWNN matches the minimax lower bound up to a logarithmic term.

Updated: April 6, 2021