Title: Sparse models for sparse networks
Author introduction: Chenlei Leng is Professor of Statistics in the University of Warwick. He received a bachelor's degree in Mathematics from the University of Science and Technology of China and PhD in Statistics from the University of Wisconsin-Madison. He has held regular and visiting faculty positions in Peking University, the University of Munich and the National University of Singapore. He mainly works on developing methods for analyzing high-dimensional data, network data, and correlated data.（冷琛雷，英国华威大学教授，博士毕业于美国威斯康星大学麦迪逊分校。国际统计学会ISI和国际数理统计学会IMS的会士。主要研究领域：高维数据、网络数据、相依数据等。）
Abstract: Networks are ubiquitous in modern society and science. Stylized features of a typical network include network sparsity, degree heterogeneity and homophily among many others. This talk introduces a framework with a class of sparse models that utilize parameters to explicitly account for these network features. In particular, the degree heterogeneity is characterized by node-specific parametrization while homophily is captured by the use of covariates. To avoid over-parametrization, one of the key assumptions in our framework is to differentially assign node-specific parameters. We start by discussing the sparse \beta model when no covariates are present, and proceed to discuss a generalized model to include covariates. Interestingly for the former we can use \ell_0 penalization to identify and estimate the heterogeneity parameters, while for the latter we resort to penalized logistic regression with an \ell_1 penalty, thus immediately connecting our methodology to the lasso literature. Along the way, we demonstrate the fallacy of what we call data-selective inference, a common practice in the literature to discard less well-connected nodes in order to fit a model, which can be of independent interest.