PhD Defense Frederik V. Mikkelsen
Titel: Model Selection and Risk Estimation with applications to nonlinear ODEs
Broadly speaking, this thesis is devoted to model selection applied to ordinary differential equations and risk estimation under model selection. A model selection framework was developed for modelling time course data by ordinary differential equations. The framework is accompanied by the R software package, episode. This package incorporates a collection of sparsity inducing penalties into two types of loss functions: a squared loss function relying on numerically solving the equations and an approximate loss function based on inverse collocation methods. The goal of this framework is to provide effective computational tools for estimating unknown structures in dynamical systems, such as gene regulatory networks, which may be used to predict downstream effects of interventions in the system. A recommended algorithm based on the computational tools is presented and thoroughly tested in various simulation studies and applications.
The second part of the thesis also concerns model selection, but focuses on risk estimation, i.e., estimating the error of mean estimators involving model selection. An extension of Stein's unbiased risk estimate (SURE), which applies to a class of estimators with model selection, is developed. The extension relies on studying the degrees of freedom of the estimator, which for a broad class of estimators decomposes into two terms: one ignoring the selection step and one correcting for it. The classic SURE assumes that the estimator in question is almost differentiable and it therefore only accounts for the first term of the decomposition. In order to account for the second term the continuum of models arising when the selection procedure has a tuning parameter is studied. By exploiting the duality between varying the tuning parameter for fixed observations and perturbing the observations for fixed tuning parameter, an identity is derived for a class of estimators which support the extension of SURE. The resulting corrected version of SURE is generally fast to compute and for the lasso-OLS estimator it shows promising results when compared to risk estimation via cross validation.
Supervisor: Prof. Niels Richard Hansen, Math, University of Copenhagen
Assessment committee:
Prof. Carsten Wiuf (Chairman), MATH, University of Copenhagen
Prof. Michael Stumpf, Imperial College, London
Ass. Prof. Ryan Tibshirani, Carnegie Mellon University