Description

IN THE NATURAL SCIENCES

Modern natural sciences depend to a high degree on the understanding of the fundamental nature of the randomness in the time evolution of quantities of interest. This international and interdisciplinary workshop offers four lecture courses two of them in statistics and two of them in the mathematical theory of stochastic processes. It is aimed at master and Ph.D. students as well as ambitious undergraduates.

The courses in statistics are focused on learning theory, non-parametric estimation and their connections to natural sciences. Both lecturers are distinguished scientists in the field, Prof. Alexandra Carpentier from the Otto-von-Guericke-University Magdeburg, Germany, and Prof. Adolfo José Quiroz from the Mathematics department of Universidad de los Andes, Colombia.

The courses on probability theory are held by Prof. Sylvie Roelly, vice dean of the faculty of natural sciences of University of Potsdam, winner of the Itô prize in 2007 and a renown expert in stochastic processes and Anthony Reveillac, from INSA Toulouse, France, a leading expert in Malliavin's calculus, Gaussian processes and Stein's method.

A main problem of machine learning is classification, i.e. the problem of building an automated decision rule for associating a class to data samples. A common way to solve this problem in machine learning is to consider classification rules that empirically minimise a given loss (akin to a classification error) on a training sample of labeled data and over a given large class of possible classifiers. The theoretical study of this problem is done using tools of learning theory, which provides results to control the performance of such classifiers. We will introduce this problem and the associated theory in the class.

I will consider non-parametric methods for three problems of current interest in complex data. 1. Classification via support vector machines: For this problem, we will show how neigbhor methods can significantly reduce the solution time at a minimal performance cost. 2. Intrinsic dimension estimation. For data living in an unspecified submanifold of a high dimensional euclidean space, we will discuss some graph theoretic and other relevant methods for consistently estimating the dimension of the manifold. 3. The two-sample problem for functional data. I will give an overview of the available methods for the two-sample problem in multivariate and functional data and discuss some recent alternatives, one of which uses classic empirical processes theory for the asymptotic analysis.

One of the main issue in statistics when dealing with asymptotic estimators or tests is to derive speeds of convergence for some quantities of interest. This usually amounts to derive a central limit type theorem and to control the distance between the distribution of the statistics and the normal distribution. Motivated by providing a proof to his students of the CLT which does not call for the use of the characteristic function, Charles Stein proposed in 1972 a very efficient paradigm to estimate the distance of the law of a given random variable to the one of a standard gaussian law. Since then, this paradigm called Stein’s method, has been extensively used and developed to become a set of various efficient probabilistic methods to compute distances of probability distributions. The aim of this course is to focus on one of these probabilistic methods which has been recently developed by Ivan Nourdin and Giovanni Peccati which consists in combining the core idea of Stein’s method with the Malliavin calculus. We will present the main notions of Stein’s method and of the Malliavin calculus and then show how they can be combined and applied.

We study in these lectures specific Gaussian processes. First we present various characterizations of the celebrated Brownian motion, in particular an integration by part formula on path space related to Malliavin calculus. Then we describe and analyse the periodic Ornstein-Uhlenbeck process, which can be construct either as the solution of a stochastic differential equation with initial and final condition (which prevents it to be Markovian) or as the convolution of an exponential function with the Brownian motion. Generalizing this procedure we then introduce so called convoluted Brownian motions, obtained by convolution of any deterministic functions with a Brownian motion. They exhibit interesting properties: they are Gaussian, periodic, not Markovian but sometimes they satisfy a time-Markov field property. We treat as particular examples the case of the trigonometric resp. monomial convoluted Brownian motion.

Scientific committee:

Sylvie Roelly, Universität Potsdam, Germany

Michael Hoegele, Universidad de los Andes

<< Home