My main research focus is on probabilistic models for time series, this includes modelling, imputation, denoising, filtering and spectral estimation. This page contains a brief description of some selected research projects.
Bayesian spectral estimation
Spectral estimation aims to identify how the energy of a time series is distributed across frequencies; this can be challenging when only partial and noisy observations are available. In this setting, I have worked on the development of probabilistic generative models for signals (based on Gaussian processes), observations and spectra, with the aim to address spectral estimation as a Bayesian inference. This procedure, termed Bayesian nonparametric spectral estimation (BNSE) compute posterior distributions of power spectral densities (PSDs) as shown in the figure: only using a 10% of a heart-rate time series, we were able to compute a posterior distribution over PSDs (red) and successfully contained the ground truth (blue).
See [C19] in the CV & Publications tab and here for code and demo.
Multioutput signal processing
Designing Gaussian processes for multioutput data is challenging due to the intricate representation of cross-channel correlations. We have followed Cramer Theorem, the multioutput counterpart of Bochner Theorem, to design the covariance structure of a multioutput Gaussian process (MOGP) by first parameterising its cross power spectral density and then convert it, via the Fourier transform, from the frequency to the temporal space. This model is referred as the multioutput spectral mixture (MOSM) and, due to having several hyperparameters, trainig it is challenging. However, to facilitate its implementation, we have developed a Python toolkit (MOGPTK) to use MOSM and other standard MOGP kernels, which contains the resources from data loading, model training, validation and visualisation, for multioutput data analysis. We have tested MOSM on datasets from finance, EEG, COVID-19 prevalence, mining, body sensors, and climate (see the figure).
In the CV & Publications tab, see [C17] for the MOSM article, [C22] for an application to financial data, and [J14] for the MOGPTK toolbox.
Optimal transport for machine learning
I have recently become interested on applications of OT, in particular to time series and Bayesian inference. My first work on this is a technique for averaging statistical models directly on the Wasserstein space of models (rather on the parameter space) using a posterior loss. The simplest way to understand this is by recognising the posterior mean as the point estimator obtained through the minimisation of the Bayes risk under a quadratic loss, if this Bayes risk were to be optimised directly on the model space, we could use the Wasserstein loss instead, which would give us a point estimate in the form of a Wasserstein barycenter wrt to the posterior law over model.
A more recent, and also more applied, use of the Wasserstein distance on which I have worked is that of the development of the Wasserstein-Fourier (WF) distance for time series. The WF distance between two time series is defined as the Waaserstein distance between the power spectral densities (PSD) of the corresponding time series, therefore, it can be computed between two signals regardless their discrete/continuous nature. One key advantage of WF is that it allows any distance-based method (such as those found in regression, classification, or clustering) to operate directly over the time series, instead of using hand-crafted features. Another property of the WF is that it allows to define geodesics in the space of time series, that is, minimum-cost trajectories between one signal and another. See, form the image at the right, two series from the *worms* dataset (blue and red) and 10 constant-cost interpolations using WF. This opens the way for spectral-based data augmentation for time series.
In the CV & Publications tab, see [U3] for the definition of the Bayes-Wasserstein estimator and [J15] for the Wasserstein-Fourier distance.