samuel cohen, Rendani Mbuvha, Tshilidzi Marwala, Marc Deisenroth
Gaussian processes are nonparametric Bayesian models that have been applied to regression and classification problems. One of the approaches to alleviate their cubic training cost is the use of local GP experts trained on subsets of the data. While these expert models allow for massively distributed computation, their predictions can suffer from erratic behaviour of the mean or unrealistic uncertainty quantification. In this paper, we provide a solution to these problems for multiple expert models, including the generalised product of experts and the robust Bayesian committee machine. Furthermore, we leverage the optimal transport literature and propose a new expert model that averages predictions of local experts by computing their Wasserstein barycenter, which can be applied to both regression and classification settings.