I am using sklearn to fit a Gaussian process to data.

In this reduced example, I use simple noisy sinusoidal data with an RBF kernel

import numpy as np
import matplotlib.pyplot as plt
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel

X = np.random.uniform(-3.,3.,(200,1))
Y = np.sin(X) + np.random.randn(200,1)*0.05

kernel = RBF(length_scale=1)
gp = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=9)
gp.fit(X,Y)

x = np.linspace(-3, 3, 100)
y_pred, sigma = gp.predict(x.reshape(-1, 1), return_std=True)
plt.scatter(X,Y, c='r')
plt.plot(x, y_pred)

This results in a very non-smooth fitted prediction for a relatively trivial noisy sine curve. This appears to be the case regardless of the values for RBF length_scale parameter or n_restarts_optimizer

gp

From the sklearn RBF documentation, GPs with this kernel as covariance function have mean square derivatives of all orders, and are thus very smooth. Any ideas why the mean prediction is so unsmooth here?

edit: image embedding does not appear to work: here is the plotted output https://imgur.com/kIE9hOT

Related posts

Recent Viewed