I am using sklearn to fit a Gaussian process to data.
In this reduced example, I use simple noisy sinusoidal data with an RBF kernel
import numpy as np import matplotlib.pyplot as plt from sklearn.gaussian_process import GaussianProcessRegressor from sklearn.gaussian_process.kernels import RBF, ConstantKernel X = np.random.uniform(-3.,3.,(200,1)) Y = np.sin(X) + np.random.randn(200,1)*0.05 kernel = RBF(length_scale=1) gp = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=9) gp.fit(X,Y) x = np.linspace(-3, 3, 100) y_pred, sigma = gp.predict(x.reshape(-1, 1), return_std=True) plt.scatter(X,Y, c='r') plt.plot(x, y_pred)
This results in a very non-smooth fitted prediction for a relatively trivial noisy sine curve. This appears to be the case regardless of the values for RBF
length_scale parameter or
From the sklearn RBF documentation, GPs with this kernel as covariance function have mean square derivatives of all orders, and are thus very smooth. Any ideas why the mean prediction is so unsmooth here?
edit: image embedding does not appear to work: here is the plotted output https://imgur.com/kIE9hOT