Let the input vector space be \(\mathcal{X} = \mathbb{R}^2\), and define kernel \(k: \mathcal{X} \times \mathcal{X} \to \mathbb{R}, k(x,y)=\|x-y\|_2^2 = (x_1-y_1)^2+(x_2-y_2)^2\). Let's find the space of functions \(\mathcal{H}\) on \(\mathcal{X}\), for which \(k\) is the reproducing kernel. The corresponding feature map is then \(\phi: \mathcal{X}\to \mathcal{H}, \phi(x) = k(\cdot, x) = \|\cdot - x\|_2^2\)
To construct a unique RKHS given kernel, I copied and pasted the standard procedure (i.e. Moore-Aronszajn theorem) here: For all \(x\in \mathcal{X}\), define the function
Let \(\mathcal{H}_0\) be the linear span of \(k_x\), i.e. \(\mathcal{H}_0\) consists of functions of the form
Define an inner product on \(\mathcal{H}_0\) by
Let \(\mathcal{H}\) be the completion of \(\mathcal{H}_0\) with respect to this inner product. Then \(\mathcal{H}\) consists of functions of the form
Ok that's all very cool, but what does an element in \(\mathcal{H}\) actually look like for our kernel \(k(x,y)=\|x-y\|_2^2 = (x_1-y_1)^2+(x_2-y_2)^2\)? Consider a concrete \(\tilde y = (a,b) \in \mathcal{X}\), say \((a,b)=(3,4)\), then its representer \(k_{\tilde y} = k_{(a,b)} \in \mathcal{H}\) is the function such that \(k_{(a,b)} (x) = x_1^2 + x_2^2 - 2ax_1 - 2bx_2 + a^2 + b^2\). An element in \(\mathcal{H_0}\) then takes the form of a linear combination of functions \(k_{(a,b)}\), as we vary \((a,b) \in \mathcal{X}\). An element \(f\) in \(\mathcal{H}\) is some linear combination of infinitely many such \(k_{(a,b)}\), and it'll look something like \(f(x) = ax_1^2 + ax_2^2 + bx_1 + cx_2 + d\), where \(a,b,c,d\in\mathbb{R}\) have some funky convergent property.
For our earlier example \(k_{\tilde y} = k_{(3,4)}\), you may be tempted to write
then identify \(k_{\tilde y}\) with the vector \((1,1,-6,-8,25)\) and conclude that the feature map defined by \(k\) is \(\phi: \mathbb{R}^2 \to \mathbb{R}^5, \phi(x) = k_x = (x_1^2, x_2^2, x_1, x_2, 1)\). This is incorrect, as our RKHS \(\mathcal{H} \neq \mathbb{R}^5\), and
is not the inner product \eqref{eq:inner} on \({\mathcal{H}}\) (although it is the inner product on \(\mathbb{R}^5\)). If you are not convinced, take the inner product (on \(\mathbb{R}^5\)) between \((x_1^2, x_2^2, x_1, x_2, 1)\) and \((y_1^2, y_2^2, y_1, y_2, 1)\); this is certainly not equal to \(k(x,y)\).
To write kernel evaluation as inner product in the feature space \(\mathcal{H}\) defined by the kernel, we have to follow the definition \eqref{eq:inner}:
Good resources on RKHS:
-
intro to Hilbert space video; optionally application in Fourier analysis
-
Arthur Gretton's tutorials here and here (great theoretical details)