R の CV誤差の確認


x1 <- c(-1.69,-1.5,-1.3,-1.1,-0.9,-0.7,-0.48,-0.42,-0.05,0.09,0.39,0.48,0.59,1.0,1.08,1.29,1.5)
y <- c(-0.15,-0.25,-0.2,-0.14,-0.21,0,0.34,0.35,0.8,0.75,0.4,0.2,0.15,-0.05,-0.1,-0.35,-0.39)
n <- 17

getcv <- function(b,lambda,y) {
    rbf <- rbfdot(sigma=b)
    k <- kernelMatrix(rbf, x1)
    k2 <- k + lambda * diag(n)
    h <- solve(k2) %*% k
    mean(((y - (h %*% y))/(1 - diag(h)))**2)
}

β=1 で固定。λ=0.00000001, 0.001, 1,0 で確認

> getcv(1,0.00000001,y)
[1] 0.3993206
> getcv(1,0.001,y)
[1] 0.01512193
> getcv(1,1.0,y)
[1] 0.02984302

λ=0.001 で固定。β=0.1, 1.0, 10 で確認

> getcv(0.1,0.001,y)
[1] 0.04060114
> getcv(1,0.001,y)
[1] 0.01512193
> getcv(10,0.001,y)
[1] 0.01612822
>

点の数が17個しかないので、λは相当小さくしないと、CV 誤差は大きくならない感じ。
テキストでも 10^{-6} だ。