Bugzilla – Bug 14386

predict(backSpline(x)) fails in some situations

Last modified: 2010-09-23 22:05:06 UTC

Here's an apparently sensible example where predict(backSpline(interSpline())) gives crazy answers. Greg Snow contributes a diagnosis, and I contribute a simpleminded patch based on one of his solutions. Someone with more knowledge and understanding of the context may well be able to do better ... ## original example r2 <- c(1.04409027570601, 1.09953936543359, 1.15498845516117, 1.21043754488875, 1.26588663461633, 1.32133572434391, 1.37678481407149, 1.43223390379907, 1.48768299352664, 1.54313208325422, 1.5985811729818, 1.65403026270938, 1.70947935243696, 1.76492844216454, 1.82037753189212, 1.87582662161970, 1.93127571134727, 1.98672480107485, 2.04217389080243, 2.09762298053001 ) d2 <- c(6.1610616585333, 5.70079375491741, 5.2366151167289, 4.77263065800071, 4.31310259797181, 3.86232922249189, 3.42452047126494, 3.00367670365119, 2.6034766331926, 2.22717964214416, 1.87754657382891, 1.55678176465949, 1.26649764837839, 1.00770187223770, 0.780805622450771, 0.585650849661306, 0.421553364080296, 0.287358347766713, 0.18150469048976, 0.102094654969619 ) plot(d2,r2,type="b") require(splines) sp <- interpSpline(r2,d2) psp <- predict(sp) points(psp$y,psp$x,col=5) bsp <- backSpline(sp) lines(predict(bsp,seq(0,6,length=51)),col=2,type="b") ## Greg Snow says: ## I am pretty sure that you have found a bug, I am just not sure where exactly the bug would be. ## The problem comes from the fact that your bsp object has the knots ## in decreasing order, if we reverse the order like this: bsp3 <- bsp bsp3$knots <- rev(bsp$knots) bsp3$coefficients <- bsp$coefficients[20:1,] lines(predict(bsp3,seq(0,6,length=101)), col='blue') ## Then the prediction looks like it should. ## If we trace through the predict.polySpline function with x=5.6 as ## one example, then inside the prediction function the value of i ## (the interval between knots that x belongs to) is 18, well if the ## knots are in increasing order then that is the correct interval, ## this comes from the line: i <- as.numeric(cut(x, knots)) and the ## cut function treats the knots as being in increasing order, not ## decreasing, so that in a later step when the function is supposed ## to be finding how far x is from the closest knot below it (which ## should be 5.2366151 for your example) it grabs the 18 element of ## knots, which in its decending order is instead: 0.2873583, so the ## difference ends up being much too big, when the function squares ## and cubes this much too big value, we get values that are way off ## (as you noticed), the wrong set of coefficients is being used as ## well (also row 18, which correspond to 0.28 not 5.23). I would say ## that the most surprising thing with your example is the range at ## which the prediction is actually decent, I would only expect this ## between the middle 2 knots. ## I see 2 possible ways to fix the problem: ## 1. The predict function checks the order of the knots and either ## reorders the knots and coefficients when the knots are descending, ## or does the correct computations of the interval. ## 2. The backSpline function checks and makes sure that it always ## produces the results with the knots in ascending order (and the ## coefficients to match). =================================================================== --- splineClasses.R (revision 52966) +++ splineClasses.R (working copy) @@ -568,7 +568,7 @@ bcoeff[nkm1, 3L] <- 0 bcoeff[nkm1, 2L] <- kdiff[nkm1]/adiff[nkm1] } - value <- list(knots = bknots, coefficients = bcoeff) + value <- list(knots = bknots[order(bknots)], coefficients = bcoeff[order(bknots),]) attr(value, "formula") <- do.call("~", as.list(attr(object, "formula"))[3L:2L]) class(value) <- c("polySpline", "spline") value

(In reply to comment #0) > Here's an apparently sensible example where predict(backSpline(interSpline())) > gives crazy answers. Greg Snow contributes a diagnosis, and I contribute a > simpleminded patch based on one of his solutions. Someone with more knowledge > and understanding of the context may well be able to do better ... > [.......] Thank you, Ben, I had started looking into Greg's diagnosis, and started to construct own even simpler examples but had not finished in my analysis of "the best" / "most logical" fix to the problem. So,I'm going to test your proposed patch and (hopefully after seeing no problems) commit and also port it to R 2.12.0 alpha. Martin > =================================================================== > --- splineClasses.R (revision 52966) > +++ splineClasses.R (working copy) > @@ -568,7 +568,7 @@ > bcoeff[nkm1, 3L] <- 0 > bcoeff[nkm1, 2L] <- kdiff[nkm1]/adiff[nkm1] > } > - value <- list(knots = bknots, coefficients = bcoeff) > + value <- list(knots = bknots[order(bknots)], coefficients = > bcoeff[order(bknots),]) > attr(value, "formula") <- do.call("~", as.list(attr(object, > "formula"))[3L:2L]) > class(value) <- c("polySpline", "spline") > value

used a different patch (more efficient); but thank you very much, Ben, for getting this going!