Bug 16100 - Error in logLik.nls
Summary: Error in logLik.nls
Status: NEW
Alias: None
Product: R
Classification: Unclassified
Component: Analyses (show other bugs)
Version: R 3.1.2
Hardware: x86_64/x64/amd64 (64-bit) Windows 64-bit
: P5 enhancement
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2014-12-06 20:37 UTC by Arkajyoti Bhattacharya
Modified: 2014-12-06 20:37 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Arkajyoti Bhattacharya 2014-12-06 20:37:43 UTC
Hello,

There are some bugs in the source code of logLik.nls.

(1/N) should be multiplied with sum(log(weights)) for the calculation of val
in the source code. On the other hand, "res" is taken as weighted residuals
and even after that weights are multiplied by residuals for calculating
"val".

That calculation mistake is causing the difference in AIC values for nls
model with weights.

I would expect that the R developers would recover this problem. I have
written a sample code for pointing out this problem.

You can see this by the following code:

methods(logLik)
stats:::logLik.nls

set.seed(12345)
n=2000
npar=3 # alpha, beta & sigma
yt<-rnorm(n, mean=1, sd=0.25)
xt<-rnorm(n, mean=0, sd=0.16)
wt = runif(n)

data<-cbind(yt,xt)
modt=list()
modt[[1]]<-nls(yt~alpha+beta*xt, start=list(alpha=1, beta=1), weights = wt)

i=1
sse<-deviance(modt[[i]])

######################
#residuals used in the source code
######################
res1 = modt[[1]]$m$resid()
######################
#Our calculated Residuals without weights
######################
coef = coef(modt[[1]])

yhat = coef[[1]] +coef[[2]]*xt

res2 = yt - yhat
######################
#Residuals obtained by "residuals" command
######################

res3 = residuals(modt[[1]])

######################
#Matching between the residuals
#If the result is 2000 then all are matching
#Otherwise all are not same
######################

sum(res2==res3)

sum(res1==res2)

#######################
#matching the calculated residuals mulitplied by sqrt(wt) with the residuals
obtained by the source code formula
#######################

sum((sqrt(wt)*res2)==res1)


#######################
#Calculating Loglikelihood
#######################

sigma = sqrt(sse/n)

########################
#Our likelihood
########################
loglik1<-(-(n/2)*log(2*pi)-n*log(sigma) +
sum(log(wt))/2-sse*(1/(2*sigma^2)))

########################
#Source code likelihood
########################

#Using Source code residuals

res <- res1
N <- length(res)
w = wt
zw <- w == 0

loglik2 = -N * (log(2 * pi) + 1 - log(N) - sum(log(w + zw)) +
        log(sum(w * res^2)))/2


########################
#Corrected Likelihood
########################

#Using Our Calculated Residuals
res <- res2
N <- length(res)
w = wt
zw <- w == 0

loglik3 = -N * (log(2 * pi) + 1 - log(N) - (1/N)*sum(log(w + zw)) +
                  log(sum(w * res^2)))/2
(1/N) should be multiplied with sum(log(w+zw)). It is correctly written for
logLik.lm. you can check its source code in the same manner as mentioned
above

print(loglik1)#Our Calculation

print(loglik2)#source code

print(loglik3)#corrected

logLik(modt[[i]])#Using the command "logLik"

aic1<-(-2*loglik1+2*npar)
aic2<-(-2*loglik2+2*npar)
aic3<-(-2*loglik3+2*npar)

print(aic1)#Our Calculation

print(aic2)#source code

print(aic3)#corrected

AIC(modt[[i]])#Using the command AIC





#########################################################################
#Results
#########################################################################

######################
> #Matching between the residuals
  > #If the result is 2000 then all are matching
  > #Otherwise all are not same
  > ######################
>
  > sum(res2==res3)
[1] 2000
Our Calculated residuals match with the outcome that "residuals" command
produce

>
  > sum(res1==res2)
[1] 0
Our Calculated residuals do not match with the residuals used for logLik.nls
source code

>
  > #######################
> #matching the calculated residuals mulitplied by sqrt(wt) with the
> residuals obtained by the source code formula
  > #######################
>
  > sum((sqrt(wt)*res2)==res1)
[1] 2000
Our residuals multiplied with sqrt(wt) match with the residuals used for
logLik.nls source code

   #Our Calculation
print(loglik1) [1] -372.4776

print(loglik2) #source code [1] -2010632

print(loglik3) #corrected [1] -372.4776

logLik(modt[[i]]) #Using the command "logLik" 'log Lik.' -2010632 (df=3)

aic1<-(-2*loglik1+2*npar) aic2<-(-2*loglik2+2*npar)
aic3<-(-2*loglik3+2*npar)

print(aic1) #Our Calculation [1] 750.9552

print(aic2) #source code [1] 4021270

print(aic3) #corrected [1] 750.9552

AIC(modt[[i]]) #Using the command AIC [1] 4021270



With Regards,

Arkajyoti Bhattacharya
AIG Bangalore