Likelihood and entropy. An example

Sunday, July 20, 2014

As an example of the ideas in the previous post, I am running here an example of an empirical likelihood estimation in the simplest possible setting following this example.

Asumme n=50 observations is extracted from $$x_i \sim N(4, 1)$, $i=1,…,n$:

N <- 50
x <- rnorm(N, 4, 1)

The maximum likelihood estimate of the mean is simply

## [1] 4.086

Maximizing the program in the post with respect to $$\lambda_2$1, resuts in the following condition

and we can find the zeroes of the previous implicit function by minimizing:

lambda <- function(ll, x, mu) {
    n <- length(x) 
    num <- x - mu
    denom <- 1 + ll * num
    result <- (1/n) * sum(num/denom)

The (inverse) of the probability of each observation is given by:

omega <- function(x, mu, lambda) {
    n <- length(x)
    return(n * (1 + lambda * (x - mu)))

which means that the log-likelihood program to maximize is \mu$. Therefore, we can first solve the partially optimized problem with respect to -\sum_i\ln(1/\omega_i(\lambda_2,\mu))$. Four things are worth noting. First is that, given the simplicity of the problem, grid search is a reasonable option. Second, that as in any grid search problem, we should check the behavior of the program in the boundaries. Third, that the interval in which the optimizer is going to be looking is defined by the upper and lower bounds that we can establish on x$.

In this case, the vector of candidates is defined by

sgrid <- seq(min(x) + 1e-10, max(x) - 1e-10, by = 0.01)

and the actual optimization happens here:

el <- elambda <- vector("numeric", length(grid))
ip <- matrix(NA, nrow = length(sgrid), ncol = N)
i <- 1
for (i in 1:length(sgrid)) {
    lb <- (1 - 1/length(x))/(sgrid[i] - max(x))
    ub <- (1 - 1/length(x))/(sgrid[i] - min(x))
    elambda <- optimize(lambda, interval=c(lb, ub),
	                    x=x, mu=sgrid[i])$$minimum
    ip[i, ] <- omega(x, sgrid[i], elambda)
    el[i] <- -sum(log(ip[i, ]))
    i <- i + 1

Therefore the estimate is

loc <- which(el == max(el))
## [1] 4.08

which coincides with the ML estimate shown above. As a matter of fact, we can also see that the empirical probability assigned to each observation follows a nice normal distribution:

plot(sort(std(1/ip[loc,])), (1:N)/N, 
     main='Empirical cumulative distribution and Normal CDF',
     xlab='Standardized estimated probabilities',     
curve(pnorm(x), add = TRUE, col = 'red')


And this is probably the first time I use the base library for graphics in R in 8 months.

  1. It can be shown that $$\lambda_1=1$. 

Dialogue & Discussion