Sixth Problem Set

Sampling and Importance Resampling

Due Friday, November 16, 2007

In the fifth chart set, charts 23-25, code for doing Gibbs sampling in the normal inference problem with unknown mean and standard deviation is given. You can copy this code directly into an R program and use it to sample from the posterior distribution with Jeffreys prior 1/sigma on sigma and flat prior on mu. I would recommend a fairly large sample, perhaps 30,000 points would be enough for this problem. (Note: delete the line that uses the nonexistent function results; otherwise the code won't run).

So, first use the code to generate a sample from the posterior distribution, obtaining samples mu[] and sigma[]. Use the following data set for X:

X = c(-5.36, -2.73, -0.67, 0.27, 0.35, 2.44, 3.05, 3.35, 3.78, 4.53, 5.73, 6.35, 6.74, 7.99, 8.61)

Once you have the sample on mu and sigma, you are in a position to see what will happen if you change the priors.

For example, another scientist believes that he has prior information on mu and sigma that he can use to get a better inference.

We could, of course, just write a sampler to sample Gibbs from the problem with the new prior, but resampling builds on the sample we already have so as to avoid a lot of work.

On chart set 7, p. 23, the normal conjugate prior for this problem is displayed. Our scientist believes that his prior information is represented in that equation by taking the nu0=10, Sxx=4, mu0=4 and n0=1.

You can now resample from the sample you have created as follows:

1) Generate a vector that is the ratio of the desired prior to the original Jeffreys prior, evaluated point by point for each sample. This can be done in one statement in R, using R's vector capabilities, the parameters that the scientist wants to use, and the values of the samples in mu and sigma (as vectors). You may also consider writing a short function to evaluate this.

2) Resample from the vector 1:N using the R function sample, where N is the length of your original sample, sampling with replacement, using for the prob vector the vector that you created in step 1. Thus, you are sampling components that the new prior values more highly with greater probablility. This will produce a vector resam of indices which tells you which samples will appear in the resampled sample. For example, if the first component of this sample is 391, then the first sample the resampled data is mu[391] and sigma[391]. Some values of N will be sampled more than once. This is OK.

3) Create vectors of the resampled values by using resam.mu = mu[resam] and similarly for resam.sigma. In the above example, resam.mu[1]=mu[391], etc.

4) Now you can investigate the properties of the sample with the new prior. Compare the original sample with the new one, for example, by showing the plots of the samples, histograms of the samples (use 50 bins), sample posterior means and standard deviations of mu and sigma, plot of mu vs. sigma, quantiles, etc. How has the new prior affected the results?



All materials at this website Copyright (C) 1994-2007 by William H. Jefferys. All Rights Reserved.