In the last problem, we saw how we could account for the effect of a different prior by resampling from the MCMC sample using weights that were, at each sample point, proportional to the ratio of the new prior to the old prior.
The same idea can be used to estimate the effect of new data. In this idea, the weights are proportional to the likelihood that arises from the new data. This technique can be especially effective when doing sequential estimation, in situations where the data comes in over time, and we need to make a new estimate of the unknown parameters every time new data arrives. An example would be when we are following a space probe. At any given time, we have an estimate of the position of the probe in terms of the posterior probability of the orbital parameters. When new data arrives (for example, from a deep space radar "ping"), we want to update our estimates. This can be done by calculating the likelihood of the new data at each sample point, and then resampling from the current posterior distribution (regarded now as the prior for the new data) and regarding the new sample as representing the new posterior after the new data as well as the old data have been observed.
Generally, one wants to do this with a fairly robust sample (read: large). The reason is that since you are sampling with replacement, as the process progresses points will be lost from the sample and replaced with other points that are sampled more than once (for example, if you use a sample of constant size). Eventually, the number of different points in the sample will get relatively small, and it will be necessary to reconstitute a more diverse sample to proceed. But using a large sample will reduce the rate at which this happens and thus the need to reconstitute.
For the following problem, I recommend using at least 10,000 points in your sample.
Use as your initial sample the one you obtained at the start of Problem Set #6. Then imagine that you observe new data in two chunks, as follows:
Chunk #2:
X2 = c(0.722, 6.105, 2.230, 0.153, 3.816, 2.675, 5.762, -7.259, -1.330, 0.065)
Chunk #3:
X3 = c(1.057, 2.913, 2.022, -4.873, 1.733, 4.919, -0.193, 3.242, 3.435, 1.545)
These chunks are assumed to have the same mean and variance as the data used in the first MCMC sample, so the likelihoods will look the same (except for the different values of the data).
Starting with the MCMC simulation from Problem #6, generate updates of the sample by the method proposed above, first considering only chunk #2 of data (X2), and then by repeating the process using chunk #3 (X3) as well. After each update, plot marginal posterior probabilities, compute quantiles, means, medians, etc., so as to characterize the evolution of the posterior probability over time.
Note: This is just one method that can be used to accomplish
this end. We can discuss other related methods.
![]()