Homework 5 in STAT400: Computational Statistics @ CSU


Be sure to set.seed(400) at the beginning of your homework.

  1. The two-parameter Exponential density is \[ f(x) = \lambda e^{-\lambda(x - \gamma)}, \qquad x \ge \gamma, \lambda > 0, \gamma \ge 0. \] The cdf of the Exp(\(\lambda, \gamma\)) is given by \[ F(x) = \int\limits_{\gamma}^x \lambda e^{-\lambda(x - \gamma)} dx = \int\limits_{0}^{x - \gamma} \lambda e^{-\lambda u}du = 1 - e^{-\lambda(x - \gamma)}, \qquad x \ge \gamma. \]

    1. Use the inverse transform method and write a function that returns a random sample of size \(n\) from the Exp(\(\lambda, \gamma\)) distribution for arbitrary \(n\), \(\lambda\), and \(\gamma\).
    2. Repeat a., but use the accept-reject algorithm.
    3. Generate a \(n = 1000\) samples from Exp(\(\lambda, \gamma\)) with \(\lambda = 1\) and \(\gamma = 0, 1, 2\) using each method from a. and b. Compare the \(0.1, 0.2, \dots, 0.9\) sample quantiles with the theoretical quantiles.
    • The formula for the theoretical quantiles is given as \(\hat{x}_\alpha = -\frac{1}{\lambda} \log(1-\alpha) + \gamma\), where \(\hat{x}_\alpha\) is the desired \(\alpha\)-quantile (i.e. if you let \(\alpha = 0.1\), then you can use the formula to compute the 10th percentile of the distribution).
    • Make sure that you get the support correct. The two-parameter exponential distribution equals 0 when \(x \le \gamma\). If you don’t include that when you define the distribution, then your results won’t be correct.
  2. Continue Problem 2 from Homework 4 (the Rayleigh distribution). Goal: you will use the code that you created last week for Problem 2. to learn more about sampling distributions. Here we are going to investigate the sampling distribution of the median of the Rayleigh distribution.

    1. Generate \(n = 3\) samples from a Rayleigh(1) distribution and estimate the median of the sample. Now repeat this process to generate \(m=100\) estimated medians.
    2. Repeat part a but using \(n=100\) samples from a Raleigh distribution.
    3. Repeat parts a and b but using \(m=1000\).
    4. Draw a histogram of the estimated medians for parts a-c. This is the sampling distribution of the median of the Rayleigh(1) distribution. Compare your histograms from parts a-c. Make sure that your x axes have the same range for all 4 plots (see ?ggplot2::xlim). And put all the plots together. You can use this format to start your code chunk to get the plots to appear together and smaller:

      {r fig.show='hold', out.width='50%'}

  3. A random variable \(X\) has the Lognormal\((\mu, \sigma)\) distribution if \(X = e^Y\), where \(Y = N(\mu, \sigma^2)\). That is, \(\log X \sim N(\mu, \sigma^2)\). Write a function to generate from a Lognormal\((\mu, \sigma)\) distribution using a transformation method and generate a sample of size \(1000\) for \(\mu = 1\), \(\sigma = 0.25\). Compare the histogram with the lognormal density curve given by the dlnorm function in R.

Turn in in a pdf of your homework to canvas. Your .Rmd file on rstudio.cloud will also be used in grading, so be sure they are identical and reproducible.