Prior Distributions
Objective:
By the end of this lesson, students will have reviewed the following topics:
- Definition of prior distribution
- Definition of hyperparameter
- Identify appropriate distribution to represent belief
Duration:
- 75 minutes
Materials:
- Handouts with exercises and problems related to basic probability
- Computer, projector, and screen
Introduction:
What should the probability distribution be for the maximum temperature 1 year from today?
What should the probability distribution be for the final grade in this course?
Introduce the lesson’s topic:
- Today we will apply our knowledge of distributions to represent our prior beliefs.
Main Content:
Vocabulary
Prior distribution: probability model for our prior understanding of \(\text{P}[\text{event}]\).
- Hyperparameter: parameter in the specified prior distribution.
Data distribution: probability model for the outcome, \(y\).
- Parameter: parameter in the specified data distribution.
Posterior distribution: probability model that summarizes the plausibility of the outcome, \(y\), given the prior information.
Beta distribution
Suppose we are looking at binary outcomes; we want to put a prior on \(\pi = P[Y=1]\), meaning \(\pi \in [0, 1]\).
The Beta model (often used to describe the variability in \(\pi\)) has shape parameters \(\alpha > 0\) and \(\beta > 0\), and these are the shape hyperparameters.
\[\pi \sim \text{Beta}\left(\alpha, \beta \right),\]
The Beta model’s pdf is
\[f\left( \pi \right) = \frac{\Gamma \left( \alpha + \beta \right)}{\Gamma \left( \alpha \right) \Gamma \left( \beta \right)} \pi^{\alpha-1} (1-\pi)^{\beta-1},\]
Note the following:
\(\Gamma\left( z \right) = \int_{0}^{\infty} x^{z-1} e^{-y} dx\)
\(\Gamma\left( z + 1 \right) = z \Gamma\left( z \right)\)
if \(z\in \mathbb{Z}^+\), then \(\Gamma\left( z \right) = (z-1)!\)
Normal distribution
Suppose we are now examining a continuous outcome. Let \(Y\) be a continuous random variable that can take any value in \(\mathbb{R}\); i.e., \(Y \in \left(-\infty, \infty\right)\).
Let us assume that the variability in \(Y\) can be represented by the normal distribution with mean parameter \(\mu \in \mathbb{R}\) and standard deviation parameter \(\sigma \in \mathbb{R}^+\).
\[Y \sim N\left(\mu, \sigma^2\right)\]
The normal model’s pdf is
\[f(y) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left\{-\frac{\left(y - \mu\right)^2}{2\sigma^2} \right\}\]
Note: \(\sigma\) provides a sense of scale for \(Y\); approximately 95% of \(Y\) values will be within 2 standard deviations.
- i.e., \(\mu \pm 2 \sigma\)
Calculation and Practice:
Examples for the Beta distribution:
Example 1: plotting with different parameters
Example 2: students evaluate and choose between three distributions
Example 3: students derive appropriate prior
Examples for the normal distribution:
Example 1: what happens as variability changes?
Example 2: students construct appropriate normals with given mean
Example 3: students derive appropriate prior
Discussion and Wrap-Up:
- Facilitate a class discussion to review the example problems, reinforce key concepts, and answer any questions the students have.
Homework:
- Assign additional problems to practice the basic probability rules.
Formative Assessment:
- Evaluate students based on their participation in discussions, their ability to solve example problems, and their performance on the assigned homework.
Conclusion:
Emphasize that a prior will not make or break an analysis.
Our goal is to analyze in the best way possible.