A continous random variable (CRV) is another form of distribution discussed in the course (previously, Discrete Random Variables, lastly, Normal Distribution).
There are two properties that define a CRV:
Instead of having a strict number of outcomes like in previous distributions (Binomial, Bernoulli), a continous random variable does not break inbetween ranges for values. This means that we can graph the CRV, and the distribution will look like a function.
Stay tuned, I'll get back to this in Section 3: Probability Density Functions. For now, read on.
Here is a distribution of a CRV: how long it takes a customer to recieve their McDonald's meal within an hour service:
Time (seconds) | Number of Customers | Relative Frequancies |
---|---|---|
\(25 \le x < 30\) | 2 | 2/109 |
\(30 \le x < 35\) | 3 | 3/109 |
\(35 \le x < 40\) | 6 | 6/109 |
\(40 \le x < 45\) | 8 | 8/109 |
\(45 \le x < 50\) | 11 | 11/109 |
\(50 \le x < 55\) | 6 | 6/109 |
\(55 \le x < 60\) | 4 | 4/109 |
\(60 \le x < 65\) | 8 | 8/109 |
\(65 \le x < 70\) | 12 | 12/109 |
\(70 \le x < 75\) | 18 | 18/109 |
\(75 \le x < 80\) | 16 | 16/109 |
\(80 \le x < 85\) | 10 | 10/109 |
\(85 \le x < 90\) | 4 | 4/109 |
\(90 \le x < 95\) | 0 | 0/109 |
\(95 \le x < 100\) | 1 | 1/109 |
\(100 \le x < 105 \) | 0 | 0/109 |
\(105 \le x < 110\) | 0 | 0/109 |
We can see that the relative frequancies of each time range is equilavent to their frequancy (number of customer) divided by the sum of their frequancies (customers).
Whilst this graph may not look very curvy, as the time ranges decrease, the graph will approach a less blocky curve.
We can estimate what this looks like by drawing a curve from roots to the midpoint of each rectange (with some smoothing):
This section focused on primarily background information - it would be very rare to find a question about histograms, relative frequancies are however questioned about quite frequantly in various forms.
As we have seen previously, probabilities assosiated with both DRV's and CRV's can be represented in the form of a table, or a histogram (graph).
A probability distribution function (PDF) is any function that produces probabilities assosiated with a DRV and CRV.
We can integration this function between a range to find the probability of this range.
Probability distribution functions (\(f(x)\)) have some properties:
A PDF may very well be described as a piece-wise function, in which certain ranges for \(x\) have independent functions assosiated.
For instance, here is such a PDF function (left), and the graph of this PDF (right):
\(f(x) = \begin{cases} \frac{3x^2}{20} & 0 \le x \le 2 \\ -\frac{3x}{10} + \frac{6}{5} & 2 \le x \le 4 \end{cases}\)
The area under the curve between two x-values is also the probability of the range. Area IS Probability! If it is a piece-wise, break up the integral to take in consideration the different functions and domains:
1) From the above \(f(x)\), what is the probability in which \(x\) is above 1 and below 3?
\(\displaystyle \begin{aligned} P(1 \le X \le 3) &= \int_{1}^{2}{ \frac{3x^2}{20} \ dx} + \int_{2}^{3}{-\frac{3x}{10} + \frac{6}{5} \ dx} \\[10pt] &= \left[ \frac{3x^3}{60} \right]_{1}^{2} + \left[ -\frac{3x^2}{20} + \frac{6x}{5} \right]_{2}^{3} \\[10pt] &= \left[ \frac{x^3}{20} \right]_{1}^{2} + \left[ -\frac{3x^2}{20} + \frac{6x}{5} \right]_{2}^{3} \\[10pt] &= \left[ \left(\frac{8}{20} \right) - \left( \frac{1}{20} \right) \right] + \left[ \left(-\frac{27}{20}+\frac{18}{5} \right) -\left(-\frac{12}{20} + \frac{12}{5} \right) \right] \\[10pt] &= \left[ \frac{7}{20} \right] + \left[-\frac{15}{20} + \frac{6}{5} \right] \\[10pt] &= -\frac{8}{20} + \frac{24}{20} \\[10pt] &= \frac{16}{20} \\[10pt] &= \frac{4}{5} \end{aligned} \)
A cumulative density function (CDF) is the result of a probability density funciton indefinitely integrated. For instance, see the PDF (left) and it's CDF (right):
For some reason the graphs didn't come out well - the first one is linear to \(x = 4\), and the other ends at co-ordinate (4, 1)
\(f(x) = \frac{x}{8} \)
\(F(x) = \frac{x^2}{16}\)
The cumulative distribution function produces a value that is the probability range between the beginning of the function, and that value for \(x\).
2) Using the above CDF, find:
\(\begin{aligned} P(0 \le X \le 3) &= F(3) \\[5pt] &= \frac{9}{16} \\[5pt] &= \int_{0}^{3}{\frac{x}{8} \ dx} \end{aligned} \)
Here are some go-to formulas:
\(\mu = E(X) = \int_{- \infty}^{\infty}{xp(x) \ dx}\)
\(\sigma^2 = \int_{-\infty}^{\infty}{(x-\mu)^2 p(x) \ dx} \)
What does the infinities mean you may ask? The negative is the lowest value for \(x\) in the PDF, the regular infinity is the largest value for \(x\) in the PDF.
3) Solve the following, given the PDF \(f(x) = \begin{cases} \frac{x}{4} & 0 \le x \le 2 \\ \frac{x}{8} + \frac{3}{16} & 2 \le x \le 3 \end{cases}\)
Again, the graph looks hella gimpy, not sure why - really, you would rarely use the graph unless the PDF is a triangle: in which the area of a triangle can be used to find unkown values.
a. What is the mean? (calc-assumed)
\(\begin{aligned} \mu &= \int_{-\infty}^{\infty}{xp(x) \ dx} \\[7pt] &= \int_{0}^{2}{x \times \frac{x}{4} \ dx} + \int_{2}^{3}{x \left( \frac{x}{8} + \frac{3}{16} \right) \ dx} \\[7pt] &= \int_{0}^{2}{\frac{x^2}{4} \ dx} + \int_{2}^{3}{\frac{2x^2+3x}{16} \ dx} \\[7pt] &= \frac{185}{96} \\[7pt] &= 1.9271 \\[7pt] &= 1.93 \end{aligned} \)
b. What is the variation? (calc-assumed)
\(\begin{aligned} \sigma^2 &= \int_{-\infty}^{\infty}{(x-\mu)^2 p(x) \ dx} \\[7pt] &= \int_{0}^{2}{\left( x - \frac{185}{96} \right)^2 \left(\frac{x}{4} \right) \ dx} + \int_{2}^{3}{\left( x - \frac{185}{96} \right)^2 \left(\frac{x}{8} + \frac{3}{16} \right) \ dx } \\[7pt] &= 0.5051 \\[7pt] &= 0.51 \end{aligned} \)
c. What is the standard deviation? (calc-assumed)
\(\begin{aligned} \sigma &= \sqrt{\sigma^2} \\[5pt] &= \sqrt{0.5051} \\[5pt] &= 0.7107 \\[5pt] &= 0.71 \end{aligned} \)
Continous Random Variables can be translated in two ways, as a result each way may change the mean, standard deviation and variance:
1. Translation of origin: \(X \pm c\)
New mean: \(\mu \pm c\)
Standard Deivation DOES NOT CHANGE
Variance DOES NOT CHANGE
2. Scale Dilation: \(kX\)
New Mean: \(\mu \times k\)
Standard Deviation: \(|k|\sigma_x\)
Variance: \(k^2\sigma^2\)
New Mean = \(k\mu \pm c\)
As we can see, linear changes of CRV's are exactly the same as linear changes of DRV's.
4) \(p(x) = \begin{cases} \frac{1}{k} & 2 \le x \le (k+2) \\ 0 & otherwise \end{cases}\)
a. Find \(k\) if the mean is 25. (Calculator Assumed)
\(\begin{aligned} \int_{2}^{k+2}{x \left( \frac{1}{k} \right) \ dx} &= 25 \\ k &= 46 \end{aligned} \)
b. Let \(X=p(x) \ \text{and} \ Y = 4X-2 \text{. Find} \ \mu_Y. \)
\(\begin{aligned} X &= \frac{1}{46} \\[15pt] \mu &= 25(4) - 2 \\[5pt] &= 98 \end{aligned} \)
c. Considering b., find the variance of \(Y\):
\(\begin{aligned} \sigma_X^2 &= \int_{2}^{48}{\frac{(x-25)^2}{46}} \ dx \\[5pt] &= 176.3333 \\[15pt] \sigma_Y^2 &= 4^2\sigma^2 \\[5pt] &= 16 \times 176.3333 \\[5pt] &= 2821.3333 \\[5pt] &= 2821.33 \end{aligned} \)
© 2023-2023 Aaron Fonte, all rights reserved