Tasks 07-06 - Contingency Tables & Bayes

Section 07: Probability & Statistics

Problem 1: Bayes’ Theorem Basics (x)

A medical test has the following characteristics:

True positive rate: 90%
True negative rate: 95%
Base rate: 5% (5% of the population has the disease)

What is \(P(+|D)\)?
What is \(P(-|D')\)?
Calculate the Positive Predictive Value \(P(D|+)\)
Calculate the Negative Predictive Value \(P(D'|-)\)

Solution

\(P(+|D) = 0.90\) (this is the true positive rate)
\(P(-|D') = 0.95\) (this is the true negative rate)
Positive Predictive Value using Bayes: \[P(D|+) = \frac{P(+|D) \cdot P(D)}{P(+|D) \cdot P(D) + P(+|D') \cdot P(D')}\] \[= \frac{0.90 \times 0.05}{0.90 \times 0.05 + 0.05 \times 0.95}\] \[= \frac{0.045}{0.045 + 0.0475} = \frac{0.045}{0.0925} \approx 0.486\]
Negative Predictive Value: \[P(-) = P(-|D) \cdot P(D) + P(-|D') \cdot P(D') = 0.10 \times 0.05 + 0.95 \times 0.95 = 0.005 + 0.9025 = 0.9075\] \[P(D'|-) = \frac{P(-|D') \cdot P(D')}{P(-)} = \frac{0.95 \times 0.95}{0.9075} = \frac{0.9025}{0.9075} \approx 0.994\]

Problem 2: Contingency Table Construction (xx)

A company surveyed 200 employees about their commute method and job satisfaction:

55% commute by car
40% are highly satisfied
30% commute by car AND are highly satisfied

Construct a complete contingency table
Find \(P(\text{Car}|\text{Highly Satisfied})\)
Find \(P(\text{Highly Satisfied}|\text{Car})\)
Are commute method and satisfaction independent?

Solution

Contingency Table:

	Car	Other	Total
High Sat	60	20	80
Lower Sat	50	70	120
Total	110	90	200

\(P(\text{Car}|\text{HS}) = \frac{60}{80} = 0.75\)
\(P(\text{HS}|\text{Car}) = \frac{60}{110} \approx 0.545\)
Independence check: \(P(\text{Car}) \times P(\text{HS}) = 0.55 \times 0.40 = 0.22\) \(P(\text{Car} \cap \text{HS}) = \frac{60}{200} = 0.30\) \(0.22 \neq 0.30\), so NOT independent!

Problem 3: Medical Testing - Full Analysis (xx)

A screening test for a disease has:

True positive rate = 85%
True negative rate = 92%
The disease affects 3% of the population (base rate)

Create a contingency table for a population of 10,000
Calculate the Positive Predictive Value directly from the table
Calculate the Negative Predictive Value directly from the table
If you test positive, how worried should you be? Interpret the Positive Predictive Value.

Solution

Table for 10,000 people:

Population: 300 with disease, 9,700 without

	Condition +	Condition −	Total
Predicted +	255	776	1,031
Predicted −	45	8,924	8,969
Total	300	9,700	10,000

Calculations:

True Positive: \(300 \times 0.85 = 255\)
False Negative: \(300 \times 0.15 = 45\)
True Negative: \(9700 \times 0.92 = 8,924\)
False Positive: \(9700 \times 0.08 = 776\)

\(\text{Pos. Predictive Value} = \frac{255}{1031} \approx 0.247\) or 24.7%
\(\text{Neg. Predictive Value} = \frac{8924}{8969} \approx 0.995\) or 99.5%
Interpretation: A positive test result only means about 25% chance of actually having the disease. This seems counterintuitive given the test’s high accuracy, but it’s because the base rate is low (3%). Most positive tests are false positives from the large healthy population. A negative result is very reassuring (99.5% chance of being healthy).

Problem 4: Factory Quality (xx)

A factory has two machines:

Machine A produces 70% of output, 4% defect rate
Machine B produces 30% of output, 6% defect rate

What is the overall defect rate?
A defective item is found. What’s the probability it came from Machine A?
Create a contingency table for 1000 items
Verify your answer to (b) using the table

Solution

\(P(D) = P(D|A)P(A) + P(D|B)P(B) = 0.04(0.70) + 0.06(0.30) = 0.028 + 0.018 = 0.046 = 4.6\%\)
\(P(A|D) = \frac{P(D|A)P(A)}{P(D)} = \frac{0.04 \times 0.70}{0.046} = \frac{0.028}{0.046} \approx 0.609\)
Table for 1000 items:

	Machine A	Machine B	Total
Defective	28	18	46
Good	672	282	954
Total	700	300	1000

From table: \(P(A|D) = \frac{28}{46} \approx 0.609\) ✓

Problem 5: Exam-Style Problem - 2025 Format (xxx)

In a city, a rapid test for a virus is available:

The test correctly identifies 92% of infected people (true positive rate)
The test correctly identifies 97% of non-infected people (true negative rate)
Currently 8% of the population is infected (base rate)

A person tests positive.

Calculate the probability that this person is actually infected (Positive Predictive Value).
Now suppose the base rate increases to 20% due to an outbreak. Recalculate the Positive Predictive Value.
Explain why the Positive Predictive Value changes with the base rate.
At what base rate would the Positive Predictive Value equal 80%? (Set up the equation and solve)

Solution

Positive Predictive Value with 8% base rate: \[P(D|+) = \frac{0.92 \times 0.08}{0.92 \times 0.08 + 0.03 \times 0.92}\] \[= \frac{0.0736}{0.0736 + 0.0276} = \frac{0.0736}{0.1012} \approx 0.727\]

Positive Predictive Value ≈ 72.7%

Positive Predictive Value with 20% base rate: \[P(D|+) = \frac{0.92 \times 0.20}{0.92 \times 0.20 + 0.03 \times 0.80}\] \[= \frac{0.184}{0.184 + 0.024} = \frac{0.184}{0.208} \approx 0.885\]

Positive Predictive Value ≈ 88.5%

Explanation: When the base rate increases, a larger proportion of the tested population actually has the disease. This means:

More true positives (infected people correctly identified)
Fewer false positives relative to true positives (same false positive rate but smaller healthy population)
Result: Higher Positive Predictive Value

Finding base rate for Positive Predictive Value = 0.80:

Let \(p\) = base rate

\[0.80 = \frac{0.92p}{0.92p + 0.03(1-p)}\]

\[0.80(0.92p + 0.03 - 0.03p) = 0.92p\] \[0.736p + 0.024 - 0.024p = 0.92p\] \[0.024 = 0.92p - 0.712p\] \[0.024 = 0.208p\] \[p = \frac{0.024}{0.208} \approx 0.115\]

Answer: A base rate of approximately 11.5% yields a Positive Predictive Value of 80%

Problem 6: Exam-Style Problem - 2023 Format (xxx)

A company conducts employee surveys. Based on historical data:

60% of employees are satisfied with their job
Of satisfied employees, 75% recommend the company to others
Of unsatisfied employees, 20% still recommend the company

Create a contingency table for 500 employees
What proportion of employees recommend the company?
An employee recommends the company. What’s the probability they are satisfied?
Are satisfaction and recommendation independent? Justify with calculations.

Solution

Contingency Table for 500 employees:

300 satisfied, 200 unsatisfied

	Satisfied	Unsatisfied	Total
Recommend	225	40	265
Don’t Recommend	75	160	235
Total	300	200	500

Calculations:

Satisfied & Recommend: \(300 \times 0.75 = 225\)
Satisfied & Don’t: \(300 \times 0.25 = 75\)
Unsatisfied & Recommend: \(200 \times 0.20 = 40\)
Unsatisfied & Don’t: \(200 \times 0.80 = 160\)

\(P(\text{Recommend}) = \frac{265}{500} = 0.53 = 53\%\)
\(P(\text{Satisfied}|\text{Recommend}) = \frac{225}{265} \approx 0.849\)
Independence test: \(P(\text{S}) \times P(\text{R}) = 0.60 \times 0.53 = 0.318\) \(P(\text{S} \cap \text{R}) = \frac{225}{500} = 0.45\)

Since \(0.318 \neq 0.45\), the events are NOT independent.

This makes sense: satisfied employees are much more likely to recommend (75% vs 20%).