
Section 07: Probability & Statistics
From a contingency table, how do you calculate \(P(A|B)\)?
In a table with 200 total, 80 in category A, 60 in category B, and 30 in both. Find \(P(A|B)\).
How do you test if two variables are independent using a table?
A company has 1000 employees: 600 full-time, 400 with degrees, 280 full-time with degrees. Build the table.
Let’s clarify contingency-table logic before distributions.
Binomial distribution problems appear on every Feststellungsprüfung!
Question: Discuss with a partner: What is the key decision rule from this part, and where can students confuse it on the exam?
Binomial Experiment Conditions
Examples: - Flipping a coin 10 times (heads = success) - Testing 50 products (defective = success) - Surveying 100 customers (satisfied = success)
Binomial Distribution
\[P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}\]
Where: - \(n\) = number of trials - \(k\) = number of successes - \(p\) = probability of success - \(\binom{n}{k} = \frac{n!}{k!(n-k)!}\) = number of ways
\[P(X = k) = \underbrace{\binom{n}{k}}_{\text{arrangements}} \times \underbrace{p^k}_{\text{k successes}} \times \underbrace{(1-p)^{n-k}}_{\text{n-k failures}}\]
Example: In 5 coin flips, \(P(\text{exactly 3 heads})\)?
\[P(X=3) = \binom{5}{3} \times (0.5)^3 \times (0.5)^2 = 10 \times 0.125 \times 0.25 = 0.3125\]

Question: Discuss with a partner: What is the key decision rule from this part, and where can students confuse it on the exam?
| Question Type | Formula |
|---|---|
| Exactly k | \(P(X = k)\) |
| At most k | \(P(X \leq k) = \sum_{i=0}^{k} P(X=i)\) |
| At least k | \(P(X \geq k) = 1 - P(X < k) = 1 - P(X \leq k-1)\) |
For “at least” problems, use the complement rule!
| Phrase | Mathematical form |
|---|---|
| exactly \(k\) | \(P(X=k)\) |
| at most \(k\) | \(P(X \le k)\) |
| at least \(k\) | \(P(X \ge k)=1-P(X\le k-1)\) |
| between \(a\) and \(b\) (inclusive) | \(P(a \le X \le b)\) |
| first success on trial \(n\) | geometric: \((1-p)^{n-1}p\) |
Always check whether endpoints are included in words like “between”, “at most”, and “at least”.
Work individually
Choose binomial or geometric for each:
A machine produces items with 8% defect rate. In a batch of 15 items:
\[P(X=2) = \binom{15}{2} (0.08)^2 (0.92)^{13} = 105 \times 0.0064 \times 0.326 \approx 0.219\]
\[P(X \leq 1) = P(X=0) + P(X=1)\] \[= \binom{15}{0}(0.08)^0(0.92)^{15} + \binom{15}{1}(0.08)^1(0.92)^{14}\] \[= 0.286 + 0.373 = 0.659\]
\[P(X \geq 2) = 1 - P(X \leq 1) = 1 - 0.659 = 0.341\]
\[P(1 \leq X \leq 3) = P(X=1) + P(X=2) + P(X=3)\] \[\approx 0.373 + 0.219 + 0.085 = 0.677\]
Binomial Mean and Variance
Example: If \(n=100\) and \(p=0.3\): - Expected successes: \(\mu = 100 \times 0.3 = 30\) - Standard deviation: \(\sigma = \sqrt{100 \times 0.3 \times 0.7} = \sqrt{21} \approx 4.58\)
Question: What changes in probability models when we sample without replacement?
The binomial assumes independent trials (with replacement). But what if we draw without replacement from a finite population?
Hypergeometric Setup
Use the hypergeometric distribution when:
Probability of Exactly \(k\) Successes
\[P(X=k)=\frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}\]
Where:
| Model | Typical wording | Replacement | Independence |
|---|---|---|---|
| Binomial | “in 20 trials” | with replacement / approx constant \(p\) | yes |
| Hypergeometric | “from a lot of 200 items” | without replacement | no |
If sampling is without replacement from a finite population, do not use binomial automatically. Check whether hypergeometric is the correct model.
A lot has \(N=50\) items, with \(K=8\) defective. Sample \(n=5\) items without replacement.
Question: What is \(P(X=2)\), the probability of exactly 2 defectives?
\[P(X=2)=\frac{\binom{8}{2}\binom{42}{3}}{\binom{50}{5}}\]
\[=\frac{28\cdot11{,}480}{2{,}118{,}760}\approx0.152\]
From 18 applicants, 7 are international. Choose 5 without replacement.
Question: What is the probability of exactly 2 international students?
\[P(X=2)=\frac{\binom{7}{2}\binom{11}{3}}{\binom{18}{5}}\]
\[=\frac{3{,}465}{8{,}568}\approx0.404\]
If the population is very large compared with the sample, hypergeometric probabilities are often close to binomial with \(p=K/N\).
Question: Discuss with a partner: What is the key decision rule from this part, and where can students confuse it on the exam?
Geometric Distribution
The probability that the first success occurs on trial \(n\):
\[P(X = n) = (1-p)^{n-1} \cdot p\]
Where \(p\) = probability of success on each trial.
Example: A salesperson has a 20% chance of making a sale on each call. What’s the probability the first sale is on the 4th call?
\[P(X=4) = (0.8)^3 \times 0.2 = 0.512 \times 0.2 = 0.1024\]
Expected Number of Trials
For the geometric distribution: \[E[X] = \frac{1}{p}\]
Example: If success probability is 0.25, on average how many trials until first success?
\[E[X] = \frac{1}{0.25} = 4 \text{ trials}\]
A machine produces defective items with probability 0.05.
\[P(X=10) = (0.95)^9 \times 0.05 = 0.631 \times 0.05 = 0.0316\]
\[P(X \leq 5) = 1 - P(\text{no defective in first 5}) = 1 - (0.95)^5\] \[= 1 - 0.774 = 0.226\]
Question: How many items must be checked so that the probability of finding at least one defective item is at least 95%?
Given \(p=0.05\) defective probability per item:
\[1-(1-p)^n \geq 0.95\]
\[1-(0.95)^n \geq 0.95 \Rightarrow (0.95)^n \leq 0.05\]
Take logarithms:
\[n\ln(0.95) \leq \ln(0.05)\]
Because \(\ln(0.95)<0\), inequality direction flips when dividing:
\[n \geq \frac{\ln(0.05)}{\ln(0.95)} \approx 58.4\]
So the minimum whole number is:
\[n=59\]
Always round up for minimum required sample size.
Connection: Geometric and Binomial Complement
This same setup can be seen as:
\[P(\text{at least one defective in } n \text{ trials}) = 1 - P(X=0)\]
with \(X \sim \text{Binomial}(n,p)\), so
\[1 - (1-p)^n\]
This matches the geometric waiting-time interpretation.
A call center has conversion probability \(p=0.12\) per call.
Find the minimum number of calls needed so that the probability of at least one conversion is at least 90%.
Question: Which phrase signals a waiting-time model, and which phrase signals a fixed-trials model?
For \(X \sim \text{Geometric}(p)\):
These are very useful for “within first \(n\) trials” and “no success yet” questions.
Memoryless
For geometric \(X\):
\[P(X>m+n\mid X>m)=P(X>n)\]
If no success has happened yet, the process restarts probabilistically.
A support agent closes tickets with probability \(p=0.25\) per call.
Question: Given no closure in first 4 calls, what is probability of no closure in the next 3 calls?
By memoryless property:
\[P(X>7\mid X>4)=P(X>3)=(0.75)^3\approx0.422\]
These are different events.
Question: Why do we standardize to \(Z\) instead of using many different normal tables?
If \(X \sim N(\mu,\sigma)\), then
\[Z=\frac{X-\mu}{\sigma} \sim N(0,1)\]
Use this to convert any normal probability into a standard normal probability.
| Question | Convert to Z-form |
|---|---|
| \(P(X\le a)\) | \(P\left(Z\le\frac{a-\mu}{\sigma}\right)\) |
| \(P(X\ge a)\) | \(1-P\left(Z\le\frac{a-\mu}{\sigma}\right)\) |
| \(P(a\le X\le b)\) | \(P\left(\frac{a-\mu}{\sigma}\le Z\le\frac{b-\mu}{\sigma}\right)\) |
Delivery times are normal with mean 30 min and standard deviation 4 min.
Question: Find \(P(X\le 34)\).
\[z=\frac{34-30}{4}=1\]
\[P(X\le34)=P(Z\le1)\approx0.8413\]
Using the same distribution, find \(P(28\le X\le36)\).
\[z_1=\frac{28-30}{4}=-0.5,\quad z_2=\frac{36-30}{4}=1.5\]
\[P(28\le X\le36)=P(-0.5\le Z\le1.5)\]
\[\approx 0.9332-0.3085=0.6247\]
Exam scores are normal with \(\mu=70\), \(\sigma=10\).
Question: What score marks the top 10%?
Need \(P(X\le x)=0.90\), so \(z_{0.90}\approx1.2816\).
\[x=\mu+z\sigma=70+1.2816\cdot10\approx82.8\]
So the cutoff is about 83 points.
Work individually
If \(X\sim N(100,15)\):
A multiple choice test has 20 questions with 4 options each. A student guesses randomly on all questions.
A company’s call center receives calls with 15% conversion rate.
Work individually, then compare
A process has success probability \(p=0.10\) per trial.
Think individually (2 min), pair (3 min), then work in groups of 3-4 and share
A sales rep has conversion probability \(p=0.18\) per call.
Rate your confidence for today’s goals on a 1-5 scale (1 = not confident, 5 = exam-ready):
Work individually
Homework
Complete Tasks 07-07 - focus on binomial calculation practice!
Session 07-07 - Binomial & Geometric Distributions | Dr. Nikolai Heinrichs & Dr. Tobias Vlćek | Home