Probability Basics
What
Probability measures how likely an event is, from 0 (impossible) to 1 (certain).
Key ideas
- P(A): probability of event A
- P(A|B): probability of A given B happened — conditional probability
- P(A ∩ B): probability of both A and B
- P(A ∪ B): probability of A or B (or both)
- Independence: P(A ∩ B) = P(A) × P(B) — knowing B tells you nothing about A
Rules
- Sum rule: P(A) = Σ P(A, Bᵢ) — marginalize over B
- Product rule: P(A, B) = P(A|B) × P(B)
- Bayes’ rule: P(A|B) = P(B|A) × P(A) / P(B)
Joint vs marginal vs conditional
- Joint: P(X=x, Y=y) — probability of both
- Marginal: P(X=x) — sum out the other variable
- Conditional: P(X=x | Y=y) — probability of X given Y
Why it matters in ML
- Classification outputs are probabilities: P(cat | image)
- Bayesian methods update beliefs with data
- Generative models learn probability distributions
- Sampling from distributions = how we generate text, images