Probability Basics

What

Probability measures how likely an event is, from 0 (impossible) to 1 (certain).

Key ideas

  • P(A): probability of event A
  • P(A|B): probability of A given B happened — conditional probability
  • P(A ∩ B): probability of both A and B
  • P(A ∪ B): probability of A or B (or both)
  • Independence: P(A ∩ B) = P(A) × P(B) — knowing B tells you nothing about A

Rules

  • Sum rule: P(A) = Σ P(A, Bᵢ) — marginalize over B
  • Product rule: P(A, B) = P(A|B) × P(B)
  • Bayes’ rule: P(A|B) = P(B|A) × P(A) / P(B)

Joint vs marginal vs conditional

  • Joint: P(X=x, Y=y) — probability of both
  • Marginal: P(X=x) — sum out the other variable
  • Conditional: P(X=x | Y=y) — probability of X given Y

Why it matters in ML

  • Classification outputs are probabilities: P(cat | image)
  • Bayesian methods update beliefs with data
  • Generative models learn probability distributions
  • Sampling from distributions = how we generate text, images