NumPy Essentials
What
NumPy = Numerical Python. The foundation of the entire Python ML stack. Provides fast n-dimensional arrays and vectorized operations.
Why it matters
- Pandas, scikit-learn, PyTorch all build on NumPy concepts
- Vectorized ops are 10-100x faster than Python loops
- Understanding array shapes is essential for debugging ML code
Key concepts
Creating arrays
import numpy as np
np.array([1, 2, 3]) # from list
np.zeros((3, 4)) # 3×4 matrix of zeros
np.ones((2, 3)) # 2×3 matrix of ones
np.random.randn(5, 3) # 5×3 random normal
np.arange(0, 10, 2) # [0, 2, 4, 6, 8]
np.linspace(0, 1, 5) # 5 evenly spaced points [0, 0.25, 0.5, 0.75, 1]Shape operations
a = np.random.randn(3, 4, 5)
a.shape # (3, 4, 5)
a.reshape(3, 20) # reshape to (3, 20) — total elements must match
a.T # transpose
a.flatten() # 1D
a[np.newaxis, :] # add dimension: (1, 3, 4, 5)Indexing and slicing
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
a[0] # first row: [1, 2, 3]
a[:, 1] # second column: [2, 5, 8]
a[1:, :2] # rows 1+, first 2 cols: [[4, 5], [7, 8]]
a[a > 5] # boolean indexing: [6, 7, 8, 9]Vectorized operations
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
a + b # [5, 7, 9]
a * b # [4, 10, 18] element-wise
a @ b # 32 (dot product)
np.sum(a) # 6
np.mean(a) # 2.0
np.std(a) # 0.816...
np.exp(a) # element-wise exponentialBroadcasting
When shapes don’t match, NumPy stretches the smaller array:
M = np.ones((3, 4)) # shape (3, 4)
v = np.array([1, 2, 3, 4]) # shape (4,)
M + v # v broadcasts to (3, 4) — adds to each rowRules: align shapes right-to-left. Dimensions must be equal or one of them is 1.