Cosine Similarity
What
Cosine similarity measures the angle between two vectors, ignoring their magnitude. Two vectors pointing in the same direction have similarity 1, regardless of how long they are.
cos_sim(a, b) = (a · b) / (||a|| × ||b||)
where a · b is the Dot Product and ||a|| is the L2 norm (length) of a.
Why magnitude doesn’t matter
A document with the word “python” 100 times and one with “python” 10 times point in roughly the same direction in word-count space — they’re about the same topic, just different lengths. Cosine similarity captures this by normalizing out the magnitude.
Mathematically: if you normalize both vectors to unit length first, cosine similarity is just the dot product. That’s why many embedding systems store pre-normalized vectors — similarity search becomes a simple dot product.
Range and interpretation
| Value | Meaning |
|---|---|
| 1 | Identical direction |
| 0 | Orthogonal (unrelated) |
| -1 | Opposite direction |
For non-negative vectors (e.g., TF-IDF, bag-of-words), the range is [0, 1].
Where it’s used
- Embeddings: finding similar sentences/images in vector databases
- Recommendation systems: user-item similarity in latent space
- Information retrieval: query-document matching
- Clustering: as a distance metric (cosine distance = 1 - cosine similarity)
Python example
import numpy as np
def cosine_similarity(a, b):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
# Similar directions, different magnitudes
a = np.array([1, 2, 3])
b = np.array([2, 4, 6]) # same direction, 2x magnitude
print(cosine_similarity(a, b)) # 1.0
# Orthogonal
c = np.array([1, 0])
d = np.array([0, 1])
print(cosine_similarity(c, d)) # 0.0
# Using sklearn for batches
from sklearn.metrics.pairwise import cosine_similarity as cs
vecs = np.array([[1, 2, 3], [2, 4, 6], [3, 0, -1]])
print(cs(vecs)) # 3x3 similarity matrixLinks
- Dot Product — the numerator of cosine similarity
- Embeddings — primary use case in modern ML
- Vectors and Matrices — vector norms and operations