Random Walks
A technique for exploring graphs by randomly traversing connections, used to discover item relationships and generate embeddings.
A random walk is a way to explore a graph by starting at a node and repeatedly jumping to a random neighbour. In recommender systems, this technique helps discover relationships between items without needing explicit labels.
How It Works
- Start at a node (e.g., an item a user viewed)
- Randomly pick one of its connected neighbours
- Move to that neighbour
- Repeat many times
- Record the path you took
Item A → Item B → Item C → Item A → Item D → ...
Each walk produces a sequence of items, similar to a sentence of words.
Simple Analogy
Imagine wandering through a shopping mall by randomly walking into connected stores. If you start at the Apple Store, you’ll probably end up in electronics and tech accessories stores more often than in the food court. The stores you visit frequently together are “related.”
Why It’s Useful for Recommendations
Items that appear together frequently in random walks are likely related. If you do thousands of random walks starting from “iPhone”, and “AirPods” keeps showing up in those paths, they’re probably related products.
Generating Embeddings
The sequences from random walks can be fed into algorithms like Word2Vec (treating items like words in a sentence). Items that co-occur in walks end up with similar vectors.
# Simplified example
walks = generate_random_walks(item_graph, num_walks=1000, walk_length=10)
# walks might look like: [["iPhone", "AirPods", "MacBook", ...], ...]
# Train Word2Vec on the walks
from gensim.models import Word2Vec
model = Word2Vec(walks, vector_size=64, window=5, min_count=1)
# Now items have embeddings
iphone_embedding = model.wv["iPhone"]
Building the Graph
The graph connects items based on user behaviour:
- Co-browsing: Items viewed in the same session
- Co-purchase: Items bought together
- Sequential: Item B viewed after Item A
User browses: Dress → Heels → Handbag → Earrings
Creates edges:
Dress ↔ Heels
Heels ↔ Handbag
Handbag ↔ Earrings
Algorithms That Use Random Walks
DeepWalk
The original approach — run random walks on the graph, then train Word2Vec on the resulting sequences.
Node2Vec
An extension that controls how the walk explores:
- BFS-like: Stay close to the starting node (captures local structure)
- DFS-like: Wander far from the starting node (captures global structure)
Parameters p and q control this balance.
Graph Neural Networks
More recent approaches like GraphSAGE and GAT learn embeddings directly on the graph structure, but random walks remain a simple and effective baseline.
Advantages
- No labels required: Works with just the graph structure
- Captures implicit relationships: Finds connections humans might miss
- Scalable: Can sample walks rather than processing the entire graph
- Simple to implement: Just need a graph and Word2Vec
Limitations
- Static: Doesn’t adapt to real-time behaviour
- Cold start: New items with no connections can’t be walked to
- Randomness: Results vary between runs (though averaging helps)