Random Walks

A random walk is a way to explore a graph by starting at a node and repeatedly jumping to a random neighbour. In recommender systems, this technique helps discover relationships between items without needing explicit labels.

How It Works

Start at a node (e.g., an item a user viewed)
Randomly pick one of its connected neighbours
Move to that neighbour
Repeat many times
Record the path you took

Item A → Item B → Item C → Item A → Item D → ...

Each walk produces a sequence of items, similar to a sentence of words.

Simple Analogy

Imagine wandering through a shopping mall by randomly walking into connected stores. If you start at the Apple Store, you’ll probably end up in electronics and tech accessories stores more often than in the food court. The stores you visit frequently together are “related.”

Why It’s Useful for Recommendations

Items that appear together frequently in random walks are likely related. If you do thousands of random walks starting from “iPhone”, and “AirPods” keeps showing up in those paths, they’re probably related products.

Generating Embeddings

The sequences from random walks can be fed into algorithms like Word2Vec (treating items like words in a sentence). Items that co-occur in walks end up with similar vectors.

# Simplified example
walks = generate_random_walks(item_graph, num_walks=1000, walk_length=10)
# walks might look like: [["iPhone", "AirPods", "MacBook", ...], ...]

# Train Word2Vec on the walks
from gensim.models import Word2Vec
model = Word2Vec(walks, vector_size=64, window=5, min_count=1)

# Now items have embeddings
iphone_embedding = model.wv["iPhone"]

Building the Graph

The graph connects items based on user behaviour:

Co-browsing: Items viewed in the same session
Co-purchase: Items bought together
Sequential: Item B viewed after Item A

User browses: Dress → Heels → Handbag → Earrings

Creates edges:
Dress ↔ Heels
Heels ↔ Handbag
Handbag ↔ Earrings

Algorithms That Use Random Walks

DeepWalk

The original approach — run random walks on the graph, then train Word2Vec on the resulting sequences.

Node2Vec

An extension that controls how the walk explores:

BFS-like: Stay close to the starting node (captures local structure)
DFS-like: Wander far from the starting node (captures global structure)

Parameters p and q control this balance.

Graph Neural Networks

More recent approaches like GraphSAGE and GAT learn embeddings directly on the graph structure, but random walks remain a simple and effective baseline.

Advantages

No labels required: Works with just the graph structure
Captures implicit relationships: Finds connections humans might miss
Scalable: Can sample walks rather than processing the entire graph
Simple to implement: Just need a graph and Word2Vec

Limitations

Static: Doesn’t adapt to real-time behaviour
Cold start: New items with no connections can’t be walked to
Randomness: Results vary between runs (though averaging helps)