Recommendation Pipeline

A production recommender system typically processes items through four stages: Retrieval → Filtering → Scoring → Ordering. Each stage narrows and refines the candidates until a final ranked list is presented to the user.

Millions of items
       ↓
   RETRIEVAL      → Thousands of candidates
       ↓
   FILTERING      → Hundreds of valid candidates
       ↓
   SCORING        → Hundreds of scored candidates
       ↓
   ORDERING       → Final ranked list (10-50 items)

1. Retrieval

Goal: Quickly narrow millions of items to a manageable set.

The retrieval stage prioritises speed and recall. It doesn’t need to be precise — it just needs to not miss good items.

Techniques:

ANN search on embeddings
Item-to-item similarity lookup
Graph-based retrieval
Rule-based candidates (e.g., popular items, same category)

Output: ~1,000–10,000 candidates

candidates = ann_index.search(user_embedding, top_k=1000)
candidates += get_popular_items(category, top_k=100)
candidates += get_similar_items(user_recent_views, top_k=500)

2. Filtering

Goal: Remove invalid or irrelevant candidates based on business rules.

This stage applies hard constraints that the model shouldn’t have to learn. Items that fail these rules are removed entirely.

Common filters:

Already purchased/watched/seen
Out of stock
Not available in user’s region
Explicitly disliked or blocked
Age-restricted content
Doesn’t match user’s preferences (e.g., dietary restrictions)

Output: ~100–1,000 valid candidates

filtered = []
for item in candidates:
    if item in user_purchase_history:
        continue
    if not item.in_stock:
        continue
    if item.region not in user.allowed_regions:
        continue
    if item in user.blocked_items:
        continue
    filtered.append(item)

3. Scoring

Goal: Predict how much the user will like each remaining item.

This is where the heavy ML model runs. The scorer takes user features, item features, and context features to predict a relevance score.

Techniques:

Deep neural networks
Gradient boosted trees
Logistic regression
Multi-objective models (predict click, purchase, watch time, etc.)

Output: Each candidate gets a score (or multiple scores)

scores = []
for item in filtered:
    features = combine_features(user, item, context)
    score = ranking_model.predict(features)
    scores.append((item, score))

4. Ordering

Goal: Arrange items into the final list the user sees.

Ordering isn’t always “sort by score descending.” This stage applies business logic and optimisation objectives.

Considerations:

Diversity: Don’t show 10 similar items in a row
Freshness: Mix in some newer content
Exploration: Occasionally show items the model is uncertain about
Business rules: Promote sponsored content, new releases
Position bias: Some slots are more valuable than others

Techniques:

Greedy re-ranking with diversity constraints
Maximal Marginal Relevance (MMR)
Multi-armed bandits for exploration
Determinantal Point Processes (DPP) for diversity

Output: Final ordered list (10–50 items)

final_list = []
seen_categories = set()

for item, score in sorted(scores, key=lambda x: -x[1]):
    # Diversity: limit items per category
    if item.category in seen_categories:
        if count_category(final_list, item.category) >= 3:
            continue

    final_list.append(item)
    seen_categories.add(item.category)

    if len(final_list) >= 20:
        break

Why Four Stages?

Each stage has different requirements:

Stage	Latency Budget	Items Processed	Complexity
Retrieval	~10ms	Millions → Thousands	Low
Filtering	~5ms	Thousands → Hundreds	Low
Scoring	~50ms	Hundreds	High
Ordering	~5ms	Hundreds → Tens	Medium

Running a complex neural network on millions of items would be far too slow. The funnel structure lets you apply expensive computation only where it matters.

Multi-Objective Scoring

Often the scoring stage predicts multiple outcomes:

P(click)
P(purchase)
P(long watch time)
P(share)

The ordering stage then combines these:

final_score = (
    0.3 * p_click +
    0.5 * p_purchase +
    0.2 * p_watch_time
)

This lets the business tune the balance between engagement and revenue without retraining the model.

1. Retrieval

2. Filtering

3. Scoring

4. Ordering

Why Four Stages?

Multi-Objective Scoring

See Also