Recommendation Pipeline

The four-stage funnel that transforms millions of items into a personalised, ordered list for the user.

A production recommender system typically processes items through four stages: Retrieval → Filtering → Scoring → Ordering. Each stage narrows and refines the candidates until a final ranked list is presented to the user.

Millions of items

   RETRIEVAL      → Thousands of candidates

   FILTERING      → Hundreds of valid candidates

   SCORING        → Hundreds of scored candidates

   ORDERING       → Final ranked list (10-50 items)

1. Retrieval

Goal: Quickly narrow millions of items to a manageable set.

The retrieval stage prioritises speed and recall. It doesn’t need to be precise — it just needs to not miss good items.

Techniques:

  • ANN search on embeddings
  • Item-to-item similarity lookup
  • Graph-based retrieval
  • Rule-based candidates (e.g., popular items, same category)

Output: ~1,000–10,000 candidates

candidates = ann_index.search(user_embedding, top_k=1000)
candidates += get_popular_items(category, top_k=100)
candidates += get_similar_items(user_recent_views, top_k=500)

2. Filtering

Goal: Remove invalid or irrelevant candidates based on business rules.

This stage applies hard constraints that the model shouldn’t have to learn. Items that fail these rules are removed entirely.

Common filters:

  • Already purchased/watched/seen
  • Out of stock
  • Not available in user’s region
  • Explicitly disliked or blocked
  • Age-restricted content
  • Doesn’t match user’s preferences (e.g., dietary restrictions)

Output: ~100–1,000 valid candidates

filtered = []
for item in candidates:
    if item in user_purchase_history:
        continue
    if not item.in_stock:
        continue
    if item.region not in user.allowed_regions:
        continue
    if item in user.blocked_items:
        continue
    filtered.append(item)

3. Scoring

Goal: Predict how much the user will like each remaining item.

This is where the heavy ML model runs. The scorer takes user features, item features, and context features to predict a relevance score.

Techniques:

  • Deep neural networks
  • Gradient boosted trees
  • Logistic regression
  • Multi-objective models (predict click, purchase, watch time, etc.)

Output: Each candidate gets a score (or multiple scores)

scores = []
for item in filtered:
    features = combine_features(user, item, context)
    score = ranking_model.predict(features)
    scores.append((item, score))

4. Ordering

Goal: Arrange items into the final list the user sees.

Ordering isn’t always “sort by score descending.” This stage applies business logic and optimisation objectives.

Considerations:

  • Diversity: Don’t show 10 similar items in a row
  • Freshness: Mix in some newer content
  • Exploration: Occasionally show items the model is uncertain about
  • Business rules: Promote sponsored content, new releases
  • Position bias: Some slots are more valuable than others

Techniques:

  • Greedy re-ranking with diversity constraints
  • Maximal Marginal Relevance (MMR)
  • Multi-armed bandits for exploration
  • Determinantal Point Processes (DPP) for diversity

Output: Final ordered list (10–50 items)

final_list = []
seen_categories = set()

for item, score in sorted(scores, key=lambda x: -x[1]):
    # Diversity: limit items per category
    if item.category in seen_categories:
        if count_category(final_list, item.category) >= 3:
            continue

    final_list.append(item)
    seen_categories.add(item.category)

    if len(final_list) >= 20:
        break

Why Four Stages?

Each stage has different requirements:

StageLatency BudgetItems ProcessedComplexity
Retrieval~10msMillions → ThousandsLow
Filtering~5msThousands → HundredsLow
Scoring~50msHundredsHigh
Ordering~5msHundreds → TensMedium

Running a complex neural network on millions of items would be far too slow. The funnel structure lets you apply expensive computation only where it matters.

Multi-Objective Scoring

Often the scoring stage predicts multiple outcomes:

  • P(click)
  • P(purchase)
  • P(long watch time)
  • P(share)

The ordering stage then combines these:

final_score = (
    0.3 * p_click +
    0.5 * p_purchase +
    0.2 * p_watch_time
)

This lets the business tune the balance between engagement and revenue without retraining the model.

See Also

-
-