Neural Collaborative Filtering

Neural Collaborative Filtering (NCF) is a deep learning framework for recommender systems that replaces the inner product used in traditional matrix factorization with a neural network architecture. This allows the model to learn arbitrary, non-linear user-item interactions.

Core Concept

Traditional collaborative filtering methods like matrix factorization model user-item interactions as the dot product of latent factors. NCF generalizes this by using neural networks to learn the interaction function, providing greater expressiveness.

Architecture

NCF typically consists of:

Embedding Layer - Maps sparse user and item IDs to dense vectors
Neural CF Layers - Multi-layer perceptron that learns interactions
Output Layer - Predicts the interaction score

PyTorch Implementation

PyTorch uses an object-oriented approach with explicit forward passes and manual training loops, offering fine-grained control over the training process.

Model Definition

import torch
import torch.nn as nn

class NCF(nn.Module):
    def __init__(self, num_users, num_items, embedding_dim=32, hidden_layers=[64, 32, 16]):
        super(NCF, self).__init__()

        # Embedding layers
        self.user_embedding = nn.Embedding(num_users, embedding_dim)
        self.item_embedding = nn.Embedding(num_items, embedding_dim)

        # MLP layers
        layers = []
        input_size = embedding_dim * 2
        for hidden_size in hidden_layers:
            layers.append(nn.Linear(input_size, hidden_size))
            layers.append(nn.ReLU())
            input_size = hidden_size

        self.mlp = nn.Sequential(*layers)
        self.output = nn.Linear(hidden_layers[-1], 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, user_ids, item_ids):
        user_emb = self.user_embedding(user_ids)
        item_emb = self.item_embedding(item_ids)

        # Concatenate embeddings
        concat = torch.cat([user_emb, item_emb], dim=-1)

        # Pass through MLP
        mlp_out = self.mlp(concat)
        output = self.sigmoid(self.output(mlp_out))

        return output.squeeze()

Training Loop

def train_ncf(model, train_loader, epochs=10, lr=0.001):
    optimizer = torch.optim.Adam(model.parameters(), lr=lr)
    criterion = nn.BCELoss()

    for epoch in range(epochs):
        model.train()
        total_loss = 0

        for user_ids, item_ids, labels in train_loader:
            optimizer.zero_grad()
            predictions = model(user_ids, item_ids)
            loss = criterion(predictions, labels.float())
            loss.backward()
            optimizer.step()
            total_loss += loss.item()

        print(f"Epoch {epoch+1}, Loss: {total_loss/len(train_loader):.4f}")

Inference

def get_recommendations(model, user_id, candidate_items, top_k=10):
    model.eval()
    with torch.no_grad():
        user_tensor = torch.tensor([user_id] * len(candidate_items))
        item_tensor = torch.tensor(candidate_items)
        scores = model(user_tensor, item_tensor)

        top_indices = torch.topk(scores, k=top_k).indices
        return [candidate_items[i] for i in top_indices]

TensorFlow/Keras Implementation

TensorFlow uses a more declarative approach with high-level training APIs, making it well-suited for production deployment.

Model Definition

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

class NCF(keras.Model):
    def __init__(self, num_users, num_items, embedding_dim=32, hidden_layers=[64, 32, 16]):
        super(NCF, self).__init__()

        # Embedding layers
        self.user_embedding = layers.Embedding(num_users, embedding_dim)
        self.item_embedding = layers.Embedding(num_items, embedding_dim)

        # MLP layers
        self.dense_layers = []
        for hidden_size in hidden_layers:
            self.dense_layers.append(layers.Dense(hidden_size, activation='relu'))

        self.output_layer = layers.Dense(1, activation='sigmoid')

    def call(self, inputs):
        user_ids, item_ids = inputs

        user_emb = self.user_embedding(user_ids)
        item_emb = self.item_embedding(item_ids)

        # Concatenate embeddings
        concat = tf.concat([user_emb, item_emb], axis=-1)

        # Pass through MLP
        x = concat
        for dense in self.dense_layers:
            x = dense(x)

        return self.output_layer(x)

Training with Keras API

def train_ncf(model, train_dataset, epochs=10, lr=0.001):
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=lr),
        loss='binary_crossentropy',
        metrics=['accuracy']
    )

    history = model.fit(
        train_dataset,
        epochs=epochs,
        verbose=1
    )

    return history

Inference

def get_recommendations(model, user_id, candidate_items, top_k=10):
    user_tensor = tf.constant([user_id] * len(candidate_items))
    item_tensor = tf.constant(candidate_items)

    scores = model([user_tensor, item_tensor], training=False)
    scores = tf.squeeze(scores)

    top_indices = tf.math.top_k(scores, k=top_k).indices
    return [candidate_items[i] for i in top_indices.numpy()]

PyTorch vs TensorFlow Comparison

Aspect	PyTorch	TensorFlow
Execution	Eager by default	Eager in TF2, `@tf.function` for graphs
Training	Manual loop required	Built-in `model.fit()`
Debugging	Standard Python debugger	Easier in eager mode
Deployment	TorchServe, ONNX	TF Serving, TFLite, SavedModel
Ecosystem	Research-focused	Production & mobile-focused

Variants

GMF (Generalized Matrix Factorization) - Uses element-wise product of embeddings
MLP - Uses concatenated embeddings through fully connected layers
NeuMF - Combines GMF and MLP for enhanced performance

Advantages

Learns non-linear user-item interactions
Handles implicit feedback naturally
Flexible architecture allows for extensions
Integrates easily with other neural components

Limitations

Computationally more expensive than matrix factorization
Requires more data to train effectively
Cold start problem remains for new users/items

Core Concept

Architecture

PyTorch Implementation

Model Definition

Training Loop

Inference

TensorFlow/Keras Implementation

Model Definition

Training with Keras API

Inference

PyTorch vs TensorFlow Comparison

Variants

Advantages

Limitations

See Also