Parameters vs Inputs

Parameters vs Inputs: The Core Distinction

Inputs are the data that flows through your model to get predictions. Parameters are the internal values that define how your model transforms inputs into outputs. Think of inputs as the “questions” and parameters as the “knowledge” your model uses to answer them.

Key Differences

Inputs (x)

  • Change with every prediction - Each example has different input values
  • Come from the outside world - The data you want predictions for
  • Not learned - They’re given to you
  • Flow through the model - Pass through the function to produce output
  • Examples: pixel values of an image, words in a sentence, temperature readings

Parameters (w, b, θ)

  • Stay fixed during prediction - Same values used for all examples
  • Learned during training - The model discovers these values
  • Define the model’s behaviour - They ARE the model
  • Updated to minimise loss - Adjusted through optimisation
  • Examples: weights, biases, convolution filters

Simple Analogy

Think of a recipe:

  • Inputs = ingredients (flour, eggs, milk)
  • Parameters = quantities and instructions (2 cups, 350°F, mix for 3 minutes)
  • Output = the final dish

The same recipe (parameters) can be applied to different ingredients (inputs). Training is like perfecting the recipe through trial and error.

In Mathematical Terms

fw,b(x)=wx+bf_{w,b}(x) = wx + b
  • x = input (changes for each example)
  • w, b = parameters (fixed after training)
  • When x = 5: f(5) = w(5) + b
  • When x = 10: f(10) = w(10) + b
  • Same w and b, different x

During Training vs Inference

Training Phase:

  • Inputs: training examples flow through
  • Parameters: continuously updated to improve predictions
  • Process: input → current parameters → prediction → compare to target → update parameters

Inference Phase:

  • Inputs: new examples flow through
  • Parameters: frozen at their trained values
  • Process: input → fixed parameters → prediction

In Neural Networks

This scales up dramatically:

Inputs:

  • Image: millions of pixel values
  • Text: sequence of word tokens
  • Audio: waveform samples

Parameters:

  • Modern language model: billions of weights
  • Each layer has its own weight matrices and biases
  • All learned during training, fixed during use

Why This Distinction Matters

Generalization - Parameters capture patterns that work across many different inputs. A good model learns parameters that handle inputs it’s never seen before.

Transfer Learning - You can take learned parameters from one task and apply them to new inputs from a related task.

Model Size - When we say “GPT-3 has 175 billion parameters,” we’re counting the learned values, not the inputs it processes.

Optimization - Gradient descent updates parameters, not inputs. We’re solving for the best parameters given fixed training inputs.

Common Notation Patterns

  • Inputs: x, X, x^(i), input, features, data
  • Parameters: w, W, b, θ (theta), β (beta), weights, coefficients
  • Outputs: y, ŷ, predictions, targets

A Practical Example

Consider predicting house prices:

Inputs (per house):

  • Square footage: 2000
  • Bedrooms: 3
  • Location: downtown

Parameters (same for all houses):

  • Weight for sq ft: 150
  • Weight for bedrooms: 10000
  • Weight for downtown: 50000
  • Bias: 100000

Calculation: Price = 150(2000) + 10000(3) + 50000(1) + 100000 = $480,000

Different house (inputs) → same parameters → different price

The model’s “knowledge” about how features relate to price is encoded in the parameters, while the specific house details are the inputs.