Part 3 · Core Linear AlgebraChapter 775 min

Vectors

Arrows, lists, feature vectors, and embeddings

Learning objectives

  • Hold the geometric and the list-of-numbers views simultaneously
  • Compute sums, scalar multiples, and linear combinations
  • Compute the dot product and interpret it three ways
  • See feature vectors and embeddings as points in ℝⁿ

Why vectors are everywhere in ML

Open almost any machine-learning paper and within the first page you will meet a vector. A word becomes a vector (an embedding). An image becomes a vector of pixel intensities. A user, a molecule, a sentence — all become vectors. The reason is simple: once a thing is a list of numbers, we can measure it, compare it, transform it, and optimize over it with the machinery of linear algebra.

So our first real object of study is the vector, seen three ways at once:

  • Geometrically, as an arrow with a length and a direction.
  • Algebraically, as an ordered list of numbers.
  • Computationally, as a 1-D NumPy array of a fixed shape.

Fluency means switching between these views without friction. That is the goal of this chapter.

Intuition: an arrow and a list are the same thing

Picture the point (3,1)(3, 1) in the plane. Draw an arrow from the origin to it. That arrow is the vector a=(3,1)\mathbf{a} = (3, 1). The two numbers are the instructions "go 3 right, then 1 up." Every arrow from the origin corresponds to exactly one list of numbers, and vice versa. In nn dimensions we lose the ability to draw the arrow, but the correspondence still holds: a vector is nn instructions, one per axis.

Interactive LabVector Playground
Loading interactive lab…

Drag the tips above. Notice that the coordinates (the list) and the arrow (the geometry) always agree — moving one moves the other.

Formal definitions

By convention a vector is a column unless stated otherwise, so xRn\mathbf{x} \in \mathbb{R}^n is really an n×1n \times 1 matrix. Its transpose x\mathbf{x}^\top is the corresponding 1×n1 \times n row vector. This column default matters the moment we multiply by matrices in the next chapter.

The three basic operations

Two vectors of the same dimension can be added component-by-component, and any vector can be scaled by a number:

A linear combination applies both at once: given scalars c1,,ckc_1, \ldots, c_k and vectors v1,,vk\mathbf{v}_1, \ldots, \mathbf{v}_k,

Linear combinations are the single most important operation in linear algebra — a neural network layer, a weighted average, and a regression prediction are all linear combinations.

The dot product, three ways

The dot product takes two vectors of the same dimension and returns a single number:

That is the algebraic view. There are two more, and holding all three together is what makes the dot product intuitive.

The third view is projection: ab\mathbf{a}\cdot\mathbf{b} measures how much of a\mathbf{a} lies along b\mathbf{b} (scaled by b\lVert\mathbf{b}\rVert). Toggle "Show projection" in the lab above to see it.

ML use case: a neuron is a dot product

A single artificial neuron computes exactly

where xRn\mathbf{x} \in \mathbb{R}^n is the input (a feature vector), wRn\mathbf{w} \in \mathbb{R}^n are the learned weights, and bRb \in \mathbb{R} is a bias. The dot product wx\mathbf{w}^\top\mathbf{x} is a weighted sum of the features — the weights say how much each feature matters. Every dense layer of every neural network is a stack of these dot products. Understand the dot product and you understand the arithmetic core of deep learning.

Similarly, the similarity between two embeddings — "how alike are these two words / images / users?" — is almost always a (normalized) dot product. We make that precise in the next chapter with cosine similarity.

NumPy implementation

In NumPy a vector is a 1-D array. Its shape is a one-element tuple (n,). Let us implement the dot product two ways — an explicit loop and the vectorized call — and confirm they agree. Run it:

dot_product.py

The vectorized version is not just shorter — for large n it is dramatically faster, because NumPy runs the multiply-and-add loop in optimized C over contiguous memory instead of in the Python interpreter. Prefer vectorized operations; reach for an explicit loop only to explain what an operation means.

Interactive experiment

Return to the Vector Playground and build intuition for these facts by dragging:

  • Make ab\mathbf{a}\cdot\mathbf{b} as large as possible with fixed lengths — you will find the vectors must point the same way (θ=0\theta = 0).
  • Make the dot product zero — the vectors become perpendicular.
  • Make it negative — they point more than 90° apart.

These three regimes are exactly the sign behavior predicted by the geometric form.

Summary

  • A vector in Rn\mathbb{R}^n is an ordered list of nn numbers, equivalently an arrow with length and direction. Column form is the default.
  • Vectors of equal dimension add component-wise and scale by a number; combining both gives linear combinations, the workhorse operation of the field.
  • The dot product ab=iaibi\mathbf{a}\cdot\mathbf{b} = \sum_i a_i b_i returns a scalar and has three readings: sum of products, abcosθ\lVert a\rVert\lVert b\rVert\cos\theta, and a projection. Its sign encodes directional agreement.
  • A neuron z=wx+bz = \mathbf{w}^\top\mathbf{x} + b is a dot product plus a bias — the arithmetic core of neural networks.
  • In NumPy a vector is a 1-D array of shape (n,); use a @ b for the dot product and keep (n,) distinct from (n, 1).

Active recall

Answer from memory before checking the lesson:

  1. State the dot product of a,bRn\mathbf{a}, \mathbf{b} \in \mathbb{R}^n as a sum, and say what shape the result has.
  2. Two nonzero vectors have a dot product of 00. What is the angle between them?
  3. Why is a @ b preferred over a Python for loop for the dot product?
  4. What is the shape of a NumPy vector np.array([1, 2, 3]) — is it (3,) or (3, 1)? Why does the distinction matter?

Exercises

Level ARecall & basic calculation

Level BConceptual understanding

Level CDerivation & implementation

Level DResearch-thinking challenge