Final Implementation Assessment

Implement the course's core numerical routines from scratch in pure NumPy: vector operations, matrix multiplication, a numerical derivative, gradient descent, and linear regression — each with a required signature and shape contract.

Final Implementation Assessment

Implement each task in pure NumPy (assume import numpy as np). Do not use scikit-learn, np.polyfit, np.linalg.lstsq, or np.linalg.solve for the learning tasks. np.linalg.inv is allowed only for the closed-form check in Task 5. Match the required signatures and shapes exactly, and add a shape assertion wherever one is requested.

Task 1 — Vector operations

Implement two functions on 1-D arrays of shape (n,)(n,):

  • dot(a, b) -> float — the dot product. Do not call np.dot; use an elementwise multiply and a sum.
  • l2_norm(a) -> float — the Euclidean (L2L_2) norm.

Shapes: inputs (n,)(n,); outputs are scalars. Raise ValueError if a and b have different shapes.

Task 2 — Matrix multiplication

Implement matmul(A, B) -> np.ndarray for AA of shape (m,k)(m,k) and BB of shape (k,n)(k,n), returning shape (m,n)(m,n). Do not use the @ operator or np.matmul. Raise ValueError if the inner dimensions disagree.

Task 3 — Numerical derivative

Implement numerical_derivative(f, x, h=1e-5) -> float using the central difference f(x+h)f(xh)2h\dfrac{f(x+h)-f(x-h)}{2h}, where f maps a scalar to a scalar and x is a float.

Task 4 — Gradient descent

Implement gradient_descent(grad, x0, lr, steps) -> np.ndarray. Starting from x0 of shape (d,)(d,), apply the update xxlrgrad(x)x \leftarrow x - \text{lr}\cdot \text{grad}(x) for the given number of steps and return the final x of shape (d,)(d,). Here grad(x) returns an array of shape (d,)(d,).

Task 5 — Linear regression from scratch

Given X of shape (n,d)(n,d) and y of shape (n,)(n,), implement:

  • predict(X, w, b) -> np.ndarray — shape (n,)(n,), computing y^=Xw+b\hat{y}=Xw+b with w of shape (d,)(d,) and b a float.
  • mse(yhat, y) -> float — the mean of (y^y)2(\hat{y}-y)^2.
  • gradients(X, y, w, b) -> (dw, db) — with dw of shape (d,)(d,) and db a float, using
dw=2nX(y^y),db=2ni(y^iyi). dw = \frac{2}{n}\,X^\top(\hat{y}-y), \qquad db = \frac{2}{n}\sum_i (\hat{y}_i - y_i).
  • fit(X, y, lr, epochs) -> (w, b) — batch gradient descent starting from w = np.zeros(d) and b = 0.0.

Finally, compute the closed-form solution via the normal equations and confirm the learned parameters are close to it.

Deliverable: a single module with these functions plus a short main block that fits synthetic data and prints the final MSE and the max difference between the gradient-descent and closed-form parameters.

Marking rubric

  • Task 1dot and l2_norm return correct scalars using elementwise multiply + sum (no np.dot); a shape mismatch raises ValueError.

  • Task 2matmul returns shape (m,n)(m,n) with correct values, without @/np.matmul, and raises ValueError on inner-dimension mismatch.

  • Task 3 — the central difference is used exactly; the result matches the analytic derivative to about 10610^{-6} on a test function.

  • Task 4 — the update is applied steps times in the correct direction (lrgrad-\text{lr}\cdot\text{grad}); returns shape (d,)(d,); does not mutate the caller's x0.

  • Task 5predict, mse, and gradients are correct with the right shapes; dw uses 2nX(y^y)\frac{2}{n}X^\top(\hat{y}-y) and db uses 2n(y^y)\frac{2}{n}\sum(\hat{y}-y).

  • Task 5fit converges (loss decreases each run) and lands close to the closed-form normal-equations solution.

  • Shapes & correctness — every function returns the specified shape/type; requested shape assertions are present; code is vectorized where reasonable.

  • Style — clear names, no forbidden library shortcuts, and a runnable demo that prints the final MSE and the parameter gap.