Final Implementation Assessment

Implement the course's core numerical routines from scratch in pure NumPy: vector operations, matrix multiplication, a numerical derivative, gradient descent, and linear regression — each with a required signature and shape contract.

Final Implementation Assessment

Implement each task in pure NumPy (assume import numpy as np). Do not use scikit-learn, np.polyfit, np.linalg.lstsq, or np.linalg.solve for the learning tasks. np.linalg.inv is allowed only for the closed-form check in Task 5. Match the required signatures and shapes exactly, and add a shape assertion wherever one is requested.

Task 1 — Vector operations

Implement two functions on 1-D arrays of shape $(n,)$ :

dot(a, b) -> float — the dot product. Do not call np.dot; use an elementwise multiply and a sum.
l2_norm(a) -> float — the Euclidean ( $L_2$ ) norm.

Shapes: inputs $(n,)$ ; outputs are scalars. Raise ValueError if a and b have different shapes.

Task 2 — Matrix multiplication

Implement matmul(A, B) -> np.ndarray for $A$ of shape $(m,k)$ and $B$ of shape $(k,n)$ , returning shape $(m,n)$ . Do not use the @ operator or np.matmul. Raise ValueError if the inner dimensions disagree.

Task 3 — Numerical derivative

Implement numerical_derivative(f, x, h=1e-5) -> float using the central difference $\dfrac{f(x+h)-f(x-h)}{2h}$ , where f maps a scalar to a scalar and x is a float.

Task 4 — Gradient descent

Implement gradient_descent(grad, x0, lr, steps) -> np.ndarray. Starting from x0 of shape $(d,)$ , apply the update $x \leftarrow x - \text{lr}\cdot \text{grad}(x)$ for the given number of steps and return the final x of shape $(d,)$ . Here grad(x) returns an array of shape $(d,)$ .

Task 5 — Linear regression from scratch

Given X of shape $(n,d)$ and y of shape $(n,)$ , implement:

predict(X, w, b) -> np.ndarray — shape $(n,)$ , computing $\hat{y}=Xw+b$ with w of shape $(d,)$ and b a float.
mse(yhat, y) -> float — the mean of $(\hat{y}-y)^2$ .
gradients(X, y, w, b) -> (dw, db) — with dw of shape $(d,)$ and db a float, using

dw = \frac{2}{n}\,X^\top(\hat{y}-y), \qquad db = \frac{2}{n}\sum_i (\hat{y}_i - y_i).

fit(X, y, lr, epochs) -> (w, b) — batch gradient descent starting from w = np.zeros(d) and b = 0.0.

Finally, compute the closed-form solution via the normal equations and confirm the learned parameters are close to it.

Deliverable: a single module with these functions plus a short main block that fits synthetic data and prints the final MSE and the max difference between the gradient-descent and closed-form parameters.

Marking rubric

Task 1 — dot and l2_norm return correct scalars using elementwise multiply + sum (no np.dot); a shape mismatch raises ValueError.
Task 2 — matmul returns shape $(m,n)$ with correct values, without @/np.matmul, and raises ValueError on inner-dimension mismatch.
Task 3 — the central difference is used exactly; the result matches the analytic derivative to about $10^{-6}$ on a test function.
Task 4 — the update is applied steps times in the correct direction ( $-\text{lr}\cdot\text{grad}$ ); returns shape $(d,)$ ; does not mutate the caller's x0.
Task 5 — predict, mse, and gradients are correct with the right shapes; dw uses $\frac{2}{n}X^\top(\hat{y}-y)$ and db uses $\frac{2}{n}\sum(\hat{y}-y)$ .
Task 5 — fit converges (loss decreases each run) and lands close to the closed-form normal-equations solution.
Shapes & correctness — every function returns the specified shape/type; requested shape assertions are present; code is vectorized where reasonable.
Style — clear names, no forbidden library shortcuts, and a runnable demo that prints the final MSE and the parameter gap.