Common ML Equations
MSE, linear models, cosine similarity, normalization, regularization.
Models & layers
| Name | Equation | One-line meaning |
|---|---|---|
Linear model | Weighted sum of features plus a bias. | |
Dense / affine layer | Matrix maps input to a new feature vector; | |
Sigmoid | Squashes a real number into (a probability). | |
Softmax | Turns a score vector into a probability distribution. | |
Attention scores (shape) |
|
Losses & regularization
| Name | Equation | One-line meaning |
|---|---|---|
MSE | Average squared error for regression. | |
Binary cross-entropy | Penalizes confident wrong probabilities for a 0/1 label. | |
L2 / ridge penalty | Shrinks weights toward zero to fight overfitting. | |
Cosine similarity | Angle-based similarity in , scale-invariant. |
Optimization
| Name | Equation | One-line meaning |
|---|---|---|
Gradient-descent update | Step downhill along the loss gradient with learning rate . | |
Gradient | Vector of partials; points toward steepest increase of . |