linear regression from scratch

Building Linear Regression From Scratch: Mastering the Fundamentals 🚀

Linear Regression is the bedrock of machine learning models — simple yet powerful. To truly master it, I decided to build it from scratch, implementing every core step myself.

image description

Theory: Cost Function and Gradient Descent

1. Cost Function (Mean Squared Error - MSE)

We define the cost function as: Where:

  • is the number of training examples

  • is the predicted value

  • is the true value

2. Gradient Descent Update Rules

To minimize the cost function, we update parameters and using:

Where:

  • is the learning rate which controls the step size of each update

  • Gradients are calculated as:

💻 Implementation

️1. Data Preprocessing

To ensure faster convergence and stable optimization, the input features were standardized using Scikit-learn’s StandardScaler.

This was crucial—without feature scaling, the gradients would oscillate or diverge.

📊 Data: Auto MPG Dataset

🎯 Goal: Predict a car’s fuel efficiency (mpg) using engine and car specs.

🔗 Source: Available in seaborn or directly via UCI ML repo.

2. Code

The core class LinearRegressionScratch contains:

  • fit() — for model training using gradient descent

  • predict() — for making predictions

  • update_params() — for applying gradients

Weights and bias are initialized to zeros and iteratively updated over 10,000 iterations.
Here is the full implementation of the linear regression from scratch:

2.1 Initialization

class LinearRegressionScratch:
    def __init__(self, learning_rate=0.01, n_iterations=10000):
        self.learning_rate = learning_rate
        self.n_iterations = n_iterations

This initializes the model with a specified learning rate and number of iterations.

2.2 Training the Model (fit)

    def fit(self, X, y):
        self.n, self.m = X.shape
        self.w = np.zeros(self.m)
        self.b = 0
        self.X = X
        self.y = y
        self.losses = []

        for i in range(self.n_iterations):
            self.update_params()
            loss = self.compute_loss()
            self.losses.append(loss)

        return self

This method runs gradient descent and tracks the loss at every iteration.

2.3 Compute Loss (compute_loss)

    def compute_loss(self):
        y_pred = self.predict(self.X)
        loss = (1/(2*self.n)) * np.sum((y_pred - self.y) ** 2)
        return loss

Computes the MSE loss at each training iteration.

2.4 Prediction (predict)

    def predict(self, X):
        return np.dot(X, self.w) + self.b

Used for generating predictions on new data.

📉 Visualizing the Loss Curve

image description

Loss decreases smoothly over iterations, indicating effective learning.

image description

Monitoring loss during training was critical to detect divergence and confirm correct implementation of gradient descent.

⚡ Benchmark Comparison

Model RMSE
Custom Scratch Model 24.4751
Scikit-learn LinearRegression 22.1532

This validates that the custom implementation achieves near-identical performance compared to a production-level tool.

🧠 Key Takeaways

  • Gradient descent is extremely sensitive to feature scale.

  • Debugging gradient formulas built true understanding.

  • Visualizing loss is essential — it provides a heartbeat of the learning process.

  • Understanding convergence from first principles gives a deeper grasp than black-box usage.

What’s Next?

In the coming weeks:

  • Polynomial Regression (non-linearity from scratch)

  • Ridge and Lasso Regression (regularization from scratch)

  • Deriving and visualizing bias-variance trade-off

🏁 Conclusion

This project helped me internalize the fundamentals of optimization, modeling, and numerical learning — lessons that will compound as I scale deeper into advanced models.

📎 Repository

GitHub: Linear Regression from Scratch

No comment found.

Add a comment

You must log in to post a comment.