March 28, 20264 min read

Linear Regression Calculator — Line of Best Fit, Slope & R²

Calculate linear regression with least squares method. Find slope, y-intercept, and R² value. Interpret your line of best fit with worked examples and formulas.

linear regression statistics least squares R squared calchub
Ad 336x280

Linear regression finds the straight line that best fits a set of data points — minimizing the total squared distance from every point to the line. It's the workhorse of predictive statistics, from predicting house prices to estimating crop yields. Run yours with the CalcHub Linear Regression Calculator.

The Least Squares Formulas

Given pairs (x₁, y₁), (x₂, y₂), ..., (xₙ, yₙ), the best-fit line is ŷ = mx + b where:

Slope: $$m = \frac{n\Sigma(x_i y_i) - \Sigma x_i \cdot \Sigma y_i}{n\Sigma x_i^2 - (\Sigma x_i)^2}$$ Intercept: $$b = \bar{y} - m\bar{x}$$

Where x̄ and ȳ are the means of x and y respectively.

Worked Example

A study tracks hours studied vs. exam score for 5 students:

Hours (x)Score (y)xy
150150
2604120
3659195
47516300
58525425
Σ = 15Σ = 335Σ = 55Σ = 1090
n = 5, x̄ = 3, ȳ = 67 m = (5×1090 − 15×335) / (5×55 − 15²) = (5450 − 5025) / (275 − 225) = 425 / 50 = 8.5 b = 67 − 8.5×3 = 67 − 25.5 = 41.5 Line of best fit: ŷ = 8.5x + 41.5

Prediction: a student studying 6 hours → ŷ = 8.5(6) + 41.5 = 92.5

Interpreting R² (Coefficient of Determination)

R² measures how much of the variation in y is explained by the linear relationship with x.

Formula: $$R^2 = 1 - \frac{SS_{res}}{SS_{tot}}$$

Where SS_res = Σ(yᵢ − ŷᵢ)² and SS_tot = Σ(yᵢ − ȳ)²

R² ValueInterpretation
0.00 – 0.19Very weak fit
0.20 – 0.39Weak fit
0.40 – 0.59Moderate fit
0.60 – 0.79Strong fit
0.80 – 1.00Very strong fit
In the example above, R² ≈ 0.98 — nearly all the variation in scores is explained by study hours.

Residuals: The Gaps Between Reality and the Line

A residual is the difference between an observed y value and the predicted ŷ value:

eᵢ = yᵢ − ŷᵢ

For the student who studied 3 hours with a score of 65:
Predicted: ŷ = 8.5(3) + 41.5 = 67
Residual: 65 − 67 = −2

Plotting residuals helps spot patterns — if they're random around zero, the linear model is appropriate. If they curve, you might need polynomial regression.

Assumptions and Limitations

Linear regression assumes:


  • A roughly linear relationship between x and y

  • Residuals are normally distributed with constant variance

  • Observations are independent


It breaks down when the true relationship is curved, when outliers are extreme, or when you're extrapolating far beyond your data range.

What does a negative slope mean?

A negative slope (m < 0) means y decreases as x increases — they're inversely related. For example, more hours of TV watched might correlate with fewer hours studying.

Is linear regression the same as correlation?

Related but different. Correlation (Pearson r) measures the strength and direction of the linear relationship. Regression goes further — it gives you an actual predictive equation. R² equals r² (Pearson r squared), which is why a strong correlation gives a high R².

Can I use linear regression with one variable to predict the future?

Yes, with caution. Extrapolating beyond your data range is risky — the linear trend might not hold. Always be skeptical of predictions far outside the range of x values used to build the model.

Ad 728x90