Correlation Calculator — Pearson r Coefficient & Interpretation
Calculate Pearson correlation coefficient (r) for any two variables. Understand the -1 to +1 scale, interpret strength and direction, and avoid the causation trap.
Correlation answers a specific question: do two variables tend to move together? When ice cream sales go up, drowning incidents also rise. Before you conclude ice cream kills people, the correlation coefficient gives you a number to work with — and the context to interpret it. Use the CalcHub Correlation Calculator to calculate Pearson r for your data.
The Pearson r Formula
$$r = \frac{n\Sigma x_i y_i - \Sigma x_i \cdot \Sigma y_i}{\sqrt{[n\Sigma x_i^2 - (\Sigma x_i)^2][n\Sigma y_i^2 - (\Sigma y_i)^2]}}$$
r always falls between -1 and +1.
Equivalent form using standard deviations: $$r = \frac{\Sigma(x_i - \bar{x})(y_i - \bar{y})}{(n-1) \cdot s_x \cdot s_y}$$Worked Example
Does temperature (°C) affect café sales (units/day)?
| Temp (x) | Sales (y) |
|---|---|
| 20 | 40 |
| 25 | 55 |
| 30 | 70 |
| 15 | 30 |
| 35 | 80 |
- Σx = 125, Σy = 275
- Σx² = 3275, Σy² = 16325
- Σxy = 7325
Strong positive correlation — warmer days, more sales.
Interpreting the r Value
| r Range | Interpretation |
|---|---|
| 0.90 to 1.00 | Very strong positive |
| 0.70 to 0.89 | Strong positive |
| 0.50 to 0.69 | Moderate positive |
| 0.30 to 0.49 | Weak positive |
| 0.10 to 0.29 | Very weak positive |
| −0.10 to 0.10 | Essentially none |
| −0.29 to −0.10 | Very weak negative |
| −0.49 to −0.30 | Weak negative |
| −0.69 to −0.50 | Moderate negative |
| −0.89 to −0.70 | Strong negative |
| −1.00 to −0.90 | Very strong negative |
Correlation ≠ Causation
This cannot be overstated. Spurious correlations are everywhere:
- Per capita cheese consumption correlates with deaths by bedsheet tangling (r ≈ 0.95)
- Nicolas Cage film releases per year correlates with drowning in swimming pools
- Ice cream sales and drowning both rise in summer — the hidden variable is temperature
Other Types of Correlation
| Type | Use Case |
|---|---|
| Pearson r | Continuous, normally distributed data |
| Spearman ρ | Ranked/ordinal data, or non-linear relationships |
| Kendall τ | Small samples, ordinal data |
| Point-Biserial | One continuous, one binary variable |
R² and Explained Variance
Squaring the correlation gives R² — the proportion of variance in y explained by x. If r = 0.7, then R² = 0.49: the x variable explains 49% of the variation in y, leaving 51% unexplained.
What sample size do I need for reliable correlation?
As a rough guide: n ≥ 30 for a stable estimate. With n < 10, a spurious high correlation is quite likely by chance. You can test statistical significance using the t-distribution: t = r√(n-2) / √(1-r²), with n-2 degrees of freedom.
Can correlation be used with categorical data?
Not directly with Pearson r, which requires numeric data. For two categorical variables, use Cramér's V. For one categorical and one numeric, use point-biserial correlation or ANOVA.
Why is my correlation near zero even though the variables are clearly related?
Pearson r only detects linear relationships. If your data follows a U-shape or any curved pattern, r could be close to zero while the association is actually very strong. Always plot your data before trusting any correlation figure.