What is Correlation?
Correlation simply means to be related or to be connected. In statistics, correlation refers to an association between different sets of variables. In other words, correlation refers to a mutual relationship between different variables.
When we try to identify the statistical relationship between different variables, we must do a correlation analysis. There are different ways in which correlation can be studied statistically.
Correlation is measured using a coefficient of correlation. In the case of linear correlation, the measure is the Karl Pearson’s coefficient of correlation that measures how much do the variables move together.
To measure linear correlation between variables, we must make a few assumptions. The most important ones are normality, homoscedascity, and linearity.
When the variables do not contain “normal” data, that is, when the data is categorical or ordinal, we cannot use Karl Pearson’s coefficient of correlation. Instead, we use Spearman’s rank correlation to measure the correlation for such data.
Simply because two sets of variables are correlated, it does not indicate that one of them is causing the other. Correlation merely indicate a relationship, and not necessarily a causal relationship. We should be rather careful when we draw conclusions out of an observed strong correlation alone.