Monthly Archives: December 2011

Homework for my TA – week 10



Correlation and Regression


Woohoo the last of the blogs! … for now anyway. In stats over the last couple of weeks we have been doing correlation and regression, so I thought I would write a blog on it (well try anyway).

Correlation is described as the relationship between two variables. The correlation coefficient (which is the number calculated to describe the correlation) can range from -1 to +1. The closer the value is to 1 (negative or positive) the stronger the correlation/relationship is. If the value is close to -1 this means there is a negative correlation. This means that as one variable increases the other will decrease for example the more diet pills someone takes the less they will weigh. If the score was close to +1 then this would be a positive correlation which means both variables will increase together, for example, the more hours someone goes to work then the more money they will earn by the end of the month. If the value is close to 0 then there is no relationship between these two variables. Scatterplots are very often used to show a correlation with one variable on each axis (i.e. Y axis and X axis). I think scatterplots make it a lot easier to see if there is a relationship between two variables. However, there are limitations to correlational research. The main one is that a correlation cannot say what has caused what. There are different types of correlational studies, for example the survey method. This is probably the most popular method in correlational research but it in itself has its limitations with questionnaires and various biases.

Regression (for me) is more complicated to understand, especially the SPSS stuff. I think I will benefit from writing this blog as it will make me sit and read information about regression and understand it so I can tell you all about it. So here I go…Regression is used after a correlation has been identified to predict things about a variable. For example, on a time scale if you have only collected data until 3 months on the effectiveness of a drug and wanted to know the effects of this drug at 6 months then regression would help to predict this. The regression equation (y=a + bX) is used to calculate the point on the Y axis where the regression line (line of best fit) intercepts. Regression is a good way to determine if a variable causes an effect. For example regression would be able to identify that eating fatty foods is the cause to weight gain.

I believe correlation and regression can be great tools in analysing data and recognising relationships between variables. I personally find regression harder to understand and get my head around but I think it can be a really beneficial tool in research.

I looked at a couple of websites to help me understand this topic. Take a look if you want to know more: