PDF download Download Article
Calculate correlation by hand, online, or with a graphing calculator
PDF download Download Article

The correlation coefficient, denoted as r or ρ, is the measure of linear correlation (the relationship, in terms of both strength and direction) between two variables. It ranges from -1 to +1, with plus and minus signs used to represent positive and negative correlation. If the correlation coefficient is exactly -1, then the relationship between the two variables is a perfect negative fit; if the correlation coefficient is exactly +1, then the relationship is a perfect positive fit. Otherwise, two variables may have a positive correlation, a negative correlation, or no correlation at all. You can calculate correlation by hand, by using some free correlation calculators available online, or by using the statistical functions of a good graphing calculator.

Method 1
Method 1 of 4:

Finding the Correlation Coefficient by Hand

PDF download Download Article
  1. To begin calculating a correlation efficient, first examine your data pairs. It is helpful to put them in a table, either vertically or horizontally. Label each row or column x and y. [1]
    • For example, suppose you have four data pairs for x and y . Your table may look like this:
      • x || y
      • 1 || 1
      • 2 || 3
      • 4 || 5
      • 5 || 7
  2. In order to calculate the mean, you must add all the values of x , then divide by the number of values. [2]
    • Using the example above, note that you have four values for x . To calculate the mean, add all the values given for x , then divide by 4. Your calculation would look like this:
    Advertisement
  3. To find the mean of y , follow the same steps, adding all the values of y together, then dividing by the number of values. [3]
    • In the example above, you also have four values for y . Add all these values, then divide by 4. Your calculations would look like this:
  4. Once you have your means, you can calculate standard deviation. To do so, use the formula: [4]
    • With the sample data, your calculations should look like this:
  5. Using the same basic steps, find the standard deviation of y . You will use the same formula, using the y data points. [5]
    • With the sample data, your calculations should look like this:
  6. The formula for calculating a correlation coefficient uses means, standard deviations, and the number of pairs in your data set (represented by n ). The correlation coefficient itself is represented by the lower-case letter r or the lower-case Greek letter rho, ρ. For this article, you will use the formula known as the Pearson correlation coefficient, shown below: [6]
    • You may notice slight variations in the formula, here or in other texts. For example, some will use the Greek notation with rho and sigma, while others will use r and s. Some texts may show slightly different formulas; but they will be mathematically equivalent to this one.
  7. You now have the means and standard deviations for your variables, so you can proceed to use the correlation coefficient formula. Remember that n represents the number of values you have. You have already worked out the other relevant information in the steps above. [7]
    • Using the sample data, you would enter your data in the correlation coefficient formula and calculate as follows:
    • [
      ]
  8. For this data set, the correlation coefficient is 0.988. This number tells you two things about the data. Look at the sign of the number and the size of the number. [8]
    • Because the correlation coefficient is positive, you can say there is a positive correlation between the x-data and the y-data. This means that as the x values increase, you expect the y values to increase also.
    • Because the correlation coefficient is very close to +1, the x-data and y-data are very closely connected. If you were to graph these points, you would see that they form a very good approximation of a straight line.
  9. Advertisement
Method 2
Method 2 of 4:

Using Online Correlation Calculators

PDF download Download Article
  1. Measuring correlation is a fairly standard calculation for statisticians. The calculation can become very tedious if done by hand for large data sets. As a result, many sources have made correlation calculators available online. Use any search engine and enter the search term “correlation calculator.”
  2. Carefully review the instructions on the website so you will enter your data properly. It is important that your data pairs are kept in order, or you will generate an incorrect correlation result. Different websites use different formats to enter data.
  3. These calculation sites are popular because, after you enter your data, you generally need only to click on the button that says “Calculate,” and the result will appear automatically.
  4. Advertisement
Method 3
Method 3 of 4:

Using Graphing Calculators

PDF download Download Article
  1. Using a handheld graphing calculator, enter your calculator’s statistics function and then select the “Edit” command. [9]
    • Each calculator will have slightly different key commands. This article will give the specific instructions for the Texas Instruments TI-86.
    • Enter the Stat function by pressing [2nd]-Stat (above the + key), then hit F2-Edit.
  2. Most calculators will keep statistical data until cleared. To make sure that you do not confuse old data with new data, you should first clear any previously stored information. [10]
    • Use the arrow keys to move the cursor to highlight the heading “xStat.” Then press Clear and Enter. This should clear all values in the xStat column.
    • Use the arrow keys to highlight the yStat heading. Press Clear and Enter to empty the data from that column as well.
  3. Using the arrow keys, move the cursor to the first space under the xStat heading. Type in your first data value and then hit Enter. You should see the space at the bottom of the screen display “xStat(1)=__,” with your value filling the blank space. When you hit Enter, the data will fill the table, the cursor will move to the next line, and the line at the bottom of the screen should now read “xStat(2)=__.” [11]
    • Continue entering all the x-data values.
    • When you complete the x-data, use the arrow keys to move to the yStat column and enter the y-data values.
    • After all the data has been entered, hit Exit to clear the screen and leave the Stat menu.
  4. The correlation coefficient is a measure of how well the data approximates a straight line. A statistical graphing calculator can very quickly calculate the best-fit line and the correlation coefficient. [12]
    • Enter the Stat function and then hit the Calc button. On the TI-86, this is [2nd][Stat][F1].
    • Choose the Linear Regression calculations. On the TI-86, this is [F3], which is labeled “LinR.” The graphic screen should then display the line “LinR _,” with a blinking cursor.
    • You now need to enter the names of the two variables that you want to calculate. These are xStat and yStat.
      • On the TI-86, select the Names list by hitting [2nd][List][F3].
      • The bottom line of your screen should now show the available variables. Choose [xStat] (this is probably button F1 or F2), then enter a comma, then [yStat].
      • Hit Enter to calculate the data.
  5. When you hit Enter, the calculator will instantly calculate the following information for the data that you entered: [13]
    •  : This is the general formula for a straight line. However, instead of the familiar “y=mx+b,” this is presented in reverse order.
    • . This is the value of the y-intercept of the best-fit line.
    • . This is the slope of the best-fit line.
    • . This is the correlation coefficient.
    • . This is the number of data pairs that were used in the calculation.
  6. Advertisement
Method 4
Method 4 of 4:

Reviewing the Fundamentals

PDF download Download Article
  1. Correlation refers to the statistical relationship between two quantities. The correlation coefficient is a single number that you can calculate for any two sets of data points. The number will always be something between -1 and +1, and it indicates how closely connected the two data sets tend to be. [14]
    • For example, if you were to measure the heights and ages of children up to the age of about 12, you would expect to find a strong positive correlation. As children get older, they tend to get taller.
    • An example of negative correlation would be data comparing a person’s time spent practicing golf shots and that person’s golf score. As the practice increases, the score should decrease.
    • Finally, you would expect very little correlation, either positive or negative, between a person’s shoe size, for example, and SAT scores.
  2. The arithmetic mean, or “average,” of a set of data is calculated by adding all of the values of the data together, then dividing by the number of values in the set. When you find the correlation coefficient for your data, you will need to calculate the mean of each set of data. [15]
    • The mean of a variable is denoted by the variable with a horizontal line above it. This is often referred to as “x-bar” or “y-bar” for the x and y data sets. Alternatively, the mean may be signified by the lower-case Greek letter mu, μ. To indicate the mean of x-data points, for example, you could write μ x or μ(x).
    • As an example, if you have a set of x-data points (1,2,5,6,9,10), then the mean of this data is calculated as follows:
  3. In statistics, standard deviation measures variation, showing how numbers are spread out in relationship to the mean. A group of numbers with a low standard deviation are fairly tightly collected. A group of numbers with a high standard deviation are widely scattered. [16]
    • Symbolically, standard deviation is expressed with either the lower-case letter s or the lower-case Greek letter sigma, σ. Thus, the standard deviation of the x-data is written as either s x or σ x .
  4. The summation operator is one of the most common operators in mathematics, indicating a sum of values. It is represented by the upper-case Greek letter, sigma, or ∑. [17]
    • As an example, if you have a set of x-data points (1,2,5,6,9,10), then ∑x means:
      • 1+2+5+6+9+10 = 33.
  5. Advertisement

Community Q&A

Search
Add New Question
  • Question
    You are given the following information about two variables x and y: Mean(x)= 315 and Mean(y)=1,103. Variance(x)=59 and Variance(y)=156. Covariance(x,y)= -54. Calculate the coefficient of correlation between X and Y. Calculate your answer to two decimal places.
    Community Answer
    This question raises a higher level of statistics than is addressed in this article. It is possible to calculate the correlation coefficient from the means, variance and covariance, without actually having the original data points to begin with. The relationship is Correlation Coefficient = Covariance / ((Std. Dev. (x) * (Std. Dev. (y)). The standard deviation is the square root of the variance. So, with your data, this simplifies to Corr.Coeff.=-54/sqrt(59)sqrt(156) = -0.56.
Ask a Question
      Advertisement

      Video

      Tips

      • In general, a correlation coefficient higher than 0.8 (either positive or negative) represents a strong correlation; a correlation coefficient lower than 0.5 (again, either positive or negative) represents a weak one.
      • The correlation coefficient is sometimes called the “Pearson product-moment correlation coefficient” in honor of its developer, Karl Pearson.
      Submit a Tip
      All tip submissions are carefully reviewed before being published
      Name
      Please provide your name and last initial
      Thanks for submitting a tip for review!
      Advertisement

      Warnings

      • Correlation shows that the two sets of data are connected in some way. However, be careful not to interpret this as causation. For example, if you compare people’s shoe sizes and their height, you will probably find a strong positive correlation. Taller people generally have larger feet. However, this does not mean that growing tall causes your feet to grow, or that large feet cause you to grow tall. They just happen together.
      Advertisement

      About This Article

      Article Summary X

      To find the correlation coefficient by hand, first put your data pairs into a table with one row labeled “X” and the other “Y.” Then calculate the mean of X by adding all the X values and dividing by the number of values. Calculate the mean for Y in the same way. Next, use the formula for standard deviation to calculate it for both X and Y. Finally, use the means and standard deviations and the number of pairs in your data set as inputs to the correlation coefficient formula, and solve the resulting equation. To learn how to find the correlation coefficient with an online calculator or your own graphing calculator, read on!

      Did this summary help you?
      Thanks to all authors for creating a page that has been read 236,356 times.

      Reader Success Stories

      • Anonymous

        Jul 21, 2017

        "Images of the formula used are very helpful!"
      Share your story

      Did this article help you?

      Advertisement