Hypothesis Testing - Chi Squared Test

Lisa Sullivan, PhD

Professor of Biostatistics

Boston University School of Public Health

Introductory word scramble

Introduction

This module will continue the discussion of hypothesis testing, where a specific statement or hypothesis is generated about a population parameter, and sample statistics are used to assess the likelihood that the hypothesis is true. The hypothesis is based on available information and the investigator's belief about the population parameters. The specific tests considered here are called chi-square tests and are appropriate when the outcome is discrete (dichotomous, ordinal or categorical). For example, in some clinical trials the outcome is a classification such as hypertensive, pre-hypertensive or normotensive. We could use the same classification in an observational study such as the Framingham Heart Study to compare men and women in terms of their blood pressure status - again using the classification of hypertensive, pre-hypertensive or normotensive status.  

The technique to analyze a discrete outcome uses what is called a chi-square test. Specifically, the test statistic follows a chi-square probability distribution. We will consider chi-square tests here with one, two and more than two independent comparison groups.

Learning Objectives

After completing this module, the student will be able to:

  • Perform chi-square tests by hand
  • Appropriately interpret results of chi-square tests
  • Identify the appropriate hypothesis testing procedure based on type of outcome variable and number of samples

Tests with One Sample, Discrete Outcome

Here we consider hypothesis testing with a discrete outcome variable in a single population. Discrete variables are variables that take on more than two distinct responses or categories and the responses can be ordered or unordered (i.e., the outcome can be ordinal or categorical). The procedure we describe here can be used for dichotomous (exactly 2 response options), ordinal or categorical discrete outcomes and the objective is to compare the distribution of responses, or the proportions of participants in each response category, to a known distribution. The known distribution is derived from another study or report and it is again important in setting up the hypotheses that the comparator distribution specified in the null hypothesis is a fair comparison. The comparator is sometimes called an external or a historical control.   

In one sample tests for a discrete outcome, we set up our hypotheses against an appropriate comparator. We select a sample and compute descriptive statistics on the sample data. Specifically, we compute the sample size (n) and the proportions of participants in each response

Test Statistic for Testing H 0 : p 1 = p 10 , p 2 = p 20 , ..., p k = p k0

We find the critical value in a table of probabilities for the chi-square distribution with degrees of freedom (df) = k-1. In the test statistic, O = observed frequency and E=expected frequency in each of the response categories. The observed frequencies are those observed in the sample and the expected frequencies are computed as described below. χ 2 (chi-square) is another probability distribution and ranges from 0 to ∞. The test above statistic formula above is appropriate for large samples, defined as expected frequencies of at least 5 in each of the response categories.  

When we conduct a χ 2 test, we compare the observed frequencies in each response category to the frequencies we would expect if the null hypothesis were true. These expected frequencies are determined by allocating the sample to the response categories according to the distribution specified in H 0 . This is done by multiplying the observed sample size (n) by the proportions specified in the null hypothesis (p 10 , p 20 , ..., p k0 ). To ensure that the sample size is appropriate for the use of the test statistic above, we need to ensure that the following: min(np 10 , n p 20 , ..., n p k0 ) > 5.  

The test of hypothesis with a discrete outcome measured in a single sample, where the goal is to assess whether the distribution of responses follows a known distribution, is called the χ 2 goodness-of-fit test. As the name indicates, the idea is to assess whether the pattern or distribution of responses in the sample "fits" a specified population (external or historical) distribution. In the next example we illustrate the test. As we work through the example, we provide additional details related to the use of this new test statistic.  

A University conducted a survey of its recent graduates to collect demographic and health information for future planning purposes as well as to assess students' satisfaction with their undergraduate experiences. The survey revealed that a substantial proportion of students were not engaging in regular exercise, many felt their nutrition was poor and a substantial number were smoking. In response to a question on regular exercise, 60% of all graduates reported getting no regular exercise, 25% reported exercising sporadically and 15% reported exercising regularly as undergraduates. The next year the University launched a health promotion campaign on campus in an attempt to increase health behaviors among undergraduates. The program included modules on exercise, nutrition and smoking cessation. To evaluate the impact of the program, the University again surveyed graduates and asked the same questions. The survey was completed by 470 graduates and the following data were collected on the exercise question:

 

Number of Students

255

125

90

470

Based on the data, is there evidence of a shift in the distribution of responses to the exercise question following the implementation of the health promotion campaign on campus? Run the test at a 5% level of significance.

In this example, we have one sample and a discrete (ordinal) outcome variable (with three response options). We specifically want to compare the distribution of responses in the sample to the distribution reported the previous year (i.e., 60%, 25%, 15% reporting no, sporadic and regular exercise, respectively). We now run the test using the five-step approach.  

  • Step 1. Set up hypotheses and determine level of significance.

The null hypothesis again represents the "no change" or "no difference" situation. If the health promotion campaign has no impact then we expect the distribution of responses to the exercise question to be the same as that measured prior to the implementation of the program.

H 0 : p 1 =0.60, p 2 =0.25, p 3 =0.15,  or equivalently H 0 : Distribution of responses is 0.60, 0.25, 0.15  

H 1 :   H 0 is false.          α =0.05

Notice that the research hypothesis is written in words rather than in symbols. The research hypothesis as stated captures any difference in the distribution of responses from that specified in the null hypothesis. We do not specify a specific alternative distribution, instead we are testing whether the sample data "fit" the distribution in H 0 or not. With the χ 2 goodness-of-fit test there is no upper or lower tailed version of the test.

  • Step 2. Select the appropriate test statistic.  

The test statistic is:

We must first assess whether the sample size is adequate. Specifically, we need to check min(np 0 , np 1, ..., n p k ) > 5. The sample size here is n=470 and the proportions specified in the null hypothesis are 0.60, 0.25 and 0.15. Thus, min( 470(0.65), 470(0.25), 470(0.15))=min(282, 117.5, 70.5)=70.5. The sample size is more than adequate so the formula can be used.

  • Step 3. Set up decision rule.  

The decision rule for the χ 2 test depends on the level of significance and the degrees of freedom, defined as degrees of freedom (df) = k-1 (where k is the number of response categories). If the null hypothesis is true, the observed and expected frequencies will be close in value and the χ 2 statistic will be close to zero. If the null hypothesis is false, then the χ 2 statistic will be large. Critical values can be found in a table of probabilities for the χ 2 distribution. Here we have df=k-1=3-1=2 and a 5% level of significance. The appropriate critical value is 5.99, and the decision rule is as follows: Reject H 0 if χ 2 > 5.99.

  • Step 4. Compute the test statistic.  

We now compute the expected frequencies using the sample size and the proportions specified in the null hypothesis. We then substitute the sample data (observed frequencies) and the expected frequencies into the formula for the test statistic identified in Step 2. The computations can be organized as follows.

   

255

125

90

470

470(0.60)

=282

470(0.25)

=117.5

470(0.15)

=70.5

470

Notice that the expected frequencies are taken to one decimal place and that the sum of the observed frequencies is equal to the sum of the expected frequencies. The test statistic is computed as follows:

  • Step 5. Conclusion.  

We reject H 0 because 8.46 > 5.99. We have statistically significant evidence at α=0.05 to show that H 0 is false, or that the distribution of responses is not 0.60, 0.25, 0.15.  The p-value is p < 0.005.  

In the χ 2 goodness-of-fit test, we conclude that either the distribution specified in H 0 is false (when we reject H 0 ) or that we do not have sufficient evidence to show that the distribution specified in H 0 is false (when we fail to reject H 0 ). Here, we reject H 0 and concluded that the distribution of responses to the exercise question following the implementation of the health promotion campaign was not the same as the distribution prior. The test itself does not provide details of how the distribution has shifted. A comparison of the observed and expected frequencies will provide some insight into the shift (when the null hypothesis is rejected). Does it appear that the health promotion campaign was effective?  

Consider the following: 

 

255

125

90

470

282

117.5

70.5

470

If the null hypothesis were true (i.e., no change from the prior year) we would have expected more students to fall in the "No Regular Exercise" category and fewer in the "Regular Exercise" categories. In the sample, 255/470 = 54% reported no regular exercise and 90/470=19% reported regular exercise. Thus, there is a shift toward more regular exercise following the implementation of the health promotion campaign. There is evidence of a statistical difference, is this a meaningful difference? Is there room for improvement?

The National Center for Health Statistics (NCHS) provided data on the distribution of weight (in categories) among Americans in 2002. The distribution was based on specific values of body mass index (BMI) computed as weight in kilograms over height in meters squared. Underweight was defined as BMI< 18.5, Normal weight as BMI between 18.5 and 24.9, overweight as BMI between 25 and 29.9 and obese as BMI of 30 or greater. Americans in 2002 were distributed as follows: 2% Underweight, 39% Normal Weight, 36% Overweight, and 23% Obese. Suppose we want to assess whether the distribution of BMI is different in the Framingham Offspring sample. Using data from the n=3,326 participants who attended the seventh examination of the Offspring in the Framingham Heart Study we created the BMI categories as defined and observed the following:

 

30

20

932

1374

1000

3326

  • Step 1.  Set up hypotheses and determine level of significance.

H 0 : p 1 =0.02, p 2 =0.39, p 3 =0.36, p 4 =0.23     or equivalently

H 0 : Distribution of responses is 0.02, 0.39, 0.36, 0.23

H 1 :   H 0 is false.        α=0.05

The formula for the test statistic is:

We must assess whether the sample size is adequate. Specifically, we need to check min(np 0 , np 1, ..., n p k ) > 5. The sample size here is n=3,326 and the proportions specified in the null hypothesis are 0.02, 0.39, 0.36 and 0.23. Thus, min( 3326(0.02), 3326(0.39), 3326(0.36), 3326(0.23))=min(66.5, 1297.1, 1197.4, 765.0)=66.5. The sample size is more than adequate, so the formula can be used.

Here we have df=k-1=4-1=3 and a 5% level of significance. The appropriate critical value is 7.81 and the decision rule is as follows: Reject H 0 if χ 2 > 7.81.

We now compute the expected frequencies using the sample size and the proportions specified in the null hypothesis. We then substitute the sample data (observed frequencies) into the formula for the test statistic identified in Step 2. We organize the computations in the following table.

 

30

20

932

1374

1000

3326

66.5

1297.1

1197.4

765.0

3326

The test statistic is computed as follows:

We reject H 0 because 233.53 > 7.81. We have statistically significant evidence at α=0.05 to show that H 0 is false or that the distribution of BMI in Framingham is different from the national data reported in 2002, p < 0.005.  

Again, the χ 2   goodness-of-fit test allows us to assess whether the distribution of responses "fits" a specified distribution. Here we show that the distribution of BMI in the Framingham Offspring Study is different from the national distribution. To understand the nature of the difference we can compare observed and expected frequencies or observed and expected proportions (or percentages). The frequencies are large because of the large sample size, the observed percentages of patients in the Framingham sample are as follows: 0.6% underweight, 28% normal weight, 41% overweight and 30% obese. In the Framingham Offspring sample there are higher percentages of overweight and obese persons (41% and 30% in Framingham as compared to 36% and 23% in the national data), and lower proportions of underweight and normal weight persons (0.6% and 28% in Framingham as compared to 2% and 39% in the national data). Are these meaningful differences?

In the module on hypothesis testing for means and proportions, we discussed hypothesis testing applications with a dichotomous outcome variable in a single population. We presented a test using a test statistic Z to test whether an observed (sample) proportion differed significantly from a historical or external comparator. The chi-square goodness-of-fit test can also be used with a dichotomous outcome and the results are mathematically equivalent.  

In the prior module, we considered the following example. Here we show the equivalence to the chi-square goodness-of-fit test.

The NCHS report indicated that in 2002, 75% of children aged 2 to 17 saw a dentist in the past year. An investigator wants to assess whether use of dental services is similar in children living in the city of Boston. A sample of 125 children aged 2 to 17 living in Boston are surveyed and 64 reported seeing a dentist over the past 12 months. Is there a significant difference in use of dental services between children living in Boston and the national data?

We presented the following approach to the test using a Z statistic. 

  • Step 1. Set up hypotheses and determine level of significance

H 0 : p = 0.75

H 1 : p ≠ 0.75                               α=0.05

We must first check that the sample size is adequate. Specifically, we need to check min(np 0 , n(1-p 0 )) = min( 125(0.75), 125(1-0.75))=min(94, 31)=31. The sample size is more than adequate so the following formula can be used

This is a two-tailed test, using a Z statistic and a 5% level of significance. Reject H 0 if Z < -1.960 or if Z > 1.960.

We now substitute the sample data into the formula for the test statistic identified in Step 2. The sample proportion is:

what is the alternative hypothesis for a chi square test

We reject H 0 because -6.15 < -1.960. We have statistically significant evidence at a =0.05 to show that there is a statistically significant difference in the use of dental service by children living in Boston as compared to the national data. (p < 0.0001).  

We now conduct the same test using the chi-square goodness-of-fit test. First, we summarize our sample data as follows:

 

Saw a Dentist

in Past 12 Months

Did Not See a Dentist

in Past 12 Months

Total

# of Participants

64

61

125

H 0 : p 1 =0.75, p 2 =0.25     or equivalently H 0 : Distribution of responses is 0.75, 0.25 

We must assess whether the sample size is adequate. Specifically, we need to check min(np 0 , np 1, ...,np k >) > 5. The sample size here is n=125 and the proportions specified in the null hypothesis are 0.75, 0.25. Thus, min( 125(0.75), 125(0.25))=min(93.75, 31.25)=31.25. The sample size is more than adequate so the formula can be used.

Here we have df=k-1=2-1=1 and a 5% level of significance. The appropriate critical value is 3.84, and the decision rule is as follows: Reject H 0 if χ 2 > 3.84. (Note that 1.96 2 = 3.84, where 1.96 was the critical value used in the Z test for proportions shown above.)

 

64

61

125

93.75

31.25

125

(Note that (-6.15) 2 = 37.8, where -6.15 was the value of the Z statistic in the test for proportions shown above.)

We reject H 0 because 37.8 > 3.84. We have statistically significant evidence at α=0.05 to show that there is a statistically significant difference in the use of dental service by children living in Boston as compared to the national data.  (p < 0.0001). This is the same conclusion we reached when we conducted the test using the Z test above. With a dichotomous outcome, Z 2 = χ 2 !   In statistics, there are often several approaches that can be used to test hypotheses. 

Tests for Two or More Independent Samples, Discrete Outcome

Here we extend that application of the chi-square test to the case with two or more independent comparison groups. Specifically, the outcome of interest is discrete with two or more responses and the responses can be ordered or unordered (i.e., the outcome can be dichotomous, ordinal or categorical). We now consider the situation where there are two or more independent comparison groups and the goal of the analysis is to compare the distribution of responses to the discrete outcome variable among several independent comparison groups.  

The test is called the χ 2 test of independence and the null hypothesis is that there is no difference in the distribution of responses to the outcome across comparison groups. This is often stated as follows: The outcome variable and the grouping variable (e.g., the comparison treatments or comparison groups) are independent (hence the name of the test). Independence here implies homogeneity in the distribution of the outcome among comparison groups.    

The null hypothesis in the χ 2 test of independence is often stated in words as: H 0 : The distribution of the outcome is independent of the groups. The alternative or research hypothesis is that there is a difference in the distribution of responses to the outcome variable among the comparison groups (i.e., that the distribution of responses "depends" on the group). In order to test the hypothesis, we measure the discrete outcome variable in each participant in each comparison group. The data of interest are the observed frequencies (or number of participants in each response category in each group). The formula for the test statistic for the χ 2 test of independence is given below.

Test Statistic for Testing H 0 : Distribution of outcome is independent of groups

and we find the critical value in a table of probabilities for the chi-square distribution with df=(r-1)*(c-1).

Here O = observed frequency, E=expected frequency in each of the response categories in each group, r = the number of rows in the two-way table and c = the number of columns in the two-way table.   r and c correspond to the number of comparison groups and the number of response options in the outcome (see below for more details). The observed frequencies are the sample data and the expected frequencies are computed as described below. The test statistic is appropriate for large samples, defined as expected frequencies of at least 5 in each of the response categories in each group.  

The data for the χ 2 test of independence are organized in a two-way table. The outcome and grouping variable are shown in the rows and columns of the table. The sample table below illustrates the data layout. The table entries (blank below) are the numbers of participants in each group responding to each response category of the outcome variable.

Table - Possible outcomes are are listed in the columns; The groups being compared are listed in rows.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

N

In the table above, the grouping variable is shown in the rows of the table; r denotes the number of independent groups. The outcome variable is shown in the columns of the table; c denotes the number of response options in the outcome variable. Each combination of a row (group) and column (response) is called a cell of the table. The table has r*c cells and is sometimes called an r x c ("r by c") table. For example, if there are 4 groups and 5 categories in the outcome variable, the data are organized in a 4 X 5 table. The row and column totals are shown along the right-hand margin and the bottom of the table, respectively. The total sample size, N, can be computed by summing the row totals or the column totals. Similar to ANOVA, N does not refer to a population size here but rather to the total sample size in the analysis. The sample data can be organized into a table like the above. The numbers of participants within each group who select each response option are shown in the cells of the table and these are the observed frequencies used in the test statistic.

The test statistic for the χ 2 test of independence involves comparing observed (sample data) and expected frequencies in each cell of the table. The expected frequencies are computed assuming that the null hypothesis is true. The null hypothesis states that the two variables (the grouping variable and the outcome) are independent. The definition of independence is as follows:

 Two events, A and B, are independent if P(A|B) = P(A), or equivalently, if P(A and B) = P(A) P(B).

The second statement indicates that if two events, A and B, are independent then the probability of their intersection can be computed by multiplying the probability of each individual event. To conduct the χ 2 test of independence, we need to compute expected frequencies in each cell of the table. Expected frequencies are computed by assuming that the grouping variable and outcome are independent (i.e., under the null hypothesis). Thus, if the null hypothesis is true, using the definition of independence:

P(Group 1 and Response Option 1) = P(Group 1) P(Response Option 1).

 The above states that the probability that an individual is in Group 1 and their outcome is Response Option 1 is computed by multiplying the probability that person is in Group 1 by the probability that a person is in Response Option 1. To conduct the χ 2 test of independence, we need expected frequencies and not expected probabilities . To convert the above probability to a frequency, we multiply by N. Consider the following small example.

 

10

8

7

25

22

15

13

50

30

28

17

75

62

51

37

150

The data shown above are measured in a sample of size N=150. The frequencies in the cells of the table are the observed frequencies. If Group and Response are independent, then we can compute the probability that a person in the sample is in Group 1 and Response category 1 using:

P(Group 1 and Response 1) = P(Group 1) P(Response 1),

P(Group 1 and Response 1) = (25/150) (62/150) = 0.069.

Thus if Group and Response are independent we would expect 6.9% of the sample to be in the top left cell of the table (Group 1 and Response 1). The expected frequency is 150(0.069) = 10.4.   We could do the same for Group 2 and Response 1:

P(Group 2 and Response 1) = P(Group 2) P(Response 1),

P(Group 2 and Response 1) = (50/150) (62/150) = 0.138.

The expected frequency in Group 2 and Response 1 is 150(0.138) = 20.7.

Thus, the formula for determining the expected cell frequencies in the χ 2 test of independence is as follows:

Expected Cell Frequency = (Row Total * Column Total)/N.

The above computes the expected frequency in one step rather than computing the expected probability first and then converting to a frequency.  

In a prior example we evaluated data from a survey of university graduates which assessed, among other things, how frequently they exercised. The survey was completed by 470 graduates. In the prior example we used the χ 2 goodness-of-fit test to assess whether there was a shift in the distribution of responses to the exercise question following the implementation of a health promotion campaign on campus. We specifically considered one sample (all students) and compared the observed distribution to the distribution of responses the prior year (a historical control). Suppose we now wish to assess whether there is a relationship between exercise on campus and students' living arrangements. As part of the same survey, graduates were asked where they lived their senior year. The response options were dormitory, on-campus apartment, off-campus apartment, and at home (i.e., commuted to and from the university). The data are shown below.

 

32

30

28

90

74

64

42

180

110

25

15

150

39

6

5

50

255

125

90

470

Based on the data, is there a relationship between exercise and student's living arrangement? Do you think where a person lives affect their exercise status? Here we have four independent comparison groups (living arrangement) and a discrete (ordinal) outcome variable with three response options. We specifically want to test whether living arrangement and exercise are independent. We will run the test using the five-step approach.  

H 0 : Living arrangement and exercise are independent

H 1 : H 0 is false.                α=0.05

The null and research hypotheses are written in words rather than in symbols. The research hypothesis is that the grouping variable (living arrangement) and the outcome variable (exercise) are dependent or related.   

  • Step 2.  Select the appropriate test statistic.  

The condition for appropriate use of the above test statistic is that each expected frequency is at least 5. In Step 4 we will compute the expected frequencies and we will ensure that the condition is met.

The decision rule depends on the level of significance and the degrees of freedom, defined as df = (r-1)(c-1), where r and c are the numbers of rows and columns in the two-way data table.   The row variable is the living arrangement and there are 4 arrangements considered, thus r=4. The column variable is exercise and 3 responses are considered, thus c=3. For this test, df=(4-1)(3-1)=3(2)=6. Again, with χ 2 tests there are no upper, lower or two-tailed tests. If the null hypothesis is true, the observed and expected frequencies will be close in value and the χ 2 statistic will be close to zero. If the null hypothesis is false, then the χ 2 statistic will be large. The rejection region for the χ 2 test of independence is always in the upper (right-hand) tail of the distribution. For df=6 and a 5% level of significance, the appropriate critical value is 12.59 and the decision rule is as follows: Reject H 0 if c 2 > 12.59.

We now compute the expected frequencies using the formula,

Expected Frequency = (Row Total * Column Total)/N.

The computations can be organized in a two-way table. The top number in each cell of the table is the observed frequency and the bottom number is the expected frequency.   The expected frequencies are shown in parentheses.

 

32

(48.8)

30

(23.9)

28

(17.2)

90

74

(97.7)

64

(47.9)

42

(34.5)

180

110

(81.4)

25

(39.9)

15

(28.7)

150

39

(27.1)

6

(13.3)

5

(9.6)

50

255

125

90

470

Notice that the expected frequencies are taken to one decimal place and that the sums of the observed frequencies are equal to the sums of the expected frequencies in each row and column of the table.  

Recall in Step 2 a condition for the appropriate use of the test statistic was that each expected frequency is at least 5. This is true for this sample (the smallest expected frequency is 9.6) and therefore it is appropriate to use the test statistic.

We reject H 0 because 60.5 > 12.59. We have statistically significant evidence at a =0.05 to show that H 0 is false or that living arrangement and exercise are not independent (i.e., they are dependent or related), p < 0.005.  

Again, the χ 2 test of independence is used to test whether the distribution of the outcome variable is similar across the comparison groups. Here we rejected H 0 and concluded that the distribution of exercise is not independent of living arrangement, or that there is a relationship between living arrangement and exercise. The test provides an overall assessment of statistical significance. When the null hypothesis is rejected, it is important to review the sample data to understand the nature of the relationship. Consider again the sample data. 

Because there are different numbers of students in each living situation, it makes the comparisons of exercise patterns difficult on the basis of the frequencies alone. The following table displays the percentages of students in each exercise category by living arrangement. The percentages sum to 100% in each row of the table. For comparison purposes, percentages are also shown for the total sample along the bottom row of the table.

36%

33%

31%

41%

36%

23%

73%

17%

10%

78%

12%

10%

54%

27%

19%

From the above, it is clear that higher percentages of students living in dormitories and in on-campus apartments reported regular exercise (31% and 23%) as compared to students living in off-campus apartments and at home (10% each).  

Test Yourself

 Pancreaticoduodenectomy (PD) is a procedure that is associated with considerable morbidity. A study was recently conducted on 553 patients who had a successful PD between January 2000 and December 2010 to determine whether their Surgical Apgar Score (SAS) is related to 30-day perioperative morbidity and mortality. The table below gives the number of patients experiencing no, minor, or major morbidity by SAS category.  

0-4

21

20

16

5-6

135

71

35

7-10

158

62

35

Question: What would be an appropriate statistical test to examine whether there is an association between Surgical Apgar Score and patient outcome? Using 14.13 as the value of the test statistic for these data, carry out the appropriate test at a 5% level of significance. Show all parts of your test.

In the module on hypothesis testing for means and proportions, we discussed hypothesis testing applications with a dichotomous outcome variable and two independent comparison groups. We presented a test using a test statistic Z to test for equality of independent proportions. The chi-square test of independence can also be used with a dichotomous outcome and the results are mathematically equivalent.  

In the prior module, we considered the following example. Here we show the equivalence to the chi-square test of independence.

A randomized trial is designed to evaluate the effectiveness of a newly developed pain reliever designed to reduce pain in patients following joint replacement surgery. The trial compares the new pain reliever to the pain reliever currently in use (called the standard of care). A total of 100 patients undergoing joint replacement surgery agreed to participate in the trial. Patients were randomly assigned to receive either the new pain reliever or the standard pain reliever following surgery and were blind to the treatment assignment. Before receiving the assigned treatment, patients were asked to rate their pain on a scale of 0-10 with higher scores indicative of more pain. Each patient was then given the assigned treatment and after 30 minutes was again asked to rate their pain on the same scale. The primary outcome was a reduction in pain of 3 or more scale points (defined by clinicians as a clinically meaningful reduction). The following data were observed in the trial.

50

23

0.46

50

11

0.22

We tested whether there was a significant difference in the proportions of patients reporting a meaningful reduction (i.e., a reduction of 3 or more scale points) using a Z statistic, as follows. 

H 0 : p 1 = p 2    

H 1 : p 1 ≠ p 2                             α=0.05

Here the new or experimental pain reliever is group 1 and the standard pain reliever is group 2.

We must first check that the sample size is adequate. Specifically, we need to ensure that we have at least 5 successes and 5 failures in each comparison group or that:

In this example, we have

Therefore, the sample size is adequate, so the following formula can be used:

Reject H 0 if Z < -1.960 or if Z > 1.960.

We now substitute the sample data into the formula for the test statistic identified in Step 2. We first compute the overall proportion of successes:

We now substitute to compute the test statistic.

  • Step 5.  Conclusion.  

We now conduct the same test using the chi-square test of independence.  

H 0 : Treatment and outcome (meaningful reduction in pain) are independent

H 1 :   H 0 is false.         α=0.05

The formula for the test statistic is:  

For this test, df=(2-1)(2-1)=1. At a 5% level of significance, the appropriate critical value is 3.84 and the decision rule is as follows: Reject H0 if χ 2 > 3.84. (Note that 1.96 2 = 3.84, where 1.96 was the critical value used in the Z test for proportions shown above.)

We now compute the expected frequencies using:

The computations can be organized in a two-way table. The top number in each cell of the table is the observed frequency and the bottom number is the expected frequency. The expected frequencies are shown in parentheses.

23

(17.0)

27

(33.0)

50

11

(17.0)

39

(33.0)

50

34

66

100

A condition for the appropriate use of the test statistic was that each expected frequency is at least 5. This is true for this sample (the smallest expected frequency is 22.0) and therefore it is appropriate to use the test statistic.

(Note that (2.53) 2 = 6.4, where 2.53 was the value of the Z statistic in the test for proportions shown above.)

Chi-Squared Tests in R

The video below by Mike Marin demonstrates how to perform chi-squared tests in the R programming language.

Answer to Problem on Pancreaticoduodenectomy and Surgical Apgar Scores

We have 3 independent comparison groups (Surgical Apgar Score) and a categorical outcome variable (morbidity/mortality). We can run a Chi-Squared test of independence.

H 0 : Apgar scores and patient outcome are independent of one another.

H A : Apgar scores and patient outcome are not independent.

Chi-squared = 14.3

Since 14.3 is greater than 9.49, we reject H 0.

There is an association between Apgar scores and patient outcome. The lowest Apgar score group (0 to 4) experienced the highest percentage of major morbidity or mortality (16 out of 57=28%) compared to the other Apgar score groups.

Chi-Square (Χ²) Test & How To Calculate Formula Equation

Benjamin Frimodig

Science Expert

B.A., History and Science, Harvard University

Ben Frimodig is a 2021 graduate of Harvard College, where he studied the History of Science.

Learn about our Editorial Process

Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

On This Page:

Chi-square (χ2) is used to test hypotheses about the distribution of observations into categories with no inherent ranking.

What Is a Chi-Square Statistic?

The Chi-square test (pronounced Kai) looks at the pattern of observations and will tell us if certain combinations of the categories occur more frequently than we would expect by chance, given the total number of times each category occurred.

It looks for an association between the variables. We cannot use a correlation coefficient to look for the patterns in this data because the categories often do not form a continuum.

There are three main types of Chi-square tests, tests of goodness of fit, the test of independence, and the test for homogeneity. All three tests rely on the same formula to compute a test statistic.

These tests function by deciphering relationships between observed sets of data and theoretical or “expected” sets of data that align with the null hypothesis.

What is a Contingency Table?

Contingency tables (also known as two-way tables) are grids in which Chi-square data is organized and displayed. They provide a basic picture of the interrelation between two variables and can help find interactions between them.

In contingency tables, one variable and each of its categories are listed vertically, and the other variable and each of its categories are listed horizontally.

Additionally, including column and row totals, also known as “marginal frequencies,” will help facilitate the Chi-square testing process.

In order for the Chi-square test to be considered trustworthy, each cell of your expected contingency table must have a value of at least five.

Each Chi-square test will have one contingency table representing observed counts (see Fig. 1) and one contingency table representing expected counts (see Fig. 2).

contingency table representing observed counts

Figure 1. Observed table (which contains the observed counts).

To obtain the expected frequencies for any cell in any cross-tabulation in which the two variables are assumed independent, multiply the row and column totals for that cell and divide the product by the total number of cases in the table.

contingency table representing observed counts

Figure 2. Expected table (what we expect the two-way table to look like if the two categorical variables are independent).

To decide if our calculated value for χ2 is significant, we also need to work out the degrees of freedom for our contingency table using the following formula: df= (rows – 1) x (columns – 1).

Formula Calculation

chi-squared-equation

Calculate the chi-square statistic (χ2) by completing the following steps:

  • Calculate the expected frequencies and the observed frequencies.
  • For each observed number in the table, subtract the corresponding expected number (O — E).
  • Square the difference (O —E)².
  • Divide the squares obtained for each cell in the table by the expected number for that cell (O – E)² / E.
  • Sum all the values for (O – E)² / E. This is the chi-square statistic.
  • Calculate the degrees of freedom for the contingency table using the following formula; df= (rows – 1) x (columns – 1).

Once we have calculated the degrees of freedom (df) and the chi-squared value (χ2), we can use the χ2 table (often at the back of a statistics book) to check if our value for χ2 is higher than the critical value given in the table. If it is, then our result is significant at the level given.

Interpretation

The chi-square statistic tells you how much difference exists between the observed count in each table cell to the counts you would expect if there were no relationship at all in the population.

Small Chi-Square Statistic: If the chi-square statistic is small and the p-value is large (usually greater than 0.05), this often indicates that the observed frequencies in the sample are close to what would be expected under the null hypothesis.

The null hypothesis usually states no association between the variables being studied or that the observed distribution fits the expected distribution.

In theory, if the observed and expected values were equal (no difference), then the chi-square statistic would be zero — but this is unlikely to happen in real life.

Large Chi-Square Statistic : If the chi-square statistic is large and the p-value is small (usually less than 0.05), then the conclusion is often that the data does not fit the model well, i.e., the observed and expected values are significantly different. This often leads to the rejection of the null hypothesis.

How to Report

To report a chi-square output in an APA-style results section, always rely on the following template:

χ2 ( degrees of freedom , N = sample size ) = chi-square statistic value , p = p value .

chi-squared-spss output

In the case of the above example, the results would be written as follows:

A chi-square test of independence showed that there was a significant association between gender and post-graduation education plans, χ2 (4, N = 101) = 54.50, p < .001.

APA Style Rules

  • Do not use a zero before a decimal when the statistic cannot be greater than 1 (proportion, correlation, level of statistical significance).
  • Report exact p values to two or three decimals (e.g., p = .006, p = .03).
  • However, report p values less than .001 as “ p < .001.”
  • Put a space before and after a mathematical operator (e.g., minus, plus, greater than, less than, equals sign).
  • Do not repeat statistics in both the text and a table or figure.

p -value Interpretation

You test whether a given χ2 is statistically significant by testing it against a table of chi-square distributions , according to the number of degrees of freedom for your sample, which is the number of categories minus 1. The chi-square assumes that you have at least 5 observations per category.

If you are using SPSS then you will have an expected p -value.

For a chi-square test, a p-value that is less than or equal to the .05 significance level indicates that the observed values are different to the expected values.

Thus, low p-values (p< .05) indicate a likely difference between the theoretical population and the collected sample. You can conclude that a relationship exists between the categorical variables.

Remember that p -values do not indicate the odds that the null hypothesis is true but rather provide the probability that one would obtain the sample distribution observed (or a more extreme distribution) if the null hypothesis was true.

A level of confidence necessary to accept the null hypothesis can never be reached. Therefore, conclusions must choose to either fail to reject the null or accept the alternative hypothesis, depending on the calculated p-value.

The four steps below show you how to analyze your data using a chi-square goodness-of-fit test in SPSS (when you have hypothesized that you have equal expected proportions).

Step 1 : Analyze > Nonparametric Tests > Legacy Dialogs > Chi-square… on the top menu as shown below:

Step 2 : Move the variable indicating categories into the “Test Variable List:” box.

Step 3 : If you want to test the hypothesis that all categories are equally likely, click “OK.”

Step 4 : Specify the expected count for each category by first clicking the “Values” button under “Expected Values.”

Step 5 : Then, in the box to the right of “Values,” enter the expected count for category one and click the “Add” button. Now enter the expected count for category two and click “Add.” Continue in this way until all expected counts have been entered.

Step 6 : Then click “OK.”

The four steps below show you how to analyze your data using a chi-square test of independence in SPSS Statistics.

Step 1 : Open the Crosstabs dialog (Analyze > Descriptive Statistics > Crosstabs).

Step 2 : Select the variables you want to compare using the chi-square test. Click one variable in the left window and then click the arrow at the top to move the variable. Select the row variable and the column variable.

Step 3 : Click Statistics (a new pop-up window will appear). Check Chi-square, then click Continue.

Step 4 : (Optional) Check the box for Display clustered bar charts.

Step 5 : Click OK.

Goodness-of-Fit Test

The Chi-square goodness of fit test is used to compare a randomly collected sample containing a single, categorical variable to a larger population.

This test is most commonly used to compare a random sample to the population from which it was potentially collected.

The test begins with the creation of a null and alternative hypothesis. In this case, the hypotheses are as follows:

Null Hypothesis (Ho) : The null hypothesis (Ho) is that the observed frequencies are the same (except for chance variation) as the expected frequencies. The collected data is consistent with the population distribution.

Alternative Hypothesis (Ha) : The collected data is not consistent with the population distribution.

The next step is to create a contingency table that represents how the data would be distributed if the null hypothesis were exactly correct.

The sample’s overall deviation from this theoretical/expected data will allow us to draw a conclusion, with a more severe deviation resulting in smaller p-values.

Test for Independence

The Chi-square test for independence looks for an association between two categorical variables within the same population.

Unlike the goodness of fit test, the test for independence does not compare a single observed variable to a theoretical population but rather two variables within a sample set to one another.

The hypotheses for a Chi-square test of independence are as follows:

Null Hypothesis (Ho) : There is no association between the two categorical variables in the population of interest.

Alternative Hypothesis (Ha) : There is no association between the two categorical variables in the population of interest.

The next step is to create a contingency table of expected values that reflects how a data set that perfectly aligns the null hypothesis would appear.

The simplest way to do this is to calculate the marginal frequencies of each row and column; the expected frequency of each cell is equal to the marginal frequency of the row and column that corresponds to a given cell in the observed contingency table divided by the total sample size.

Test for Homogeneity

The Chi-square test for homogeneity is organized and executed exactly the same as the test for independence.

The main difference to remember between the two is that the test for independence looks for an association between two categorical variables within the same population, while the test for homogeneity determines if the distribution of a variable is the same in each of several populations (thus allocating population itself as the second categorical variable).

Null Hypothesis (Ho) : There is no difference in the distribution of a categorical variable for several populations or treatments.

Alternative Hypothesis (Ha) : There is a difference in the distribution of a categorical variable for several populations or treatments.

The difference between these two tests can be a bit tricky to determine, especially in the practical applications of a Chi-square test. A reliable rule of thumb is to determine how the data was collected.

If the data consists of only one random sample with the observations classified according to two categorical variables, it is a test for independence. If the data consists of more than one independent random sample, it is a test for homogeneity.

What is the chi-square test?

The Chi-square test is a non-parametric statistical test used to determine if there’s a significant association between two or more categorical variables in a sample.

It works by comparing the observed frequencies in each category of a cross-tabulation with the frequencies expected under the null hypothesis, which assumes there is no relationship between the variables.

This test is often used in fields like biology, marketing, sociology, and psychology for hypothesis testing.

What does chi-square tell you?

The Chi-square test informs whether there is a significant association between two categorical variables. Suppose the calculated Chi-square value is above the critical value from the Chi-square distribution.

In that case, it suggests a significant relationship between the variables, rejecting the null hypothesis of no association.

How to calculate chi-square?

To calculate the Chi-square statistic, follow these steps:

1. Create a contingency table of observed frequencies for each category.

2. Calculate expected frequencies for each category under the null hypothesis.

3. Compute the Chi-square statistic using the formula: Χ² = Σ [ (O_i – E_i)² / E_i ], where O_i is the observed frequency and E_i is the expected frequency.

4. Compare the calculated statistic with the critical value from the Chi-square distribution to draw a conclusion.

Print Friendly, PDF & Email

Easy Sociology

  • Books, Journals, Papers
  • Guides & How To’s
  • Life Around The World
  • Research Methods
  • Functionalism
  • Postmodernism
  • Social Constructionism
  • Structuralism
  • Symbolic Interactionism
  • Sociology Theorists
  • General Sociology
  • Social Policy
  • Social Work
  • Sociology of Childhood
  • Sociology of Crime & Deviance
  • Sociology of Art
  • Sociology of Dance
  • Sociology of Food
  • Sociology of Sport
  • Sociology of Disability
  • Sociology of Economics
  • Sociology of Education
  • Sociology of Emotion
  • Sociology of Family & Relationships
  • Sociology of Gender
  • Sociology of Health
  • Sociology of Identity
  • Sociology of Ideology
  • Sociology of Inequalities
  • Sociology of Knowledge
  • Sociology of Language
  • Sociology of Law
  • Sociology of Anime
  • Sociology of Film
  • Sociology of Gaming
  • Sociology of Literature
  • Sociology of Music
  • Sociology of TV
  • Sociology of Migration
  • Sociology of Nature & Environment
  • Sociology of Politics
  • Sociology of Power
  • Sociology of Race & Ethnicity
  • Sociology of Religion
  • Sociology of Sexuality
  • Sociology of Social Movements
  • Sociology of Technology
  • Sociology of the Life Course
  • Sociology of Travel & Tourism
  • Sociology of Violence & Conflict
  • Sociology of Work
  • Urban Sociology
  • Changing Relationships Within Families
  • Conjugal Role Relationships
  • Criticisms of Families
  • Family Forms
  • Functions of the Family
  • Featured Articles
  • Privacy Policy
  • Terms & Conditions

How to Conduct a Chi-Square Test

Mr Edwards

In social science research, one of the common tasks researchers undertake is analyzing relationships between categorical variables. Understanding how variables like gender, ethnicity, or occupation are distributed across different categories can reveal significant insights about patterns of inequality , social behavior, or institutional biases. One powerful statistical tool used to analyze such relationships is the Chi-Square test. This test allows sociologists to explore whether the observed frequencies in a categorical dataset deviate from what would be expected under a given hypothesis. In this article, we will explore how to conduct a Chi-Square test step-by-step, breaking down the concepts and calculations in a way that is accessible to undergraduate sociology students.

Understanding the Chi-Square Test

The Chi-Square test is a statistical method designed to examine the association between two or more categorical variables. These variables represent data that can be categorized into distinct groups or categories. For instance, gender (male, female, other) and level of education (high school, college, graduate) are examples of categorical variables that may be of interest in sociological research. The test compares the observed frequencies of different categories against the expected frequencies, under the assumption that there is no association between the variables.

There are two primary types of Chi-Square tests: the Chi-Square test for independence and the Chi-Square test for goodness of fit. The test for independence examines whether two categorical variables are related, while the goodness-of-fit test determines if the observed distribution of a single categorical variable matches an expected distribution. In this article, we will focus on the Chi-Square test for independence, as it is more commonly used in sociology research.

When to Use a Chi-Square Test

Before diving into the details of how to conduct a Chi-Square test, it is essential to understand when it is appropriate to use this statistical tool. The Chi-Square test is best suited for scenarios where the data is categorical and the researcher is interested in testing the relationship between two variables. Some common sociological research questions that can be addressed using a Chi-Square test include:

  • Is there a relationship between gender and voting behavior?
  • Are educational attainment levels related to employment status ?
  • Does racial or ethnic background correlate with access to healthcare?

To use the Chi-Square test effectively, the data must meet certain conditions:

  • Independence : The observations in each category must be independent of each other. This means that no individual or case should appear in more than one category.
  • Expected Frequencies : Each cell in the contingency table (which we will discuss shortly) should have an expected frequency of at least 5. If the expected frequencies are too small, the results of the test may be unreliable.
  • Sample Size : The test is more reliable with larger sample sizes. While there is no strict rule, having at least 30 observations is generally recommended for a Chi-Square test.

If these conditions are met, the Chi-Square test is an appropriate method for testing relationships between categorical variables.

Steps for Conducting a Chi-Square Test

1. formulating hypotheses.

As with any statistical test, conducting a Chi-Square test begins with formulating two hypotheses: the null hypothesis (H₀) and the alternative hypothesis (H₁).

  • Null Hypothesis (H₀) : This hypothesis states that there is no relationship between the two categorical variables. In other words, the variables are independent of each other.
  • Alternative Hypothesis (H₁) : The alternative hypothesis suggests that there is a relationship between the variables, meaning that they are not independent.

For example, if you are studying the relationship between gender and voting behavior, your hypotheses might be:

  • H₀ : There is no relationship between gender and voting behavior.
  • H₁ : There is a relationship between gender and voting behavior.

2. Collecting and Organizing Data

Next, gather data on the variables you are interested in examining. This data should be categorical, with each observation falling into one category for each variable. Once you have collected the data, organize it into a contingency table , which displays the frequencies of observations for each combination of categories.

For instance, if you are analyzing the relationship between gender and voting behavior, your contingency table might look like this:

VotedDid Not VoteTotal
Male451560
Female552580
Non-Binary10515
Total11045155

In this example, the table shows the observed frequencies of individuals who voted or did not vote, broken down by gender.

3. Calculating Expected Frequencies

The next step involves calculating the expected frequencies for each cell in the contingency table, assuming that the null hypothesis is true (i.e., there is no relationship between the variables). The expected frequency for each cell is calculated using the following formula:

E ij = (Row Total of Row i × Column Total of Column j) / Grand Total

  • E ij is the expected frequency for cell (i,j),
  • Row Total of Row i is the total number of observations in the i-th row,
  • Column Total of Column j is the total number of observations in the j-th column,
  • Grand Total is the total number of observations in the entire dataset.

Let’s calculate the expected frequencies for the first cell (Male, Voted) using the table above. The row total for males is 60, the column total for those who voted is 110, and the grand total is 155.

E (Male, Voted) = (60 × 110) / 155 = 42.58

You would repeat this process for each cell in the contingency table to obtain the expected frequencies.

4. Computing the Chi-Square Statistic

Once you have the observed and expected frequencies, the next step is to calculate the Chi-Square statistic. This statistic measures the difference between the observed and expected frequencies for each cell in the contingency table. The formula for the Chi-Square statistic is:

χ 2 = ∑ [(O ij – E ij ) 2 / E ij ]

  • χ 2 is the Chi-Square statistic,
  • O ij is the observed frequency for cell (i,j),
  • The sum is taken over all cells in the contingency table.

For each cell, you subtract the expected frequency from the observed frequency, square the result, and divide by the expected frequency. After calculating this value for all cells, you sum the results to obtain the overall Chi-Square statistic.

5. Determining Degrees of Freedom

Degrees of freedom (df) are a critical component in determining the significance of the Chi-Square statistic. In the case of a Chi-Square test for independence, the degrees of freedom are calculated using the formula:

df = (r – 1) × (c – 1)

  • r is the number of rows in the contingency table,
  • c is the number of columns in the contingency table.

In our example, there are 3 rows (Male, Female, Non-Binary) and 2 columns (Voted, Did Not Vote). Therefore, the degrees of freedom would be:

df = (3 – 1) × (2 – 1) = 2

6. Interpreting the Results

To determine whether the relationship between the variables is statistically significant, compare the calculated Chi-Square statistic to a critical value from the Chi-Square distribution table. The critical value depends on two factors: the degrees of freedom and the chosen significance level (often set at 0.05, or 5%).

If the calculated Chi-Square statistic is greater than the critical value, you can reject the null hypothesis, indicating that there is a significant relationship between the variables. If the Chi-Square statistic is less than the critical value, you fail to reject the null hypothesis, meaning that there is no evidence of a relationship between the variables.

7. Reporting the Results

When reporting the results of a Chi-Square test in a research paper or article, it is important to provide a clear summary of the findings. Typically, this includes the following information:

  • The observed Chi-Square statistic,
  • The degrees of freedom,
  • The p-value (the probability that the observed association occurred by chance),
  • Whether the result is statistically significant (i.e., whether you reject the null hypothesis),
  • A brief interpretation of the findings in the context of the research question.

For example, you might report your results as follows:

“A Chi-Square test for independence was performed to examine the relationship between gender and voting behavior. The test revealed a significant association between the two variables, χ 2 (2, N = 155) = 6.12, p = 0.047, indicating that voting behavior is not independent of gender.”

The Chi-Square test is an invaluable tool in sociology, enabling researchers to explore the relationships between categorical variables and to test hypotheses about social behavior and structures. By following the steps outlined in this article—formulating hypotheses, collecting data, calculating expected frequencies, computing the Chi-Square statistic, determining degrees of freedom, and interpreting the results—sociologists can rigorously test whether observed patterns in their data reflect significant associations or are merely the result of random chance.

Understanding and correctly applying the Chi-Square test helps sociologists draw meaningful conclusions about the social world, contributing to broader discussions about inequality, social behavior, and institutional practices. As a fundamental part of the sociologist’s toolkit, mastering the Chi-Square test is essential for undergraduate students who wish to engage critically with empirical research.

Mr Edwards has a PhD in sociology and 10 years of experience in sociological knowledge

Related Articles

A man clearing snow using a road gritting machine

Snowballing Technique in Sociological Research

The snowballing technique, also known as snowball sampling, is a non-probability sampling method widely used in qualitative research within the...

An abstract piece bearing similariy to grey wisps of smoke

Is it Possible to be Unbiased in Sociology?

Explore the concept of bias in sociology and the challenges of achieving unbiased research. Learn about strategies for mitigating bias...

A conceptual illustration showing an individual at the center surrounded by representations of different social institutions.

Institutionalization: An Overview

black and white shot of a row of jail cells

Total Institutions Explained

An aristocratic building

Institutionalism Explained

Get the latest sociology.

Would you be interested in enrolling in courses from Easy Sociology?

Recommended

A man experiencing alienation

Understanding the Marxist Concept of Alienation: A Comprehensive Analysis

A group of friends sat on a wall on a sunny day expressing solidarity

Understanding Solidarity in Sociology

24 hour trending.

People in a classroom

The Symbolic Interactionist View of Education: A Detailed Outline and Explanation

The functionalist perspective on gender in sociology, the effect of neoliberalism on education, the role and functions of the education system: exploring its relationship to the economy and class structure, robert merton’s strain theory: understanding societal pressure and deviance.

Easy Sociology makes sociology as easy as possible. Our aim is to make sociology accessible for everybody. © 2023 Easy Sociology

© 2023 Easy Sociology

LEARN STATISTICS EASILY

LEARN STATISTICS EASILY

Learn Data Analysis Now!

LEARN STATISTICS EASILY LOGO 2

Mastering the Chi-Square Test: A Comprehensive Guide

The Chi-Square Test is a statistical method used to determine if there’s a significant association between two categorical variables in a sample data set. It checks the independence of these variables, making it a robust and flexible tool for data analysis.

Introduction to Chi-Square Test

The  Chi-Square Test  of Independence is an important tool in the statistician’s arsenal. Its primary function is determining whether a significant association exists between two categorical variables in a sample data set. Essentially, it’s a test of independence, gauging if variations in one variable can impact another.

This comprehensive guide gives you a deeper understanding of the Chi-Square Test, its mechanics, importance, and correct implementation.

  • Chi-Square Test assess the association between two categorical variables.
  • Chi-Square Test requires the data to be a random sample.
  • Chi-Square Test is designed for categorical or nominal variables.
  • Each observation in the Chi-Square Test must be mutually exclusive and exhaustive.
  • Chi-Square Test can’t establish causality, only an association between variables.

 width=

Ad description. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Case Study: Chi-Square Test in Real-World Scenario

Let’s delve into a real-world scenario to illustrate the application of the  Chi-Square Test . Picture this: you’re the lead data analyst for a burgeoning shoe company. The company has an array of products but wants to enhance its marketing strategy by understanding if there’s an association between gender (Male, Female) and product preference (Sneakers, Loafers).

To start, you collect data from a random sample of customers, using a survey to identify their gender and their preferred shoe type. This data then gets organized into a contingency table , with gender across the top and shoe type down the side.

Next, you apply the Chi-Square Test to this data. The null hypothesis (H0) is that gender and shoe preference are independent. In contrast, the alternative hypothesis (H1) proposes that these variables are associated. After calculating the expected frequencies and the Chi-Square statistic, you compare this statistic with the critical value from the Chi-Square distribution.

Suppose the Chi-Square statistic is higher than the critical value in our scenario, leading to the rejection of the null hypothesis. This result indicates a significant association between gender and shoe preference. With this insight, the shoe company has valuable information for targeted marketing campaigns.

For instance, if the data shows that females prefer sneakers over loafers, the company might emphasize its sneaker line in marketing materials directed toward women. Conversely, if men show a higher preference for loafers, the company can highlight these products in campaigns targeting men.

This case study exemplifies the power of the Chi-Square Test. It’s a simple and effective tool that can drive strategic decisions in various real-world contexts, from marketing to medical research.

The Mathematics Behind Chi-Square Test

At the heart of the  Chi-Square Test  lies the calculation of the discrepancy between observed data and the expected data under the assumption of variable independence. This discrepancy termed the Chi-Square statistic, is calculated as the sum of squared differences between observed (O) and expected (E) frequencies, normalized by the expected frequencies in each category.

In mathematical terms, the Chi-Square statistic (χ²) can be represented as follows: χ² = Σ [ (Oᵢ – Eᵢ)² / Eᵢ ] , where the summation (Σ) is carried over all categories.

This formula quantifies the discrepancy between our observations and what we would expect if the null hypothesis of independence were true. We can decide on the variables’ independence by comparing the calculated Chi-Square statistic to a critical value from the Chi-Square distribution. Suppose the computed χ² is greater than the critical value. In that case, we reject the null hypothesis, indicating a significant association between the variables.

Step-by-Step Guide to Perform Chi-Square Test

To effectively execute a  Chi-Square Test , follow these methodical steps:

State the Hypotheses:  The null hypothesis (H0) posits no association between the variables — i.e., independent — while the alternative hypothesis (H1) posits an association between the variables.

Construct a Contingency Table:  Create a matrix to present your observations, with one variable defining the rows and the other defining the columns. Each table cell shows the frequency of observations corresponding to a particular combination of variable categories.

Calculate the Expected Values:  For each cell in the contingency table, calculate the expected frequency assuming that H0 is true. This can be calculated by multiplying the sum of the row and column for that cell and dividing by the total number of observations.

Compute the Chi-Square Statistic:  Apply the formula χ² = Σ [ (Oᵢ – Eᵢ)² / Eᵢ ] to compute the Chi-Square statistic.

Compare Your Test Statistic:  Evaluate your test statistic against a Chi-Square distribution to find the p-value, which will indicate the statistical significance of your test. If the p-value is less than your chosen significance level (usually 0.05), you reject H0.

Interpretation of the results should always be in the context of your research question and hypothesis. This includes considering practical significance — not just statistical significance — and ensuring your findings align with the broader theoretical understanding of the topic.

Steps in Chi-Square Test Description
State the Hypotheses The null hypothesis (H0) posits no association between the variables (i.e., they are independent), while the alternative hypothesis (H1) posits an association between the variables.
Construct a Contingency Table Create a matrix to present your observations, with one variable defining the rows and the other defining the columns. Each table cell shows the frequency of observations corresponding to a particular combination of variable categories.
Calculate the Expected Values For each cell in the contingency table, calculate the expected frequency under the assumption that H0 is true. This is calculated by multiplying the row and column total for that cell and dividing by the grand total.
Compute the Chi-Square Statistic Apply the formula χ² = Σ [ (Oᵢ – Eᵢ)² / Eᵢ ] to compute the Chi-Square statistic.
Compare Your Test Statistic Evaluate your test statistic against a Chi-Square distribution to find the p-value, which will indicate the statistical significance of your test. If the p-value is less than your chosen significance level (usually 0.05), you reject H0.
Interpret the Results Interpretation should always be in the context of your research question and hypothesis. Consider the practical significance, not just statistical significance, and ensure your findings align with the broader theoretical understanding of the topic.

Assumptions, Limitations, and Misconceptions

The  Chi-Square Test , a vital tool in statistical analysis, comes with certain assumptions and distinct limitations. Firstly, it presumes that the data used are a  random sample  from a larger population and that the variables under investigation are nominal or categorical. Each observation must fall into one unique category or cell in the analysis, meaning observations are mutually  exclusive  and  exhaustive .

The Chi-Square Test has limitations when deployed with small sample sizes. The  expected frequency  of any cell in the contingency table should ideally be 5 or more. If it falls short, this can cause distortions in the test findings, potentially triggering a Type I or Type II error.

Misuse and misconceptions about this test often center on its application and interpretability. A standard error is using it for continuous or ordinal data without appropriate  categorization , leading to misleading results. Also, a significant result from a Chi-Square Test indicates an association between variables, but it doesn’t infer  causality . This is a frequent misconception — interpreting the association as proof of causality — while the test doesn’t offer information about whether changes in one variable cause changes in another.

Moreover, more than a significant Chi-Square test is required to comprehensively understand the relationship between variables. To get a more nuanced interpretation, it’s crucial to accompany the test with a measure of  effect size , such as Cramer’s V or Phi coefficient for a 2×2 contingency table. These measures provide information about the strength of the association, adding another dimension to the interpretation of results. This is essential as statistically significant results do not necessarily imply a practically significant effect. An effect size measure is critical in large sample sizes where even minor deviations from independence might result in a significant Chi-Square test.

Conclusion and Further Reading

Mastering the  Chi-Square Test  is vital in any data analyst’s or statistician’s journey. Its wide range of applications and robustness make it a tool you’ll turn to repeatedly.

For further learning, statistical textbooks and online courses can provide more in-depth knowledge and practice. Don’t hesitate to delve deeper and keep exploring the fascinating world of data analysis.

  • Effect Size for Chi-Square Tests
  • Assumptions for the Chi-Square Test
  • Assumptions for Chi-Square Test (Story)
  • Chi Square Test – an overview (External Link)
  • Understanding the Null Hypothesis in Chi-Square
  • What is the Difference Between the T-Test vs. Chi-Square Test?
  • How to Report Chi-Square Test Results in APA Style: A Step-By-Step Guide

Frequently Asked Questions (FAQ)

It’s a statistical test used to determine if there’s a significant association between two categorical variables.

The test is suitable for categorical or nominal variables.

No, the test can only indicate an association, not a causal relationship.

The test assumes that the data is a random sample and that observations are mutually exclusive and exhaustive.

It measures the discrepancy between observed and expected data, calculated by χ² = Σ [ (Oᵢ – Eᵢ)² / Eᵢ ].

The result is generally considered statistically significant if the p-value is less than 0.05.

Misuse can lead to misleading results, making it crucial to use it with categorical data only.

Small sample sizes can lead to wrong results, especially when expected cell frequencies are less than 5.

Low expected cell frequencies can lead to Type I or Type II errors.

Results should be interpreted in context, considering the statistical significance and the broader understanding of the topic.

Similar Posts

Distributions of Generalized Linear Models

Understanding Distributions of Generalized Linear Models

Dive into “Distributions of Generalized Linear Models” to master the core concepts and applications of statistical modeling.

sampling error

Understanding Sampling Error: A Foundation in Statistical Analysis

Understand the concept of sampling error, its impacts, and strategies to mitigate it in statistical analysis and data science.

7 Strategies to Optimize Your Statistical and Data Analysis Workflow

7 Strategies to Optimize Your Statistical and Data Analysis Workflow

Optimize your data analysis workflow with these 7 practical tips for better organization, efficiency, and accuracy in your projects.

Chi Square Calculator

Chi-Square Calculator: Enhance Your Data Analysis Skills

Master the Chi Square Calculator to elevate your data analysis. This guide unpacks the tool’s utility in statistical testing and research.

inter-class correlation

Inter-Class Correlation: Mastering the Art of Evaluating Rater Agreement

Explore the Inter-Class Correlation to enhance the reliability of your statistical analyses and embrace the beauty of data consistency.

graphs

How To Select The Appropriate Graph?

Discover how to select the perfect graph for your data. Learn about quantitative and qualitative variables and explore different graph types.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

what is the alternative hypothesis for a chi square test

what is the alternative hypothesis for a chi square test

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

S.4 chi-square tests, chi-square test of independence section  .

Do you remember how to test the independence of two categorical variables? This test is performed by using a Chi-square test of independence.

Recall that we can summarize two categorical variables within a two-way table, also called an r × c contingency table, where r = number of rows, c = number of columns. Our question of interest is “Are the two variables independent?” This question is set up using the following hypothesis statements:

 \[E=\frac{\text{row total}\times\text{column total}}{\text{sample size}}\]

We will compare the value of the test statistic to the critical value of \(\chi_{\alpha}^2\) with the degree of freedom = ( r - 1) ( c - 1), and reject the null hypothesis if \(\chi^2 \gt \chi_{\alpha}^2\).

Example S.4.1 Section  

Is gender independent of education level? A random sample of 395 people was surveyed and each person was asked to report the highest education level they obtained. The data that resulted from the survey are summarized in the following table:

  High School  Bachelors Masters Ph.d. Total
Female 60 54 46 41 201
Male 40 44 53 57 194
Total 100 98 99 98 395

Question : Are gender and education level dependent at a 5% level of significance? In other words, given the data collected above, is there a relationship between the gender of an individual and the level of education that they have obtained?

Here's the table of expected counts:

  High School  Bachelors Masters Ph.d. Total
Female 50.886 49.868 50.377 49.868 201
Male 49.114 48.132 48.623 48.132 194
Total 100 98 99 98 395

So, working this out, \(\chi^2= \dfrac{(60−50.886)^2}{50.886} + \cdots + \dfrac{(57 − 48.132)^2}{48.132} = 8.006\)

The critical value of \(\chi^2\) with 3 degrees of freedom is 7.815. Since 8.006 > 7.815, we reject the null hypothesis and conclude that the education level depends on gender at a 5% level of significance.

JMP | Statistical Discovery.™ From SAS.

Statistics Knowledge Portal

A free online introduction to statistics

The Chi-Square Test

What is a chi-square test.

A Chi-square test is a hypothesis testing method. Two common Chi-square tests involve checking if observed frequencies in one or more categories match expected frequencies.

Is a Chi-square test the same as a χ² test?

Yes, χ is the Greek symbol Chi.

What are my choices?

If you have a single measurement variable, you use a Chi-square goodness of fit test . If you have two measurement variables, you use a Chi-square test of independence . There are other Chi-square tests, but these two are the most common.

Types of Chi-square tests

You use a Chi-square test for hypothesis tests about whether your data is as expected. The basic idea behind the test is to compare the observed values in your data to the expected values that you would see if the null hypothesis is true.

There are two commonly used Chi-square tests: the Chi-square goodness of fit test and the Chi-square test of independence . Both tests involve variables that divide your data into categories. As a result, people can be confused about which test to use. The table below compares the two tests.

Visit the individual pages for each type of Chi-square test to see examples along with details on assumptions and calculations.

Table 1: Choosing a Chi-square test

 
Number of variablesOneTwo
Purpose of testDecide if one variable is likely to come from a given distribution or notDecide if two variables might be related or not
ExampleDecide if bags of candy have the same number of pieces of each flavor or notDecide if movie goers' decision to buy snacks is related to the type of movie they plan to watch
Hypotheses in example

H : proportion of flavors of candy are the same

H : proportions of flavors are not the same

H : proportion of people who buy snacks is independent of the movie type

H : proportion of people who buy snacks is different for different types of movies

used in testChi-SquareChi-Square
Degrees of freedom

Number of categories minus 1

Number of categories for first variable minus 1, multiplied by number of categories for second variable minus 1

How to perform a Chi-square test

For both the Chi-square goodness of fit test and the Chi-square test of independence , you perform the same analysis steps, listed below. Visit the pages for each type of test to see these steps in action.

  • Define your null and alternative hypotheses before collecting your data.
  • Decide on the alpha value. This involves deciding the risk you are willing to take of drawing the wrong conclusion. For example, suppose you set α=0.05 when testing for independence. Here, you have decided on a 5% risk of concluding the two variables are independent when in reality they are not.
  • Check the data for errors.
  • Check the assumptions for the test. (Visit the pages for each test type for more detail on assumptions.)
  • Perform the test and draw your conclusion.

Both Chi-square tests in the table above involve calculating a test statistic. The basic idea behind the tests is that you compare the actual data values with what would be expected if the null hypothesis is true. The test statistic involves finding the squared difference between actual and expected data values, and dividing that difference by the expected data values. You do this for each data point and add up the values.

Then, you compare the test statistic to a theoretical value from the Chi-square distribution . The theoretical value depends on both the alpha value and the degrees of freedom for your data. Visit the pages for each test type for detailed examples.

Chi-Square Test of Independence

The Chi-Square test of independence is used to determine if there is a significant relationship between two nominal (categorical) variables.  The frequency of each category for one nominal variable is compared across the categories of the second nominal variable.  The data can be displayed in a contingency table where each row represents a category for one variable and each column represents a category for the other variable.  For example, say a researcher wants to examine the relationship between gender (male vs. female) and empathy (high vs. low).  The chi-square test of independence can be used to examine this relationship.  The null hypothesis for this test is that there is no relationship between gender and empathy.  The alternative hypothesis is that there is a relationship between gender and empathy (e.g. there are more high-empathy females than high-empathy males).

Calculate Chi Square Statistic by Hand

First we have to calculate the expected value of the two nominal variables.  We can calculate the expected value of the two nominal variables by using this formula:

what is the alternative hypothesis for a chi square test

N = total number

After calculating the expected value, we will apply the following formula to calculate the value of the Chi-Square test of Independence:

what is the alternative hypothesis for a chi square test

Degree of freedom is calculated by using the following formula: DF = (r-1)(c-1) Where DF = Degree of freedom r = number of rows c = number of columns

Need assistance with your research?

Schedule a time to speak with an expert using the calendar below.

Transform raw data to written interpreted results in seconds.

Null hypothesis: Assumes that there is no association between the two variables.

Alternative hypothesis: Assumes that there is an association between the two variables.

Hypothesis testing: Hypothesis testing for the chi-square test of independence as it is for other tests like ANOVA , where a test statistic is computed and compared to a critical value.  The critical value for the chi-square statistic is determined by the level of significance (typically .05) and the degrees of freedom.  The degrees of freedom for the chi-square are calculated using the following formula: df = (r-1)(c-1) where r is the number of rows and c is the number of columns. If the observed chi-square test statistic is greater than the critical value, the null hypothesis can be rejected.

Related Pages:

  • Conduct and Interpret the Chi-Square Test of Independence
  • Test of Independence: degrees of freedom
  • Take the course: Chi Square Test of Independence

Statistics Solutions can assist with your quantitative analysis by assisting you to develop your methodology and results chapters. The services that we offer include:

Data Analysis Plan

  • Edit your research questions and null/alternative hypotheses
  • Write your data analysis plan; specify specific statistics to address the research questions, the assumptions of the statistics, and justify why they are the appropriate statistics; provide references
  • Justify your sample size/power analysis, provide references
  • Explain your data analysis plan to you so you are comfortable and confident
  • Two hours of additional support with your statistician

Quantitative Results Section (Descriptive Statistics, Bivariate and Multivariate Analyses, Structural Equation Modeling , Path analysis, HLM, Cluster Analysis )

  • Clean and code dataset
  • Conduct descriptive statistics (i.e., mean, standard deviation, frequency and percent, as appropriate)
  • Conduct analyses to examine each of your research questions
  • Write-up results
  • Provide APA 7 th edition tables and figures
  • Explain Chapter 4 findings
  • Ongoing support for entire results chapter statistics

Please call 727-442-4290 to request a quote based on the specifics of your research, schedule using the calendar on this page, or email [email protected]

Encyclopedia Britannica

  • History & Society
  • Science & Tech
  • Biographies
  • Animals & Nature
  • Geography & Travel
  • Arts & Culture
  • Games & Quizzes
  • On This Day
  • One Good Fact
  • New Articles
  • Lifestyles & Social Issues
  • Philosophy & Religion
  • Politics, Law & Government
  • World History
  • Health & Medicine
  • Browse Biographies
  • Birds, Reptiles & Other Vertebrates
  • Bugs, Mollusks & Other Invertebrates
  • Environment
  • Fossils & Geologic Time
  • Entertainment & Pop Culture
  • Sports & Recreation
  • Visual Arts
  • Demystified
  • Image Galleries
  • Infographics
  • Top Questions
  • Britannica Kids
  • Saving Earth
  • Space Next 50
  • Student Center
  • When did science begin?
  • Where was science invented?

bunch of numbers

chi-squared test

Our editors will review what you’ve submitted and determine whether to revise the article.

  • Newcastle University - Chi-square Tests (Business)
  • Simply Psychology - Chi-Square (Χ²) Test and How to Calculate Formula Equation
  • Statistics LibreTexts - Chi-Square Tests
  • BMJ - The Chi squared tests
  • The University of utah - Department of Sociology - Chi-Square

chi-squared test , a hypothesis-testing method in which observed frequencies are compared with expected frequencies for experimental outcomes.

In hypothesis testing , data from a sample are used to draw conclusions about a population parameter or a population probability distribution. First, a tentative assumption is made about the parameter or distribution. This assumption is called the null hypothesis and is denoted by H 0 . An alternative hypothesis (denoted H a ), which is the opposite of what is stated in the null hypothesis, is then defined. The hypothesis-testing procedure involves using sample data to determine whether H 0 can be rejected. If H 0 is rejected, the statistical conclusion is that the alternative hypothesis H a is true.

The chi-squared test is such a hypothesis test. First, one selects a p -value, a measure of how likely the sample results are to fall in a predicted range, assuming the null hypothesis is true; the smaller the p -value, the less likely the sample results are to fall in a predicted range. If the p -value is less than α, the null hypothesis can be rejected; otherwise, the null hypothesis cannot be rejected. The value of α is often chosen to be 0.05.

One then calculates the chi-squared value. The formula for the chi-squared test is χ 2 = Σ ( O i − E i ) 2 / E i , where χ 2 represents the chi-squared value, O i represents the observed value, E i represents the expected value (that is, the value expected from the null hypothesis), and the symbol Σ represents the summation of values for all i . One then looks up in a table the chi-squared value that corresponds to the chosen p - value and the number of degrees of freedom of the data (that is, the number of categories of the data minus one). If that value from the table is less than the chi-squared value calculated from the data, one can reject the null hypothesis.

The two most common chi-squared tests are the one-variable goodness of fit test and the two-variable test of independence. The one-variable goodness of fit test determines if one variable value is likely or not likely to be within a given distribution. For example, suppose a study was being conducted to measure the volume of soda in cans being filled with soda at a bottling and distribution centre. A one-variable goodness of fit test might be used to determine the likelihood that a randomly selected can of soda has a volume within a fixed volume range--this range refers to all acceptable volumes of soda in cans filled at the centre.

The two-variable test of independence determines whether two variables could be related. For example, a a two-variable test of independence could be used to test whether there is a correlation between the types of books people choose to read and the season of the year when they make their choices.

  • Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

How the Chi-Squared Test of Independence Works

By Jim Frost 21 Comments

Chi-squared tests of independence determine whether a relationship exists between two categorical variables . Do the values of one categorical variable depend on the value of the other categorical variable? If the two variables are independent, knowing the value of one variable provides no information about the value of the other variable.

I’ve previously written about Pearson’s chi-square test of independence using a fun Star Trek example . Are the uniform colors related to the chances of dying? You can test the notion that the infamous red shirts have a higher likelihood of dying. In that post, I focus on the purpose of the test, applied it to this example, and interpreted the results.

In this post, I’ll take a bit of a different approach. I’ll show you the nuts and bolts of how to calculate the expected values, chi-square value, and degrees of freedom. Then you’ll learn how to use the chi-squared distribution in conjunction with the degrees of freedom to calculate the p-value.

I’ve used the same approach to explain how:

  • t-Tests work .
  • F-tests work in one-way ANOVA .

Of course, you’ll usually just let your statistical software perform all calculations. However, understanding the underlying methodology helps you fully comprehend the analysis.

Chi-Squared Example Dataset

For the Star Trek example, uniform color and status are the two categorical variables. The contingency table below shows the combination of variable values, frequencies, and percentages.

7 9 24 40
129 46 215 390
136 55 239 N = 430
5.15% 16.36% 10.04%

Red shirts on Star Trek.

However, our fatality rates are not equal. Gold has the highest fatality rate at 16.36%, while Blue has the lowest at 5.15%. Red is in the middle at 10.04%. Does this inequality in our sample suggest that the fatality rates are different in the population? Does a relationship exist between uniform color and fatalities?

Thanks to random sampling error, our sample’s fatality rates don’t exactly equal the population’s rates. If the population rates are equal, we’d likely still see differences in our sample. So, the question becomes, after factoring in sampling error, are the fatality rates in our sample different enough to conclude that they’re different in the population? In other words, we want to be confident that the observed differences represent a relationship in the population rather than merely random fluctuations in the sample. That’s where Pearson’s chi-squared test for independence comes in!

Hypotheses for Our Test

The two hypotheses for the chi-squared test of independence are the following:

  • Null : The variables are independent. No relationship exists.
  • Alternative : A relationship between the variables exists.

Related posts : Hypothesis Testing Overview and Guide to Data Types

Calculating the Expected Frequencies for the Chi-squared Test of Independence

The chi-squared test of independence compares our sample data in the contingency table to the distribution of values we’d expect if the null hypothesis is correct. Let’s construct the contingency table we’d expect to see if the null hypothesis is true for our population.

For chi-squared tests, the term “expected frequencies” refers to the values we’d expect to see if the null hypothesis is true. To calculate the expected frequency for a specific combination of categorical variables (e.g., blue shirts who died), multiply the column total (Blue) by the row total (Dead), and divide by the sample size.

Row total X Column total / Sample Size = Expected value for one table cell

To calculate the expected frequency for the Dead/Blue cell in our dataset, do the following:

  • Find the row total for Dead (40)
  • Find the column total for Blue (136)
  • Multiply those two values and divide by the sample size (430)

40 * 136 / 430 = 12.65

If the null hypothesis is true, we’d expect to see 12.65 fatalities for wearers of the Blue uniforms in our sample. Of course, we can’t have a fraction of a death, but that doesn’t affect the results.

Contingency Table with the Expected Values

I’ll calculate the expected values for all six cells that represent the combinations of the three uniform colors and two statuses. I’ll also include the observed values in our sample. Expected values are in parentheses.

7 (12.65) 9 (5.12) 24 (22.23) 40
129 (123.35) 46 (49.88) 215 (216.77) 390
9.3% 9.3% 9.3%

In this table, notice how the column percentages for the expected dead are all 9.3%. This equality occurs when the null hypothesis is valid, which is the condition that the expected values represent.

Using this table, we can also compare the values we observe in our sample to the frequencies we’d expect if the null hypothesis that the variables are not related is correct.

For example, the observed frequency for Blue/Dead is less than the expected value (7 < 12.65). In our sample, deaths of those in blue uniforms occurred less frequently than we’d expect if the variables are independent. On the other hand, the observed frequency for Gold/Dead is greater than the expected value (9 > 5.12). Meanwhile, the observed frequency for Red/Dead approximately equals the expected value. This interpretation matches what we concluded by assessing the column percentages in the first contingency table.

Pearson’s chi-squared test works by mathematically comparing observed frequencies to the expected values and boiling all those differences down into one number. Let’s see how it does that!

Related post : Using Contingency Tables to Calculate Probabilities

Calculating the Chi-Squared Statistic

Most hypothesis tests calculate a test statistic. For example, t-tests use t-values and F-tests use F-values as their test statistics. These statistical tests compare your observed sample data to what you would expect if the null hypothesis is true. The calculations reduce your sample data down to one value that represents how different your data are from the null. Learn more about Test Statistics .

For chi-squared tests, the test statistic is, unsurprisingly, chi-squared, or χ 2 .

The chi-squared calculations involve a familiar concept in statistics—the sum of the squared differences between the observed and expected values. This concept is similar to how regression models assess goodness-of-fit using the sum of the squared differences.

Here’s the formula for chi-squared.

Chi-squared equation.

Let’s walk through it!

To calculate the chi-squared statistic, take the difference between a pair of observed (O) and expected values (E), square the difference, and divide that squared difference by the expected value. Repeat this process for all cells in your contingency table and sum those values. The resulting value is χ 2 . We’ll calculate it for our example data shortly!

Important Considerations about the Chi-Squared Statistic

Notice several important considerations about chi-squared values:

Zero represents the null hypothesis. If all your observed frequencies equal the expected frequencies exactly, the chi-squared value for each cell equals zero, and the overall chi-squared statistic equals zero. Zero indicates your sample data exactly match what you’d expect if the null hypothesis is correct.

Squaring the differences ensures both that cell values must be non-negative and that larger differences are weighted more than smaller differences. A cell can never subtract from the chi-squared value.

Larger values represent a greater difference between your sample data and the null hypothesis. Chi-squared tests are one-tailed tests rather than the more familiar two-tailed tests. The test determines whether the entire set of differences exceeds a significance threshold. If your χ 2 passes the limit, your results are statistically significant! You can reject the null hypothesis and conclude that the variables are dependent–a relationship exists.

Related post : One-tailed and Two-tailed Hypothesis Tests

Calculating Chi-Squared for our Example Data

Let’s calculate the chi-squared statistic for our example data! To do that, I’ll rearrange the contingency table, making it easier to illustrate how to calculate the sum of the squared differences.

Worksheet that shows chi-squared calculations for our example data.

The first two columns indicate the combination of categorical variable values. The next two are the observed and expected values that we calculated before. The last column is the squared difference divided by the expected value for each row. The bottom line sums those values.

Our chi-squared test statistic is 6.17. Ok, great. What does that mean? Larger values indicate a more substantial divergence between our observed data and the null hypothesis. However, the number by itself is not useful because we don’t know if it’s unusually large. We need to place it into a broader context to determine whether it is an extreme value.

Using the Chi-Squared Distribution to Test Hypotheses

One chi-squared test produces a single chi-squared value. However, imagine performing the following process.

First, assume the null hypothesis is valid for the population. At the population level, there is no relationship between the two categorical variables. Now, we’ll repeat our study many times by drawing many random samples from this population using the same design and sample size. Next, we perform the chi-squared test of independence on all the samples and plot the distribution of the chi-squared values. This distribution is known as a sampling distribution, which is a type of probability distribution.

If we follow this procedure, we create a graph that displays the distribution of chi-squared values for a population where the null hypothesis is true. We use sampling distributions to calculate probabilities for how unlikely our sample statistic is if the null hypothesis is correct. Chi-squared tests use the chi-square distribution.

Fortunately, we don’t need to collect many random samples to create this graph! Statisticians understand the properties of chi-squared distributions so we can estimate the sampling distribution using the details of our design.

Our goal is to determine whether our sample chi-squared value is so rare that it justifies rejecting the null hypothesis for the entire population. The chi-squared distribution provides the context for making that determination. We’ll calculate the probability of obtaining a chi-squared value that is at least as high as the value that our study found (6.17).

This probability has a name—the P-value!  A low probability indicates that our sample data are unlikely when the null hypothesis is true.

Alternatively, you can use a chi-square table to determine whether our study’s chi-square test statistic exceeds the critical value .

Related posts : Sampling Distributions , Understanding Probability Distributions and Interpreting P-values

Graphing the Chi-Squared Test Results for Our Example

For chi-squared tests, the degrees of freedom define the shape of the chi-squared distribution for a design. Chi-square tests use this distribution to calculate p-values. The graph below displays several chi-square distributions with differing degrees of freedom.

Graph the displays chi-square distributions for different degrees of freedom.

For a table with r rows and c columns, the method for calculating degrees of freedom for a chi-square test is (r-1) (c-1). For our example, we have two rows and three columns: (2-1) * (3-1) = 2 df.

Read my post about degrees of freedom to learn about this concept along with a more intuitive way of understanding degrees of freedom in chi-squared tests of independence.

Below is the chi-squared distribution for our study’s design.

Chi-squared distribution for our example analysis.

The distribution curve displays the likelihood of chi-squared values for a population where there is no relationship between uniform color and status at the population level. I shaded the region that corresponds to chi-square values greater than or equal to our study’s value (6.17). When the null hypothesis is correct, chi-square values fall in this area approximately 4.6% of the time, which is the p-value (0.046). With a significance level of 0.05, our sample data are unusual enough to reject the null hypothesis.

The sample evidence suggests that a relationship between the variables exists in the population. While this test doesn’t indicate red shirts have a higher chance of dying, there is something else going on with red shirts. Read my other post chi-squared to learn about that !

Related Reading

When you have smaller sample sizes, you might need to use Fisher’s exact test instead of the chi-square version. To learn more, read my post, Fisher’s Exact Test: Using and Interpreting .

Learn more about How to Find the P Value .

You can also read about the chi-square goodness of fit test , which assesses the distribution of outcomes for a categorical or discrete variable.

Pearson’s chi-squared test for independence doesn’t tell you the effect size. To understand the strength of the relationship, you’d need to use something like Cramér’s V, which is a measure of association like Pearson’s correlation —except for categorical variables. That’s the topic of a future post!

Share this:

what is the alternative hypothesis for a chi square test

Reader Interactions

' src=

November 15, 2021 at 1:56 pm

Jim – I want to start by saying that I love your site. It has helped me out greatly during many occasions. In this particular example I am interested in understanding the logic around the math for the expected values. For example, can you explain how I should interpret scaling the total number dead by the total number blue?

From there I get that we divide by the total number of people to get the number of blue deaths expected within the group of 430 people. Is this a formula that is well known for contingency tables or did you apply that strictly for this scenario?

Hopefully this question made sense?

Either way, thanks for the contributing to the community!

' src=

November 16, 2021 at 11:48 am

I’m so glad to hear that my site has been helpful!

I’m not 100% sure what you’re asking, so I’m not sure if I’m answering your question. To start, the formulas are the standard ones for the chi-squared test of independence, which you use in conjunction with contingency tables. You’d use the same methods and formulas for other datasets.

The portion you’re asking about is how to calculate the expected number for blue deaths if there is no association between uniform color and deaths (i.e., the null hypothesis of the test is true). So, the interpretation of the value is: If there is no relationship between uniform color and deaths, we’d expect 12.6 fatalities among those wearing blue uniforms. The test as a whole compares these expected values (for all table cells) to the observed values to determine whether the data support rejecting the null hypothesis and concluding that there is a relationship between the variables.

' src=

April 22, 2021 at 7:38 am

I teach AP Stat and am planning on using your example. However, in checking conditions I would like to be able to give background on the origin of the data. I went to your link and found that this data was collected for the TV episodes. Are those the episodes just for the original series?

April 23, 2021 at 11:21 pm

That’s great you’re teaching an AP Stats class! 🙂

Yes, the data I use are from the original TV series that aired from 1966-69.

' src=

July 5, 2020 at 12:34 pm

Thank you for your gracious reply. I’m especially happy because it meant that I actually understood! You’ve done a great service with this blog; I plan to return regularly! Thank you.

July 5, 2020 at 5:43 pm

I was think exactly that after fixing the comment. It would make a perfect comprehension test. Read this article and find the two incorrect letters! You passed! 🙂

July 4, 2020 at 9:13 am

I very much appreciate your clear explanations. I’m a “50 something” trying to finish a PhD in Library Science and my brain needs the help!

One question, please?

You write above:

Larger values represent a greater difference between your sample data and the null hypothesis. Chi-squared tests are one-tailed tests rather than the more familiar two-tailed tests. The test determines whether the entire set of differences exceeds a significance threshold. If your χ2 passes the limit, your results are statistically significant! You can reject the null hypothesis and conclude that the variables are independent.

I thought that rejecting the null hypothesis allowed you to conclude the opposite. If the null hypothesis is

Null: The variables are independent. No relationship exists.

Then rejecting the Null hypothesis means rejecting that the variables are independent, not concluding that the variables are independent.

This is, please, a honest question, (not being “that guy”; i’m not smart enough!).

Again, thank you for your work!! I’m going to check to see if you cover Kendall’s W, as it’s central to a paper I’m reading!

July 4, 2020 at 3:08 pm

First, I definitely welcome all questions! And, especially in this case because you caught a typo! You’re correct about what rejecting the null hypothesis means for this test. I’ve updated the text to say “and conclude that the variables are dependent.” I double-checked elsewhere through article and all the other text about the conclusions based on significance are correct. Just a brain malfunction on my part! I’m grateful you caught that as that little slip changes the entire meaning!

Alas, I don’t cover Kendall’s W–at least not yet. I plan to add that down the road.

' src=

April 28, 2020 at 7:28 pm

Thanks Jim. Your explanations are so effective, yet easy to understand!

' src=

April 26, 2020 at 8:54 pm

Thank you Jim. Great post and reply. I have a question which is an extension of Michael’s question.

In general, it seems like one could build any test statistic. Find the distribution of your statistic under the null (say using bootstrap), and that will give you a p-value for your dataset.

Are chi-squared, t, or F-statistics special in some way? Or do we continue to use them simply because people have used them historically?

April 27, 2020 at 12:31 am

Originally, hypothesis tests that used these distributions were easier to calculate. You could calculate the test statistic using a simple formula and then look it up in a table. Later, it got even easier when the computer could both calculate the test statistic and tell you its p-value. It’s really the ease of calculation that made them special along with the theories behind them.

Now, we have such powerful computers that they can easily construct very large sets of bootstrap samples. That would’ve been difficult earlier. So, a large part of the answer is that bootstrapping really wasn’t feasible earlier and so the use of the chi-squared, t, and F distributions became the norm. The historically accepted standards.

It’s possible that over time bootstrap methods will gain be used more. I haven’t done extensive research into how efficient they are compared to using the various distributions, but what I have done indicates they are at least roughly on par. If you haven’t, I’d suggest reading my post about bootstrapping for more information.

Thanks for asking the great question!

' src=

January 31, 2020 at 1:29 am

Nice explanation

' src=

January 30, 2020 at 4:21 am

This has started my year, so far so good, Thank you Jim.

' src=

January 29, 2020 at 1:32 am

great lesson thanks

' src=

January 28, 2020 at 9:24 pm

Thankyou Jim, I will read and calc this lesson today, at 3 o’clock Brasilia time.

' src=

January 28, 2020 at 4:40 am

Thank You Sir

' src=

January 27, 2020 at 8:49 am

Great post, thanks for writing it. I am looking forward to the Cramer’s V post!

As a person just starting to dive into statistics I am curios why we so often square the differences to make calculations. It seems squaring a difference will put to much weight on large differences. For example, in the chi-square test what if we used the absolute value of observed and expected differences? Just something I have been wondering about.

January 28, 2020 at 11:43 pm

Hi Michael,

There’s several ways of looking at your question. In some cases, if you just want to know how far observations are from the mean for a dataset, you would be justified using the mean absolute deviation rather than the standard deviation, which incorporates squared deviations but then takes the square root.

However, in other cases, the squared deviations are built into the underlying analysis. Such as in linear regression where it penalizes larger errors which helps force them to be smaller. Otherwise, the regression line would not “consider” larger errors to be much worse than smaller errors. Here’s an article about it in the regression context .

Or, if you’re working with the normal distribution and using it calculate probabilities or what not, that distribution has the mean and standard deviation as parameters. And the standard deviation incorporates squared differences. You could not work with the normal distribution using mean absolute deviations (MAD).

In a similar vein for chi-squared tests, you have to realize that the chi-squared distribution is based on squared differences. So, if you wanted to do a similar analysis but with the mean absolute deviation (MAD), you’d have to devise an entirely new test statistic and sampling distribution for it! You couldn’t just use the chi-squared distribution because that is specifically for these differences that use squaring. Same thing for F-tests which use ratios of variances, and variances are of course based on squared differences. Again, to use MAD for something like ANOVA, you’d need to come up with a new test statistic and sampling distribution!

But, the general reason is that squaring does weight large differences more heavily and that fits in with the rational that given a distribution of values, outlier values should be weighted more because they are relatively unlikely to occur so when they do it’s noteworthy. It makes those large differences between the expected and the observed more “odd.” And, some analyses use an underlying sampling distribution that is based on a test statistic calculated using squared differences in some fashion.

' src=

January 27, 2020 at 2:08 am

Thank you Jim.

' src=

January 27, 2020 at 1:12 am

Great lesson Jim! You’re putting it a very simple ways for non-statisticians. Thanks for sharing the knowledge!

' src=

January 26, 2020 at 8:32 pm

Thanks for sharing, Jim!

Comments and Questions Cancel reply

Stack Exchange Network

Stack Exchange network consists of 183 Q&A communities including Stack Overflow , the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Alternatives for chi-squared test for independence for tables more than 2 x 2

What are some alternatives to the chi-squared test for categorical variables with tables larger than 2 x 2 and cells with a count less than 5, if I don't want to merge classes?

  • hypothesis-testing
  • chi-squared-test
  • independence
  • contingency-tables
  • fishers-exact-test

kjetil b halvorsen's user avatar

  • 5 $\begingroup$ The Chi-Square-test can also be used with larger tables than 2x2. Could you explain why the Chi-Square-test should not be appropriate for your problem? Additionally, could you state the problem you're hoping to solve? $\endgroup$ –  COOLSerdash Commented Jun 28, 2015 at 16:33
  • $\begingroup$ I have a 2 x 3 contingency table, and cells with a count less than 5 $\endgroup$ –  Israel Commented Jun 28, 2015 at 16:37
  • 4 $\begingroup$ Thanks, please edit your question and add this information as not everyone reads the comments. A usual rule of thumb regarding the Chi-Square-test is that its results can be inaccurate if the expected cell counts are lower than 5. Usually, a Fisher-Test is recommended in these cases. Barnard's test may also be an option. $\endgroup$ –  COOLSerdash Commented Jun 28, 2015 at 16:52

2 Answers 2

There are some common misunderstandings here. The chi-squared test is perfectly fine to use with tables that are larger than $2\!\times\! 2$ . In order for the actual distribution of the chi-squared test statistic to approximate the chi-squared distribution, the traditional recommendation is that all cells have expected values $\ge 5$ . Two things must be noted here:

It does not matter what the observed cell counts are—they could well be $0$ with no problem—only the expected counts matter.

This traditional rule of thumb is now known to be too conservative. It can be fine to have $\le 20\%$ of the cells with expected counts $< 5$ as long as no expected counts are $<1$ . See:

  • Campbell Ian, 2007, Chi-squared and Fisher-Irwin tests of two-by-two tables with small sample recommendations, Statistics in Medicine, 26, 3661 - 3675

If your expected counts do not match this more accurate criterion, there are some alternative options available:

Your best bet is probably to simulate the sampling distribution of the test statistic, or to use a permutation test. (Note, however, that R's, chisq.test(..., simulate.p.value=TRUE) is really a simulation of Fisher's exact test—see #2—so you'd have to code the simulation manually if you didn't want that.)

You could use an alternative test, such as Fisher's exact test. Although Fisher's exact test is often recommended in this situation, it is worth noting that it makes different assumptions and may not be appropriate. Namely, Fisher's exact test assumes the row and column counts were set in advance and only the arrangement of the row x column combinations can vary (see: Given the power of computers these days, is there ever a reason to do a chi-squared test rather than Fisher's exact test? ). If you are uncomfortable with this assumption, simulating the chi-squared will be a better option.

gung - Reinstate Monica's user avatar

  • 2 $\begingroup$ In the help manual of the function chisq.test() it says 'simulation is done by random sampling from the set of all contingency tables with given marginals [...] Note that this is not the usual sampling situation assumed for the chi-squared test but rather that for Fisher's exact test.'. So if we set simulate.p.values=TRUE it also assumes fixed margins (just like the Fisher's exact test). $\endgroup$ –  retodomax Commented Mar 29, 2022 at 13:18
  • 1 $\begingroup$ @retodomax, you're right about that. I'll update the answer. $\endgroup$ –  gung - Reinstate Monica Commented Mar 29, 2022 at 15:20
  • 2 $\begingroup$ Fisher's exact test assumes the row and column counts were set in advance ... This is not really true, it conditions on the observed margins. see stats.stackexchange.com/questions/441139/… $\endgroup$ –  kjetil b halvorsen ♦ Commented Mar 29, 2022 at 15:44
  • 1 $\begingroup$ It's interesting he says that, @kjetilbhalvorsen, but lots of people in published statistical literature discuss Fisher's exact test this way, eg, Gelman (p 375, pdf ). $\endgroup$ –  gung - Reinstate Monica Commented Mar 29, 2022 at 18:50
  • $\begingroup$ @gung: I have seen before Gelman saying that. As a bayesian, maybe he does not believe in frequentist conditioning. The argument is that the marginals are approximate ancillary (there is a paper by Barnard showing they are not exactly ancillary). But there cannot be much info in the marginals only about dependence or independence. So the logic is as with other frequentist conditional tests. $\endgroup$ –  kjetil b halvorsen ♦ Commented Mar 29, 2022 at 20:26

As an addition to the answer by @gung, let us look at another way: Simulating the null distribution of the $\chi^2$ test statistic, unconditionally , and not as with the Fisher exact test, use the conditional distribution (conditional on the two margins).

As an example, I will use the $2\times 3$ table from Fisher's exact test in 3x2 contingency table below:

We must calculate (really estimate) the marginal probabilities from this table, and then use this to estimate the probabilities of the table, under independence. Then we use this in multinomial sampling, with the same sample size.

One problem which shows, is that some (in this example more than 10%) of the simulated tables have some zero margins. That makes for a problem in calculating chi-square, since it means that we are dividing by zero. My R functions return then NaN (Not a Number). For the illustrations below, I have decided to simply remove this values, although that is certainly debatable.

I get the following simulated null distribution:

Simulated null distribution of chi-square

overlaid in red is the density of the usual approximate chi-square distribution with two degrees of freedom.

We can use this to calculate a simulated p-value, which turns out as 0.0523 as compared to the usual p-value, as returned by R's chisq.test , which is 0.06293 . The simulation option of that function gives a p-value of 0.04927 .

But, the problem of what to do in the case of simulated tables with null margins is interesting. It shows that simulating the null distribution of the chi-squared test statistic, unconditionally, is not unproblematic. The results above is not really unconditional, they are conditional on no null margins. Using the Fisher exact test, pragmatically avoids the issue!

Your Answer

Sign up or log in, post as a guest.

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy .

Not the answer you're looking for? Browse other questions tagged hypothesis-testing chi-squared-test independence contingency-tables fishers-exact-test or ask your own question .

  • Featured on Meta
  • Site maintenance - Mon, Sept 16 2024, 21:00 UTC to Tue, Sept 17 2024, 2:00...
  • User activation: Learnings and opportunities
  • Join Stack Overflow’s CEO and me for the first Stack IRL Community Event in...

Hot Network Questions

  • How to decrease by 1 integers in an expl3's clist?
  • Should I write an email to a Latino teacher working in the US in English or Spanish?
  • Basic question - will "universal" SMPS work at any voltage in the range, even DC?
  • What is the rationale behind 32333 "Technic Pin Connector Block 1 x 5 x 3"?
  • Is it a correct rendering of Acts 1,24 when the New World Translation puts in „Jehovah“ instead of Lord?
  • How many engineers/scientists believed that human flight was imminent as of the late 19th/early 20th century?
  • "Famous award" - "on ships"
  • Creating good tabularx
  • How do I go about writing a tragic ending in a story while making it overall satisfying to the reader?
  • For a bike with "Forged alloy crank with 44T compact disc chainring", can the chainring be removed?
  • Why does a capacitor act as an open circuit under a DC circuit?
  • VBA: Efficiently Organise Data with Missing Values to Achieve Minimum Number of Tables
  • Is Produce Flame a spell that the caster casts upon themself?
  • Trying to match building(s) to lot(s) when data has margin of error in QGIS
  • What is the shortest viable hmac for non-critical applications?
  • Remove all punctuation AND the values after it at end of string in R
  • Can I repeat link labels several times on a web page without hurting SEO by using meta attributes?
  • How to deal with coauthors who just do a lot of unnecessary work and exploration to be seen as hard-working and grab authorship?
  • Doesn't nonlocality follow from nonrealism in the EPR thought experiment and Bell tests?
  • How is switching of measurement ranges in instruments, like oscilloscopes, realized nowadays?
  • Why is steaming food faster than boiling it?
  • What came of the Trump campaign's complaint to the FEC that Harris 'stole' (or at least illegally received) Biden's funding?
  • A phenomenon arisen from the civil service entrance examination (seemingly related to algebraic topology)
  • How to avoid bringing paper silverfish home from a vacation place?

what is the alternative hypothesis for a chi square test

Teach yourself statistics

Chi-Square Test of Homogeneity

This lesson explains how to conduct a chi-square test of homogeneity . The test is applied to a single categorical variable from two or more different populations. It is used to determine whether frequency counts are distributed identically across different populations.

For example, in a survey of TV viewing preferences, we might ask respondents to identify their favorite program. We might ask the same question of two different populations, such as males and females. We could use a chi-square test for homogeneity to determine whether male viewing preferences differed significantly from female viewing preferences. The sample problem at the end of the lesson considers this example.

When to Use Chi-Square Test for Homogeneity

The test procedure described in this lesson is appropriate when the following conditions are met:

  • For each population, the sampling method is simple random sampling .
  • The variable under study is categorical .
  • If sample data are displayed in a contingency table (Populations x Category levels), the expected frequency count for each cell of the table is at least 5.

This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results.

State the Hypotheses

Every hypothesis test requires the analyst to state a null hypothesis and an alternative hypothesis . The hypotheses are stated in such a way that they are mutually exclusive. That is, if one is true, the other must be false; and vice versa.

Suppose that data were sampled from r populations, and assume that the categorical variable had c levels. At any specified level of the categorical variable, the null hypothesis states that each population has the same proportion of observations. Thus,

H : P = P = . . . = P
H : P = P = . . . = P
. . .
H : P = P = . . . = P

The alternative hypothesis (H a ) is that at least one of the null hypothesis statements is false.

Formulate an Analysis Plan

The analysis plan describes how to use sample data to accept or reject the null hypothesis. The plan should specify the following elements.

  • Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can be used.
  • Test method. Use the chi-square test for homogeneity to determine whether observed sample frequencies differ significantly from expected frequencies specified in the null hypothesis. The chi-square test for homogeneity is described in the next section.

Analyze Sample Data

Using sample data from the contingency tables, find the degrees of freedom, expected frequency counts, test statistic, and the P-value associated with the test statistic. The analysis described in this section is illustrated in the sample problem at the end of this lesson.

DF = (r - 1) * (c - 1)

E r,c = (n r * n c ) / n

Χ 2 = Σ [ (O r,c - E r,c ) 2 / E r,c ]

  • P-value. The P-value is the probability of observing a sample statistic as extreme as the test statistic. Since the test statistic is a chi-square, use the Chi-Square Distribution Calculator to assess the probability associated with the test statistic. Use the degrees of freedom computed above.

Interpret Results

If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the P-value to the significance level , and rejecting the null hypothesis when the P-value is less than the significance level.

Test Your Understanding

In a study of the television viewing habits of children, a developmental psychologist selects a random sample of 300 first graders - 100 boys and 200 girls. Each child is asked which of the following TV programs they like best: The Lone Ranger, Sesame Street, or The Simpsons. Results are shown in the contingency table below.

Viewing Preferences Total
Lone RangerLone
Ranger
Sesame StreetSesame
Street
The SimpsonsThe
Simpsons
Boys 50 30 20 100
Girls 50 80 70 200
Total 100 110 90 300

Do the boys' preferences for these TV programs differ significantly from the girls' preferences? Use a 0.05 level of significance.

The solution to this problem takes four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results. We work through those steps below:

State the hypotheses. The first step is to state the null hypothesis and an alternative hypothesis.

H : P = P

H : P = P

H : P = P

  • Alternative hypothesis: At least one of the null hypothesis statements is false.
  • Formulate an analysis plan . For this analysis, the significance level is 0.05. Using sample data, we will conduct a chi-square test for homogeneity .

DF = (r - 1) * (c - 1) DF = (r - 1) * (c - 1) = (2 - 1) * (3 - 1) = 2

E r,c = (n r * n c ) / n E 1,1 = (100 * 100) / 300 = 10000/300 = 33.3 E 1,2 = (100 * 110) / 300 = 11000/300 = 36.7 E 1,3 = (100 * 90) / 300 = 9000/300 = 30.0 E 2,1 = (200 * 100) / 300 = 20000/300 = 66.7 E 2,2 = (200 * 110) / 300 = 22000/300 = 73.3 E 2,3 = (200 * 90) / 300 = 18000/300 = 60.0

Χ 2 = Σ[ (O r,c - E r,c ) 2 / E r,c ] Χ 2 = (50 - 33.3) 2 /33.3 + (30 - 36.7) 2 /36.7 + (20 - 30) 2 /30 + (50 - 66.7) 2 /66.7 + (80 - 73.3) 2 /73.3 + (70 - 60) 2 /60 Χ 2 = (16.7) 2 /33.3 + (-6.7) 2 /36.7 + (-10.0) 2 /30 + (-16.7) 2 /66.7 + (3.3) 2 /73.3 + (10) 2 /60 Χ 2 = 8.38 + 1.22 + 3.33 + 4.18 + 0.61 + 1.67 = 19.39

where DF is the degrees of freedom, r is the number of populations, c is the number of levels of the categorical variable, n r is the number of observations from population r , n c is the number of observations from level c of the categorical variable, n is the number of observations in the sample, E r,c is the expected frequency count in population r for level c , and O r,c is the observed frequency count in population r for level c .

The P-value is the probability that a chi-square statistic having 2 degrees of freedom is more extreme than 19.39. We use the Chi-Square Distribution Calculator to find P(Χ 2 > 19.39) = 0.00006.

  • Interpret results . Since the P-value (0.00006) is less than the significance level (0.05), we reject the null hypothesis.

Note: If you use this approach on an exam, you may also want to mention why this approach is appropriate. Specifically, the approach is appropriate because the sampling method was simple random sampling, the variable under study was categorical, and the expected frequency count was at least 5 in each population at each level of the categorical variable.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Chi-Square Goodness of Fit Test | Formula, Guide & Examples

Chi-Square Goodness of Fit Test | Formula, Guide & Examples

Published on May 24, 2022 by Shaun Turney . Revised on June 22, 2023.

A chi-square (Χ 2 ) goodness of fit test is a type of Pearson’s chi-square test . You can use it to test whether the observed distribution of a categorical variable differs from your expectations.

You recruit a random sample of 75 dogs and offer each dog a choice between the three flavors by placing bowls in front of them. You expect that the flavors will be equally popular among the dogs, with about 25 dogs choosing each flavor.

The chi-square goodness of fit test tells you how well a statistical model fits a set of observations. It’s often used to analyze genetic crosses .

Table of contents

What is the chi-square goodness of fit test, chi-square goodness of fit test hypotheses, when to use the chi-square goodness of fit test, how to calculate the test statistic (formula), how to perform the chi-square goodness of fit test, when to use a different test, practice questions and examples, other interesting articles, frequently asked questions about the chi-square goodness of fit test.

A chi-square (Χ 2 ) goodness of fit test is a goodness of fit test for a categorical variable . Goodness of fit is a measure of how well a statistical model fits a set of observations.

  • When goodness of fit is high , the values expected based on the model are close to the observed values.
  • When goodness of fit is low , the values expected based on the model are far from the observed values.

The statistical models that are analyzed by chi-square goodness of fit tests are distributions . They can be any distribution, from as simple as equal probability for all groups, to as complex as a probability distribution with many parameters.

  • Hypothesis testing

The chi-square goodness of fit test is a hypothesis test . It allows you to draw conclusions about the distribution of a population based on a sample. Using the chi-square goodness of fit test, you can test whether the goodness of fit is “good enough” to conclude that the population follows the distribution.

With the chi-square goodness of fit test, you can ask questions such as: Was this sample drawn from a population that has…

  • Equal proportions of male and female turtles?
  • Equal proportions of red, blue, yellow, green, and purple jelly beans?
  • 90% right-handed and 10% left-handed people?
  • Offspring with an equal probability of inheriting all possible genotypic combinations (i.e., unlinked genes)?
  • A Poisson distribution of floods per year?
  • A normal distribution of bread prices?
Observed and expected frequencies of dogs’ flavor choices
Garlic Blast 22 25
Blueberry Delight 30 25
Minty Munch 23 25

To help visualize the differences between your observed and expected frequencies, you also create a bar graph:

bar-graph-chi-square-test-goodness-of-fit

The president of the dog food company looks at your graph and declares that they should eliminate the Garlic Blast and Minty Munch flavors to focus on Blueberry Delight. “Not so fast!” you tell him.

You explain that your observations were a bit different from what you expected, but the differences aren’t dramatic. They could be the result of a real flavor preference or they could be due to chance.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Like all hypothesis tests, a chi-square goodness of fit test evaluates two hypotheses: the null and alternative hypotheses. They’re two competing answers to the question “Was the sample drawn from a population that follows the specified distribution?”

  • Null hypothesis ( H 0 ): The population follows the specified distribution.
  • Alternative hypothesis ( H a ):   The population does not follow the specified distribution.

These are general hypotheses that apply to all chi-square goodness of fit tests. You should make your hypotheses more specific by describing the “specified distribution.” You can name the probability distribution (e.g., Poisson distribution) or give the expected proportions of each group.

  • Null hypothesis ( H 0 ): The dog population chooses the three flavors in equal proportions ( p 1 = p 2 = p 3 ).
  • Alternative hypothesis ( H a ): The dog population does not choose the three flavors in equal proportions.

The following conditions are necessary if you want to perform a chi-square goodness of fit test:

  • You want to test a hypothesis about the distribution of one categorical variable . If your variable is continuous , you can convert it to a categorical variable by separating the observations into intervals. This process is known as data binning.
  • The sample was randomly selected from the population .
  • There are a minimum of five observations expected in each group.
  • You want to test a hypothesis about the distribution of one categorical variable. The categorical variable is the dog food flavors.
  • You recruited a random sample of 75 dogs.
  • There were a minimum of five observations expected in each group. For all three dog food flavors, you expected 25 observations of dogs choosing the flavor.

The test statistic for the chi-square (Χ 2 ) goodness of fit test is Pearson’s chi-square:

Formula Explanation
is the chi-square test statistic is the summation operator (it means “take the sum of”) is the observed frequency is the expected frequency

The larger the difference between the observations and the expectations ( O − E in the equation), the bigger the chi-square will be.

To use the formula, follow these five steps:

Step 1: Create a table

Create a table with the observed and expected frequencies in two columns.

Garlic Blast 22 25
Blueberry Delight 30 25
Minty Munch 23 25

Step 2: Calculate O − E

Add a new column called “ O −  E ”. Subtract the expected frequencies from the observed frequency.

Garlic Blast 22 25 22 25 = 3
Blueberry Delight 30 25 5
Minty Munch 23 25 2

Step 3: Calculate ( O − E ) 2

Add a new column called “( O −  E ) 2 ”. Square the values in the previous column.

Garlic Blast 22 25 3 ( 3) = 9
Blueberry Delight 30 25 5 25
Minty Munch 23 25 2 4

Step 4: Calculate ( O − E ) 2 / E

Add a final column called “( O − E )² /  E “. Divide the previous column by the expected frequencies.

− )² / 
Garlic Blast 22 25 3 9 9/25 = 0.36
Blueberry Delight 30 25 5 25 1
Minty Munch 23 25 2 4 0.16

Step 5: Calculate Χ 2

Add up the values of the previous column. This is the chi-square test statistic (Χ 2 ).

Garlic Blast 22 25 3 9 9/25 = 0.36
Blueberry Delight 30 25 5 25 1
Minty Munch 23 25 2 4 0.16

Prevent plagiarism. Run a free check.

The chi-square statistic is a measure of goodness of fit, but on its own it doesn’t tell you much. For example, is Χ 2 = 1.52 a low or high goodness of fit?

To interpret the chi-square goodness of fit, you need to compare it to something. That’s what a chi-square test is: comparing the chi-square value to the appropriate chi-square distribution to decide whether to reject the null hypothesis .

To perform a chi-square goodness of fit test, follow these five steps (the first two steps have already been completed for the dog food example):

Step 1: Calculate the expected frequencies

Sometimes, calculating the expected frequencies is the most difficult step. Think carefully about which expected values are most appropriate for your null hypothesis .

In general, you’ll need to multiply each group’s expected proportion by the total number of observations to get the expected frequencies.

Step 2: Calculate chi-square

Calculate the chi-square value from your observed and expected frequencies using the chi-square formula.

\begin{equation*}X^2 = \sum{\dfrac{(O-E)^2}{E}}\end{equation*}

Step 3: Find the critical chi-square value

Find the critical chi-square value in a chi-square critical value table or using statistical software. The critical value is calculated from a chi-square distribution. To find the critical chi-square value, you’ll need to know two things:

  • The degrees of freedom ( df ): For chi-square goodness of fit tests, the df is the number of groups minus one.
  • Significance level (α): By convention, the significance level is usually .05.

Step 4: Compare the chi-square value to the critical value

Compare the chi-square value to the critical value to determine which is larger.

Critical value = 5.99

Step 5: Decide whether the reject the null hypothesis

  • The data allows you to reject the null hypothesis and provides support for the alternative hypothesis.
  • The data doesn’t allow you to reject the null hypothesis and doesn’t provide support for the alternative hypothesis.

Whether you use the chi-square goodness of fit test or a related test depends on what hypothesis you want to test and what type of variable you have.

When to use the chi-square test of independence

There’s another type of chi-square test, called the chi-square test of independence .

  • Use the chi-square goodness of fit test when you have one categorical variable and you want to test a hypothesis about its distribution .
  • Use the chi-square test of independence when you have two categorical variables and you want to test a hypothesis about their relationship .

When to use a different goodness of fit test

The Anderson–Darling and Kolmogorov–Smirnov goodness of fit tests are two other common goodness of fit tests for distributions.

  • Use the Anderson–Darling or the Kolmogorov–Smirnov goodness of fit test when you have a continuous variable (that you don’t want to bin).
  • Use the chi-square goodness of fit test when you have a categorical variable (or a continuous variable that you want to bin).

Do you want to test your knowledge about the chi-square goodness of fit test? Download our practice questions and examples with the buttons below.

Download Word doc Download Google doc

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Chi square test of independence
  • Statistical power
  • Descriptive statistics
  • Degrees of freedom
  • Pearson correlation
  • Null hypothesis

Methodology

  • Double-blind study
  • Case-control study
  • Research ethics
  • Data collection
  • Structured interviews

Research bias

  • Hawthorne effect
  • Unconscious bias
  • Recall bias
  • Halo effect
  • Self-serving bias
  • Information bias

You can use the CHISQ.TEST() function to perform a chi-square goodness of fit test in Excel. It takes two arguments, CHISQ.TEST(observed_range, expected_range), and returns the p value .

You can use the chisq.test() function to perform a chi-square goodness of fit test in R. Give the observed values in the “x” argument, give the expected values in the “p” argument, and set “rescale.p” to true. For example:

chisq.test(x = c(22,30,23), p = c(25,25,25), rescale.p = TRUE)

Chi-square goodness of fit tests are often used in genetics. One common application is to check if two genes are linked (i.e., if the assortment is independent). When genes are linked, the allele inherited for one gene affects the allele inherited for another gene.

Suppose that you want to know if the genes for pea texture (R = round, r = wrinkled) and color (Y = yellow, y = green) are linked. You perform a dihybrid cross between two heterozygous ( RY / ry ) pea plants. The hypotheses you’re testing with your experiment are:

  • This would suggest that the genes are unlinked.
  • This would suggest that the genes are linked.

You observe 100 peas:

  • 78 round and yellow peas
  • 6 round and green peas
  • 4 wrinkled and yellow peas
  • 12 wrinkled and green peas

To calculate the expected values, you can make a Punnett square. If the two genes are unlinked, the probability of each genotypic combination is equal.

RRYY RrYy RRYy RrYY
RrYy rryy Rryy rrYy
RRYy Rryy RRyy RrYy
RrYY rrYy RrYy rrYY

The expected phenotypic ratios are therefore 9 round and yellow: 3 round and green: 3 wrinkled and yellow: 1 wrinkled and green.

From this, you can calculate the expected phenotypic frequencies for 100 peas:

Round and yellow 78 100 * (9/16) = 56.25
Round and green 6 100 * (3/16) = 18.75
Wrinkled and yellow 4 100 * (3/16) = 18.75
Wrinkled and green 12 100 * (1/16) = 6.21
Round and yellow 78 56.25 21.75 473.06 8.41
Round and green 6 18.75 −12.75 162.56 8.67
Wrinkled and yellow 4 18.75 −14.75 217.56 11.6
Wrinkled and green 12 6.21 5.79 33.52 5.4

Χ 2 = 8.41 + 8.67 + 11.6 + 5.4 = 34.08

Since there are four groups (round and yellow, round and green, wrinkled and yellow, wrinkled and green), there are three degrees of freedom .

For a test of significance at α = .05 and df = 3, the Χ 2 critical value is 7.82.

Χ 2 = 34.08

Critical value = 7.82

The Χ 2 value is greater than the critical value .

The Χ 2 value is greater than the critical value, so we reject the null hypothesis that the population of offspring have an equal probability of inheriting all possible genotypic combinations. There is a significant difference between the observed and expected genotypic frequencies ( p < .05).

The data supports the alternative hypothesis that the offspring do not have an equal probability of inheriting all possible genotypic combinations, which suggests that the genes are linked

The two main chi-square tests are the chi-square goodness of fit test and the chi-square test of independence .

A chi-square distribution is a continuous probability distribution . The shape of a chi-square distribution depends on its degrees of freedom , k . The mean of a chi-square distribution is equal to its degrees of freedom ( k ) and the variance is 2 k . The range is 0 to ∞.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Turney, S. (2023, June 22). Chi-Square Goodness of Fit Test | Formula, Guide & Examples. Scribbr. Retrieved September 9, 2024, from https://www.scribbr.com/statistics/chi-square-goodness-of-fit/

Is this article helpful?

Shaun Turney

Shaun Turney

Other students also liked, chi-square (χ²) tests | types, formula & examples, chi-square (χ²) distributions | definition & examples, chi-square test of independence | formula, guide & examples, what is your plagiarism score.

Tutorial Playlist

Statistics tutorial, everything you need to know about the probability density function in statistics, the best guide to understand central limit theorem, an in-depth guide to measures of central tendency : mean, median and mode, the ultimate guide to understand conditional probability, a comprehensive look at percentile in statistics, the best guide to understand bayes theorem, everything you need to know about the normal distribution, an in-depth explanation of cumulative distribution function, chi-square test.

What Is Hypothesis Testing in Statistics? Types and Examples

Understanding the Fundamentals of Arithmetic and Geometric Progression

The definitive guide to understand spearman’s rank correlation, mean squared error: overview, examples, concepts and more, all you need to know about the empirical rule in statistics, the complete guide to skewness and kurtosis, a holistic look at bernoulli distribution, all you need to know about bias in statistics, a complete guide to get a grasp of time series analysis.

The Key Differences Between Z-Test Vs. T-Test

The Complete Guide to Understand Pearson's Correlation

A complete guide on the types of statistical studies, everything you need to know about poisson distribution, your best guide to understand correlation vs. regression, the most comprehensive guide for beginners on what is correlation, what is a chi-square test.

Lesson 9 of 24 By Avijeet Biswal

Chi-Square Test

Table of Contents

The world is constantly curious about the Chi-Square test's application in machine learning and how it makes a difference. Feature selection is a critical topic in machine learning , as you will have multiple features in line and must choose the best ones to build the model. Examining the relationship between the elements, the chi-square test aids in solving feature selection problems. This tutorial will teach you about the chi-square test types, how to perform these tests, their properties, their application, and more. Let's start!

What Is a Chi-Square Test?

The Chi-Square test is a statistical procedure for determining the difference between observed and expected data. This test can also be used to decide whether it correlates to our data's categorical variables. It helps to determine whether a difference between two categorical variables is due to chance or a relationship between them.

A chi-square test or comparable nonparametric test is required to test a hypothesis regarding the distribution of a categorical variable. Categorical variables, which indicate categories such as animals or countries, can be nominal or ordinal. They cannot have a normal distribution since they only have a few particular values.

Chi-Square Test Formula

Chi_Sq_formula.

c = Degrees of freedom

O = Observed Value

E = Expected Value

The degrees of freedom in a  statistical calculation represent the number of variables that can vary. The degrees of freedom can be calculated to ensure that chi-square tests are statistically valid. These tests are frequently used to compare observed data with data expected to be obtained if a particular hypothesis were true.

The Observed values are those you gather yourselves.

The expected values are the anticipated frequencies, based on the null hypothesis. 

Fundamentals of Hypothesis Testing

Hypothesis testing is a technique for interpreting and drawing inferences about a population based on sample data. It aids in determining which sample data best support mutually exclusive population claims.

Null Hypothesis (H0) - The Null Hypothesis is the assumption that the event will not occur. A null hypothesis has no bearing on the study's outcome unless it is rejected.

H0 is the symbol for it, and it is pronounced H-naught.

Alternate Hypothesis(H1 or Ha) - The Alternate Hypothesis is the logical opposite of the null hypothesis. The acceptance of the alternative hypothesis follows the rejection of the null hypothesis. H1 is the symbol for it.

Become an Expert in Data Analytics

Become an Expert in Data Analytics

Types of Chi-Square Tests

There are two main types of Chi-Square tests:

Independence 

  • Goodness-of-Fit 

The Chi-Square Test of Independence is a derivable ( also known as inferential ) statistical test which examines whether the two sets of variables are likely to be related with each other or not. This test is used when we have counts of values for two nominal or categorical variables and is considered as non-parametric test. A relatively large sample size and independence of obseravations are the required criteria for conducting this test.

In a movie theatre, suppose we made a list of movie genres. Let us consider this as the first variable. The second variable is whether or not the people who came to watch those genres of movies have bought snacks at the theatre. Here the null hypothesis is that th genre of the film and whether people bought snacks or not are unrelatable. If this is true, the movie genres don’t impact snack sales. 

Goodness-Of-Fit

In statistical hypothesis testing, the Chi-Square Goodness-of-Fit test determines whether a variable is likely to come from a given distribution or not. We must have a set of data values and the idea of the distribution of this data. We can use this test when we have value counts for categorical variables. This test demonstrates a way of deciding if the data values have a “ good enough” fit for our idea or if it is a representative sample data of the entire population. 

Suppose we have bags of balls with five different colours in each bag. The given condition is that the bag should contain an equal number of balls of each colour. The idea we would like to test here is that the proportions of the five colours of balls in each bag must be exact. 

Chi-Square Test Examples

1. Chi-Square Test for Independence

Example: A researcher wants to determine if there is an association between gender (male/female) and preference for a new product (like/dislike). The test can assess whether preferences are independent of gender.

2. Chi-Square Test for Goodness of Fit

Example: A dice manufacturer wants to test if a six-sided die is fair. They roll the die 60 times and expect each face to appear 10 times. The test checks if the observed frequencies match the expected frequencies.

3. Chi-Square Test for Homogeneity

Example: A fast-food chain wants to see if the preference for a particular menu item is consistent across different cities. The test can compare the distribution of preferences in multiple cities to see if they are homogeneous.

4. Chi-Square Test for a Contingency Table

Example: A study investigates whether smoking status (smoker/non-smoker) is related to the presence of lung disease (yes/no). The test can evaluate the relationship between smoking and lung disease in the sample.

5. Chi-Square Test for Population Proportions

Example: A political analyst wants to see if voter preference (candidate A vs. candidate B) is the same across different age groups. The test can determine if the proportions of preferences differ significantly between age groups.

How to Perform a Chi-Square Test?

Let's say you want to know if gender has anything to do with political party preference. You poll 440 voters in a simple random sample to find out which political party they prefer. The results of the survey are shown in the table below:

chi-1.

To see if gender is linked to political party preference, perform a Chi-Square test of independence using the steps below.

Step 1: Define the Hypothesis

H0: There is no link between gender and political party preference.

H1: There is a link between gender and political party preference.

Step 2: Calculate the Expected Values

Now you will calculate the expected frequency.

Chi_Sq_formula_1.

For example, the expected value for Male Republicans is: 

Chi_Sq_formula_2

Similarly, you can calculate the expected value for each of the cells.

chi-2.

Step 3: Calculate (O-E)2 / E for Each Cell in the Table

Now you will calculate the (O - E)2 / E for each cell in the table.

chi-3.

Step 4: Calculate the Test Statistic X2

X2  is the sum of all the values in the last table

 =  0.743 + 2.05 + 2.33 + 3.33 + 0.384 + 1

Before you can conclude, you must first determine the critical statistic, which requires determining our degrees of freedom. The degrees of freedom in this case are equal to the table's number of columns minus one multiplied by the table's number of rows minus one, or (r-1) (c-1). We have (3-1)(2-1) = 2.

Finally, you compare our obtained statistic to the critical statistic found in the chi-square table. As you can see, for an alpha level of 0.05 and two degrees of freedom, the critical statistic is 5.991, which is less than our obtained statistic of 9.83. You can reject our null hypothesis because the critical statistic is higher than your obtained statistic.

This means you have sufficient evidence to say that there is an association between gender and political party preference.

Chi_Sq_formula_3

What Are Categorical Variables?

Categorical variables belong to a subset of variables that can be divided into discrete categories. Names or labels are the most common categories. These variables are also known as qualitative variables because they depict the variable's quality or characteristics.

Categorical variables can be divided into two categories:

1. Nominal Variable: A nominal variable's categories have no natural ordering. Example: Gender, Blood groups

2. Ordinal Variable: A variable that allows the categories to be sorted is an ordinal variable. An example is customer satisfaction (Excellent, Very Good, Good, Average, Bad, and so on).

Chi-Square Practice Problems

1. voting patterns.

A researcher wants to know if voting preferences (party A, party B, or party C) and gender (male, female) are related. Apply a chi-square test to the following set of data:

  • Male: Party A - 30, Party B - 20, Party C - 50
  • Female: Party A - 40, Party B - 30, Party C - 30

To determine if gender influences voting preferences, run a chi-square test of independence.

2. State of Health

In a sample population, a medical study examines the association between smoking status (smoker, non-smoker) and the occurrence of lung disease (yes, no). The information is as follows:

  • Smoker: Yes - 90, No - 60
  • Non-smoker: Yes - 30, No - 120 

To find out if smoking status is related to the incidence of lung disease, do a chi-square test.

3. Consumer Preferences

Customers are surveyed by a company to determine whether their age group (under 20, 20-40, over 40) and their preferred product category (food, apparel, or electronics) are related. The information gathered is:

  • Under 20: Electronic - 50, Clothing - 30, Food - 20
  • 20-40: Electronic - 60, Clothing - 70, Food - 50
  • Over 40: Electronic - 30, Clothing - 40, Food - 80

Use a chi-square test to investigate the connection between product preference and age group

4. Academic Performance

An educational researcher looks at the relationship between students' success on standardized tests (pass, fail) and whether or not they participate in after-school programs. The information is as follows:

  • Yes: Pass - 80, Fail - 20
  • No: Pass - 50, Fail - 50

Use a chi-square test to determine if involvement in after-school programs and test scores are connected.

5. Genetic Inheritance

A geneticist investigates how a particular trait is inherited in plants and seeks to ascertain whether the expression of a trait (trait present, trait absent) and the existence of a genetic marker (marker present, marker absent) are significantly correlated. The information gathered is:

  • Marker Present: Trait Present - 70, Trait Absent - 30
  • Marker Absent: Trait Present - 40, Trait Absent - 60

Do a chi-square test to determine if there is a correlation between the trait's expression and the genetic marker.

How to Solve Chi-Square Problems?

1. state the hypotheses.

  • Null hypothesis (H0): There is no association between the variables
  • Alternative hypothesis (H1): There is an association between the variables.

2. Calculate the Expected Frequencies

  • Use the formula: E=(Row Total×Column Total)Grand TotalE = \frac{(Row \ Total \times Column \ Total)}{Grand \ Total}E=Grand Total(Row Total×Column Total)​

3. Compute the Chi-Square Statistic

  • Use the formula: χ2=∑(O−E)2E\chi^2 = \sum \frac{(O - E)^2}{E}χ2=∑E(O−E)2​, where O is the observed frequency and E is the expected frequency.

4. Determine the Degrees of Freedom (df)

  • Use the formula: df=(number of rows−1)×(number of columns−1)df = (number \ of \ rows - 1) \times (number \ of \ columns - 1)df=(number of rows−1)×(number of columns−1)

5. Find the Critical Value and Compare

  • Use the chi-square distribution table to find the critical value for the given df and significance level (usually 0.05).
  • Compare the chi-square statistic to the critical value to decide whether to reject the null hypothesis.

These practice problems help you understand how chi-square analysis tests hypotheses and explores relationships between categorical variables in various fields.

Serious About Success? Don't Settle for Less

Serious About Success? Don't Settle for Less

When to Use a Chi-Square Test?

A Chi-Square Test is used to examine whether the observed results are in order with the expected values. When the data to be analysed is from a random sample, and when the variable is the question is a categorical variable, then Chi-Square proves the most appropriate test for the same. A categorical variable consists of selections such as breeds of dogs, types of cars, genres of movies, educational attainment, male v/s female etc. Survey responses and questionnaires are the primary sources of these types of data. The Chi-square test is most commonly used for analysing this kind of data. This type of analysis is helpful for researchers who are studying survey response data. The research can range from customer and marketing research to political sciences and economics. 

Chi-Square Distribution 

Chi-square distributions (X2) are a type of continuous probability distribution. They're commonly utilized in hypothesis testing, such as the chi-square goodness of fit and independence tests. The parameter k, which represents the degrees of freedom, determines the shape of a chi-square distribution.

Very few real-world observations follow a chi-square distribution. Chi-square distributions aim to test hypotheses, not to describe real-world distributions. In contrast, other commonly used distributions, such as normal and Poisson distributions, may explain important things like birth weights or illness cases per year.

Chi-square distributions are excellent for hypothesis testing because of its close resemblance to the conventional normal distribution. Many essential statistical tests rely on the traditional normal distribution.

In statistical analysis , the Chi-Square distribution is used in many hypothesis tests and is determined by the parameter k degree of freedom. It belongs to the family of continuous probability distributions . The Sum of the squares of the k-independent standard random variables is called the Chi-Squared distribution. Pearson’s Chi-Square Test formula is - 

Chi_Square_Distribution_1

Where X^2 is the Chi-Square test symbol

Σ is the summation of observations

O is the observed results

E is the expected results 

The shape of the distribution graph changes with the increase in the value of k, i.e., the degree of freedom. 

When k is 1 or 2, the Chi-square distribution curve is shaped like a backwards ‘J’. It means there is a high chance that X^2 becomes close to zero. 

Courtesy: Scribbr

When k is greater than 2, the shape of the distribution curve looks like a hump and has a low probability that X^2 is very near to 0 or very far from 0. The distribution occurs much longer on the right-hand side and shorter on the left-hand side. The probable value of X^2 is (X^2 - 2).

When k is greater than ninety, a normal distribution is seen, approximating the Chi-square distribution.

What is the P-Value in a Chi-Square Test?

The P-Value in a Chi-Square test is a statistical measure that helps to assess the importance of your test results.

Here P denotes the probability; hence for the calculation of p-values, the Chi-Square test comes into the picture. The different p-values indicate different types of hypothesis interpretations. 

  • P <= 0.05 (Hypothesis interpretations are rejected)
  • P>= 0.05 (Hypothesis interpretations are accepted) 

The concepts of probability and statistics are entangled with Chi-Square Test. Probability is the estimation of something that is most likely to happen. Simply put, it is the possibility of an event or outcome of the sample. Probability can understandably represent bulky or complicated data. And statistics involves collecting and organising, analysing, interpreting and presenting the data. 

Finding P-Value

When you run all of the Chi-square tests, you'll get a test statistic called X2. You have two options for determining whether this test statistic is statistically significant at some alpha level:

  • Compare the test statistic X2 to a critical value from the Chi-square distribution table.
  • Compare the p-value of the test statistic X2 to a chosen alpha level.

Test statistics are calculated by taking into account the sampling distribution of the test statistic under the null hypothesis, the sample data, and the approach which is chosen for performing the test. 

The p-value will be as mentioned in the following cases.

  • A lower-tailed test is specified by: P(TS ts | H0 is true) p-value = cdf (ts)
  • Lower-tailed tests have the following definition: P(TS ts | H0 is true) p-value = cdf (ts)
  • A two-sided test is defined as follows, if we assume that the test static distribution  of H0 is symmetric about 0. 2 * P(TS |ts| | H0 is true) = 2 * (1 - cdf(|ts|))

P: probability Event

TS: Test statistic is computed observed value of the test statistic from your sample cdf(): Cumulative distribution function of the test statistic's distribution (TS)

Tools and Software for Chi-Square Analysis

Here are some commonly used tools and software for performing Chi-Square analysis:

1. SPSS (Statistical Package for the Social Sciences) is a widely used software for statistical analysis, including Chi-Square tests. It provides an easy-to-use interface for performing Chi-Square tests for independence, goodness of fit, and other statistical analyses.

2. R is a powerful open-source programming language and software environment for statistical computing. The chisq.test() function in R allows for easy conducting of Chi-Square tests.

3. The SAS suite is used for advanced analytics, including Chi-Square tests. It is often used in research and business environments for complex data analysis.

4. Microsoft Excel offers a Chi-Square test function (CHISQ.TEST) for users who prefer working within spreadsheets. It’s a good option for basic Chi-Square analysis with smaller datasets.

5. Python (with libraries like SciPy or Pandas) offers robust tools for statistical analysis. The scipy.stats.chisquare() function can be used to perform Chi-Square tests.

Properties of Chi-Square Test 

  • Variance is double the times the number of degrees of freedom.
  • Mean distribution is equal to the number of degrees of freedom.
  • When the degree of freedom increases, the Chi-Square distribution curve becomes normal.

Limitations of Chi-Square Test

There are two limitations to using the chi-square test that you should be aware of. 

  • The chi-square test, for starters, is extremely sensitive to sample size. Even insignificant relationships can appear statistically significant when a large enough sample is used. Keep in mind that "statistically significant" does not always imply "meaningful" when using the chi-square test.
  • Be mindful that the chi-square can only determine whether two variables are related. It does not necessarily follow that one variable has a causal relationship with the other. It would require a more detailed analysis to establish causality.

Invest in Excellence, Join Our Top-Tier Program

Invest in Excellence, Join Our Top-Tier Program

Advanced Chi-Square Test Techniques

1. Chi-Square Test with Yates' Correction (Continuity Correction)

This technique is used in 2x2 contingency tables to reduce the Chi-Square value and correct for the overestimation of statistical significance when sample sizes are small. The correction is achieved by subtracting 0.5 from the absolute difference between each observed and expected frequency.

2. Mantel-Haenszel Chi-Square Test

This technique is used to assess the association between two variables while controlling for one or more confounding variables. It’s particularly useful in stratified analyses where the goal is to examine the relationship between variables across different strata (e.g., age groups, geographic locations).

3. Chi-Square Test for Trend (Cochran-Armitage Test)

This test is used when the categorical variable is ordinal, and you want to assess whether there is a linear trend in the proportions across the ordered groups. It’s commonly used in epidemiology to analyze trends in disease rates over time or across different exposure levels.

4. Monte Carlo Simulation for Chi-Square Test

When the sample size is very small or when expected frequencies are too low, the Chi-Square distribution may not provide accurate p-values. In such cases, Monte Carlo simulation can be used to generate an empirical distribution of the test statistic, providing a more accurate significance level.

5. Bayesian Chi-Square Test

In Bayesian statistics, the Chi-Square test can be adapted to incorporate prior knowledge or beliefs about the data. This approach is useful when existing information should influence the analysis, leading to potentially more accurate conclusions.

In this tutorial titled ‘The Complete Guide to Chi-square test’, you explored the concept of Chi-square distribution and how to find the related values. You also take a look at how the critical value and chi-square value is related to each other.

If you want to gain more insight, get a work-ready understanding of statistical concepts, and learn how to use them to get into a career in Data Analytics, our Post Graduate Program in Data Analytics in partnership with Purdue University should be your next stop. A comprehensive program with training from top practitioners and in collaboration with IBM will be all you need to kickstart your career in the field. Get started today!

1. What is the chi-square test used for? 

The chi-square test is a statistical method used to determine if there is a significant association between two categorical variables. It helps researchers understand whether the observed distribution of data differs from the expected distribution, allowing them to assess whether any relationship exists between the variables being studied.

2. What is the chi-square test and its types? 

The chi-square test is a statistical test used to analyze categorical data and assess the independence or association between variables. There are two main types of chi-square tests:

a) Chi-square test of independence: This test determines whether there is a significant association between two categorical variables. b) Chi-square goodness-of-fit test: This test compares the observed data to the expected data to assess how well the observed data fit the expected distribution.

3. What is the difference between t-test and chi-square? 

The t-test and the chi-square test are two different statistical tests used for various data types. The t-test compares the means of two groups and is suitable for continuous numerical data. On the other hand, the chi-square test examines the association between two categorical variables and is applicable to discrete categorical data.

4. What alternatives exist to the Chi-Square Test?

Alternatives include Fisher's Exact Test for small sample sizes, the G-test for large datasets, and logistic regression for modelling categorical outcomes.

5. What is the null hypothesis for Chi-Square?

The null hypothesis states no association between the categorical variables, meaning their distributions are independent.

6. How do I handle small sample sizes in a Chi-Square Test?

Use Fisher's Exact Test or apply Yates' continuity correction in 2x2 tables for small sample sizes to reduce the risk of inaccurate results.

7. What is the appropriate way to analyze Chi-Square Test results?

Compare the calculated Chi-Square statistic with the critical value from the Chi-Square distribution table; if it's more significant, reject the null hypothesis.

8. What is the advantage of the Chi-Square Test?

The Chi-Square test is simple to calculate and applies to categorical data, making it versatile for analyzing relationships in contingency tables.

Find our PL-300 Microsoft Power BI Certification Training Online Classroom training classes in top cities:

NameDatePlace
21 Sep -6 Oct 2024,
Weekend batch
Your City
12 Oct -27 Oct 2024,
Weekend batch
Your City
26 Oct -10 Nov 2024,
Weekend batch
Your City

About the Author

Avijeet Biswal

Avijeet is a Senior Research Analyst at Simplilearn. Passionate about Data Analytics, Machine Learning, and Deep Learning, Avijeet is also interested in politics, cricket, and football.

Recommended Resources

What Is Hypothesis Testing in Statistics? Types and Examples

Getting Started with Google Display Network: The Ultimate Beginner’s Guide

Sanity Testing Vs Smoke Testing: Know the Differences, Applications, and Benefits Of Each

Sanity Testing Vs Smoke Testing: Know the Differences, Applications, and Benefits Of Each

Fundamentals of Software Testing

Fundamentals of Software Testing

The Key Differences Between Z-Test Vs. T-Test

The Building Blocks of API Development

  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

COMMENTS

  1. Chi-Square (Χ²) Tests

    Alternative hypothesis (H A): The bird species visit the bird feeder in different proportions from the average over the past five years. Chi-square test of independence. You can use a chi-square test of independence when you have two categorical variables. It allows you to test whether the two variables are related to each other.

  2. Chi-Square Test of Independence: Definition, Formula, and Example

    A Chi-Square test of independence uses the following null and alternative hypotheses: H0: (null hypothesis) The two variables are independent. H1: (alternative hypothesis) The two variables are not independent. (i.e. they are associated) We use the following formula to calculate the Chi-Square test statistic X2: X2 = Σ (O-E)2 / E.

  3. Chi-Square Test of Independence

    A chi-square (Χ 2) test of independence is a nonparametric hypothesis test. You can use it to test whether two categorical variables are related to each other. Example: Chi-square test of independence. Imagine a city wants to encourage more of its residents to recycle their household waste.

  4. Chi-Square Test of Independence and an Example

    The Chi-square test of independence determines whether there is a statistically significant relationship between categorical variables. It is a hypothesis test that answers the question—do the values of one categorical variable depend on the value of other categorical variables? This test is also known as the chi-square test of association.

  5. Hypothesis Testing

    The alternative or research hypothesis is that there is a difference in the distribution of responses to the outcome variable among the comparison groups ... The chi-square test of independence can also be used with a dichotomous outcome and the results are mathematically equivalent. In the prior module, we considered the following example. ...

  6. 8.1

    To conduct this test we compute a Chi-Square test statistic where we compare each cell's observed count to its respective expected count. In a summary table, we have r × c = r c cells. Let O 1, O 2, …, O r c denote the observed counts for each cell and E 1, E 2, …, E r c denote the respective expected counts for each cell.

  7. What Is Chi Square Test & How To Calculate Formula Equation

    The Chi-square test is a non-parametric statistical test used to determine if there's a significant association between two or more categorical variables in a sample. It works by comparing the observed frequencies in each category of a cross-tabulation with the frequencies expected under the null hypothesis, which assumes there is no ...

  8. 11.3

    The chi-square (\(\chi^2\)) test of independence is used to test for a relationship between two categorical variables. Recall that if two categorical variables are independent, then \(P(A) = P(A \mid B)\). ... Alternative hypothesis: Seat location and cheating are related in the population. To perform a chi-square test of independence in ...

  9. How to Conduct a Chi-Square Test

    The Chi-Square test is a statistical method designed to examine the association between two or more categorical variables. Sociology Hub. Books, Journals, Papers; ... (H₀) and the alternative hypothesis (H₁). Null Hypothesis (H₀): This hypothesis states that there is no relationship between the two categorical variables. In other words ...

  10. Chi-Square Test: A Comprehensive Guide

    The Chi-Square Test is a statistical method used to determine if there's a significant association between two categorical variables in a sample data set. It checks the independence of these variables, making it a robust and flexible tool for data analysis. ... In contrast, the alternative hypothesis (H1) proposes that these variables are ...

  11. S.4 Chi-Square Tests

    The two categorical variables are dependent. Chi-Square Test Statistic. χ 2 = ∑ (O − E) 2 / E. where O represents the observed frequency. E is the expected frequency under the null hypothesis and computed by: E = row total × column total sample size. We will compare the value of the test statistic to the critical value of χ α 2 with the ...

  12. The Chi-Square Test

    The basic idea behind the test is to compare the observed values in your data to the expected values that you would see if the null hypothesis is true. There are two commonly used Chi-square tests: the Chi-square goodness of fit test and the Chi-square test of independence. Both tests involve variables that divide your data into categories.

  13. PDF The Chi Square Test

    The alternative hypothesis for a chi-square test is always two-sided. (It is technically multi-sided because the differences may occur in both directions in each cell of the table). Alternative Hypothesis: H a: There is a significant association between students' educational level and

  14. Chi-Square Test of Independence

    Alternative hypothesis: Assumes that there is an association between the two variables. Hypothesis testing: Hypothesis testing for the chi-square test of independence as it is for other tests like ANOVA, where a test statistic is computed and compared to a critical value. The critical value for the chi-square statistic is determined by the ...

  15. 11.1: Chi-Square Tests for Independence

    Step 2. The distribution is chi-square. Step 3. To compute the value of the test statistic we must first computed the expected number for each of the six core cells (the ones whose entries are boldface): 1 st row and 1 st column: 1 st row and 2 nd column: 1 st row and 3 rd column: 2 nd row and 1 st column:

  16. Chi-squared test

    The formula for the chi-squared test is χ 2 = Σ (Oi − Ei)2/ Ei, where χ 2 represents the chi-squared value, Oi represents the observed value, Ei represents the expected value (that is, the value expected from the null hypothesis), and the symbol Σ represents the summation of values for all i. One then looks up in a table the chi-squared ...

  17. How the Chi-Squared Test of Independence Works

    I shaded the region that corresponds to chi-square values greater than or equal to our study's value (6.17). When the null hypothesis is correct, chi-square values fall in this area approximately 4.6% of the time, which is the p-value (0.046). With a significance level of 0.05, our sample data are unusual enough to reject the null hypothesis.

  18. Help us do more

    Watch a video that explains how to use the chi-square statistic to test hypotheses about categorical data with an example.

  19. hypothesis testing

    There are some common misunderstandings here. The chi-squared test is perfectly fine to use with tables that are larger than $2\!\times\! 2$.In order for the actual distribution of the chi-squared test statistic to approximate the chi-squared distribution, the traditional recommendation is that all cells have expected values $\ge 5$.Two things must be noted here:

  20. Chi-Square Test vs. t-Test: What's the Difference?

    Chi-Square Test for independence: Allows you to test whether or not not there is a statistically significant association between two categorical variables. When you reject the null hypothesis of a chi-square test for independence, it means there is a significant association between the two variables. t-Test for a difference in means: Allows you ...

  21. Chi-Square Test of Homogeneity

    The alternative hypothesis (H a) is that at least one of the null hypothesis statements is false. ... sample frequencies differ significantly from expected frequencies specified in the null hypothesis. The chi-square test for homogeneity is described in the next section. Analyze Sample Data. Using sample data from the contingency tables, find ...

  22. Chi-Square Goodness of Fit Test

    Example: Chi-square goodness of fit test conditions. You can use a chi-square goodness of fit test to analyze the dog food data because all three conditions have been met: You want to test a hypothesis about the distribution of one categorical variable. The categorical variable is the dog food flavors. You recruited a random sample of 75 dogs.

  23. 11.9: Test of Homogeneity

    Step 1: State the hypotheses. In the test of homogeneity, the null hypothesis says that the distribution of a categorical response variable is the same in each population. In this example, the categorical response variable is steroid use (yes or no). The populations are the three NCAA divisions. H 0: The proportion of athletes using steroids is ...

  24. What is a Chi-Square Test

    The world is constantly curious about the Chi-Square test's application in machine learning and how it makes a difference. Feature selection is a critical topic in machine learning, as you will have multiple features in line and must choose the best ones to build the model.Examining the relationship between the elements, the chi-square test aids in solving feature selection problems.

  25. 9.2: Chi-square contingency tables

    Introduction. We just completed a discussion about goodness of fit tests, inferences on categorical traits for which a theoretical distribution or expectation is available.We offered Mendelian ratios as an example in which theory provides clear-cut expectations for the distribution of phenotypes in the F 2 offspring generation. This is an extrinsic model — theory external to the study guides ...

  26. Five products are available for purchase, A-E. 250

    Question: Five products are available for purchase, A-E. 250 consumers are sampled and 60 prefer A, 50 B, 55C, 45 D , and 40 E .(a) Write down an appropriate null and alternative hypothesis that examines whether each candidateis equally preferred.(b) Construct a Chi-square test-statistic.(c) Write down the correct critical values to test your hypothesis.(d)