Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.
8.8 Hypothesis Tests for a Population Proportion
Learning objectives.
- Conduct and interpret hypothesis tests for a population proportion.
Some notes about conducting a hypothesis test:
- The null hypothesis [latex]H_0[/latex] is always an “equal to.” The null hypothesis is the original claim about the population parameter.
- The alternative hypothesis [latex]H_a[/latex] is a “less than,” “greater than,” or “not equal to.” The form of the alternative hypothesis depends on the context of the question.
- If the alternative hypothesis is a “less than”, then the test is left-tail. The p -value is the area in the left-tail of the distribution.
- If the alternative hypothesis is a “greater than”, then the test is right-tail. The p -value is the area in the right-tail of the distribution.
- If the alternative hypothesis is a “not equal to”, then the test is two-tail. The p -value is the sum of the area in the two-tails of the distribution. Each tail represents exactly half of the p -value.
- Think about the meaning of the p -value. A data analyst (and anyone else) should have more confidence that they made the correct decision to reject the null hypothesis with a smaller p -value (for example, 0.001 as opposed to 0.04) even if using a significance level of 0.05. Similarly, for a large p -value such as 0.4, as opposed to a p -value of 0.056 (a significance level of 0.05 is less than either number), a data analyst should have more confidence that they made the correct decision in not rejecting the null hypothesis. This makes the data analyst use judgment rather than mindlessly applying rules.
- The significance level must be identified before collecting the sample data and conducting the test. Generally, the significance level will be included in the question. If no significance level is given, a common standard is to use a significance level of 5%.
Suppose the hypotheses for a hypothesis test are:
[latex]\begin{eqnarray*} H_0: & & p=20 \% \\ H_a: & & p \gt 20\% \end{eqnarray*}[/latex]
Because the alternative hypothesis is a [latex]\gt[/latex], this is a right-tail test. The p -value is the area in the right-tail of the distribution.
[latex]\begin{eqnarray*} H_0: & & p=50 \% \\ H_a: & & p \neq 50\% \end{eqnarray*}[/latex]
Because the alternative hypothesis is a [latex]\neq[/latex], this is a two-tail test. The p -value is the sum of the areas in the two tails of the distribution. Each tail contains exactly half of the p -value.
[latex]\begin{eqnarray*} H_0: & & p=10\% \\ H_a: & & p \lt 10\% \end{eqnarray*}[/latex]
Because the alternative hypothesis is a [latex]\lt[/latex], this is a left-tail test. The p -value is the area in the left-tail of the distribution.
Steps to Conduct a Hypothesis Test for a Population Proportion
- Write down the null and alternative hypotheses in terms of the population proportion [latex]p[/latex]. Include appropriate units with the values of the proportion.
- Use the form of the alternative hypothesis to determine if the test is left-tailed, right-tailed, or two-tailed.
- Collect the sample information for the test and identify the significance level.
- If [latex]n \times p \geq 5[/latex] and [latex]n \times (1-p) \geq 5[/latex], use the normal distribution with [latex]\displaystyle{z=\frac{\hat{p}-p}{\sqrt{\frac{p \times (1-p)}{n}}}}[/latex].
- If one of [latex]n \times p \lt 5[/latex] or [latex]n \times (1-p) \lt 5[/latex], use a binomial distribution.
- The results of the sample data are significant. There is sufficient evidence to conclude that the null hypothesis [latex]H_0[/latex] is an incorrect belief and that the alternative hypothesis [latex]H_a[/latex] is most likely correct.
- The results of the sample data are not significant. There is not sufficient evidence to conclude that the alternative hypothesis [latex]H_a[/latex] may be correct.
- Write down a concluding sentence specific to the context of the question.
USING EXCEL TO CALCULE THE P -VALUE FOR A HYPOTHESIS TEST ON A POPULATION PROPORTION
The p -value for a hypothesis test on a population proportion is the area in the tail(s) of distribution of the sample proportion. If both [latex]n \times p \geq 5[/latex] and [latex]n \times (1-p) \geq 5[/latex], use the normal distribution to find the p -value. If at least one of [latex]n \times p \lt 5[/latex] or [latex]n \times (1-p) \lt 5[/latex], use the binomial distribution to find the p -value.
If both [latex]n \times p \geq 5[/latex] and [latex]n \times (1-p) \geq 5[/latex]:
- For x , enter the value for [latex]\hat{p}[/latex].
- For [latex]\mu[/latex] , enter the mean of the sample proportions [latex]p[/latex]. Note: Because the test is run assuming the null hypothesis is true, the value for [latex]p[/latex] is the claim from the null hypothesis.
- For [latex]\sigma[/latex] , enter the standard error of the proportions [latex]\displaystyle{\sqrt{\frac{p \times (1-p)}{n}}}[/latex].
- For the logic operator , enter true . Note: Because we are calculating the area under the curve, we always enter true for the logic operator.
- Use the appropriate technique with the norm.dist function to find the area in the left-tail or the area in the right-tail.
If at least one of [latex]n \times p \lt 5[/latex] or [latex]n \times (1-p) \lt 5[/latex]:
- The p -value is found using the binomial distribution.
- For x , enter the number of successes.
- For n , enter the sample size.
- For p , enter the the value of the population proportion [latex]p[/latex] from the null hypothesis.
- For the logic operator , enter true . Note: Because we are calculating an at most probability, the logic operator is always true.
- For p , enter the the value of the population proportion [latex]p[/latex] in the null hypothesis.
- For the logic operator , enter true . Note: Because we are calculating an at least probability, the logic operator is always true.
Marketers believe that 92% of adults own a cell phone. A cell phone manufacturer believes that number is actually lower. In a sample of 200 adults, 87% own a cell phone. At the 1% significance level, determine if the proportion of adults that own a cell phone is lower than the marketers’ claim.
Hypotheses:
[latex]\begin{eqnarray*} H_0: & & p=92\% \mbox{ of adults own a cell phone} \\ H_a: & & p \lt 92\% \mbox{ of adults own a cell phone} \end{eqnarray*}[/latex]
From the question, we have [latex]n=200[/latex], [latex]\hat{p}=0.87[/latex], and [latex]\alpha=0.01[/latex].
To determine the distribution, we check [latex]n \times p[/latex] and [latex]n \times (1-p)[/latex]. For the value of [latex]p[/latex], we use the claim from the null hypothesis ([latex]p=0.92[/latex]).
[latex]\begin{eqnarray*} n \times p & = & 200 \times 0.92=184 \geq 5 \\ n \times (1-p) & = & 200 \times (1-0.92)=16 \geq 5\end{eqnarray*}[/latex]
Because both [latex]n \times p \geq 5[/latex] and [latex]n \times (1-p) \geq 5[/latex] we use a normal distribution to calculate the p -value. Because the alternative hypothesis is a [latex]\lt[/latex], the p -value is the area in the left tail of the distribution.
So the p -value[latex]=0.0046[/latex].
Conclusion:
Because p -value[latex]=0.0046 \lt 0.01=\alpha[/latex], we reject the null hypothesis in favour of the alternative hypothesis. At the 1% significance level there is enough evidence to suggest that the proportion of adults who own a cell phone is lower than 92%.
- The null hypothesis [latex]p=92\%[/latex] is the claim that 92% of adults own a cell phone.
- The alternative hypothesis [latex]p \lt 92\%[/latex] is the claim that less than 92% of adults own a cell phone.
- The function is norm.dist because we are finding the area in the left tail of a normal distribution.
- Field 1 is the value of [latex]\hat{p}[/latex].
- Field 2 is the value of [latex]p[/latex] from the null hypothesis. Remember, we run the test assuming the null hypothesis is true, so that means we assume [latex]p=0.92[/latex].
- Field 3 is the standard deviation for the sample proportions [latex]\displaystyle{\sqrt{\frac{p \times (1-p)}{n}}}[/latex].
- The p -value of 0.0046 tells us that under the assumption that 92% of adults own a cell phone (the null hypothesis), there is only a 0.46% chance that the proportion of adults who own a cell phone in a sample of 200 is 87% or less. This is a small probability, and so is unlikely to happen assuming the null hypothesis is true. This suggests that the assumption that the null hypothesis is true is most likely incorrect, and so the conclusion of the test is to reject the null hypothesis in favour of the alternative hypothesis. In other words, the proportion of adults who own a cell phone is most likely less than 92%.
A consumer group claims that the proportion of households that have at least three cell phones is 30%. A cell phone company has reason to believe that the proportion of households with at least three cell phones is much higher. Before they start a big advertising campaign based on the proportion of households that have at least three cell phones, they want to test their claim. Their marketing people survey 150 households with the result that 54 of the households have at least three cell phones. At the 1% significance level, determine if the proportion of households that have at least three cell phones is less than 30%.
[latex]\begin{eqnarray*} H_0: & & p=30\% \mbox{ of household have at least 3 cell phones} \\ H_a: & & p \gt 30\% \mbox{ of household have at least 3 cell phones} \end{eqnarray*}[/latex]
From the question, we have [latex]n=150[/latex], [latex]\displaystyle{\hat{p}=\frac{54}{150}=0.36}[/latex], and [latex]\alpha=0.01[/latex].
To determine the distribution, we check [latex]n \times p[/latex] and [latex]n \times (1-p)[/latex]. For the value of [latex]p[/latex], we use the claim from the null hypothesis ([latex]p=0.3[/latex]).
[latex]\begin{eqnarray*} n \times p & = & 150 \times 0.3=45 \geq 5 \\ n \times (1-p) & = & 150 \times (1-0.3)=105 \geq 5\end{eqnarray*}[/latex]
Because both [latex]n \times p \geq 5[/latex] and [latex]n \times (1-p) \geq 5[/latex] we use a normal distribution to calculate the p -value. Because the alternative hypothesis is a [latex]\gt[/latex], the p -value is the area in the right tail of the distribution.
So the p -value[latex]=0.0544[/latex].
Because p -value[latex]=0.0544 \gt 0.01=\alpha[/latex], we do not reject the null hypothesis. At the 1% significance level there is not enough evidence to suggest that the proportion of households with at least three cell phones is more than 30%.
- The null hypothesis [latex]p=30\%[/latex] is the claim that 30% of households have at least three cell phones.
- The alternative hypothesis [latex]p \gt 30\%[/latex] is the claim that more than 30% of households have at least three cell phones.
- The function is 1-norm.dist because we are finding the area in the right tail of a normal distribution.
- Field 2 is the value of [latex]p[/latex] from the null hypothesis. Remember, we run the test assuming the null hypothesis is true, so that means we assume [latex]p=0.3[/latex].
- The p -value of 0.0544 tells us that under the assumption that 30% of households have at least three cell phones (the null hypothesis), there is a 5.44% chance that the proportion of households with at least three cell phones in a sample of 150 is 36% or more. Compared to the 1% significance level, this is a large probability, and so is likely to happen assuming the null hypothesis is true. This suggests that the assumption that the null hypothesis is true is most likely correct, and so the conclusion of the test is to not reject the null hypothesis. In other words, the claim that 30% of households have at least three cell phones is most likely correct.
A teacher believes that 70% of students in the class will want to go on a field trip to the local zoo. The students in the class believe the proportion is much higher and ask the teacher to verify her claim. The teacher samples 50 students and 39 reply that they would want to go to the zoo. At the 5% significance level, determine if the proportion of students who want to go on the field trip is higher than 70%.
[latex]\begin{eqnarray*} H_0: & & p = 70\% \mbox{ of students want to go on the field trip} \\ H_a: & & p \gt 70\% \mbox{ of students want to go on the field trip} \end{eqnarray*}[/latex]
From the question, we have [latex]n=50[/latex], [latex]\displaystyle{\hat{p}=\frac{39}{50}=0.78}[/latex], and [latex]\alpha=0.05[/latex].
[latex]\begin{eqnarray*} n \times p & = & 50 \times 0.7=35 \geq 5 \\ n \times (1-p) & = & 50 \times (1-0.7)=15 \geq 5\end{eqnarray*}[/latex]
Because both [latex]n \times p \geq 5[/latex] and [latex]n \times (1-p) \geq 5[/latex] we use a normal distribution to calculate the p -value. Because the alternative hypothesis is a [latex]\gt[/latex], the p -value is the area in the right tail of the distribution.
So the p -value[latex]=0.1085[/latex].
Because p -value[latex]=0.1085 \gt 0.05=\alpha[/latex], we do not reject the null hypothesis. At the 5% significance level there is not enough evidence to suggest that the proportion of students who want to go on the field trip is higher than 70%.
- The null hypothesis [latex]p=70\%[/latex] is the claim that 70% of the students want to go on the field trip.
- The alternative hypothesis [latex]p \gt 70\%[/latex] is the claim that more than 70% of students want to go on the field trip.
- The p -value of 0.1085 tells us that under the assumption that 70% of students want to go on the field trip (the null hypothesis), there is a 10.85% chance that the proportion of students who want to go on the field trip in a sample of 50 students is 78% or more. Compared to the 5% significance level, this is a large probability, and so is likely to happen assuming the null hypothesis is true. This suggests that the assumption that the null hypothesis is true is most likely correct, and so the conclusion of the test is to not reject the null hypothesis. In other words, the teacher’s claim that 70% of students want to go on the field trip is most likely correct.
Joan believes that 50% of first-time brides in the United States are younger than their grooms. She performs a hypothesis test to determine if the percentage is the same or different from 50%. Joan samples 100 first-time brides and 56 reply that they are younger than their grooms. Use a 5% significance level.
[latex]\begin{eqnarray*} H_0: & & p=50\% \mbox{ of first-time brides are younger than the groom} \\ H_a: & & p \neq 50\% \mbox{ of first-time brides are younger than the groom} \end{eqnarray*}[/latex]
From the question, we have [latex]n=100[/latex], [latex]\displaystyle{\hat{p}=\frac{56}{100}=0.56}[/latex], and [latex]\alpha=0.05[/latex].
To determine the distribution, we check [latex]n \times p[/latex] and [latex]n \times (1-p)[/latex]. For the value of [latex]p[/latex], we use the claim from the null hypothesis ([latex]p=0.5[/latex]).
[latex]\begin{eqnarray*} n \times p & = & 100 \times 0.5=50 \geq 5 \\ n \times (1-p) & = & 100 \times (1-0.5)=50 \geq 5\end{eqnarray*}[/latex]
Because both [latex]n \times p \geq 5[/latex] and [latex]n \times (1-p) \geq 5[/latex] we use a normal distribution to calculate the p -value. Because the alternative hypothesis is a [latex]\neq[/latex], the p -value is the sum of area in the tails of the distribution.
Because there is only one sample, we only have information relating to one of the two tails, either the left or the right. We need to know if the sample relates to the left or right tail because that will determine how we calculate out the area of that tail using the normal distribution. In this case, the sample proportion [latex]\hat{p}=0.56[/latex] is greater than the value of the population proportion in the null hypothesis [latex]p=0.5[/latex] ([latex]\hat{p}=0.56>0.5=p[/latex]), so the sample information relates to the right-tail of the normal distribution. This means that we will calculate out the area in the right tail using 1-norm.dist . However, this is a two-tailed test where the p -value is the sum of the area in the two tails and the area in the right-tail is only one half of the p -value. The area in the left tail equals the area in the right tail and the p -value is the sum of these two areas.
So the area in the right tail is 0.1151 and [latex]\frac{1}{2}[/latex]( p -value)[latex]=0.1151[/latex]. This is also the area in the left tail, so
p -value[latex]=0.1151+0.1151=0.2302[/latex]
Because p -value[latex]=0.2302 \gt 0.05=\alpha[/latex], we do not reject the null hypothesis. At the 5% significance level there is not enough evidence to suggest that the proportion of first-time brides that are younger than the groom is different from 50%.
- The null hypothesis [latex]p=50\%[/latex] is the claim that the proportion of first-time brides that are younger than the groom is 50%.
- The alternative hypothesis [latex]p \neq 50\%[/latex] is the claim that the proportion of first-time brides that are younger than the groom is different from 50%.
- We use norm.dist([latex]\hat{p}[/latex],[latex]p[/latex],[latex]\mbox{sqrt}(p*(1-p)/n)[/latex],true) to find the area in the left tail. The area in the right tail equals the area in the left tail, so we can find the p -value by adding the output from this function to itself.
- We use 1-norm.dist([latex]\hat{p}[/latex],[latex]p[/latex],[latex]\mbox{sqrt}(p*(1-p)/n)[/latex],true) to find the area in the right tail. The area in the left tail equals the area in the right tail, so we can find the p -value by adding the output from this function to itself.
- The p -value of 0.2302 is a large probability compared to the 5% significance level, and so is likely to happen assuming the null hypothesis is true. This suggests that the assumption that the null hypothesis is true is most likely correct, and so the conclusion of the test is to not reject the null hypothesis. In other words, the claim that the proportion of first-time brides who are younger than the groom is most likely correct.
Watch this video: Hypothesis Testing for Proportions: z -test by ExcelIsFun [7:27]
An online retailer believes that 93% of the visitors to its website will make a purchase. A researcher in the marketing department thinks the actual percent is lower than claimed. The researcher examines a sample of 50 visits to the website and finds that 45 of the visits resulted in a purchase. At the 1% significance level, determine if the proportion of visits to the website that result in a purchase is lower than claimed.
[latex]\begin{eqnarray*} H_0: & & p=93\% \mbox{ of visitors make a purchase} \\ H_a: & & p \lt 93\% \mbox{ of visitors make a purchase} \end{eqnarray*}[/latex]
From the question, we have [latex]n=50[/latex], [latex]x=45[/latex], and [latex]\alpha=0.01[/latex].
To determine the distribution, we check [latex]n \times p[/latex] and [latex]n \times (1-p)[/latex]. For the value of [latex]p[/latex], we use the claim from the null hypothesis ([latex]p=0.93[/latex]).
[latex]\begin{eqnarray*} n \times p & = & 50 \times 0.93=46.5 \geq 5 \\ n \times (1-p) & = & 50 \times (1-0.93)=3.5 \lt 5\end{eqnarray*}[/latex]
Because [latex]n \times (1-p) \lt 5[/latex] we use a binomial distribution to calculate the p -value. Because the alternative hypothesis is a [latex]\lt[/latex], the p -value is the probability of getting at most 45 successes in 50 trials.
So the p -value[latex]=0.2710[/latex].
Because p -value[latex]=0.2710 \gt 0.01=\alpha[/latex], we do not reject the null hypothesis. At the 1% significance level there is not enough evidence to suggest that the proportion of visitors who make a purchase is lower than 93%.
- The null hypothesis [latex]p=93\%[/latex] is the claim that 93% of visitors to the website make a purchase.
- The alternative hypothesis [latex]p \lt 93\%[/latex] is the claim that less than 93% of visitors to the website make a purchase.
- The function is binom.dist because we are finding the probability of at most 45 successes.
- Field 1 is the number of successes [latex]x[/latex].
- Field 2 is the sample size [latex]n[/latex].
- Field 3 is the probability of success [latex]p[/latex]. This is the claim about the population proportion made in the null hypothesis, so that means we assume [latex]p=0.93[/latex].
- The p -value of 0.2710 tells us that under the assumption that 93% of visitors make a purchase (the null hypothesis), there is a 27.10% chance that the number of visitors in a sample of 50 who make a purchase is 45 or less. This is a large probability compared to the significance level, and so is likely to happen assuming the null hypothesis is true. This suggests that the assumption that the null hypothesis is true is most likely correct, and so the conclusion of the test is to not reject the null hypothesis. In other words, the proportion of visitors to the website who make a purchase adults is most likely 93%.
A drug company claims that only 4% of people who take their new drug experience any side effects from the drug. A researcher believes that the percent is higher than drug company’s claim. The researcher takes a sample of 80 people who take the drug and finds that 10% of the people in the sample experience side effects from the drug. At the 5% significance level, determine if the proportion of people who experience side effects from taking the drug is higher than claimed.
[latex]\begin{eqnarray*} H_0: & & p=4\% \mbox{ of people experience side effects} \\ H_a: & & p \gt 4\% \mbox{ of people experience side effects} \end{eqnarray*}[/latex]
From the question, we have [latex]n=80[/latex], [latex]\hat{p}=0.1[/latex], and [latex]\alpha=0.05[/latex].
To determine the distribution, we check [latex]n \times p[/latex] and [latex]n \times (1-p)[/latex]. For the value of [latex]p[/latex], we use the claim from the null hypothesis ([latex]p=0.04[/latex]).
[latex]\begin{eqnarray*} n \times p & = & 80 \times 0.04=3.2 \lt 5\end{eqnarray*}[/latex]
Because [latex]n \times p \lt 5[/latex] we use a binomial distribution to calculate the p -value. Because the alternative hypothesis is a [latex]\gt[/latex], the p -value is the probability of getting at least 8 successes in 80 trials. (Note: In the sample of size 80, 10% have the characteristic of interest, so this means that [latex]80 \times 0.1=8[/latex] people in the sample have the characteristic of interest.)
So the p -value[latex]=0.0147[/latex].
Because p -value[latex]=0.0147 \lt 0.05=\alpha[/latex], we reject the null hypothesis in favour of the alternative hypothesis. At the 5% significance level there is enough evidence to suggest that the proportion of people who experience side effects from taking the drug is higher than 4%.
- The null hypothesis [latex]p=4\%[/latex] is the claim that 4% of the people experience side effects from taking the drug.
- The alternative hypothesis [latex]p \gt 4\%[/latex] is the claim that more than 4% of the people experience side effects from taking the drug.
- The function is 1-binom.dist because we are finding the probability of at least 8 successes.
- Field 1 is [latex]x-1[/latex] where [latex]x[/latex] is the number of successes. In this case, we are using the compliment rule to change the probability of at least 8 successes into 1 minus the probability of at most 7 successes.
- Field 3 is the probability of success [latex]p[/latex]. This is the claim about the population proportion made in the null hypothesis, so that means we assume [latex]p=0.04[/latex].
- The p -value of 0.0147 tells us that under the assumption that 4% of people experience side effects (the null hypothesis), there is a 1.47% chance that the number of people in a sample of 80 who experience side effects is 8 or more. This is a small probability compared to the significance level, and so is unlikely to happen assuming the null hypothesis is true. This suggests that the assumption that the null hypothesis is true is most likely incorrect, and so the conclusion of the test is to reject the null hypothesis in favour of the alternative hypothesis. In other words, the proportion of people who experience side effects is most likely greater than 4%.
Concept Review
The hypothesis test for a population proportion is a well-established process:
- Find the p -value (the area in the corresponding tail) for the test using the appropriate distribution (normal or binomial).
- Compare the p -value to the significance level and state the outcome of the test.
Attribution
“ 9.6 Hypothesis Testing of a Single Mean and Single Proportion “ in Introductory Statistics by OpenStax is licensed under a Creative Commons Attribution 4.0 International License.
Introduction to Statistics Copyright © 2022 by Valerie Watts is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.
Statistics and Probability Module: Solving Problems Involving Test of Hypothesis on Population Proportion
This Self-Learning Module (SLM) is prepared so that you, our dear learners, can continue your studies and learn while at home. Activities, questions, directions, exercises, and discussions are carefully stated for you to understand each lesson.
Each SLM is composed of different parts. Each part shall guide you step-by-step as you discover and understand the lesson prepared for you.
Pre-tests are provided to measure your prior knowledge on lessons in each SLM. This will tell you if you need to proceed on completing this module or if you need to ask your facilitator or your teacher’s assistance for better understanding of the lesson. At the end of each module, you need to answer the post-test to self-check your learning. Answer keys are provided for each activity and test. We trust that you will be honest in using these.
Please use this module with care. Do not put unnecessary marks on any part of this SLM. Use a separate sheet of paper in answering the exercises and tests. And read the instructions carefully before performing each task.
In real life whenever we are confronted with problems, our decision-making skill is being tested. Before we decide, there are certain considerations and analysis of the given conditions must be made. Someone can be an expert problem solver if s/he is able to apply the learned concepts in a particular situation. Although problem solving has steps, someone may have his/her own way or techniques of solving a problem.
Meanwhile, in statistical analysis, there are steps that need to be followed in solving problems involving test of hypothesis on population proportion. The objective is for us to make a correct decision about the null hypothesis. It is whether we can confidently say that the change in our data is real, definite, and not attributed by chance.
After going through this module, you are expected to:
1. enumerate the steps in solving problems involving test of hypothesis on population proportion; and
2. solve problems involving test of hypothesis on the population proportion.
Statistics and Probability Quarter 4 Self-Learning Module: Solving Problems Involving Test of Hypothesis on Population Proportion
Can't find what you're looking for.
We are here to help - please use the search box below.
Leave a Comment Cancel reply
Chapter 8: Inference for One Proportion
Hypothesis test for a population proportion (3 of 3), learning objectives.
- Conduct a hypothesis test for a population proportion. State a conclusion in context.
- Interpret the P-value as a conditional probability in the context of a hypothesis test about a population proportion.
- Distinguish statistical significance from practical importance.
- From a description of a study, evaluate whether the conclusion of a hypothesis test is reasonable.
More about the P-Value
The P-value is a probability that describes the likelihood of the data if the null hypothesis is true. More specifically, the P-value is the probability that sample results are as extreme as or more extreme than the data if the null hypothesis is true. The phrase “as extreme as or more extreme than” means farther from the center of the sampling distribution in the direction of the alternative hypothesis.
More generally, we view the P-value a description of the strength of the evidence against the null hypothesis and in support of the alternative hypothesis. But the P-value is a probability about sample results, not about the null or alternative hypothesis.
One More Note about P-Values and the Significance Level
You may wonder why 5% is often selected as the significance level in hypothesis testing and why 1% is also a commonly used level. It is largely due to just convenience and tradition. When Ronald Fisher (one of the founders of modern statistics) published one of his tables, he used a mathematically convenient scale that included 5% and 1%. Later, these same 5% and 1% levels were used by other people, in part just because Fisher was so highly esteemed. But mostly, these are arbitrary levels.
The idea of selecting some sort of relatively small cutoff was historically important in the development of statistics. But it’s important to remember that there is really a continuous range of increasing confidence toward the alternative hypothesis, not a single all-or-nothing value. There isn’t much meaningful difference, for instance, between the P-values 0.049 and 0.051, and it would be foolish to declare one case definitely a “real” effect and the other case definitely a “random” effect. In either case, the study results are roughly 5% likely by chance if there’s no actual effect.
Whether such a P-value is sufficient for us to reject a particular null hypothesis ultimately depends on the risk of making the wrong decision and the extent to which the hypothesized effect might contradict our prior experience or previous studies.
Sample Size and Hypothesis Testing
Consider our earlier example about teenagers and Internet access. According to the Kaiser Family Foundation, 84% of U.S. children ages 8 to 18 had Internet access at home as of August 2009. Researchers wonder if this number has changed since then. The hypotheses we tested were:
- H 0 : p = 0.84
- H a : p ≠ 0.84
The original sample consisted of 500 children, and 86% of them had Internet access at home. The P-value was about 0.22, which was not strong enough to reject the null hypothesis. There was not enough evidence to show that the proportion of all U.S. children ages 8 to 18 have Internet access at home.
Suppose we sampled 2,000 children and the sample proportion was still 86%. Our test statistic would be Z ≈ 2.44, and our P-value would be about 0.015. The larger sample size would allow us to reject the null hypothesis even though the sample proportion was the same.
Why does this happen? Larger samples vary less, so a sample proportion of 0.86 is more unusual with larger samples than with smaller samples if the population proportion is really 0.84. This means that if the alternative hypothesis is true, a larger sample size will make it more likely that we reject the null. Therefore, we generally prefer a larger sample as we have seen previously.
Drawing Conclusions from Hypothesis Tests
It is tempting to get involved in the details of a hypothesis test without thinking about how the data was collected. Whether we are calculating a confidence interval or performing a hypothesis test, the results are meaningless without a properly designed study. Consider the following exercises about how data collection can affect the results of a study.
Learn By Doing
Let’s summarize.
In this section, we looked at the four steps of a hypothesis test as they relate to a claim about a population proportion.
Step 1: Determine the hypotheses.
- The hypotheses are claims about the population proportion, p .
- The null hypothesis is a hypothesis that the proportion equals a specific value, p 0 .
- The alternative hypothesis is the competing claim that the parameter is less than, greater than, or not equal to p 0 .
Step 2: Collect the data.
Since the hypothesis test is based on probability, random selection or assignment is essential in data production. Additionally, we need to check whether the sample proportion can be np ≥ 10 and n (1 − p ) ≥ 10.
Step 3: Assess the evidence.
- Determine the test statistic which is the z -score for the sample proportion. The formula is: [latex]Z=\frac{\stackrel{ˆ}{p}-{p}_{0}}{\sqrt{\frac{{p}_{0}(1-{p}_{0})}{n}}}[/latex]
- Use the test statistic, together with the alternative hypothesis to determine the P-value. You can use a standard normal table (or Z -table) or technology (such as the simulations on the second page of this topic) to find the P-value.
- If the alternative hypothesis is greater than, the P-value is the area to the right of the test statistic. If the alternative hypothesis is less than, the P-value is the area to the left of the test statistic. If the alternative hypothesis is not equal to, the P-value is equal to double the tail area beyond the test statistic.
Step 4: Give the conclusion.
- A small P-value says the data is unlikely to occur if the null is true. If the P-value is less than or equal to the significance level, we reject the null hypothesis and accept the alternative hypothesis instead.
- If the P-value is greater than the significance level, we say we “fail to reject” the null hypothesis. We never say that we “accept” the null hypothesis. We just say that we don’t have enough evidence to reject it. This is equivalent to saying we don’t have enough evidence to support the alternative hypothesis.
- We write the conclusion in the context of the research question. Our conclusion is usually a statement about the alternative hypothesis (we accept H a or fail to accept H a ) and should include the P-value.
Other Hypothesis Testing Notes
Remember that the P-value is the probability of seeing a sample proportion as extreme as the one observed from the data if the null hypothesis is true. The probability is about the random sample, not about the null or alternative hypothesis.
A larger sample size makes it more likely that we will reject the null hypothesis if the alternative is true. Another way of thinking about this is that increasing the sample size will decrease the likelihood of a type II error. Recall that a type II error is failing to reject the null hypothesis when the alternative is true.
Increasing the sample size can have the unintended effect of making the test sensitive to differences so small they don’t matter. A statistically significant difference is one large enough that it is unlikely to be due to sampling variability alone. Even a difference so small that it is not important can be statistically significant if the sample size is big enough.
Finally, remember the phrase “garbage in, garbage out.” If the data collection methods are poor, then the results of a hypothesis test are meaningless. No statistical methods can create useful information if our data comes from convenience or voluntary response samples. Additionally, the results of a hypothesis test apply only to the population from whom the sample was chosen.
Candela Citations
- Concepts in Statistics. Provided by : Open Learning Initiative. Located at : http://oli.cmu.edu . License : CC BY: Attribution
Privacy Policy
COMMENTS
able to apply the learned concepts in a particular situation. Although problem solving has steps, someone may have his/her own way or techniques of solving a problem. Meanwhile, in statistical analysis, there are steps that need to be followed in solving problems involving test of hypothesis on population proportion. The objective is for
Steps to Conduct a Hypothesis Test for a Population Proportion. Write down the null and alternative hypotheses in terms of the population proportion [latex]p[/latex]. Include appropriate units with the values of the proportion. Use the form of the alternative hypothesis to determine if the test is left-tailed, right-tailed, or two-tailed.
Jan 8, 2024 · This page titled 3.4: Hypothesis Test for a Population Proportion is shared under a CC BY-NC-SA 3.0 license and was authored, remixed, and/or curated by Diane Kiernan via source content that was edited to the style and standards of the LibreTexts platform.
Jul 12, 2023 · The next example is a poem written by a statistics student named Nicole Hart. The solution to the problem follows the poem. Notice that the hypothesis test is for a single population proportion. This means that the null and alternate hypotheses use the parameter \(p\). The distribution for the test is normal.
Although problem solving has steps, someone may have his/her own way or techniques of solving a problem. Meanwhile, in statistical analysis, there are steps that need to be followed in solving problems involving test of hypothesis on population proportion. The objective is for us to make a correct decision about the null hypothesis.
1. enumerate the steps in solving problems involving test of hypothesis on population proportion; and 2. solve problems involving test of hypothesis on the population proportion. What I Know Directions: Choose the best answer to the given questions or statements. Write the letter of your choice on a separate sheet of paper. 1.
Aug 8, 2024 · Both the critical value approach and the p-value approach can be applied to test hypotheses about a population proportion p. The null hypothesis will have the form \(H_0 : p = p_0\) for some specific number \(p_0\) between \(0\) and \(1\). The alternative hypothesis will be one of the three inequalities \(p <p_0\), \(p>p_0\), or \(p≠p_0\)
Hypothesis testing about the population proportion is carried out very similarly to the familiar method for hypothesis testing involving the population mean. Assumptions: 1.Simple random sample of size n ≤0.05N is collected. 2.If p 0 is the assumed value of the population proportion, then np 0(1 −p 0) ≥10. 3.The test statistic will be ...
The hypotheses are claims about the population proportion, p. The null hypothesis is a hypothesis that the proportion equals a specific value, p 0. The alternative hypothesis is the competing claim that the parameter is less than, greater than, or not equal to p 0. Step 2: Collect the data. Since the hypothesis test is based on probability ...
This document provides information and practice problems about hypothesis testing on population proportions: - It outlines the steps to solve problems involving hypothesis testing on population proportions and provides examples to practice these steps. - Example practice problems are given to test understanding of key concepts like the null and alternative hypotheses, test statistics, critical ...