Sampling Bias: Types, Examples & How to Avoid It
Julia Simkus
Editor at Simply Psychology
BA (Hons) Psychology, Princeton University
Julia Simkus is a graduate of Princeton University with a Bachelor of Arts in Psychology. She is currently studying for a Master's Degree in Counseling for Mental Health and Wellness in September 2023. Julia's research has been published in peer reviewed journals.
Learn about our Editorial Process
Saul McLeod, PhD
Editor-in-Chief for Simply Psychology
BSc (Hons) Psychology, MRes, PhD, University of Manchester
Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.
Olivia Guy-Evans, MSc
Associate Editor for Simply Psychology
BSc (Hons) Psychology, MSc Psychology of Education
Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.
On This Page:
Sampling bias occurs when a sample does not accurately represent the population being studied. This can happen when there are systematic errors in the sampling process, leading to over-representation or under-representation of certain groups within the sample.
Sampling bias results in biased samples of a population where all individuals were not equally likely to have been selected and thus do not accurately represent the entire group.
Sampling bias compromises the external validity of findings by failing to accurately represent the population, restricting the generalization of results only to groups that share characteristics with the sample.
In medical fields, sampling bias is ascertainment bias, where one category of participants is over-represented in the sample.
Sampling bias is problematic because it leaves out important research data, threatening external validity. The results from research completed with a sampling bias are misleading and exclude valuable data.
This limits the generalizability of your findings because findings from biased samples can only be generalized to populations that share characteristics with the sample. Thus the results from the research cannot be used to express the ideas and thoughts of the majority.
When there is sampling bias in your study, differences between the samples from a population and the entire population they represent are not due to chance but rather due to this bias.
Correcting or reducing sampling bias is important during the research because the population will not be accurately represented if the sample bias is not addressed.
It is important to note that sampling bias occurs during data collection and refers to the method of sampling, not the sample itself. Additionally, sampling bias often happens without the researcher’s knowledge.
Imagine you want to study the prevalence of depression amongst undergraduate students at your university. You send out an email to the undergraduate student body asking for volunteers to participate in your study.
This method will lead to sampling bias because only the people who are open to talking about their depression will sign up to participate.
This is an example of voluntary response bias because only those individuals who are willing to talk about their experiences with depression will agree to take part in a study, making the participants a non-representative sample.
Undercoverage Bias
Undercoverafe bias occurs when some population members are inadequately represented in the sample.
For example, administering a survey online will exclude groups with limited internet access, such as the elderly and those in lower-income households.
Voluntary Response Bias / Self-Selection Bias
Self-selection bias is a type of bias that occurs when participants can choose whether or not to participate in the project.
Bias arises because people with specific characteristics might be more likely to agree to participate in a study than others, making the participants a non-representative sample.
For example, people with strong opinions or substantial knowledge about a specific topic may be more willing to spend time answering a survey than those without.
Survivorship Bias
Survivorship bias refers to when researchers focus on individuals, groups, or observations that have passed some sort of selection process while ignoring those who did not.
In other words, only “surviving” subjects are selected. For example, in finance, failed companies tend to be excluded from performance studies because they no longer exist.
This causes the results to skew higher because only companies that were successful enough to survive are included.
Non-Response Bias
Non-response bias is a type of bias that arises when people who refuse to participate or drop out of a study systematically differ from those who take part.
For example, if conducting a study on the prevalence of depression in a community, your results may be an underestimation if those with depression are less likely to participate than those without depression.
Recall Bias
Recall bias occurs when some members of your sample cannot remember important details accurately. As a result, they might provide incomplete or incorrect information that can distort your research findings.
This type of bias tends to affect retrospective surveys that rely on self-reported data.
Exclusion Bias
This bias results from intentionally excluding a particular group from the sample. Exclusion bias is closely related to non-response bias.
Observer Bias
Observer bias refers to the tendency of observers not to see what is there, but instead to see what they expect or want to see.
This bias can result in an overestimation or underestimation of what is true and accurate, which compromises the validity of your research findings.
For example, researchers might unintentionally influence participants during interviews by focusing on specific statistics that tend to support the hypothesis instead of those that do not.
A common cause of sampling ties lies in the study’s design or the data collection process, as researchers may favor or disfavor collecting data from certain individuals or under certain conditions.
Sampling bias also tends to arise when researchers adopt sampling strategies based on judgment or convenience.
This type of bias can occur in both probability and non-probability sampling.
In probability sampling, every member of the population has an equal chance of being selected (i.e., using a random number generator to select a random sample from a population). While probability sampling tends to reduce the risk of sampling bias, it typically does not eliminate it completely.
Extracting random samples typically requires a sampling frame, or a list of units of the whole population from which the sample is drawn. However, using a sampling frame does not necessarily prevent sampling bias. If your sampling frame does not match the population, this can result in a biased sample.
This can happen when a researcher fails to correctly determine the target population or uses outdated and incomplete information, thus excluding sections of the target population.
Or, even when the sampling frame is selected properly, sampling bias can arise from non-responsive sampling units (i.e., if certain classes of subjects are more likely to refuse to participate).
Mismatches between the sampling frame and the target population, as well as non-responses, can result in a biased sample.
In non-probability sampling, samples are selected based on non-random criteria, such as with convenience sampling where participants are selected based on accessibility or availability.
These sampling techniques often result in biased samples because some population members are more or less likely to be included than others.
How to Avoid Sampling Bias
- Use random or stratified sampling → Stratified random sampling will help ensure you get a representative research sample and reduce the interference of irrelevant variables in your systematic investigation.
- Avoid convenience sampling → Rather than collecting data from only easily accessible or available participants, you should gather data from the different subgroups that make up your population of interest.
- Clearly define a target population and a sampling frame → Matching the sampling frame to the target population as much as possible will reduce the risk of sampling bias.
- Follow up on non-responders → When people drop out or fail to respond to your survey, do not ignore them, but rather follow up to determine why they are unresponsive and see if you can garner a response. Additionally, you should keep close tabs on your research participants, and follow up with them frequently to reduce attrition.
- Oversampling → Oversampling can be used to avoid sampling bias in cases where members of the defined population are underrepresented.
- Aim for a large research sample → The larger your sample population, the more likely you are to represent all subgroups from your population of interest.
- Set up quotas for each identified demographic → If you think participant gender, age, ethnicity or some other demographic characteristic is a potential source of bias within your study, quotas will allow you to evenly sample people from different demographic groups within the study.
What is the difference between sampling bias and sampling error?
Sampling error is a statistical error that occurs when the sample used in the study is not representative of the whole population. So, sampling error occurs as a result of sampling bias.
What is the difference between sampling bias and response bias?
Sampling bias occurs when some members of a population are systematically more likely to be selected in a sample than others and thus the sample does not accurately represent the entire group.
Response bias is a general term that refers to a wide range of conditions or factors that can lead participants to respond inaccurately or falsely to questions.
For example, there could be something about how the actual survey questionnaire is constructed that encourages a certain type of answer, leading to measurement error.
Which type of sampling is most at risk for sampling bias?
Non-probability sampling, specifically convenience sampling, is most at risk for sampling bias because with this type of sampling, some members of the population are more likely to be included than others.
Does sampling bias affect reliability?
Yes, sampling bias distorts the research findings and leads to unreliable outcomes. It also is a threat to external validity because the results from a biased sample may not generalize to the population.
Why is it important to avoid sampling bias in research?
It is important to avoid sampling bias in research because otherwise, the population of interest will not be accurately represented. If the sample bias is not addressed then, your research loses its credibility.
Is probability sampling biased?
While probability sampling can significantly reduce sampling bias by giving every member of the population an equal chance of being included in the research, this method can still result in a biased sample if your sampling frame does not match the population of interest.
Can sampling error be calculated?
Yes, sampling error is calculated by dividing the standard deviation of the population by the square root of the size of the sample, and then multiplying the resultant with the confidence level.
Here’s the formula for calculating sampling error:
Sampling error = confidence level × [standard deviation of population / (square root of sample size)]
Further Reading
Hamill, R., Wilson, T. D., & Nisbett, R. E. (1980). Insensitivity to sample bias: Generalizing from atypical cases . Journal of Personality and Social Psychology , 39 (4), 578.
Nielsen, M., Haun, D., Kärtner, J., & Legare, C. H. (2017). The persistent sampling bias in developmental psychology: A call to action . Journal of Experimental Child Psychology , 162 , 31-38.
16 Selection Bias Examples
Chris Drew (PhD)
Dr. Chris Drew is the founder of the Helpful Professor. He holds a PhD in education and has published over 20 articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education. [Image Descriptor: Photo of Chris]
Learn about our Editorial Process
Selection bias occurs when the sample being studied is not representative of the population from which the sample was drawn, leading to skewed or misleading results (Walliman, 2021).
In these situations, the sample under study deviates from a fair, random, and equitable selection process. This influences the outcomes and interpretations of a research study.
A common situation where selection bias affects results is in electoral polling. If the sample that the pollster interviews skews older than the general population, or has a disproportionate amount of men or women compared to the general population, then the data will be wrong. As a result, we might get a shock on election day!
Selection Bias Examples
1. sampling bias.
Sampling bias occurs when a researcher selects sampling methods that aren’t representative of the entire population, thereby introducing bias in the representation (Atkinson et al., 2021).
A common example is convenience sampling, where individuals are chosen based on their proximity or accessibility, rather than considering the characteristics of the larger population.
In behavioral science, for example, reliance on undergraduate students as subjects limits the application of many theories to different age groups (Liamputtong, 2020). This is not a representation of the wider population, hence introducing sampling bias.
This form of bias affects the generalizability and external validity of the results (Busetto et al., 2020). Therefore it is crucial to balance representativeness and accessibility while designing the sample strategy (Walliman, 2021).
2. Self-selection Bias
Self-selection bias arises when participants are given the choice to participate in a study and the ones who opt in or out are systematically different from the others (Suter, 2011).
The study findings may not accurately represent the entire population as those who self-selected may have specific characteristics or behaviors influencing the research outcome (Creswell, 2013). For instance, individuals who agree to be part of a weight loss study might already be motived to lose weight.
This bias skews the resulting data and lacks inferential power to the broader population (Bryman, 2015). Hence, while presenting an opportunity for participant autonomy, self-selection bias calls for cautious interpretation of data.
3. Exclusion Bias
Exclusion bias refers to the systematic exclusion of certain individuals from the sample.
It could be due to the specific criteria defined in the study design or the involuntary or intentional exclusion of groups resulting from the recruitment strategy (Walliman, 2021). For example, a study on work productivity excluding night-shift workers will have an exclusion bias.
This form of bias threatens the internal validity of the study as it implies a differential selection of subjects into study groups (Atkinson et al., 2021). Thus, researchers should ensure their selection criteria do not create an undue bias.
4. Berkson’s bias
Berkson’s Bias, named after American statistician Joseph Berkson, is a form of selection bias seen commonly in medical research.
This bias takes place when the selection of subjects into a study is related to their exposure and outcome (Barker et al., 2016). For example, if a study is conducted in a hospital setting, it would more likely attract people who are ill and seeking treatment than healthy individuals, leading to an overrepresentation of one group.
Understanding that this bias decouples the results from generalizability, ensuring a diverse sample becomes integral (Liamputtong, 2020).
5. Undercoverage Bias
Undercoverage bias happens when some groups of the population are inadequately represented in the sample (Creswell, 2013).
Similar to exclusion bias, it can emerge because the researchers did not reach out to certain groups, or those groups could not respond due to barriers (Bryman, 2015). An example would be a telephone survey that only includes landline numbers, thereby excluding a huge sector of the population, mainly the tech-savvy younger generation.
Acknowledging and factoring in such underrepresentation ensures a more accurate result (Suter, 2011).
6. Cherry Picking
Cherry-picking is a type of selection bias, which involves selectively presenting, emphasizing, or excluding data that support a particular conclusion while neglecting significant data that may contradict it (Bryman, 2015).
It can lead to inaccurate or misleading findings because the research results have been skewed deliberately. An example could be climate change deniers who selectively focus on particular periods to argue that global warming isn’t happening or isn’t serious.
Researchers must be explicit about their selection process and should refrain from selectively highlighting or suppressing data (Walliman, 2021).
7. Survivorship Bias
Survivorship bias is a bias that can occur when the focus is solely on the subjects that “survived” or succeeded, dismissing those that failed or dropped out (Atkinson et al., 2021).
This can clearly skew the results as important factors contributing to failure or dropout might be overlooked. An example is in entrepreneurship where stories of successful founders are commonly told while ignoring the much larger number of entrepreneurs who failed.
To avoid this bias, researchers need to consider the whole spectrum of outcomes (Busetto, Wick, & Gumbinger, 2020).
Read More: Survivorship Bias Examples
8. Time Interval Bias
Time Interval Bias arises when the time between measurements or observations is inconsistent, thereby inflating or reducing the observed effects or associations in a study (Atkinson et al., 2021).
The choice of time intervals is vital, and different intervals can lead to different results. For example, tracking a group of patients’ recovery weekly might lead to a different outcome than if the analysis was done monthly.
While varying the intervals might capture more detail, it may also lead to an overestimation of some results (Suter, 2011).
Researchers need to carefully consider the most accurate and reasonable intervals to mitigate such bias (Bryman, 2015). Therefore, understanding the implication of time intervals is critical for truthful representation and valid interpretation of data (Walliman, 2021).
9. Attrition Bias
Attrition Bias, also known as dropout bias, comes into play when participants exit a study before its completion, leading to a skewness in the final results (Bryman, 2015).
This departure can be associated with certain characteristics or responses to the study, thus altering the distribution of variables within the remaining sample (Walliman, 2021). An example would be participants dropping out of a drug efficacy study due to intense side effects.
If many of these participants belonged to the group that received the new drug, the remaining participants would likely show results biased in favor of the new drug’s efficacy (Atkinson et al., 2021).
To control for attrition bias, strategies such as bolstering participant engagement and using intention-to-treat analysis should be considered (Suter, 2011). Therefore, attention to withdrawal reasons and early identification of potential dropout factors are critical aspects of research design and execution (Creswell, 2013).
10. Non-response Bias
Non-response bias arises when the characteristics of those who choose to participate in a study differ significantly from those who do not respond.
For instance, in a survey about personal health habits, individuals with poor health habits may be less likely to respond. Hence, the data would underestimate the prevalence of poor health habits in the population (Walliman, 2021).
To mitigate this bias, researchers can adopt strategies such as contacting non-responders repeatedly or offering incentives to improve response rates (Barker, Pistrang, & Elliott, 2016).
11. Volunteer Bias
Volunteer bias transpires when individuals who volunteer for a research study are fundamentally different from the ones who decline to participate (Atkinson et al., 2021).
Their eagerness to participate is often reflective of strong opinions or experiences related to the research topic. It creates a skewed representation as the volunteers may be more educated, affluent, or health-conscious than the broader population (Creswell, 2013). For instance, in a study regarding alcohol consumption patterns, non-drinkers or moderate drinkers may be less inclined to respond.
Therefore, caution must be exercised while making inferences from volunteer-based data as they have a higher propensity for experience distortion (Suter, 2011). Thus, purposive sampling strategies must be employed to ensure a balanced representation (Bryman, 2015).
12. Healthy User Effect
The healthy user effect, or a health-conscious bias, arises when participants who voluntarily engage in health behavior or treatment studies are generally healthier, more educated, and compliant than the average populace (Walliman, 2021).
This participation can cause an overestimation of the benefits of the health behavior or treatment being studied (Atkinson et al., 2021). A classic example would be a study of the impacts of a healthy diet, where individuals already conscious about their food choices, are more likely to participate (Bryman, 2015).
Such selective participation skews the outcomes towards favorable results (Liamputtong, 2020). So it’s paramount that researchers control for health consciousness in their analysis to ensure the effects being studied are indeed due to the intervention and not related to healthier behaviors (Barker et al., 2016).
13. Exposure Bias
Exposure bias operates when there are inconsistencies or errors in measuring an individual’s exposure to a certain factor or condition in a research study (Suter, 2011).
This might occur when a study measures participants’ sun exposure levels without considering their sunscreen usage, leading to an overestimation of sun exposure and its effects (Bryman, 2015).
Such flawed measurement can consequently undermine the validity and reliability of the research findings (Walliman, 2021). As a result, it’s crucial to consider and control for confounding variables that might affect exposure levels (Barker et al., 2016).
Importantly, employing consistent and objective methods of measurement helps to minimize exposure bias (Liamputtong, 2020).
14. Location Bias
Location bias is a sample distortion that emerges when the setting for data collection influences the research results, making them unrepresentative of the wider population (Atkinson et al., 2021).
If a study on physical fitness is conducted solely in a gym, the results would most likely present a fitness level higher than the general population (Suter, 2011). This location-specific data might falsely represent the overall fitness levels because a gym environment already attracts more physically active people (Creswell, 2013).
To avoid this bias, researchers should aim to diversify the settings for data collection, ensuring they are reflective of various environments where the target population might be found (Liamputtong, 2020). Therefore, an understanding of the potential influence of the study location is crucial to reduce location bias (Bryman, 2015).
15. Referral Bias
Referral bias appears in studies when the sampled population has been specifically referred from another source, creating potential unrepresentativeness (Barker et al., 2016).
This type of bias is common in healthcare research, whereby patients referred for specialized care are investigated (Walliman, 2021). Forwarding these patients into a study could misrepresent the condition’s severity as they are already pre-selected based on their need for specialized care (Creswell, 2013).
Consequently, the outcomes of such studies could overestimate disease severity or the effectiveness of specialized treatment (Atkinson et al., 2021). Thus, understanding and considering referral patterns and their implications is a crucial step in mitigating referral bias in research (Suter, 2011).
16. Pre-screening of Subjects
Pre-screening of subjects happens when researchers follow a vetting process to determine whether potential participants are suitable for a study (Walliman, 2021).
This process could inadvertently exclude certain individuals or groups, leading to a biased, non-representative sample (Atkinson et al., 2021).
An example of pre-screening bias is when a study on heart diseases excludes individuals with a history of hypertension. As a result, it could potentially understate the severity of heart conditions as it does not account for such overlapping conditions (Bryman, 2015).
Thus, careful balancing must be undertaken during pre-screening to ensure the sample reflects the wider research context whilst adhering to study-specific needs (Creswell, 2013). Importantly, the implications of pre-screening should be acknowledged in any resulting data interpretations (Liamputtong, 2020).
What’s Wrong with Selection Bias?
Selection bias can and does skew results. This is an overarching issue in both qualitative and quantitative research, as biases may emerge from the chosen selection methods, either intentionally or unintentionally (Busetto, Wick, & Gumbinger, 2020).
Diverse factors such as geography, socioeconomic status, or personal preferences can influence participant choice and thereby introduce bias.
Selection bias ultimately reduces both external and internal validity:
- External validity is compromised as the biased sample is not representative of the larger population, making it hard to generalize the findings. (See: Threats to external validity ).
- Internal validity is compromised because the bias introduces additional variables, making it challenging to confirm whether the observed effect is due to the experiment itself or the bias (See: Threats to internal validity ).
Overall, selection bias contravenes scientific research principles because it potentially leads to inaccurate findings and breaks the trust between the researcher and the public or scientific community.
Combatting Selection Bias: Specialized Methodologies
Addressing selection bias is vital for maintaining the integrity of research outcomes. By combining careful planning, methodological rigor, statistical expertise, and transparency , significant strides can be made in reducing this type of bias (Walliman, 2021).
Specifically, here are four techniques:
1. Stratified Sampling
Stratified sampling is a method in which the larger population is first divided into distinct, non-overlapping subgroups or “strata” based on specific characteristics or variables (Atkinson et al., 2021).
These could be attributes like age range, geographic location, or socio-economic groups. The next step is to randomly select samples from each stratum.
The benefit of stratified sampling technique is it allows to achieve a sample that is more representative of the diversity in the population (Bryman, 2015). So, instead of treating the population as homogenous, it respects the heterogeneity and reduces the risk of under-representation.
2. Randomization
Randomization is the process of assigning individuals to groups randomly within a study. It ensures that each participant has an equal chance of being assigned to any group, thereby minimizing the risk of selection bias (Creswell, 2013).
Importantly, it also helps to distribute the features of participants evenly across groups. As the distribution is random, differences in outcome can be more confidently attributed to differing interventions rather than underlying differences in the groups.
Its key strength is that it supports causal inferences by balancing both known and unknown confounds (Suter, 2011).
3. Propensity Score Matching
Propensity score matching (PSM) is a statistical method that attempts to estimate the effect of an intervention, treatment, or policy by accounting for the covariates that predict receiving the treatment (Busetto et al., 2020).
Essentially, it matches individuals in the treated group with individuals in the control group with similar “propensity scores” or predicted probabilities of receiving the treatment.
By balancing the observed characteristics between treated and control groups in this manner, PSM helps to mimic a randomized controlled trial and minimize selection bias in non-experimental studies (Barker et al., 2016).
4. Instrumental Variable Methods
Instrumental variable (IV) methods are used in situations where random assignment is not feasible, and there’s potential for uncontrolled confounding (Atkinson et al., 2021).
An instrument is a variable that affects the treatment status but does not independently affect the outcome, except through its effect on treatment status (Walliman, 2021).
The goal of IV methods is to remove bias in the estimated treatment effects by isolating the variability in treatment that is not due to confounding (Bryman, 2015). It’s a powerful tool addressing selection bias in observational studies, but finding a valid instrument can be challenging (Liamputtong, 2020).
Overcoming selection bias requires meticulous planning, proper sample selection, and unbiased data analysis (Bryman, 2015). Responsible research and commitment to ethical guidelines will significantly reduce cases of selection bias.
To conclude, selection bias, as emphasized by Liamputtong (2020), is one of the significant forms of bias in research. Its influence can markedly distort research outcomes, and considerable efforts must be made to identify, control, and mitigate its impact on research findings.
Atkinson, P., Delamont, S., Cernat, A., Sakshaug, J., & Williams, R. A. (2021). SAGE research methods foundations . London: Sage Publications.
Barker, C., Pistrang, N., & Elliott, R. (2016). Research Methods in Clinical Psychology: An Introduction for Students and Practitioners . London: John Wiley & Sons.
Bryman, A. (2015). The SAGE Handbook of Qualitative Research . London: Sage Publications.
Busetto, L., Wick, W., & Gumbinger, C. (2020). How to use and assess qualitative research methods. Neurological Research and practice, 2 , 1-10.
Creswell, J. W. (2013). Research Design: Qualitative, Quantitative and Mixed Methods Approaches . New York: Sage Publications.
Liamputtong, P. (2020). Qualitative research methods . New York: Sage.
Walliman, N. (2021). Research methods: The basics . Los Angeles: Routledge.
Suter, W. N. (2011). Introduction to educational research: A critical thinking approach . London: SAGE publications.
- Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ 10 Reasons you’re Perpetually Single
- Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ 20 Montessori Toddler Bedrooms (Design Inspiration)
- Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ 21 Montessori Homeschool Setups
- Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ 101 Hidden Talents Examples
Leave a Comment Cancel Reply
Your email address will not be published. Required fields are marked *
An official website of the United States government
The .gov means it's official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
- Publications
- Account settings
- Browse Titles
NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2024 Jan-.
StatPearls [Internet].
Aleksandar Popovic ; Martin R. Huecker .
Affiliations
Last Update: June 20, 2023 .
- Definition/Introduction
Bias is colloquially defined as any tendency that limits impartial consideration of a question or issue. In academic research, bias refers to a type of systematic error that can distort measurements and/or affect investigations and their results. [1] It is important to distinguish a systematic error, such as bias, from that of random error. Random error occurs due to the natural fluctuation in the accuracy of any measurement device, the innate differences between humans (both investigators and subjects), and by pure chance. Random errors can occur at any point and are more difficult to control. [2] Systematic errors, referred to as bias from here on, occur at one or multiple points during the research process, including the study design, data collection, statistical analysis, interpretation of results, and publication process. [3]
However, interpreting the presence of bias involves understanding that it is not a dichotomous variable, where the results can either be “present” or “not present.” Rather, it must be understood that bias is always present to some degree due to inherent limitations in research, its design, implementation, and ethical considerations. [4] Therefore, it is instead crucial to evaluate how much bias is present in a study and how the researchers attempted to minimize any sources of bias. [5] When evaluating for bias, it is important to note there are many types with several proposed classification schemes. However, it is easiest to view bias based on the various stages of research studies; the planning and design stage (before), data collection and analysis (during), and interpretation of results and journal submission (after).
- Issues of Concern
The planning stage of any study can have bias present in both study design and recruitment of subjects. Ideally, the design of a study should include a well-defined outcome, population of interest, and collection methods before implementation and data collection. The outcome, for example, response rates to a new medication, should be precisely agreed upon. Investigators may focus on changes in laboratory parameters (such as a new statin reducing LDL and total cholesterol levels) or focus on long-term morbidity and mortality (does the new statin cause reduction in cardiovascular-related deaths?) Similarly, the investigator’s own pre-existing notion or personal beliefs can influence the question being asked and the study's methodology. [6]
For example, an investigator who works for a pharmaceutical company may address a question or collect data most likely to produce a significant finding supporting the use of the investigational medication. Thus, if possible, the question(s) being asked and the collection methods employed should be agreed upon by multiple team members in an interprofessional setting to reduce potential bias. Ethics committees also play a valuable role here.
Relatedly, the team members designing a study must define their population of interest, also referred to as the study population. Bias occurs if the study population does not closely represent a target population due to errors in study design or implementation, termed selection bias. Sampling bias is one form of selection bias and typically occurs if subjects were selected in a non-random way. It can also occur if the study requires subjects to be placed into cohorts and if those cohorts are significantly different in some way. This can lead to erroneous conclusions and significant findings. Randomization of subject selection and cohort assignment is a technique used in study design intended to reduce sampling bias. [7] [8]
However, bias can occur if subject selection occurred through limited means, such as recruiting subjects through phone landlines, thereby excluding anyone who does not own a landline. Similarly, this can occur if subjects are recruited only through email or a website. This can result in confounding or the introduction of 3 variable that influences both the independent and dependent variables. [9]
For example, if a study recruited subjects from two primary care clinics to compare diabetes screening and treatment rates but did not account for potentially different socioeconomic characteristics of the two clinics, there may be significant differences between groups not due to clinical practice but rather cohort composition.
A subtype of selection bias, admission bias (also referred to as Berkson bias), occurs when the selected study population is derived from patients within hospitals or certain specialty clinics. This group is then compared to a non-hospitalized group. This predisposes to bias as hospitalized patient populations are more likely to be ill and not represent the general population. Furthermore, there are typically other confounding variables or covariates that may skew relationships between the intended dependent and independent variables. [10]
For example, in one study that evaluated the effect of cigarette smoking and its association with bladder cancer, researchers decided to use a hospital-based case-control study design. Normally, there is a strong and well-established relationship between years of cigarette use and the likelihood of developing bladder cancer. In fact, part of screening guidelines for bladder cancer considers the total years that an individual has smoked during patient risk stratification and subsequent evaluation and follow-up. However, in one study, researchers noted no significant relationship between smoking and bladder cancer. Upon re-evaluating, they noted their cases and controls both had significant smoking histories, thereby blurring any relationships. [11]
Admission bias can be reduced by selecting appropriate controls and being cognizant of the potential introduction of this bias in any hospital-based study. If this is not possible to do, researchers must be transparent about this in their work and may try to use different methods of statistical analysis to account for any confounding variables. In an almost opposite fashion, another source of potential error is a phenomenon termed the healthy worker effect. The healthy worker effect refers to the overall improved health and decreased mortality and morbidity rates of those employed relative to the unemployed. This occurs for various reasons, including access to better health care, improved socioeconomic status, the beneficial effects of work itself, and those who are critically ill or disabled are less likely to find employment. [12] [13]
Two other important forms of selection bias are lead-time bias and length time bias. Lead-time bias occurs in the context of disease diagnosis. In general, it occurs when new diagnostic testing allows detection of a disease in an early stage, causing a false appearance of longer lifespan or improved outcomes. [14] An example of this is noted in individuals with schizophrenia with varying durations of untreated psychosis. Those with shorter durations of psychosis relative to longer durations typically had better psychosocial functioning after admission to and treatment within a hospital. However, upon further analysis, it was found that it was not the duration of psychosis that affected psychosocial functioning. Rather, the duration of psychosis was indicative of the stage of the person’s disease, and those individuals with shorter durations of psychosis were in an earlier stage of their disease. [15]
Length time bias is similar to lead-time bias; however, it refers to the overestimation of an individual’s survival time due to a large number of cases that are asymptomatic and slowly progressing with a smaller number of cases that are rapidly progressive and symptomatic. An example can be noted in patients with hepatocellular carcinoma (HCC). Those who have HCC found via asymptomatic screening typically had a tumor doubling time of 100 days. In contrast, those individuals who had HCC uncovered due to symptomatic presentation had a tumor doubling time of 42 days on average. However, overall outcomes were the same amongst these two groups. [16]
The effect of both lead time and length time bias must be taken into effect by investigators. For lead-time bias, investigators can instead look at changes in the overall mortality rate due to disease. One method involves creating a modified survival curve that considers possible lead-time bias with the new diagnostic or screening protocols. [17] This involves an estimate of the lead time bias and subsequently subtracting this from the observed survival time. Unfortunately, the consequences of length time bias are difficult to mitigate, but investigators can minimize their effects by keeping individuals in their original groups based on screening protocols (intention-to-screen) regardless of the individual required earlier diagnostic workup due to symptoms.
Channeling and procedure bias are other forms of selection bias that can be encountered and addressed during the planning stage of a study. Channeling bias is a type of selection bias noted in observational studies. It occurs most frequently when patient characteristics, such as age or severity of illness, affect cohort assignment. This can occur, for example, in surgical studies where different interventions carry different levels of risk. Surgical procedures may be more likely to be carried out on patients with lower levels of periprocedural risk who would likely tolerate the event, whereas non-surgical interventions may be reserved for patients with higher levels of risk who would not be suitable for a lengthy procedure under general anesthesia. [18] As a result, channeling bias results in an imbalance of covariates between cohorts. This is particularly important when the surgical and non-surgical interventions have significant differences in outcome, making it difficult to ascertain if the difference is due to different interventions or covariate imbalance. Channeling bias can be accounted for through the use of propensity score analysis. [19]
Propensity scores are the probability of receiving one intervention over another based on an individual's observed covariates. These scores are obtained through a variety of different methods and then accounted for in the analysis stage via statistical methods, such as logistic regression. In addition to channeling bias, procedure bias (administration bias) is a similar form of selection bias, where two cohorts receive different levels of treatment or are administered similar treatments or interviews in different formats. An example of the former would be two cohorts of patients with ACL injuries. One cohort received strictly supervised physical therapy 3 times per week, and the other cohort was taught the exercises but instructed to do them at home on their own. An example of the latter would be administering a questionnaire regarding eating disorder symptoms. One group was asked in-person in an interview format, and the other group was allowed to take the questionnaire at home in an anonymous format. [20]
Either form of procedure bias can lead to significant differences observed between groups that might not exist where they are treated the same. Therefore, both procedure and channeling bias must be considered before data collection, particularly in observational or retrospective studies, to reduce or eliminate erroneous conclusions that are derived from the study design itself and not from treatment protocols.
Bias in Data Collection & Analysis
There are also a variety of forms of bias present during data collection and analysis. One type is observer bias, which refers to any systematic difference between true and recorded values due to variation in the individual observer. This form of bias is particularly notable in studies that require investigators to record measurements or exposures, particularly if there is an element of subjectiveness present, such as evaluating the extent or color of a rash. [21] However, this has even been noted in the measurement of subjects’ blood pressures when using sphygmomanometers, where investigators may round up or down depending on their preconceived notions about the subject. Observer bias is more likely when the observer is aware of the subject’s treatment status or assignment cohort. This is related to confirmation bias, which refers to a tendency to search for or interpret information to support a pre-existing belief. [22]
In one prominent example, physicians were asked to estimate blood loss and amniotic fluid volume in pregnant patients currently in labor. By providing additional information in the form of blood pressures (hypotensive or normotensive) to the physicians, they were more likely to overestimate blood loss and underestimate amniotic fluid volume when told the patient was hypotensive. [23] Similar findings are noted in fields such as medicine, health sciences, and social sciences, illustrating the strong and misdirecting influence of confirmation bias on the results found in certain studies. [22] [24]
Investigators and data collectors need to be trained to collect data in a uniform, empirical fashion and be conscious of their own beliefs to minimize measurement variability. There should be standardization of data collection to reduce inter-observer variance. This may include training all investigators or analysts to follow a standardized protocol, use standardized devices or measurement tools, or use validated questionnaires. [21] [25]
Furthermore, the decision of whether to blind the investigators and analysts should also be made. If implemented, blinding of the investigators can reduce observer bias, which refers to the differential assessment of an outcome when subjective criteria are being assessed. Confirmation bias within investigators and data collectors can be minimized if they are informed of its potential interfering role. Furthermore, overconfidence in either the overall study’s results or the collection of accurate data from subjects can be a strong source of confirmation bias. Challenging overconfidence and encouraging multiple viewpoints is another mechanism by which to challenge this within investigators. Lastly, potential funding sources or other conflicts of interest can influence confirmation and observer bias and must be considered when evaluating for these potential sources of systematic error. [26] [27] However, subjects themselves may change their behavior, consciously or unconsciously, in response to their awareness of being observed or being assigned to a treatment group termed the Hawthorne effect. [28] The Hawthorne effect can be minimized, although not eliminated, by reducing or hiding the observation of the subject if possible. A similar phenomenon is noted with self-selection bias, which occurs when individuals sort themselves into groups or choose to enroll in studies based on pre-existing factors. For example, a study evaluating the effectiveness of a popular weight loss program that allows participants to self-enroll may have significant differences between groups. In circumstances such as this, it is more probable that individuals who experienced greater success (measured in terms of weight lost) are likely to enroll. Meanwhile, those who did not lose weight and/or gained weight would likely not enroll. Similar issues plague other studies that rely on subject self-enrollment. [20] [29]
Self-selection bias is often found in tandem with response bias, which refers to subjects inaccurately answering questions due to various influences. [30] This can be due to question-wording, the social desirability of a certain answer, the sensitiveness of a question, the order of questions, and even the survey format, such as in-person, via telephone, or online. [22] [31] [32] [33] [34] There are methods of reducing the impact of all these factors, such as the use of anonymity in surveys, the use of specialized questioning techniques to reduce the impact of wording, and even the use of nominative techniques where individuals are asked about the behavior of close friends for certain types of questions. [35] Non-response bias refers to significant differences between individuals who respond and those who do not respond to a survey or questionnaire. It is not to be confused as being the opposite of response bias. It is particularly problematic as errors can result in estimating population characteristics due to a lack of response from the non-responders. It is often noted in health surveys regarding alcohol, tobacco, or drug use, though it has been seen in many other topics targeted by surveys. [36] [37] [36] Furthermore, particularly in surveys designed to evaluate satisfaction after an intervention or treatment, individuals are much more likely to respond if they felt highly satisfied relative to the average individual. While highly dissatisfied individuals were also more likely to respond relative to average, they were less likely to respond relative to highly satisfied individuals, thus potentially skewing results toward respondents with positive viewpoints. This can be noted in product reviews or restaurant evaluations.
Several preventative steps can be taken during study design or data collection to mitigate the effects of non-response bias. Ideally, surveys should be as short and accessible as possible, and potential participants should be involved in questions design. Additionally, incentives can be provided for participation if necessary. Lastly, if necessary, surveys can be made mandatory as opposed to voluntary. For example, this could occur if school-age children were initially sent a survey via mail to their homes to complete voluntarily, but this was later changed to a survey required to be completed and handed in at school on an anonymous basis. [38] [39]
Similar to the Hawthorne effect and self-selection bias, recall bias is another potential source of systematic error stemming from the subjects of a particular study. Recall bias is any error due to differences in an individual’s recollections and what truly transpired. Recall bias is particularly prevalent in retrospective studies that use questionnaires, surveys, and/or interviews. [40]
For example, in a retrospective study evaluating the prevalence of cigarette smoking in individuals diagnosed with lung cancer vs. those without, those with lung cancer may be more likely to overestimate their use of tobacco meanwhile those without may underestimate their use. Fortunately, the impact of recall bias can be minimized by decreasing the time interval between an outcome (lung cancer) and exposure (tobacco use). The rationale for this is that individuals are more likely to be accurate when the time period assessed is of shorter duration. Other methods that can be used would be to corroborate the individual’s subjective assessments with medical records or other objective measures whenever possible. [41]
Lastly, in addition to the data collectors and the subjects, bias and subsequent systematic error can be introduced through data analysis, especially if conducted in a manner that gives preference to certain conclusions. There can be blatant data fabrication where non-existing data is reported. However, researchers are more likely to perform multiple tests with pair-wise comparisons, termed “p-hacking.” [42] This typically involves analysis of subgroups or multiple endpoints to obtain statistically significant findings, even if these findings were unrelated to the original hypothesis. P-hacking also occurs when investigators perform data analysis partway through data collection to determine if it is worth continuing or not. [43] It also occurs when covariates are excluded, if outliers are included or dropped without mention, or if treatment groups are split, combined, or otherwise modified based on the original research design. [44] [45]
Ideally, researchers should list all variables explored and all associated findings. If any observations are eliminated (outliers), they should be reported, and an explanation is given as to why they were eliminated and how their elimination affected the data.
Bias in Data Interpretation and Publication
The final stages of any study, interpretation of data and publication of results, is also susceptible to various types of bias. During data interpretation and subsequent discussion, researchers must ensure that the proper statistical tests were used and that they were used correctly. Furthermore, results discussed should be statistically significant, and discussion should be avoided with results that “approach significance.” [46] Furthermore, bias can also be introduced in this stage if researchers discuss statistically significant differences but not clinically significant if conclusions are made about causality when the experiment was purely observational if data is extrapolated beyond the range found within the study. [3]
A major form of bias found during the publication stage is appropriately named publication bias. This refers to the submission of either statistically or clinically significant results, excluding other findings. [47] Journals and publishers themselves have been found to favor studies with significant values. However, researchers themselves may, in turn, use methods of data analysis or interpretation (mentioned above) to uncover significant results. Outcome reporting bias is similar, which refers to the submission of statistically significant results only, excluding non-significant ones. These two biases have been found to affect the results of systematic analyses and even affect the clinical management of patients. [48] However, publication and outcome reporting bias can be prevented in certain cases. Any prospective trials are typically required to be registered before study commencement, meaning that all results, whether significant or not, will be visible. Furthermore, electronic registration and archiving of findings can also help reduce publication bias. [49]
- Clinical Significance
Understanding basic aspects of study bias and related concepts will aid clinicians in practicing and improving evidence-based medicine. Study bias can be a major factor that detracts from the external validity of a study or the generalizability of findings to other populations or settings. [50] Clinicians who possess a strong understanding of the various biases that can plague studies will be better able to determine the external validity and, therefore, clinical applicability of a study's findings. [51] [52]
The replicability of a study with similar findings is a strong factor in determining its external validity and generalizability to the clinical setting. Whenever possible, clinicians should arm themselves with the knowledge from multiple studies or systematic reviews on a topic, as opposed to using a single study. [53] Systematic reviews allow applying strategies that limit bias through systematic assembly, appraisal, and unification of the relevant studies regarding a topic. [54]
With a critical, investigational point of view, a willingness to evaluate contrary sources, and the use of systematic reviews, clinicians can better identify sources of bias. In doing so, they can better reduce its impact in their decision-making process and thereby implement a strong form of evidence-based medicine.
- Nursing, Allied Health, and Interprofessional Team Interventions
There are numerous sources of bias within the research process, ranging from the design and planning stage, data collection and analysis, interpretation of results, and the publication process. Bias in one or multiple points of this process can skew results and even lead to incorrect conclusions. This, in turn, can cause harmful medical decisions, affecting patients, their families, and the overall healthcare team. Outside of medicine, significant bias can result in erroneous conclusions in academic research, leading to future fruitless studies in the same field. [55]
When combined with the knowledge that most studies are never replicated or verified, this can lead to a deleterious cycle of biased, unverified research leading to more research. This can harm the investigators and institutions partaking in such research and discredit entire fields, even if other investigators had significant work and took extreme care to limit and explain sources of bias.
All research needs to be carried out and reported transparently and honestly. In recent years, important steps have been taken, such as increased awareness of biases present in the research process, manipulating statistics to generate significant results, and implementing a clinical trial registry system. However, all stakeholders of the research process, from investigators to data collectors, to the institutions they are a part of, and the journals that review and publish findings, must take extreme care to identify and limit sources of bias and report those transparently.
All interprofessional healthcare team members, including physicians, physician assistants, nurses, pharmacists, and therapists, need to understand the variety of biases present throughout the research process. Such knowledge will separate stronger studies from weaker ones, determine the clinical and real-world applicability of results, and optimize patient care through the appropriate use of data-driven research results considering potential biases. Failure to understand various biases and how they can skew research results can lead to suboptimal and potentially deleterious decision-making and negatively impact both patient and system outcomes.
- Review Questions
- Access free multiple choice questions on this topic.
- Comment on this article.
Disclosure: Aleksandar Popovic declares no relevant financial relationships with ineligible companies.
Disclosure: Martin Huecker declares no relevant financial relationships with ineligible companies.
This book is distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) ( http://creativecommons.org/licenses/by-nc-nd/4.0/ ), which permits others to distribute the work, provided that the article is not altered or used commercially. You are not required to obtain permission to distribute this article, provided that you credit the author and journal.
- Cite this Page Popovic A, Huecker MR. Study Bias. [Updated 2023 Jun 20]. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2024 Jan-.
In this Page
Bulk download.
- Bulk download StatPearls data from FTP
Related information
- PMC PubMed Central citations
- PubMed Links to PubMed
Similar articles in PubMed
- Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas. [Cochrane Database Syst Rev. 2022] Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas. Crider K, Williams J, Qi YP, Gutman J, Yeung L, Mai C, Finkelstain J, Mehta S, Pons-Duran C, Menéndez C, et al. Cochrane Database Syst Rev. 2022 Feb 1; 2(2022). Epub 2022 Feb 1.
- Review Bias in research studies. [Radiology. 2006] Review Bias in research studies. Sica GT. Radiology. 2006 Mar; 238(3):780-9.
- Small class sizes for improving student achievement in primary and secondary schools: a systematic review. [Campbell Syst Rev. 2018] Small class sizes for improving student achievement in primary and secondary schools: a systematic review. Filges T, Sonne-Schmidt CS, Nielsen BCV. Campbell Syst Rev. 2018; 14(1):1-107. Epub 2018 Oct 11.
- Recovery schools for improving behavioral and academic outcomes among students in recovery from substance use disorders: a systematic review. [Campbell Syst Rev. 2018] Recovery schools for improving behavioral and academic outcomes among students in recovery from substance use disorders: a systematic review. Hennessy EA, Tanner-Smith EE, Finch AJ, Sathe N, Kugley S. Campbell Syst Rev. 2018; 14(1):1-86. Epub 2018 Oct 4.
- Review Bias, Confounding, and Interaction: Lions and Tigers, and Bears, Oh My! [Anesth Analg. 2017] Review Bias, Confounding, and Interaction: Lions and Tigers, and Bears, Oh My! Vetter TR, Mascha EJ. Anesth Analg. 2017 Sep; 125(3):1042-1048.
Recent Activity
- Study Bias - StatPearls Study Bias - StatPearls
Your browsing activity is empty.
Activity recording is turned off.
Turn recording back on
Connect with NLM
National Library of Medicine 8600 Rockville Pike Bethesda, MD 20894
Web Policies FOIA HHS Vulnerability Disclosure
Help Accessibility Careers
- Search Search Please fill out this field.
- Business Leaders
- Math and Statistics
Sample Selection Bias: Definition, Examples, and How To Avoid
What Is Sample Selection Bias?
Sample selection bias is a type of bias caused by choosing non-random data for statistical analysis. The bias exists due to a flaw in the sample selection process, where a subset of the data is systematically excluded due to a particular attribute. The exclusion of the subset can influence the statistical significance of the test, and it can bias the estimates of parameters of the statistical model .
Key Takeaways
- Sample selection bias in a research study occurs when non-random data is selected for statistical analysis.
- Due to a flaw in the sample selection process, a subset of the data is excluded from the study, thereby impacting or negating the statistical significance of the test.
- There are several types of sample selection bias, including pre-screening bias, self-selection bias, exclusion bias, and observer bias.
- Survivorship bias can lead to false conclusions because it focuses only on those elements, people, or things that have made it past a certain point in the selection process, ignoring those that did not.
- One way to correct sample selection bias is to assign weights to misrepresented subgroups in order to statistically correct the bias.
Understanding Sample Selection Bias
Survivorship bias is a common type of sample selection bias. This type of bias ignores those subjects that did not make it past a certain point in the selection process and only focuses on the subjects that "survived." This can lead to false conclusions.
For example, when backtesting an investment strategy on a large group of stocks, it may be convenient to look for securities that have data for the entire sample period. If we were going to test the strategy against 15 years worth of stock data, we might be inclined to look for stocks that have complete information for the entire 15-year period.
However, eliminating a stock that stopped trading, or shortly left the market, would input a bias in our data sample. Since we only include stocks that lasted the 15-year period, our final results would be flawed, as these performed well enough to survive the market.
Types of Sample Selection Bias
In addition to survivorship bias, there are several other types of sample selection bias.
Advertising or Pre-Screening Bias
This occurs when the way participants are pre-screened in a study introduces bias . For example, the language researchers use to advertise for participants can itself introduce bias into the study simply by discouraging or encouraging certain groups of people from volunteering to participate.
Self-Selection Bias
Self-selection bias—also known as volunteer response bias—occurs when the study organizers allow participants to self-select or volunteer to participate. The study organizers relinquish control over who participates to those who decide to volunteer. This may lead people with specific characteristics or opinions to volunteer for a study and thus skew the results .
Exclusion and Undercoverage Bias
Exclusion bias occurs when specific members of a population are excluded from participating in a study. Undercoverage bias occurs when study organizers create a study that does not adequately represent some members of the population.
Example of Sample Selection Bias
Hedge fund performance indexes are one example of sample selection bias subject to survivorship bias. Because hedge funds that don’t survive stop reporting their performance to index aggregators, resulting indices are naturally tilted to funds and strategies that remain, hence “survive.” This can be an issue with popular mutual fund reporting services as well. Analysts can adjust to take account of these biases but may introduce new biases in the process.
Observer bias happens when researchers project their own beliefs or expectations to participants of a study, thereby skewing the results of the study. This sometimes occurs in conjunction with cherry-picking , which is when researchers focus primarily on statistics that support their hypothesis.
Special Considerations
Researchers and study organizers have the responsibility to ensure the results of their studies are accurate, relevant, and do not incorporate any type of bias that could lead to flawed conclusions. One way to do this is to structure the study based on a method that supports a random sample selection process.
While in theory, this may seem simple enough, the reality is that the researcher will need to be vigilant in their efforts to prevent sample selection bias. Additionally, the study organizer may be faced with restrictions beyond their control that make it challenging to realize a random sample. For example, there may be a lack of participants or inadequate funding for the project.
To make sure the sample being studied is random, the researcher should identify the various subgroups within the population . They should then analyze the sample to determine if these subgroups are adequately represented in the study.
In some cases, the researcher may find that certain subgroups are either overrepresented or underrepresented in their study. At this point, the researcher can implement bias correction methods. One method is to assign weights to the misrepresented subgroups in order to statistically correct the bias. This weighted average takes into account the proportional relevance of each subgroup and can lead to results that more accurately reflect the study population's actual demographics.
- Terms of Service
- Editorial Policy
- Privacy Policy
- Your Privacy Choices
- How it works
"Christmas Offer"
Terms & conditions.
As the Christmas season is upon us, we find ourselves reflecting on the past year and those who we have helped to shape their future. It’s been quite a year for us all! The end of the year brings no greater joy than the opportunity to express to you Christmas greetings and good wishes.
At this special time of year, Research Prospect brings joyful discount of 10% on all its services. May your Christmas and New Year be filled with joy.
We are looking back with appreciation for your loyalty and looking forward to moving into the New Year together.
"Claim this offer"
In unfamiliar and hard times, we have stuck by you. This Christmas, Research Prospect brings you all the joy with exciting discount of 10% on all its services.
Offer valid till 5-1-2024
We love being your partner in success. We know you have been working hard lately, take a break this holiday season to spend time with your loved ones while we make sure you succeed in your academics
Discount code: RP0996Y
What is Selection Bias – Types & Examples
Published by Owen Ingram at July 31st, 2023 , Revised On October 5, 2023
Selection bias is a common phenomenon that affects the validity and generalisability of research findings. This bias creeps into research when the selection of participants is not representative of the entire population.
Let’s look at the selection bias definition in detail.
What is Selection Bias?
Experts give selection bias meaning as
‘’Selection bias refers to a systematic error or distortion in the process of selecting participants or samples for a study or analysis, resulting in a non-representative or biased sample. ‘’
It occurs when certain individuals or groups are more likely to be included or excluded from the sample, leading to inaccurate or misleading conclusions.
Selection bias can occur in various fields, including research, surveys, data analysis, and decision-making processes.
Selection Bias Example
A selection bias example is a study on the effectiveness of a new medication for a particular health condition that recruits participants only from a single clinic or hospital. If the participants from this clinic or hospital have access to better healthcare facilities and resources compared to the general population, it can introduce selection bias.
The results of the study may overestimate the effectiveness of the medication because the participants selected are not representative of the broader population with the health condition.
In this scenario, individuals seeking treatment at the specific clinic or hospital may have more severe or complex cases, leading to potentially better outcomes compared to individuals who receive treatment elsewhere or do not seek treatment at all. The study’s findings would not accurately reflect the real-world effectiveness of the medication for the entire population affected by the health condition.
What are the Types of Selection Bias?
There are several types of selection bias that can occur in research and data analysis:
- Self-Selection Bias
This bias occurs when individuals self-select to be part of a study or sample. It can lead to a non-random sample that may not represent the broader population accurately. For example, in surveys, individuals who feel strongly about a topic are more likely to participate, resulting in a biased sample.
Non-Response Bias
Non-response bias occurs when individuals selected to participate in a study or survey do not respond or choose not to participate. If those who do not respond differ systematically from those who do, the results may be biased. For instance, if a survey on income is only completed by individuals with higher incomes, it can lead to an overestimation of average income levels.
Volunteer Bias
Volunteer bias occurs when individuals voluntarily participate in a study or research. This can lead to a non-representative sample, as those who volunteer may possess certain characteristics or motivations that differ from the general population. For example, in clinical trials, volunteers may be more motivated or have better health than the average population.
Berkson’s Bias
Berkson’s bias is common in hospital-based studies. It arises when the study population is selected from a specific group, such as hospital patients, which may have a higher prevalence of certain conditions compared to the general population. This can result in an underestimation or overestimation of the association between variables.
Healthy User Bias
This bias occurs when a study population includes individuals who are more health-conscious or have healthier behaviours than the general population. This can lead to an overestimation of the benefits of certain interventions or treatments.
Overmatching Bias
Overmatching bias occurs when controls are selected based on characteristics that are influenced by exposure or outcome. This can result in an artificially strengthened association between the exposure and outcome of interest.
Diagnostic Access Bias
Diagnostic access bias occurs when the probability of being diagnosed with a condition depends on exposure status. This bias can distort the relationship between exposure and outcome if one group has better access to diagnostic tests than the other.
What is Selection Bias in Research?
Selection bias in research refers to the systematic error or distortion that occurs when the selection of participants or subjects for a study is not random or representative of the target population. It occurs when certain individuals or groups are more likely to be included or excluded from the study, leading to a biased sample.
Selection bias can arise at various stages of research, including participant recruitment, sampling, and data collection. It can impact the internal validity and generalisability of research findings, as the sample may not accurately represent the larger population of interest.
Selection bias can occur due to various factors, such as non-random sampling methods, self-selection by participants, differential response rates, or exclusion criteria that inadvertently exclude certain groups. These factors can introduce biases that influence the characteristics and outcomes observed in the study population.
What are the Examples of Selection Bias?
Examples of selection bias in daily life can include:
Online Product Reviews
When browsing online reviews, people tend to leave reviews for products they either strongly like or strongly dislike, leading to a biased representation of overall customer satisfaction.
Social Media Feeds
Social media algorithms often personalise content based on users’ past preferences and interactions, resulting in a biased selection of information that may reinforce existing beliefs and limit exposure to diverse perspectives.
Political Surveys
Surveys conducted by political organisations or campaigns may target specific demographics or party supporters, leading to a biased sample that may not accurately represent the views of the entire population.
Restaurant Ratings
People are more likely to leave reviews for restaurants when they have an exceptionally positive or negative experience, which can skew overall ratings and fail to capture the opinions of those who had average or neutral experiences.
Job Application Processes
Hiring managers may unintentionally exhibit selection bias by favouring candidates who come from certain schools or have similar backgrounds, overlooking potential talent from other sources.
Media Coverage
Media outlets often focus on sensational or controversial stories, resulting in a biased selection of news stories that may not accurately reflect the full range of events happening in the world.
Sampling Bias in Surveys
Surveys conducted in specific locations or targeting certain demographics may not capture the opinions and experiences of the broader population, leading to biased results.
It is important to recognise these examples of selection bias and be mindful of their potential impact on our understanding of the world. Seeking diverse sources of information and actively considering alternative perspectives can help mitigate the effects of selection bias in daily life.
- Bias in Sampling
- Undercoverage Bias
- Ecological Fallacy
- Optimism Bias
- Status Quo Bias
Hire an Expert Editor
- Precision and Clarity
- Zero Plagiarism
- Authentic Sources
How to Avoid Selection Bias?
To avoid selection bias in research or data analysis, consider the following strategies:
Random Sampling
Use random sampling techniques to ensure that every individual or unit in the population has an equal chance of being selected for the study. This helps to create a representative sample and minimises selection bias.
Define Inclusion Criteria Carefully
Clearly define the criteria for selecting participants or subjects based on the research objectives. This helps to ensure that the selection process is based on relevant characteristics rather than personal biases or preferences.
Increase Response Rates
Take measures to increase response rates in surveys or studies to minimise non-response bias. Follow up with non-responders, offer incentives for participation, and ensure clear and concise communication about the importance and benefits of participation.
Use Stratified Sampling
If there are specific subgroups within the population that are of interest, employ stratified sampling to ensure adequate representation of each subgroup. This helps to prevent the under-representation or over-representation of particular groups.
Avoid Self-Selection
Minimise self-selection bias by actively recruiting participants rather than relying solely on voluntary participation. Reach out to potential participants through various channels, ensuring diversity in recruitment methods.
Consider Using Blinding
In certain studies, blinding the researchers to certain participant characteristics or group assignments can help minimise bias in participant selection and data analysis.
Validate Data Against External Sources
Validate the collected data against external sources or existing datasets to assess the representativeness of the sample and identify any potential biases.
Transparency in Reporting
Clearly describe the sampling methods, inclusion criteria, and any limitations related to participant selection in the research report. This transparency helps readers and reviewers evaluate the potential impact of selection bias on the study’s findings.
Frequently Asked Questions
What is selection bias.
Selection bias refers to a systematic error or distortion in research or data analysis that occurs when the selection of participants or samples is non-random or unrepresentative, leading to biased results.
What are the types of selection bias?
There are several types of selection bias, including self-selection bias, non-response bias, volunteer bias, Berkson’s bias, healthy user bias, overmatching bias, and diagnostic access bias.
How does selection bias impact research findings?
Selection bias can lead to skewed or inaccurate research findings by introducing a non-representative sample that does not accurately reflect the broader population of interest. It can undermine the internal validity and generalisability of study results.
What are examples of selection bias?
Examples of selection bias can be observed in various contexts, such as online product reviews, social media feeds, political surveys, restaurant ratings, job application processes, media coverage, and sampling bias in surveys.
You May Also Like
Regression to the mean is a statistical phenomenon that refers to the tendency of an extremely high or low variable to move closer to the average on its next measurement or over time.
Have you ever noticed that when you look at online reviews for a certain product, you tend to find excessively good reviews? This is how undercoverage bias works. Usually, people with strong opinions about the product write reviews.
Let’s say a person strongly believes that a particular political party is responsible for all the problems in their country. When they encounter news articles or information that supports their viewpoint, they will readily accept it as true and remember it.
As Featured On
USEFUL LINKS
LEARNING RESOURCES
COMPANY DETAILS
Splash Sol LLC
- How It Works
An official website of the United States government
Official websites use .gov A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS A lock ( Lock Locked padlock icon ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.
- Publications
- Account settings
- Advanced Search
- Journal List
Sampling: how to select participants in my research study? *
Jeovany martínez-mesa, david alejandro gonzález-chica, rodrigo pereira duquia, renan rangel bonamigo, joão luiz bastos.
- Author information
- Article notes
- Copyright and License information
Mailing address: Jeovany Martínez-Mesa, Faculdade Meridional - IMED, Escola de Medicina, R. Senador Pinheiro, 304, 99070-220 - Passo Fundo - RS, Brazil. Email: [email protected]
Conflict of Interest: None.
Received 2015 Oct 15; Accepted 2015 Nov 2; Issue date 2016 May-Jun.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License which permits unrestricted non-commercial use, distribution, and reproduction in any medium provided the original work is properly cited.
In this paper, the basic elements related to the selection of participants for a health research are discussed. Sample representativeness, sample frame, types of sampling, as well as the impact that non-respondents may have on results of a study are described. The whole discussion is supported by practical examples to facilitate the reader's understanding.
To introduce readers to issues related to sampling.
Keywords: Dermatology, Epidemiology and biostatistics, Epidemiologic studies, Sample size, Sampling studies
INTRODUCTION
The essential topics related to the selection of participants for a health research are: 1) whether to work with samples or include the whole reference population in the study (census); 2) the sample basis; 3) the sampling process and 4) the potential effects nonrespondents might have on study results. We will refer to each of these aspects with theoretical and practical examples for better understanding in the sections that follow.
TO SAMPLE OR NOT TO SAMPLE
In a previous paper, we discussed the necessary parameters on which to estimate the sample size. 1 We define sample as a finite part or subset of participants drawn from the target population. In turn, the target population corresponds to the entire set of subjects whose characteristics are of interest to the research team. Based on results obtained from a sample, researchers may draw their conclusions about the target population with a certain level of confidence, following a process called statistical inference. When the sample contains fewer individuals than the minimum necessary, but the representativeness is preserved, statistical inference may be compromised in terms of precision (prevalence studies) and/or statistical power to detect the associations of interest. 1 On the other hand, samples without representativeness may not be a reliable source to draw conclusions about the reference population (i.e., statistical inference is not deemed possible), even if the sample size reaches the required number of participants. Lack of representativeness can occur as a result of flawed selection procedures (sampling bias) or when the probability of refusal/non-participation in the study is related to the object of research (nonresponse bias). 1 , 2
Although most studies are performed using samples, whether or not they represent any target population, census-based estimates should be preferred whenever possible. 3 , 4 For instance, if all cases of melanoma are available on a national or regional database, and information on the potential risk factors are also available, it would be preferable to conduct a census instead of investigating a sample.
However, there are several theoretical and practical reasons that prevent us from carrying out census-based surveys, including:
Ethical issues: it is unethical to include a greater number of individuals than that effectively required;
Budgetary limitations: the high costs of a census survey often limits its use as a strategy to select participants for a study;
Logistics: censuses often impose great challenges in terms of required staff, equipment, etc. to conduct the study;
Time restrictions: the amount of time needed to plan and conduct a census-based survey may be excessive; and,
Unknown target population size: if the study objective is to investigate the presence of premalignant skin lesions in illicit drugs users, lack of information on all existing users makes it impossible to conduct a census-based study.
All these reasons explain why samples are more frequently used. However, researchers must be aware that sample results can be affected by the random error (or sampling error). 3 To exemplify this concept, we will consider a research study aiming to estimate the prevalence of premalignant skin lesions (outcome) among individuals >18 years residing in a specific city (target population). The city has a total population of 4,000 adults, but the investigator decided to collect data on a representative sample of 400 participants, detecting an 8% prevalence of premalignant skin lesions. A week later, the researcher selects another sample of 400 participants from the same target population to confirm the results, but this time observes a 12% prevalence of premalignant skin lesions. Based on these findings, is it possible to assume that the prevalence of lesions increased from the first to the second week? The answer is probably not. Each time we select a new sample, it is very likely to obtain a different result. These fluctuations are attributed to the "random error." They occur because individuals composing different samples are not the same, even though they were selected from the same target population. Therefore, the parameters of interest may vary randomly from one sample to another. Despite this fluctuation, if it were possible to obtain 100 different samples of the same population, approximately 95 of them would provide prevalence estimates very close to the real estimate in the target population - the value that we would observe if we investigated all the 4,000 adults residing in the city. Thus, during the sample size estimation the investigator must specify in advance the highest or maximum acceptable random error value in the study. Most population-based studies use a random error ranging from 2 to 5 percentage points. Nevertheless, the researcher should be aware that the smaller the random error considered in the study, the larger the required sample size. 1
SAMPLE FRAME
The sample frame is the group of individuals that can be selected from the target population given the sampling process used in the study. For example, to identify cases of cutaneous melanoma the researcher may consider to utilize as sample frame the national cancer registry system or the anatomopathological records of skin biopsies. Given that the sample may represent only a portion of the target population, the researcher needs to examine carefully whether the selected sample frame fits the study objectives or hypotheses, and especially if there are strategies to overcome the sample frame limitations (see Chart 1 for examples and possible limitations).
Examples of sample frames and potential limitations as regards representativeness
Sampling can be defined as the process through which individuals or sampling units are selected from the sample frame. The sampling strategy needs to be specified in advance, given that the sampling method may affect the sample size estimation. 1 , 5 Without a rigorous sampling plan the estimates derived from the study may be biased (selection bias). 3
TYPES OF SAMPLING
In figure 1 , we depict a summary of the main sampling types. There are two major sampling types: probabilistic and nonprobabilistic.
Sampling types used in scientific studies
NONPROBABILISTIC SAMPLING
In the context of nonprobabilistic sampling, the likelihood of selecting some individuals from the target population is null. This type of sampling does not render a representative sample; therefore, the observed results are usually not generalizable to the target population. Still, unrepresentative samples may be useful for some specific research objectives, and may help answer particular research questions, as well as contribute to the generation of new hypotheses. 4 The different types of nonprobabilistic sampling are detailed below.
Convenience sampling : the participants are consecutively selected in order of apperance according to their convenient accessibility (also known as consecutive sampling). The sampling process comes to an end when the total amount of participants (sample saturation) and/or the time limit (time saturation) are reached. Randomized clinical trials are usually based on convenience sampling. After sampling, participants are usually randomly allocated to the intervention or control group (randomization). 3 Although randomization is a probabilistic process to obtain two comparable groups (treatment and control), the samples used in these studies are generally not representative of the target population.
Purposive sampling: this is used when a diverse sample is necessary or the opinion of experts in a particular field is the topic of interest. This technique was used in the study by Roubille et al, in which recommendations for the treatment of comorbidities in patients with rheumatoid arthritis, psoriasis, and psoriatic arthritis were made based on the opinion of a group of experts. 6
Quota sampling: according to this sampling technique, the population is first classified by characteristics such as gender, age, etc. Subsequently, sampling units are selected to complete each quota. For example, in the study by Larkin et al., the combination of vemurafenib and cobimetinib versus placebo was tested in patients with locally-advanced melanoma, stage IIIC or IV, with BRAF mutation. 7 The study recruited 495 patients from 135 health centers located in several countries. In this type of study, each center has a "quota" of patients.
"Snowball" sampling : in this case, the researcher selects an initial group of individuals. Then, these participants indicate other potential members with similar characteristics to take part in the study. This is frequently used in studies investigating special populations, for example, those including illicit drugs users, as was the case of the study by Gonçalves et al, which assessed 27 users of cocaine and crack in combination with marijuana. 8
PROBABILISTIC SAMPLING
In the context of probabilistic sampling, all units of the target population have a nonzero probability to take part in the study. If all participants are equally likely to be selected in the study, equiprobabilistic sampling is being used, and the odds of being selected by the research team may be expressed by the formula: P=1/N, where P equals the probability of taking part in the study and N corresponds to the size of the target population. The main types of probabilistic sampling are described below.
Simple random sampling: in this case, we have a full list of sample units or participants (sample basis), and we randomly select individuals using a table of random numbers. An example is the study by Pimenta et al, in which the authors obtained a listing from the Health Department of all elderly enrolled in the Family Health Strategy and, by simple random sampling, selected a sample of 449 participants. 9
Systematic random sampling: in this case, participants are selected from fixed intervals previously defined from a ranked list of participants. For example, in the study of Kelbore et al, children who were assisted at the Pediatric Dermatology Service were selected to evaluate factors associated with atopic dermatitis, selecting always the second child by consulting order. 10
Stratified sampling: in this type of sampling, the target population is first divided into separate strata. Then, samples are selected within each stratum, either through simple or systematic sampling. The total number of individuals to be selected in each stratum can be fixed or proportional to the size of each stratum. Each individual may be equally likely to be selected to participate in the study. However, the fixed method usually involves the use of sampling weights in the statistical analysis (inverse of the probability of selection or 1/P). An example is the study conducted in South Australia to investigate factors associated with vitamin D deficiency in preschool children. Using the national census as the sample frame, households were randomly selected in each stratum and all children in the age group of interest identified in the selected houses were investigated. 11
Cluster sampling: in this type of probabilistic sampling, groups such as health facilities, schools, etc., are sampled. In the above-mentioned study, the selection of households is an example of cluster sampling. 11
Complex or multi-stage sampling: This probabilistic sampling method combines different strategies in the selection of the sample units. An example is the study of Duquia et al. to assess the prevalence and factors associated with the use of sunscreen in adults. The sampling process included two stages. 12 Using the 2000 Brazilian demographic census as sampling frame, all 404 census tracts from Pelotas (Southern Brazil) were listed in ascending order of family income. A sample of 120 tracts were systematically selected (first sampling stage units). In the second stage, 12 households in each of these census tract (second sampling stage units) were systematically drawn. All adult residents in these households were included in the study (third sampling stage units). All these stages have to be considered in the statistical analysis to provide correct estimates.
NONRESPONDENTS
Frequently, sample sizes are increased by 10% to compensate for potential nonresponses (refusals/losses). 1 Let us imagine that in a study to assess the prevalence of premalignant skin lesions there is a higher percentage of nonrespondents among men (10%) than among women (1%). If the highest percentage of nonresponse occurs because these men are not at home during the scheduled visits, and these participants are more likely to be exposed to the sun, the number of skin lesions will be underestimated. For this reason, it is strongly recommended to collect and describe some basic characteristics of nonrespondents (sex, age, etc.) so they can be compared to the respondents to evaluate whether the results may have been affected by this systematic error.
Often, in study protocols, refusal to participate or sign the informed consent is considered an "exclusion criteria". However, this is not correct, as these individuals are eligible for the study and need to be reported as "nonrespondents".
SAMPLING METHOD ACCORDING TO THE TYPE OF STUDY
In general, clinical trials aim to obtain a homogeneous sample which is not necessarily representative of any target population. Clinical trials often recruit those participants who are most likely to benefit from the intervention. 3 Thus, the more strict criteria for inclusion and exclusion of subjects in clinical trials often make it difficult to locate participants: after verification of the eligibility criteria, just one out of ten possible candidates will enter the study. Therefore, clinical trials usually show limitations to generalize the results to the entire population of patients with the disease, but only to those with similar characteristics to the sample included in the study. These peculiarities in clinical trials justify the necessity of conducting a multicenter and/or global studiesto accelerate the recruitment rate and to reach, in a shorter time, the number of patients required for the study. 13
In turn, in observational studies to build a solid sampling plan is important because of the great heterogeneity usually observed in the target population. Therefore, this heterogeneity has to be also reflected in the sample. A cross-sectional population-based study aiming to assess disease estimates or identify risk factors often uses complex probabilistic sampling, because the sample representativeness is crucial. However, in a case-control study, we face the challenge of selecting two different samples for the same study. One sample is formed by the cases, which are identified based on the diagnosis of the disease of interest. The other consists of controls, which need to be representative of the population that originated the cases. Improper selection of control individuals may introduce selection bias in the results. Thus, the concern with representativeness in this type of study is established based on the relationship between cases and controls (comparability).
In cohort studies, individuals are recruited based on the exposure (exposed and unexposed subjects), and they are followed over time to evaluate the occurrence of the outcome of interest. At baseline, the sample can be selected from a representative sample (population-based cohort studies) or a non-representative sample. However, in the successive follow-ups of the cohort member, study participants must be a representative sample of those included in the baseline. 14 , 15 In this type of study, losses over time may cause follow-up bias.
Researchers need to decide during the planning stage of the study if they will work with the entire target population or a sample. Working with a sample involves different steps, including sample size estimation, identification of the sample frame, and selection of the sampling method to be adopted.
Financial Support: None.
Study performed at Faculdade Meridional - Escola de Medicina (IMED) - Passo Fundo (RS), Brazil.
- 1. Martínez-Mesa J, González-Chica DA, Bastos JL, Bonamigo RR, Duquia RP. Sample size: how many participants do I need in my research? An Bras Dermatol. 2014;89:609–615. doi: 10.1590/abd1806-4841.20143705. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 2. Röhrig B, du Prel JB, Wachtlin D, Kwiecien R, Blettner M. Sample size calculation in clinical trials: part 13 of a series on evaluation of scientific publications. Dtsch Arztebl Int. 2010;107:552–556. doi: 10.3238/arztebl.2010.0552. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 3. Suresh K, Thomas SV, Suresh G. Design, data analysis and sampling techniques for clinical research. Ann Indian Acad Neurol. 2011;14:287–290. doi: 10.4103/0972-2327.91951. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 4. Rothman KJ, Gallacher JE, Hatch EE. Why representativeness should be avoided. Int J Epidemiol. 2013;42:1012–1014. doi: 10.1093/ije/dys223. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 5. Krause M, Lutz W, Boehnke JR. The role of sampling in clinical trial design. Psychother Res. 2011;21:243–251. doi: 10.1080/10503307.2010.549520. [ DOI ] [ PubMed ] [ Google Scholar ]
- 6. Roubille C, Richer V, Starnino T, McCourt C, McFarlane A, Fleming P, et al. Evidence-based Recommendations for the Management of Comorbidities in Rheumatoid Arthritis, Psoriasis, and Psoriatic Arthritis: Expert Opinion of the Canadian Dermatology-Rheumatology Comorbidity Initiative. J Rheumatol. 2015;42:1767–1780. doi: 10.3899/jrheum.141112. [ DOI ] [ PubMed ] [ Google Scholar ]
- 7. Larkin J, Ascierto PA, Dréno B, Atkinson V, Liszkay G, Maio M, et al. Combined vemurafenib and cobimetinib in BRAF-mutated melanoma. N Engl J Med. 2014;371:1867–1876. doi: 10.1056/NEJMoa1408868. [ DOI ] [ PubMed ] [ Google Scholar ]
- 8. Goncalves JR, Nappo SA. Factors that lead to the use of crack cocaine in combination with marijuana in Brazil: a qualitative study. BMC Public Health. 2015;15:706–706. doi: 10.1186/s12889-015-2063-0. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 9. Pimenta FB, Pinho L, Silveira MF, Botelho AC. Factors associated with chronic diseases among the elderly receiving treatment under the Family Health Strategy. Cien Saude Colet. 2015;20:2489–2498. doi: 10.1590/1413-81232015208.11742014. [ DOI ] [ PubMed ] [ Google Scholar ]
- 10. Kelbore AG, Alemu W, Shumye A, Getachew S. Magnitude and associated factors of Atopic dermatitis among children in Ayder referral hospital, Mekelle, Ethiopia. BMC Dermatol. 2015;15:15–15. doi: 10.1186/s12895-015-0034-x. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 11. Zhou SJ, Skeaff M, Makrides M, Gibson R. Vitamin D status and its predictors among pre-school children in Adelaide. J Paediatr Child Health. 2015;51:614–619. doi: 10.1111/jpc.12770. [ DOI ] [ PubMed ] [ Google Scholar ]
- 12. Duquia RP, Menezes AM, Almeida HL, Jr, Reichert FF, Santos Ida S, Haack RL, et al. Prevalence of sun exposure and its associated factors in southern Brazil: a population-based study. An Bras Dermatol. 2013;88:554–561. doi: 10.1590/abd1806-4841.20132122. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 13. Barrios CH, Werutsky G, Martinez-Mesa J. The global conduct of cancer clinical trials: challenges and opportunities. Am Soc Clin Oncol Educ Book. 2015:e132–e139. doi: 10.14694/EdBook_AM.2015.35.e132. [ DOI ] [ PubMed ] [ Google Scholar ]
- 14. Victora CG, Barros FC. Cohort profile: the 1982 Pelotas (Brazil) birth cohort study. Int J Epidemiol. 2006;35:237–242. doi: 10.1093/ije/dyi290. [ DOI ] [ PubMed ] [ Google Scholar ]
- 15. Boing AC, Peres KG, Boing AF, Hallal PC, Silva NN, Peres MA. EpiFloripa Health Survey: the methodological and operational aspects behind the scenes. Rev Bras Epidemiol. 2014;17:147–162. doi: 10.1590/1415-790x201400010012eng. [ DOI ] [ PubMed ] [ Google Scholar ]
- View on publisher site
- PDF (276.2 KB)
- Collections
Similar articles
Cited by other articles, links to ncbi databases.
- Download .nbib .nbib
- Format: AMA APA MLA NLM
IMAGES
VIDEO
COMMENTS
Sampling bias occurs when a sample does not accurately represent the population being studied. This can happen when there are systematic errors in the sampling process, leading to over-representation or under-representation of certain groups within the sample.
Selection bias refers to situations where research bias is introduced due to factors related to the study’s participants. Selection bias can be introduced via the methods used to select the population of interest, the sampling methods , or the recruitment of participants.
In this paper, we will define bias and identify potential sources of bias which occur during study design, study implementation, and during data analysis and publication. We will also make recommendations on avoiding bias before, during, and after a clinical trial.
Selection bias occurs when the sample being studied is not representative of the population from which the sample was drawn, leading to skewed or misleading results (Walliman, 2021). In these situations, the sample under study deviates from a fair, random, and equitable selection process.
Examples include sampling on outcomes of the Y variable and (sometimes) selection on X variables not in the model. If you can build a model of selection into the sample, you may be able to correct for selection bias. Experiments can solve the sample selection problem in theory.
Sampling bias occurs when some members of a population are systematically more likely to be selected in a sample than others. It is also called ascertainment bias in medical fields. Sampling bias limits the generalizability of findings because it is a threat to external validity, specifically population validity. In other words, findings from ...
Bias occurs if the study population does not closely represent a target population due to errors in study design or implementation, termed selection bias. Sampling bias is one form of selection bias and typically occurs if subjects were selected in a non-random way.
Sample selection bias is a type of bias caused by choosing non-random data for statistical analysis. The bias exists due to a flaw in the sample selection process, where a subset of the...
Selection bias can lead to skewed or inaccurate research findings by introducing a non-representative sample that does not accurately reflect the broader population of interest. It can undermine the internal validity and generalisability of study results.
In this paper, the basic elements related to the selection of participants for a health research are discussed. Sample representativeness, sample frame, types of sampling, as well as the impact that non-respondents may have on results of a study are described.