Ohio State nav bar

The Ohio State University

  • BuckeyeLink
  • Find People
  • Search Ohio State

Research Questions & Hypotheses

Generally, in quantitative studies, reviewers expect hypotheses rather than research questions. However, both research questions and hypotheses serve different purposes and can be beneficial when used together.

Research Questions

Clarify the research’s aim (farrugia et al., 2010).

  • Research often begins with an interest in a topic, but a deep understanding of the subject is crucial to formulate an appropriate research question.
  • Descriptive: “What factors most influence the academic achievement of senior high school students?”
  • Comparative: “What is the performance difference between teaching methods A and B?”
  • Relationship-based: “What is the relationship between self-efficacy and academic achievement?”
  • Increasing knowledge about a subject can be achieved through systematic literature reviews, in-depth interviews with patients (and proxies), focus groups, and consultations with field experts.
  • Some funding bodies, like the Canadian Institute for Health Research, recommend conducting a systematic review or a pilot study before seeking grants for full trials.
  • The presence of multiple research questions in a study can complicate the design, statistical analysis, and feasibility.
  • It’s advisable to focus on a single primary research question for the study.
  • The primary question, clearly stated at the end of a grant proposal’s introduction, usually specifies the study population, intervention, and other relevant factors.
  • The FINER criteria underscore aspects that can enhance the chances of a successful research project, including specifying the population of interest, aligning with scientific and public interest, clinical relevance, and contribution to the field, while complying with ethical and national research standards.
Feasible
Interesting
Novel
Ethical
Relevant
  • The P ICOT approach is crucial in developing the study’s framework and protocol, influencing inclusion and exclusion criteria and identifying patient groups for inclusion.
Population (patients)
Intervention (for intervention studies only)
Comparison group
Outcome of interest
Time
  • Defining the specific population, intervention, comparator, and outcome helps in selecting the right outcome measurement tool.
  • The more precise the population definition and stricter the inclusion and exclusion criteria, the more significant the impact on the interpretation, applicability, and generalizability of the research findings.
  • A restricted study population enhances internal validity but may limit the study’s external validity and generalizability to clinical practice.
  • A broadly defined study population may better reflect clinical practice but could increase bias and reduce internal validity.
  • An inadequately formulated research question can negatively impact study design, potentially leading to ineffective outcomes and affecting publication prospects.

Checklist: Good research questions for social science projects (Panke, 2018)

all quantitative research must be hypothesis driven

Research Hypotheses

Present the researcher’s predictions based on specific statements.

  • These statements define the research problem or issue and indicate the direction of the researcher’s predictions.
  • Formulating the research question and hypothesis from existing data (e.g., a database) can lead to multiple statistical comparisons and potentially spurious findings due to chance.
  • The research or clinical hypothesis, derived from the research question, shapes the study’s key elements: sampling strategy, intervention, comparison, and outcome variables.
  • Hypotheses can express a single outcome or multiple outcomes.
  • After statistical testing, the null hypothesis is either rejected or not rejected based on whether the study’s findings are statistically significant.
  • Hypothesis testing helps determine if observed findings are due to true differences and not chance.
  • Hypotheses can be 1-sided (specific direction of difference) or 2-sided (presence of a difference without specifying direction).
  • 2-sided hypotheses are generally preferred unless there’s a strong justification for a 1-sided hypothesis.
  • A solid research hypothesis, informed by a good research question, influences the research design and paves the way for defining clear research objectives.

Types of Research Hypothesis

  • In a Y-centered research design, the focus is on the dependent variable (DV) which is specified in the research question. Theories are then used to identify independent variables (IV) and explain their causal relationship with the DV.
  • Example: “An increase in teacher-led instructional time (IV) is likely to improve student reading comprehension scores (DV), because extensive guided practice under expert supervision enhances learning retention and skill mastery.”
  • Hypothesis Explanation: The dependent variable (student reading comprehension scores) is the focus, and the hypothesis explores how changes in the independent variable (teacher-led instructional time) affect it.
  • In X-centered research designs, the independent variable is specified in the research question. Theories are used to determine potential dependent variables and the causal mechanisms at play.
  • Example: “Implementing technology-based learning tools (IV) is likely to enhance student engagement in the classroom (DV), because interactive and multimedia content increases student interest and participation.”
  • Hypothesis Explanation: The independent variable (technology-based learning tools) is the focus, with the hypothesis exploring its impact on a potential dependent variable (student engagement).
  • Probabilistic hypotheses suggest that changes in the independent variable are likely to lead to changes in the dependent variable in a predictable manner, but not with absolute certainty.
  • Example: “The more teachers engage in professional development programs (IV), the more their teaching effectiveness (DV) is likely to improve, because continuous training updates pedagogical skills and knowledge.”
  • Hypothesis Explanation: This hypothesis implies a probable relationship between the extent of professional development (IV) and teaching effectiveness (DV).
  • Deterministic hypotheses state that a specific change in the independent variable will lead to a specific change in the dependent variable, implying a more direct and certain relationship.
  • Example: “If the school curriculum changes from traditional lecture-based methods to project-based learning (IV), then student collaboration skills (DV) are expected to improve because project-based learning inherently requires teamwork and peer interaction.”
  • Hypothesis Explanation: This hypothesis presumes a direct and definite outcome (improvement in collaboration skills) resulting from a specific change in the teaching method.
  • Example : “Students who identify as visual learners will score higher on tests that are presented in a visually rich format compared to tests presented in a text-only format.”
  • Explanation : This hypothesis aims to describe the potential difference in test scores between visual learners taking visually rich tests and text-only tests, without implying a direct cause-and-effect relationship.
  • Example : “Teaching method A will improve student performance more than method B.”
  • Explanation : This hypothesis compares the effectiveness of two different teaching methods, suggesting that one will lead to better student performance than the other. It implies a direct comparison but does not necessarily establish a causal mechanism.
  • Example : “Students with higher self-efficacy will show higher levels of academic achievement.”
  • Explanation : This hypothesis predicts a relationship between the variable of self-efficacy and academic achievement. Unlike a causal hypothesis, it does not necessarily suggest that one variable causes changes in the other, but rather that they are related in some way.

Tips for developing research questions and hypotheses for research studies

  • Perform a systematic literature review (if one has not been done) to increase knowledge and familiarity with the topic and to assist with research development.
  • Learn about current trends and technological advances on the topic.
  • Seek careful input from experts, mentors, colleagues, and collaborators to refine your research question as this will aid in developing the research question and guide the research study.
  • Use the FINER criteria in the development of the research question.
  • Ensure that the research question follows PICOT format.
  • Develop a research hypothesis from the research question.
  • Ensure that the research question and objectives are answerable, feasible, and clinically relevant.

If your research hypotheses are derived from your research questions, particularly when multiple hypotheses address a single question, it’s recommended to use both research questions and hypotheses. However, if this isn’t the case, using hypotheses over research questions is advised. It’s important to note these are general guidelines, not strict rules. If you opt not to use hypotheses, consult with your supervisor for the best approach.

Farrugia, P., Petrisor, B. A., Farrokhyar, F., & Bhandari, M. (2010). Practical tips for surgical research: Research questions, hypotheses and objectives.  Canadian journal of surgery. Journal canadien de chirurgie ,  53 (4), 278–281.

Hulley, S. B., Cummings, S. R., Browner, W. S., Grady, D., & Newman, T. B. (2007). Designing clinical research. Philadelphia.

Panke, D. (2018). Research design & method selection: Making good choices in the social sciences.  Research Design & Method Selection , 1-368.

Logo for UEN Digital Press with Pressbooks

Key Concepts in Quantitative Research

In this module, we are going to explore the nuances of quantitative research, including the main types of quantitative research, more exploration into variables (including confounding and extraneous variables), and causation.

Content includes:

  • Flaws, “Proof”, and Rigor
  • The Steps of Quantitative Methodology
  • Major Classes of Quantitative Research
  • Experimental versus Non-Experimental Research
  • Types of Experimental Research
  • Types of Non-Experimental Research
  • Research Variables
  • Confounding/Extraneous Variables
  • Causation versus correlation/association

Objectives:

  • Discuss the flaws, proof, and rigor in research.
  • Describe the differences between independent variables and dependent variables.
  • Describe the steps in quantitative research methodology.
  • Describe experimental, quasi-experimental, and non-experimental research studies
  • Describe confounding and extraneous variables.
  • Differentiate cause-and-effect (causality) versus association/correlation

Flaws, Proof, and Rigor in Research

One of the biggest hurdles that students and seasoned researchers alike struggle to grasp, is that research cannot “ prove ” nor “ disprove ”. Research can only support a hypothesis with reasonable, statistically significant evidence.

Indeed. You’ve heard it incorrectly your entire life. You will hear professors, scientists, radio ads, podcasts, and even researchers comment something to the effect of, “It has been proven that…” or “Research proves that…” or “Finally! There is proof that…”

We have been duped. Consider the “ prove ” word a very bad word in this course. The forbidden “P” word. Do not say it, write it, allude to it, or repeat it. And, for the love of avocados and all things fluffy, do not include the “P” word on your EBP poster. You will be deducted some major points.

We can only conclude with reasonable certainty through statistical analyses that there is a high probability that something did not happen by chance but instead happened due to the intervention that the researcher tested. Got that? We will come back to that concept but for now know that it is called “statistical significance”.

All research has flaws. We might not know what those flaws are, but we will be learning about confounding and extraneous variables later on in this module to help explain how flaws can happen.

Remember this: Sometimes, the researcher might not even know that there was a flaw that occurred. No research project is perfect. There is no 100% awesome. This is a major reason why it is so important to be able to duplicate a research project and obtain similar results. The more we can duplicate research with the same exact methodology and protocols, the more certainty we have in the results and we can start accounting for flaws that may have sneaked in.

Finally, not all research is equal. Some research is done very sloppily, and other research has a very high standard of rigor. How do we know which is which when reading an article? Well, within this module, we will start learning about some things to look for in a published research article to help determine rigor. We do not want lazy research to determine our actions as nurses, right? We want the strongest, most reliable, most valid, most rigorous research evidence possible so that we can take those results and embed them into patient care. Who wants shoddy evidence determining the actions we take with your grandmother’s heart surgery?

Independent Variables and Dependent Variables

As we were already introduced to, there are measures called “variables” in research. This will be a bit of a review but it is important to bring up again, as it is a hallmark of quantitative research. In quantitative studies, the concepts being measured are called variables (AKA: something that varies). Variables are something that can change – either by manipulation or from something causing a change. In the article snapshots that we have looked at, researchers are trying to find causes for phenomena. Does a nursing intervention cause an improvement in patient outcomes? Does the cholesterol medication cause a decrease in cholesterol level? Does smoking cause  cancer?

The presumed cause is called the independent variable. The presumed effect is called the dependent variable. The dependent variable is “dependent” on something causing it to change. The dependent variable is the outcome that a researcher is trying to understand, explain, or predict.

Think back to our PICO questions. You can think of the intervention (I) as the independent variable and the outcome (O) as the dependent variable.

The independent variable is manipulated by the researcher or can be variants of influence. Whereas the dependent variable is never manipulated.

all quantitative research must be hypothesis driven

Variables do not always measure cause-and-effect. They can also measure a direction of influence.

Here is an example of that: If we compared levels of depression among men and women diagnosed with pancreatic cancer and found men to be more depressed, we cannot conclude that depression was caused by gender. However, we can note that the direction of influence   clearly runs from gender to depression. It makes no sense to suggest the depression influenced their gender.

In the above example, what is the independent variable (IV) and what is the dependent variable (DV)? If you guessed gender as the IV and depression as the DV, you are correct! Important to note in this case that the researcher did not manipulate the IV, but the IV is manipulated on its own (male or female).

Researchers do not always have just one IV. In some cases, more than one IV may be measured. Take, for instance, a study that wants to measure the factors that influence one’s study habits. Independent variables of gender, sleep habits, and hours of work may be considered. Likewise, multiple DVs can be measured. For example, perhaps we want to measure weight and abdominal girth on a plant-based diet (IV).

Now, some studies do not have an intervention. We will come back to that when we talk about non-experimental research.

The point of variables is so that researchers have a very specific measurement that they seek to study.

all quantitative research must be hypothesis driven

Let’s look at a couple of examples:

Study

Independent Variable(s)(Intervention/Treatment)

Dependent Variable(s)(Effect/Results)

  An analysis of emotional intelligence in nursing leaders—focuses on the meaning of emotional intelligence specific to nurses—defines emotional intelligence, the consequences, and antecedents.

A literature review is used to find information about the meaning, consequences, and antecedents of emotional intelligence.  

None – there is no intervention

The definition of emotional intelligence.

The antecedents of emotional intelligence.

: In this study, nurses use protocol hand hygiene for their own hands and patient hands to examine if the hand hygiene protocol will decrease hospital-acquired infections in the Intensive Care Unit.

Hand hygiene for nurses and patients.

Nurse in-service training on hand hygiene for nurses and patients.

Hospital-acquired infection rates in the ICU.

Now you try! Identify the IVs and DVs:

Study

Independent Variable(s)(Intervention/Treatment)

Dependent Variable(s)(Effect/Results)

:  A nurse wants to know if extra education about healthy lifestyles with a focus on increasing physical activity with adolescents will increase their physical activity levels and impact their heart rates and blood pressures over a 6-month time.

Data is collected before intervention and after intervention at multiple intervals.

A control group and intervention group is used.   Randomized assignment to groups is used.   (True Experimental design with intervention group, control group, and randomization.)

 

 

: Playing classical music for college students was examined to study if it impacts their grades—music was played for college students in the study and their post music grades were compared to their pre-music grades.

 

 

: A nurse researcher studies the lived experiences of registered nurses in their first year of nursing practice through a one-on-one interview.   The nurse researcher records all the data and then has it transcribes to analysis themes that emerge from the 28 nurses interviewed.

 

 

IV and DV Case Studies (Leibold, 2020)

Case Three:   Independent variable: Healthy Lifestyle education with a focus on physical activity; Dependent variable: Physical activity rate before and after education intervention, Heart rate before and after education intervention, Blood pressures before and after education intervention.

Case Four:   Independent variable: Playing classical music; Dependent variable:  Grade point averages post classical music, compared to pre-classical music.

Case Five: Independent variable: No independent variable as there is no intervention.  Dependent variable: The themes that emerge from the qualitative data.

The Steps in Quantitative Research Methodology

Now, as we learned in the last module, quantitative research is completely objective. There is no subjectivity to it. Why is this? Well, as we have learned, the purpose of quantitative research is to make an inference about the results in order to generalize these results to the population.

In quantitative studies, there is a very systematic approach that moves from the beginning point of the study (writing a research question) to the end point (obtaining an answer). This is a very linear and purposeful flow across the study, and all quantitative research should follow the same sequence.

  • Identifying a problem and formulating a research question . Quantitative research begins with a theory . As in, “something is wrong and we want to fix it or improve it”.  Think back to when we discussed research problems and formulating a research question. Here we are! That is the first step in formulating a quantitative research plan.
  • Formulate a hypothesis . This step is key. Researchers need to know exactly what they are testing so that testing the hypothesis can be achieved through specific statistical analyses.
  • A thorough literature review .  At this step, researchers strive to understand what is already known about a topic and what evidence already exists.
  • Identifying a framework .  When an appropriate framework is identified, the findings of a study may have broader significance and utility (Polit & Beck, 2021).
  • Choosing a study design . The research design will determine exactly how the researcher will obtain the answers to the research question(s). The entire design needs to be structured and controlled, with the overarching goal of minimizing bias and errors. The design determines what data will be collected and how, how often data will be collected, what types of comparisons will be made. You can think of the study design as the architectural backbone of the entire study.
  • Sampling . The researcher needs to determine a subset of the population that is to be studied. We will come back to the sampling concept in the next module. However, the goal of sampling is to choose a subset of the population that adequate reflects the population of interest.
  • I nstruments to be used to collect data (with reliability and validity as a priority). Researchers must find a way to measure the research variables (intervention and outcome) accurately. The task of measuring is complex and challenging, as data needs to be collected reliably (measuring consistently each time) and valid. Reliability and validity are both about how well a method measures something. The next module will cover this in detail.
  • Obtaining approval for ethical/legal human rights procedures . As we will learn in an upcoming module, there needs to be methods in place to safeguard human rights.
  • Data collection . The fun part! Finally, after everything has been organized and planned, the researcher(s) begin to collect data. The pre-established plan (methodology) determines when data collection begins, how to accomplish it, how data collection staff will be trained, and how data will be recorded.
  • Data analysis . Here comes the statistical analyses. The next module will dive into this.
  • Discussion . After all the analyses have been complete, the researcher then needs to interpret the results and examine the implications. Researchers attempt to explain the findings in light of the theoretical framework, prior evidence, theory, clinical experience, and any limitations in the study now that it has been completed. Often, the researcher discusses not just the statistical significance, but also the clinical significance, as it is common to have one without the other.
  • Summary/references . Part of the final steps of any research project is to disseminate (AKA: share) the findings. This may be in a published article, conference, poster session, etc. The point of this step is to communicate to others the information found through the study.  All references are collected so that the researchers can give credit to others.
  • Budget and funding . As a last mention in the overall steps, budget and funding for research is a consideration. Research can be expensive. Often, researchers can obtain a grant or other funding to help offset the costs.

all quantitative research must be hypothesis driven

Edit: Steps in Quantitative Research video. Step 12 should say “Dissemination” (sharing the results).

Experimental, Quasi-Experimental, and Non-Experimental Studies

To start this section, please watch this wonderful video by Jenny Barrow, MSN, RN, CNE, that explains experimental versus nonexperimental research.

(Jenny Barrow, 2019)

Now that you have that overview, continue reading this module.

Experimental Research : In experimental research, the researcher is seeking to draw a conclusion between an independent variable and a dependent variable. This design attempts to establish cause-effect relationships among the variables. You could think of experimental research as experimenting with “something” to see if it caused “something else”.

A true experiment is called a Randomized Controlled Trial (or RCT). An RCT is at the top of the echelon as far as quantitative experimental research. It’s the gold standard of scientific research. An RCT, a true experimental design, must have 3 features:

  • An intervention : The experiment does something to the participants by the option of manipulating the independent variable.
  • Control : Some participants in the study receive either the standard care, or no intervention at all. This is also called the counterfactual – meaning, it shows what would happen if no intervention was introduced.
  • Randomization : Randomization happens when the researcher makes sure that it is completely random who receives the intervention and who receives the control. The purpose is to make the groups equal regarding all other factors except receipt of the intervention.

Note: There is a lot of confusion with students (and even some researchers!) when they refer to “ random assignment ” versus “ random sampling ”. Random assignment  is a signature of a true experiment. This means that if participants are not truly randomly assigned to intervention groups, then it is not a true experiment. We will talk more about random sampling in the next module.

One very common method for RCT’s is called a pretest-posttest design .  This is when the researcher measures the outcome before and after the intervention. For example, if the researcher had an IV (intervention/treatment) of a pain medication, the DV (pain) would be measured before the intervention is given and after it is given. The control group may just receive a placebo. This design permits the researcher to see if the change in pain was caused by the pain medication because only some people received it (Polit & Beck, 2021).

Another experimental design is called a crossover design . This type of design involves exposing participants to more than one treatment. For example, subject 1 first receives treatment A, then treatment B, then treatment C. Subject 2 might first receive treatment B, then treatment A, and then treatment C. In this type of study, the three conditions for an experiment are met: Intervention, randomization, and control – with the subjects serving as their own control group.

Control group conditions can be done in 4 ways:

  • No intervention is used; control group gets no treatment at all
  • “Usual care” or standard of care or normal procedures used
  • An alternative intervention is uses (e.g. auditory versus visual stimulation)
  • A placebo or pseudo-intervention, presumed to have no therapeutic value, is used

Quasi-Experimental Research : Quasi-experiments involve an experiment just like true experimental research. However, they lack randomization and some even lack a control group.  Therefore, there is implementation and testing of an intervention, but there is an absence of randomization.

For example, perhaps we wanted to measure the effect of yoga for nursing students. The IV (intervention of yoga) is being offered to all nursing students and therefore randomization is not possible. For comparison, we could measure quality of life data on nursing students at a different university. Data is collected from both groups at baseline and then again after the yoga classes. Note, that in quasi-experiments, the phrase “comparison group” is sometimes used instead of “control group” against which outcome measures are collected.

Sometimes there is no comparison group either. This would be called a one-group pretest-posttest design .

Non-Experimental Research : Sometimes, cause-problem research questions cannot be answered with an experimental or quasi-experimental design because the IV cannot be manipulated. For example, if we want to measure what impact prerequisite grades have on student success in nursing programs, we obviously cannot manipulate the prerequisite grades. In another example, if we wanted to investigate how low birth weight impacts developmental progression in children, we cannot manipulate the birth weight. Often, you will see the word “observational” in lieu of non-experimental researcher. This does not mean the researcher is just standing and watching people, but instead it refers to the method of observing data that has already been established without manipulation.

There are various types of non-experimental research:

Correlational research : A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. In the example of prerequisites and nursing program success, that is a correlational design. Consider hypothetically, a researcher is studying a correlation between cancer and marriage. In this study, there are two variables: disease and marriage. Let us say marriage has a negative association with cancer. This means that married people are less likely to develop cancer.

Cohort design (also called a prospective design) : In a cohort study, the participants do not have the outcome of interest to begin with. They are selected based on the exposure status of the individual. They are then followed over time to evaluate for the occurrence of the outcome of interest. Cohorts may be divided into exposure categories once baseline measurements of a defined population are made. For example, the Framingham Cardiovascular Disease Study (CVD) used baseline measurements to divide the population into categories of CVD risk factors. Another example:  An example of a cohort study is comparing the test scores of one group of people who underwent extensive tutoring and a special curriculum and those who did not receive any extra help. The group could be studied for years to assess whether their scores improve over time and at what rate.

Retrospective design : In retrospective studies, the outcome of interest has already occurred (or not occurred – e.g., in controls) in each individual by the time s/he is enrolled, and the data are collected either from records or by asking participants to recall exposures. There is no follow-up of participants. For example, a researcher might examine the medical histories of 1000 elderly women to identify the causes of health problems.

Case-control design : A study that compares two groups of people: those with the disease or condition under study (cases) and a very similar group of people who do not have the condition. For example, investigators conducted a case-control study to determine if there is an association between colon cancer and a high fat diet. Cases were all confirmed colon cancer cases in North Carolina in 2010. Controls were a sample of North Carolina residents without colon cancer.

Descriptive research : Descriptive research design is a type of research design that aims to obtain information to systematically describe a phenomenon, situation, or population. More specifically, it helps answer the what, when, where, and how questions regarding the research problem, rather than the why. For example, the researcher might wish to discover the percentage of motorists who tailgate – the prevalence  of a certain behavior.

There are two other designs to mention, which are both on a time continuum basis.

Cross-sectional design : All data are collected at a single point in time. Retrospective studies are usually cross-sectional. The IV usually concerns events or behaviors occurring in the past. One cross-sectional study example in medicine is a data collection of smoking habits and lung cancer incidence in a given population. A cross-sectional study like this cannot solely determine that smoking habits cause lung cancer, but it can suggest a relationship that merits further investigation. Cross-sectional studies serve many purposes, and the cross-sectional design is the most relevant design when assessing the prevalence of disease, attitudes and knowledge among patients and health personnel, in validation studies comparing, for example, different measurement instruments, and in reliability studies.

Longitudinal design : Data are collected two or more times over an extended period. Longitudinal designs are better at showing patterns of change and at clarifying whether a cause occurred before an effect (outcome). A challenge in longitudinal studies is attrition or the loss of participants over time. In a longitudinal study subjects are followed over time with continuous or repeated monitoring of risk factors or health outcomes, or both. Such investigations vary enormously in their size and complexity. At one extreme a large population may be studied over decades. An example of a longitudinal design is a multiyear comparative study of the same children in an urban and a suburban school to record their cognitive development in depth.

Confounding and Extraneous Variables

Confounding variables  are a type of extraneous variable that occur which interfere with or influence the relationship between the independent and dependent variables. In research that investigates a potential cause-and-effect relationship, a confounding variable is an unmeasured third variable that influences both the supposed cause and the supposed effect.

It’s important to consider potential confounding variables and account for them in research designs to ensure results are valid. You can imagine that if something sneaks in to influence the measured variables, it can really muck up the study!

Here is an example:

You collect data on sunburns and ice cream consumption. You find that higher ice cream consumption is associated with a higher probability of sunburn. Does that mean ice cream consumption causes sunburn?

Here, the confounding variable is temperature: hot temperatures cause people to both eat more ice cream and spend more time outdoors under the sun, resulting in more sunburns.

image

To ensure the internal validity of research, the researcher must account for confounding variables. If he/she fails to do so, the results may not reflect the actual relationship between the variables that they are interested in.

For instance, they may find a cause-and-effect relationship that does not actually exist, because the effect they measure is caused by the confounding variable (and not by the independent variable).

Here is another example:

The researcher finds that babies born to mothers who smoked during their pregnancies weigh significantly less than those born to non-smoking mothers. However, if the researcher does not account for the fact that smokers are more likely to engage in other unhealthy behaviors, such as drinking or eating less healthy foods, then he/she might overestimate the relationship between smoking and low birth weight.

Extraneous variables are any variables that the researcher is not investigating that can potentially affect the outcomes of the research study. If left uncontrolled, extraneous variables can lead to inaccurate conclusions about the relationship between IVs and DVs.

Extraneous variables can threaten the internal validity of a study by providing alternative explanations for the results. In an experiment, the researcher manipulates an independent variable to study its effects on a dependent variable.

In a study on mental performance, the researcher tests whether wearing a white lab coat, the independent variable (IV), improves scientific reasoning, the dependent variable (DV).

Students from a university are recruited to participate in the study. The researcher manipulates the independent variable by splitting participants into two groups:

  • Participants in the experimental   group are asked to wear a lab coat during the study.
  • Participants in the control group are asked to wear a casual coat during the study.

All participants are given a scientific knowledge quiz, and their scores are compared between groups.

When extraneous variables are uncontrolled, it’s hard to determine the exact effects of the independent variable on the dependent variable, because the effects of extraneous variables may mask them.

Uncontrolled extraneous variables can also make it seem as though there is a true effect of the independent variable in an experiment when there’s actually none.

In the above experiment example, these extraneous variables can affect the science knowledge scores:

  • Participant’s major (e.g., STEM or humanities)
  • Participant’s interest in science
  • Demographic variables such as gender or educational background
  • Time of day of testing
  • Experiment environment or setting

If these variables systematically differ between the groups, you can’t be sure whether your results come from your independent variable manipulation or from the extraneous variables.

In summary, an extraneous variable is anything that could influence the dependent variable. A confounding variable influences the dependent variable, and also correlates with or causally affects the independent variable.

image

Cause-and-Effect (Causality) Versus Association/Correlation  

A very important concept to understand is cause-and-effect, also known as causality, versus correlation. Let’s look at these two concepts in very simplified statements. Causation means that one thing caused  another thing to happen. Correlation means there is some association between the two thing we are measuring.

It would be nice if it were as simple as that. These two concepts can indeed by confused by many. Let’s dive deeper.

Two or more variables are considered to be related or associated, in a statistical context, if their values change so that as the value of one variable increases or decreases so does the value of the other variable (or the opposite direction).

For example, for the two variables of “hours worked” and “income earned”, there is a relationship between the two if the increase in hours is associated with an increase in income earned.

However, correlation is a statistical measure that describes the size and direction of a relationship between two or more variables. A correlation does not automatically mean that the change in one variable caused the change in value in the other variable.

Theoretically, the difference between the two types of relationships is easy to identify — an action or occurrence can cause another (e.g. smoking causes an increase in the risk of developing lung cancer), or it can correlate with another (e.g. smoking is correlated with alcoholism, but it does not cause alcoholism). In practice, however, it remains difficult to clearly establish cause and effect, compared with establishing correlation.

Simplified in this image, we can say that hot and sunny weather causes an increase in ice cream consumption. Similarly, we can demise that hot and sunny weather increases the incidence of sunburns. However, we cannot say that ice cream caused a sunburn (or that a sunburn increases consumption of ice cream). It is purely coincidental. In this example, it is pretty easy to anecdotally surmise correlation versus causation. However, in research, we have statistical tests that help researchers differentiate via specialized analyses.

An image showing a sun pointing to an ice cream cone and a person with a sunburn as causation. Then between the ice cream cone and sunburn as correlcations

Here is a great Khan Academy video of about 5 minutes that shows a worked example of correlation versus causation with regard to sledding accidents and frostbite cases:

https://www.khanacademy.org/test-prep/praxis-math/praxis-math-lessons/gtp–praxis-math–lessons–statistics-and-probability/v/gtp–praxis-math–video–correlation-and-causation

all quantitative research must be hypothesis driven

References & Attribution

“ Light bulb doodle ” by rawpixel licensed CC0 .

“ Magnifying glass ” by rawpixel licensed CC0

“ Orange flame ” by rawpixel licensed CC0 .

Jenny Barrow. (2019). Experimental versus nonexperimental research. https://www.youtube.com/watch?v=FJo8xyXHAlE

Leibold, N. (2020). Research variables. Measures and Concepts Commonly Encountered in EBP. Creative Commons License: BY NC

Polit, D. & Beck, C. (2021).  Lippincott CoursePoint Enhanced for Polit’s Essentials of Nursing Research  (10th ed.). Wolters Kluwer Health.

Evidence-Based Practice & Research Methodologies Copyright © by Tracy Fawns is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • How to Write a Strong Hypothesis | Steps & Examples

How to Write a Strong Hypothesis | Steps & Examples

Published on May 6, 2022 by Shona McCombes . Revised on November 20, 2023.

A hypothesis is a statement that can be tested by scientific research. If you want to test a relationship between two or more variables, you need to write hypotheses before you start your experiment or data collection .

Example: Hypothesis

Daily apple consumption leads to fewer doctor’s visits.

Table of contents

What is a hypothesis, developing a hypothesis (with example), hypothesis examples, other interesting articles, frequently asked questions about writing hypotheses.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess – it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Variables in hypotheses

Hypotheses propose a relationship between two or more types of variables .

  • An independent variable is something the researcher changes or controls.
  • A dependent variable is something the researcher observes and measures.

If there are any control variables , extraneous variables , or confounding variables , be sure to jot those down as you go to minimize the chances that research bias  will affect your results.

In this example, the independent variable is exposure to the sun – the assumed cause . The dependent variable is the level of happiness – the assumed effect .

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Step 1. Ask a question

Writing a hypothesis begins with a research question that you want to answer. The question should be focused, specific, and researchable within the constraints of your project.

Step 2. Do some preliminary research

Your initial answer to the question should be based on what is already known about the topic. Look for theories and previous studies to help you form educated assumptions about what your research will find.

At this stage, you might construct a conceptual framework to ensure that you’re embarking on a relevant topic . This can also help you identify which variables you will study and what you think the relationships are between them. Sometimes, you’ll have to operationalize more complex constructs.

Step 3. Formulate your hypothesis

Now you should have some idea of what you expect to find. Write your initial answer to the question in a clear, concise sentence.

4. Refine your hypothesis

You need to make sure your hypothesis is specific and testable. There are various ways of phrasing a hypothesis, but all the terms you use should have clear definitions, and the hypothesis should contain:

  • The relevant variables
  • The specific group being studied
  • The predicted outcome of the experiment or analysis

5. Phrase your hypothesis in three ways

To identify the variables, you can write a simple prediction in  if…then form. The first part of the sentence states the independent variable and the second part states the dependent variable.

In academic research, hypotheses are more commonly phrased in terms of correlations or effects, where you directly state the predicted relationship between variables.

If you are comparing two groups, the hypothesis can state what difference you expect to find between them.

6. Write a null hypothesis

If your research involves statistical hypothesis testing , you will also have to write a null hypothesis . The null hypothesis is the default position that there is no association between the variables. The null hypothesis is written as H 0 , while the alternative hypothesis is H 1 or H a .

  • H 0 : The number of lectures attended by first-year students has no effect on their final exam scores.
  • H 1 : The number of lectures attended by first-year students has a positive effect on their final exam scores.
Research question Hypothesis Null hypothesis
What are the health benefits of eating an apple a day? Increasing apple consumption in over-60s will result in decreasing frequency of doctor’s visits. Increasing apple consumption in over-60s will have no effect on frequency of doctor’s visits.
Which airlines have the most delays? Low-cost airlines are more likely to have delays than premium airlines. Low-cost and premium airlines are equally likely to have delays.
Can flexible work arrangements improve job satisfaction? Employees who have flexible working hours will report greater job satisfaction than employees who work fixed hours. There is no relationship between working hour flexibility and job satisfaction.
How effective is high school sex education at reducing teen pregnancies? Teenagers who received sex education lessons throughout high school will have lower rates of unplanned pregnancy teenagers who did not receive any sex education. High school sex education has no effect on teen pregnancy rates.
What effect does daily use of social media have on the attention span of under-16s? There is a negative between time spent on social media and attention span in under-16s. There is no relationship between social media use and attention span in under-16s.

If you want to know more about the research process , methodology , research bias , or statistics , make sure to check out some of our other articles with explanations and examples.

  • Sampling methods
  • Simple random sampling
  • Stratified sampling
  • Cluster sampling
  • Likert scales
  • Reproducibility

 Statistics

  • Null hypothesis
  • Statistical power
  • Probability distribution
  • Effect size
  • Poisson distribution

Research bias

  • Optimism bias
  • Cognitive bias
  • Implicit bias
  • Hawthorne effect
  • Anchoring bias
  • Explicit bias

Prevent plagiarism. Run a free check.

A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Null and alternative hypotheses are used in statistical hypothesis testing . The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. (2023, November 20). How to Write a Strong Hypothesis | Steps & Examples. Scribbr. Retrieved June 13, 2024, from https://www.scribbr.com/methodology/hypothesis/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, construct validity | definition, types, & examples, what is a conceptual framework | tips & examples, operationalization | a guide with examples, pros & cons, what is your plagiarism score.

Hypothesis Requirements

Hypotheses are a crucial part of the scientific thinking process, and most professional scientific endeavors are hypothesis-driven. That is, they seek to address a specific, measurable, and answerable question. A well-constructed hypothesis has several characteristics: it is clear, testable, falsifiable, and serves as the basis for constructing a clear set of experiments that will allow the student to discuss why it can be accepted or rejected based on the experiments. We believe that it is important for students who publish with JEI to practice rigorous scientific thinking through generating and testing hypotheses.

This means that manuscripts that merely introduce an invention, no matter how impressive it is, are not appropriate for JEI. Here are some common examples of unacceptable “hypotheses” relating to engineering projects:

  • I hypothesize that my invention will work
  • I hypothesize that I can build this invention

If your hypothesis boils down to one of the above hypotheses, your research is engineering-based. If your manuscript is related to engineering, please read our Guidelines for Engineering-Based Projects .

Additionally, review articles , where a review of the existing literature on a topic is presented, are not eligible for publication in JEI at this time .

This video goes over the general hypothesis requirements as they relate to research eligible for publication at JEI. It was created by one of our previous authors and current student advisory board members, Akshya Mahadevan!

When you assess whether your manuscript has a clear, well-constructed hypothesis, please ask whether it meets the following five criteria:

1. It IS NOT discovery or descriptive research

Some research is not hypothesis-driven. Terms used to describe non-hypothesis-driven research are ‘descriptive research,’ in which information is collected without a particular question in mind, and ‘discovery science,’ where large volumes of experimental data are analyzed with the goal of finding new patterns or correlations. These new observations can lead to hypothesis formation and other scientific methodologies. Some examples of discovery or descriptive research include an invention, explaining an engineered design like a program or an algorithm, mining large datasets for potential targets, or even characterizing a new species. However, if you have a pre-existing hypothesis and use large datasets to test it , this is acceptable for submission to JEI.

Another way to assess whether your research is hypothesis-driven is by analyzing the experimental setup. What variables in the experiment are independent, and which are dependent? Do the results of the dependent variable answer the scientific question? Are there positive and negative control groups?

2. It IS original

While your hypothesis does not have to be completely novel within the larger field of your research topic, it cannot be obvious to you, given the background information or experimental setup. You must have developed the hypothesis and designed experiments to test it yourself. This means that the experiments cannot be prescribed – an assigned project from an AP biology course, for example.

3. It IS NOT too general/global

Example 1: “Disease X results from the expression of virulence genes.” Instead the hypothesis should focus on the expression of a particular gene or a set of genes.

Example 2: “Quantifying X will provide significant increases in income for industry.” This is essentially untestable in an experimental setup and is really a potential outcome, not a hypothesis.

4. It IS NOT too complex

Hypothesis statements that contain words like “and” and “or” are ‘compound hypotheses’. This makes testing difficult, because while one part may be true the other may not be so. When your hypothesis has multiple parts, make sure that your experiments directly test the entire hypothesis. Possible further implications that you cannot test should be discussed in Discussion.

5. It DOES NOT misdirect to the researcher

The hypothesis should not address your capabilities. “Discovering the mechanism behind X will enable us to better detect the pathogen.” This example tests the ability of the researchers to take information and use it; this is a result of successful hypothesis-driven research, not a testable hypothesis. Instead, the hypothesis should focus on the experimental system. If it is difficult to state the hypothesis without misdirecting to the researcher, the focus of the research may be discovery science or invention-based, and should be edited to incorporate a properly formulated hypothesis.

Please contact the JEI Editorial Staff at [email protected] if you have any questions regarding the hypothesis of your research.

Hypothesis-driven science in large-scale studies: the case of GWAS

  • Open access
  • Published: 19 September 2021
  • Volume 36 , article number  46 , ( 2021 )

Cite this article

You have full access to this open access article

all quantitative research must be hypothesis driven

  • James Read   ORCID: orcid.org/0000-0003-2226-0340 1 &
  • Sumana Sharma   ORCID: orcid.org/0000-0003-0598-2181 2  

6757 Accesses

3 Citations

1 Altmetric

Explore all metrics

It is now well-appreciated by philosophers that contemporary large-scale ‘-omics’ studies in biology stand in non-trivial relationships to more orthodox hypothesis-driven approaches. These relationships have been clarified by Ratti ( 2015 ); however, there remains much more to be said regarding how an important field of genomics cited in that work—‘genome-wide association studies’ (GWAS)—fits into this framework. In the present article, we propose a revision to Ratti’s framework more suited to studies such as GWAS. In the process of doing so, we introduce to the philosophical literature novel exploratory experiments in (phospho)proteomics, and demonstrate how these experiments interplay with the above considerations.

Similar content being viewed by others

all quantitative research must be hypothesis driven

Raiders of the lost HARK: a reproducible inference framework for big data science

all quantitative research must be hypothesis driven

On Fishing for Significance and Statistician’s Degree of Freedom in the Era of Big Molecular Data

all quantitative research must be hypothesis driven

Experimental Discovery, Data Models, and Mechanisms in Biology: An Example from Mendel’s Work

Avoid common mistakes on your manuscript.

Introduction

The fields of molecular biology and genetics were transformed upon completion in 2001 of the Human Genome Project (Lander et al. 2001 ). This provided for the first time near-complete information on the genetic makeup of human beings, and marked the advent of what has become known as the ‘post-genomics’ era, defined by the availability of large-scale data sets derived from ‘genome-scale’ approaches. In turn, this has led to a shift in biological methodology, from carefully constructed hypothesis-driven research, to unbiased data-driven approaches, sometimes called ‘-omics’ studies. These studies have attracted philosophical interest in recent years: see e.g. Burian ( 2007 ); O’Malley et al. ( 2010 ); Ratti ( 2015 ); for more general philosophical discussions of large-scale data-driven approaches in contemporary post-genomics biology, see e.g. Leonelli ( 2016 ); Richardson and Stevens ( 2015 ).

Recall that -omics studies fall into three main categories: ‘genomics’, ‘transcriptomics’, and ‘proteomics’. The salient features of these three categories as as follows (we make no claim that these features exhaust any of the three categories; they are, however, the features which are relevant to the present article). Genomics is the study of the complete set of genes (composed of DNA) inside a cell. Cellular processes lead to genetic information being transcribed (copied) into molecules known as RNA. ‘Messenger RNA’ (mRNA) carries information corresponding to the genetic sequence of a gene. Transcriptomics is the study of the complete set of RNA transcripts that are produced by the genome. Finally, the information encoded in mRNA is used by cellular machinery called ribosomes to construct proteins; proteomics is the systematic study of these proteins within a cell. Proteins are the ultimate workhorses of the cell; proteomics studies aim to characterise cellular functions mediated by protein networks, in which nodes represent proteins and edges represent physical/functional interactions between them. For further background on genomics, transcriptomics, and proteomics, see Hasin et al. ( 2017 ).

Large-scale -omics studies are often described as being ‘hypothesis-free’. To take one example from genomics: advances in genome-editing techniques mean that it is now possible to generate ‘loss-of-function’ mutants in the laboratory. Such mutations are inactivating in the sense that they lead to the loss in the function of a gene within a cell. In the last few years, CRISPR-Cas9 technology has emerged, which makes it possible to create targetted loss-of-function mutants for any of the nearly 20,000 genes in the human genome (Doudna and Charpentier 2014 ). This allows researchers to ‘screen’ for a gene the loss of which leads to the phenotype of interest, thereby identifying the function of that gene. The methodological idea behind such screening approaches is that one does not require any background hypothesis as to which gene could be involved in a particular biological process, or associated with a particular phenotype: hence the widespread declaration that such approaches are ‘hypothesis-free’ (Shalem et al. 2015 ). As Burian writes, “Genomics, proteomics, and related “omics” disciplines represent a break with the ideal of hypothesis-driven science” (Burian 2007 , p. 289).

With Ratti ( 2015 ); Franklin ( 2005 ), and others, we find the terminology of ‘hypothesis-free’ to be misleading—for, in fact, such large-scale studies exhibit a Janus-faced dependence on mechanistic hypotheses of a quite standard sort. Ratti characterises such studies, and their connections with more orthodox mechanistic hypothesis-driven science, as involving three steps:

1. The generation of a preliminary set of hypotheses from an established set of premises; 2. The prioritization of some hypotheses and discarding of others by means of other premises and new evidence; 3. The search for more stringent evidence for prioritized hypotheses. (Ratti 2015 , p. 201)

In step (1), scientific hypothesising plays a role, insofar as it is used to delimit the domain of inquiry of the study. For example, a loss-of-function screen to identify the receptor for a pathogen would hypothesise that there exists a non-redundant mechanism for the interaction of the pathogen with the cells, and that the loss of this cellular factor/mechanism would lead to diminution of interaction of the pathogen with the cell surface. For the purpose of the test, such hypotheses are regarded as indubitable: they delimit the range of acceptable empirical enquiry. But there is also a forward-looking dependence of these approaches on scientific hypothesising: the results of such studies can be used to generate more specific mechanistic hypotheses, certain of which are prioritised in step (2) (based on certain additional assumptions—e.g., that there is a single cellular factor/mechanism responsible for pathogen-cell interaction in the above example), and which can then be validated in downstream analysis in step (3). For example, identification of candidate viral receptors using genome-wide loss-of-function screens can be used to generate specific hypotheses regarding the identity of the associated receptor, which can then be subject to empirical test.

Although broadly speaking we concur with Ratti on these matters (in addition to concurring with other philosophers who have written on this topic, e.g. Franklin ( 2005 ); Burian ( 2007 )), and find his work to deliver significant advances in our conceptual understanding of such large-scale studies, his citing of ‘genome-wide association studies’ (GWAS) as a means of illustrating the above points (see Ratti 2015 , p. 201) invites further consideration. GWAS aims to identify causal associations between genetic variations and diseases/traits; however, it encounters serious difficulties in identifying concrete hypotheses to prioritise, as per Ratti’s (2). Different solutions to this issue (and the related issue of GWAS ‘missing heritability’) manifest in different approaches to this prioritisation: something which deserves to be made explicit in the context of Ratti’s framework. Specifically, while Ratti focuses implicitly on a ‘core gene’ approach to GWAS (cf. Boyle et al. ( 2017 )), according to which a small number of ‘single nucleotide polymorphisms’ (this terminology will be explained in the body of this paper) are primarily responsible for the trait in question (note that this does not imply that only a small number of genes are associated with the relevant phenotype—rather, it assumes that there are some genes which are more central for the manifestation of the phenotype than the majority), there are other approaches to GWAS which do not presuppose this core gene model; as explained in Wray et al. ( 2018 ) (albeit without direct reference to Ratti’s work), such approaches would lead to the prioritisation of different hypotheses in Ratti’s (2). Footnote 1

The first goal of the present paper is to expand on these matters in full detail, and to revise Ratti’s framework in order to incorporate the above points: in so doing, we gain a clearer understanding of how GWAS approaches relate to more traditional, mechanistic, hypothesis-driven science. But there is also a second goal of this paper: to explore for the first time (to our knowledge) in the philosophical literature what it would take for the above-mentioned alternative approaches (often relying on network models)—particularly those which appeal to the field of (phospho)proteomics—to succeed. Although we make no claim that such (phospho)proteomics approaches are per se superior to other strategies for hypothesis prioritisation, they are nevertheless in our view worthy of philosophical attention unto themselves, for they constitute (we contend) a novel form of exploratory experimentation (cf. Burian ( 2007 ); Franklin ( 2005 ); Steinle ( 1997 )) featuring both iterativity (cf. Elliott ( 2012 ); O’Malley et al. ( 2010 )) and appeal to deep learning (cf. Bechtel ( 2019 ); Ratti ( 2020 )).

Bringing all this together, the plan for the paper is as follows. In Sect. " GWAS studies and prioritisation ", we recall the details of GWAS, and witness how different approaches to the so-called missing heritability and coherence problems lead to the prioritisation of different hypotheses in Ratti’s (2). In Sect. " Proteomics and iterative methodology ", we turn our attention to network approaches—specifically to those informed by (phospho)proteomics—and study these through the lens of the literature on exploratory experimentation, before returning to our considerations of GWAS and addressing the question of how such network-based approaches inform the question of hypothesis prioritisation in that context. We close with some discussion of future work to be done in the philosophy both of GWAS, and of big-data biology at large.

GWAS studies and prioritisation

Background on gwas.

Many applications of the framework presented in the introduction—perform genome-wide screens based on a general hypothesis (for example, ‘a gene/process is responsible for a disease’), and on the basis of the results obtained construct a more refined hypothesis for further testing—have been highly successful in biomedical research. However, there are cases in which the application of the approach has not been so straightforward. This can best be illustrated using the example of a field of genomics that studies common diseases such as inflammatory bowel disease (IBD), coronary artery disease, insomnia, and depression. These are often diseases complex in nature, and are thought to be controlled not by a single mutation, but rather to be influenced by multiple loci in the genome and even through the effect of the environment.

In the past decades, researchers have developed a method to characterise the genotype-phenotype associations in these diseases: the method is called ‘genome-wide association studies’ (GWAS). To understand this method, it is important to understand single nucleotide polymorphisms (SNPs). SNPs are variations in a single DNA building block, called a ‘nucleotide’, and they constitute the most common type of genetic variation among individuals. There are around 4-5 million SNPs in a person’s genome. Most SNPs have no effect on human health, but there are some cases in which these variations lead to increased chances of disease. GWAS was based originally upon a ‘common disease, common variant’ hypothesis, which states that common diseases can be attributed to common genetic variants (present in more than 1–5% of the population). By scanning the genomes of many different people, GWAS sought to identify the relationships between common genetic variations and common traits. GWAS studies remain very popular in the field of human genetics, and have been successful in identifying a number of novel variant-trait associations (for example, in diseases such as those mentioned above). For a clear introduction to GWAS from the biology literature, see Tam et al. ( 2019 ); for existing philosophical works on GWAS, with further details on such studies complimentary to those presented in this paper, see e.g. Bourrat ( 2020 ); Bourrat and Lu ( 2017 ).

GWAS’ discontents

GWAS is, however, not without its critics. A clear conclusion from multiple GWAS studies is that even statistically highly significant hits identified from such studies are able to account only for a small fraction of the heritability of the trait/disease in question. (Recall that ‘heritability’ is the measure of proportion of the phenotypic variance in a population that can be attributed to genetic differences—see Downes and Matthews ( 2020 ) and references therein for further details.) Moreover, GWAS studies often implicate large numbers of genes. To put this into perspective, three GWAS studies performed for height in 2008 identified 27, 12 and 20 associated genomic regions, which accounted merely for 3.7, 2.0, and 2.9% of the population variation in height, respectively ( Lettre et al. ( 2008 ); Weedon et al. ( 2008 ); Gudbjartsson et al. ( 2008 )). This was in sharp contrast with estimates from previous genetic epidemiology studies, based upon twin studies, Footnote 2 that estimated the heritability of height to be around 80% (Yang et al. ( 2010 )). In the early days of GWAS, this apparent discrepancy from GWAS came to be known as the missing heritability problem . For recent philosophical discussion of this problem, see Bourrat ( 2020 ); Bourrat and Lu ( 2017 ); Bourrat et al. ( 2017 ); Bourrat ( 2019 ); Downes and Matthews ( 2020 ); Matthews and Turkheimer ( 2019 ).

Geneticists have since proposed a number of solutions to the missing heritibility problem. The three most commonly-discussed such solutions are classified by Gibson ( 2012 ) as follows:

Complex dieseases are polygenic and many loci with small effects account for the phenotype variance.

Common diseases are caused by rare genetic variants each of which have large effect sizes.

Most common diseases are a result of interactions between many factors such as gene-gene interaction effects and effects from environmental factors.

(We take the proposals for solving the missing heritability problem presented in Bourrat ( 2020 ); Bourrat and Lu ( 2017 ); Bourrat et al. ( 2017 ); Bourrat ( 2019 ), which invoke factors from the epigenome, to fall into category (3); we discuss further these proposals in Sect.  GWAS reprise .) From multiple GWAS studies on common diseases there is now overwhelming evidence that common diseases are polygenic, as large numbers of genes are often implicated for a given disease. However, using this framework, it is estimated that it would take 90,000–100,000 SNPs to explain 80% of the population variation in height. In light of this, Goldstein ( 2009 ) raised the concern with GWAS studies that “[i]n pointing at ‘everything’, the danger is that GWAS could point at ‘nothing”’.

It is understandable that one would find unpalatable its not being the case that a single gene or process can be associated with a particular disease. But the situation here is not as straightforward as the above remarks might suggest. Indeed, Boyle et al. ( 2017 ) propose the following refinement of this idea:

Intuitively, one might expect disease-causing variants to cluster into key pathways that drive disease etiology. But for complex traits, association signals tend to be spread across most of the genome—including near many genes without an obvious connection to disease. We propose that gene regulatory networks are sufficiently interconnected such that all genes expressed in disease-relevant cells are liable to affect the functions of core disease-related genes and that most heritability can be explained by effects on genes outside core pathways. We refer to this hypothesis as an ‘omnigenic’ model.

Boyle et al. ( 2017 ) propose that within the large number of implicated genes in GWAS, there are a few ‘core’ genes that play a direct role in disease biology; the large number of other genes identified are ‘peripheral’ and have no direct relevance to the specific disease but play a role in general regulatory cellular networks. By introducing their ‘omnigenic’ model, Boyle et al. ( 2017 ) acknowledge the empirical evidence that GWAS on complex diseases does in fact implicate large number of genes; they thereby seem to draw a distinction between complex diseases and classical Mendelian disorders, in which small number of highly deleterious variants drive the disease. However, their suggestion of the existence of a small number of ‘core’ genes backtracks on this and paints complex diseases in the same brushstrokes as classical Mendelian disorders. A number of authors have welcomed the suggestion that genes implicated for complex diseases play a role in regulatory networks but have found the dicotomy between core and peripheral genes to be an ill-motivated attempt to fit complex disease into what we intuitively think should be the framework of a disease (‘a small number of genes should be responsible for a given disease’). For example, Wray et al. ( 2018 ) write:

It seems to us to be a strong assumption that only a few genes have a core role in a common disease. Given the extent of biological robustness, we cannot exclude an etiology of many core genes, which in turn may become indistinguishable from a model of no core genes.

We concur with this verdict. One possible reconstruction of the reasons underlying the endorsement by Boyle et al. ( 2017 ) of ‘core’ versus ’peripheral’ genes could be in order to solve the missing heritability problem. These authors advocate for using experimental methods that are able to identify rare variants that have high effect sizes (solution (2) of the missing heritability problem as presented above), as this is where they suspect the ‘core’ genes can be identified. However, there is at present no evidence that the ‘core gene’ hypothesis need invariably be true for complex diseases (cf. Wray et al. ( 2018 )), so one might be inclined to reject the original hypothesis that all diseases must fit the mould of ‘small number of genes cause complex diseases’. In so doing, one would thereby need to embrace the claim that at least some complex diseases are polygenic and that putative ‘core’ genes are, in fact, no more important than putative ‘peripheral’ genes in this context.

This, however, still leaves us with the original issue that Boyle et al. ( 2017 ) were trying to address: how is it that genes which look disconnected are, in fact, together implicated in a given disease? In addressing this question, we again concur with Wray et al. ( 2018 ), who write:

To assume that a limited number of core genes are key to our understanding of common disease may underestimate the true biological complexity, which is better represented by systems genetics and network approaches.

That is to say, understanding gene functions and the interplay between the different genes is key to answering why many genes are involved in complex diseases. This is not a straightforward task and a full characterisation of the roles that genes play in biological systems remains a distant prospect.

One approach to addressing this issue is to identify relationships between genes in a cell by way of a systems biology approach, underlying premises of which are that cells are complex systems and that genetic units in cells rarely operate in isolation. Hence, on this view, understanding how genes relate to one another in a given context is key to establishing the true role of variants identified from GWAS hits. There are a number of approaches described in the field of systems biology to identify gene-gene relationships. One widely-implemented approach is to construct ‘regulatory networks’ relating these genes. A regulatory network is a set of genes, or parts of genes, that interact with each other to control a specific cell function. With recent advances in high-throughput transcriptomics, it is now possible to generate complex regulatory networks of how genes interact with each other in biological processes and define the roles of genes in a context-dependent manner based on mRNA expression in a cell. As the majority of GWAS hits often lie in non-coding regions of the genome, which are often involved in regulating gene expressions, networks based on mRNA expression are powerful means to interpret of the functional role of variants identified by GWAS.

Another approach to the functional validation of GWAS hits—currently substantially less common—proceeds by constructing networks generated from expression of proteins/phosphoproteins in a cell (more details of these approaches will be provided in the following section). Such approaches would in principle depict completely the underlying state of the cell. Combined with gene expression data, protein expression networks and signalling networks from proteomics would make transparent the functional role of the variants identified in GWAS studies in a given context—that is, they would provide a mechanistic account of disease pathogenesis without recourse to a neo-Mendelian ‘core gene’ model. Genes which prima facie appear disconnected and irrelevant to disease biology may be revealed by these approaches to be relevant after all. To illustrate, consider a complex disease such as IBD: it is thought that both (i) a disturbed interaction between the gut and the intestinal microbiota, and (ii) an over-reaction of the immune system, are required for this disease phenotype to manifest. Thus, it is likely that a number of genetic pathways will be important—pathways which need not prima facie be connected, but which may ultimately be discovered to be related in some deeper way. These proteomics-informed network approaches would thereby afford one resolution to what has been dubbed by Reimers et al. ( 2019 ) and Craver et al. ( 2020 ) the ‘coherence problem’ of GWAS: to explain how it is that all genes implicated in these studies are related to one another mechanistically. Footnote 3 Clearly, these approaches could be brought to bear in order to vindicate responses (1) or (3) to the missing heritability problem, presented above. Footnote 4

To close this subsection, it is worth reflecting on how the ‘core gene’ hypothesis might intersect with network-based approaches. If a core gene exists, then a network analysis should (at least in principle) be able to identify it; in this sense, a ‘core gene’ hypothesis can be compatible with a network approach. As already mentioned above, however, there is no evidence that such core genes invariably exist: a network analysis could (in principle) identify many ‘central hubs’, rather than just one—an outcome not obviously compatible with the ‘core gene’ hypothesis. (For more on this latter possibility, cf. the very recent work of Barrio-Hernandez et al. ( 2021 ), discussed further below.)

Ratti’s framework for large-scale studies

Suppose that one follows (our reconstruction of) Boyle et al. ( 2017 ), in embracing option (2) presented above as a solution to the GWAS missing heritability problem. One will thereby, in Ratti’s second step in his three-step programme characterising these data-driven approaches to biology, prioritise hypotheses according to which a few rare genes are responsible for the disease in question. This, indeed, is what Ratti ( 2015 ) suggests in §2.2 of his article. However, one might question whether this prioritisation is warranted, in light of the lack of direct empirical evidence for this neo-Mendelian hypothesis (as already discussed). Wray et al. ( 2018 ), for example, write that

... [t]o bias experimental design towards a hypothesis based upon a critical assumption that only a few genes play key roles in complex disease would be putting all eggs in one basket.

If one concurs with Wray et al. ( 2018 ) on this matter (as, indeed, we do), then one may prioritise different hypotheses in the second step of Ratti’s programme—in particular, one may prioritise specific hypotheses associated with ‘polygenic’ models which would constitute approach (1) and/or approach (3) to the missing heritability problem.

This latter point should be expanded. Even if one does embrace a ‘polygenic’ approach to the missing heritability problem (i.e., approach (1) and/or approach (3)), and applies e.g. networks (whether transcriptomics-based, or (phospho)proteomics-informed, or otherwise—nothing hinges on this for our purposes here) in order to model the genetic factors responsible for disease pathogenesis, ultimately one must prioritise specific hypotheses for laboratory test. For example, Schwartzentruber et al. ( 2021 ) implement in parallel a range of network models within the framework of a polygenic approach in order to prioritise genes such as TSPAN14 and ADAM10 in studies on Alzheimer’s disease (we discuss further the methodology of Schwartzentruber et al. ( 2021 ) in §3.3 ). Note, however, that these specific hypotheses might be selected for a range of reasons—e.g., our prior knowledge of the entities involved, or ease of testability, or even financial considerations—and that making such prioritisations emphatically does not imply that one is making implicit appeal to a ‘core gene’ model. This point is corroborated further by the fact that the above two genes are not the most statistically significant hits in the studies undertaken by Schwartzentruber et al. ( 2021 ), as one might expect from those working within the ‘core gene’ framework.

Returning to Ratti’s framework: we take our noting this plurality of options vis-à-vis hypothesis prioritisation to constitute a friendly modification to this framework appropriate to contexts such as that of GWAS. But of course, if one were to leave things here, questions would remain—for it would remain unclear which polygenic model of disease pathogenesis is to be preferred, and how such models are generated. Given this, it is now incumbent upon us to consider in more detail how such approaches set about achieving these tasks in practice: due both to their potential to offer underlying mechanistic models of the cell, as well as due to the novel iterative methodology for hypothesis generation involved, we focus largely in the remainder upon (phospho)proteomics-based approaches.

Proteomics and iterative methodology

Proteomics promises to afford the ultimate fundamental mechanistic account of cellular processes; data from proteomics would, therefore, illuminate the underlying relationships between the variants identified in GWAS studies. In this section, we explore in greater detail how such proteomics approaches proceed; they constitute a novel form of ‘exploratory experimentation’ (in the terminology of Burian ( 2007 ); Steinle ( 1997 )) worthy unto themselves of exposure in the philosophical literature. Footnote 5 In proteomics, further complications for hypothesis generation and testing arise, for data is sparse, and experiments often prohibitively expensive to perform. Given these constraints, how is progress to be made? It is to this question which we now turn; the structure of the section is as follows. In Sect.  Proteomics: a data-deprived field , we present relevant background regarding proteomics. Then, in Sect.  Methodological iteration , we argue that the development of this field can be understood on a model of a novel form of iterative methodology (cf. Chang 2004 ; O’Malley et al. 2010 ). We return to the relevance of these approaches for GWAS in Sect.  GWAS reprise .

Proteomics: a data-deprived field

The ultimate aim of -omics studies is to understand the cell qua biological system. Transcriptomics is now sufficiently well-advanced to accommodate large-scale systematic studies to the point of being used to validate variants identified from GWAS. Footnote 6 By contrast, proteomics—the study of proteins in a cell—remains significantly under-studied. Technologies allowing for the systematic study of proteins are not as advanced as those for studying genes and transcripts; this is mainly because no method currently exists for directly amplifying proteins (i.e., increasing the amount of a desired protein in a controlled laboratory context): a methodology which has been key for genomics and transcriptomics. Proteins are very diverse in the cell: a single gene/transcript gives rise to multiple proteins. Proteins themselves can be modified in the cell after being created, thus further increasing the complexity of proteomics studies. Unlike genomics and transcriptomics, in which it is now common to perform systematic genome-wide or transcriptome-wide approaches, studies of proteins are therefore usually taken piecemeal.

Proteomics research tends to focus on families of proteins that are involved in a particular known biological process. Among the important families of proteins are kinases and phosphatases, which are molecules that are responsible for signal transmission in the cell. These proteins are able to modify other proteins by adding or removing a phosphate group (respectively). This modification changes the shape (‘conformation’) of the protein, rendering it active or inactive, Footnote 7 depending on the context. By examining the phopsphorylation state of the proteins inside a cell, it is possible to infer the signalling state of that cell. The field of phosphoproteomics aims to characterise all phospho-modified proteins within a cell. This is thought to be one of the most powerful and fundamental ways of inferring the signalling process within a cell; the approach could add a substantial new layer to our understanding of both basic and disease biology. That said, a recent estimate suggests that current approaches have identified kinases for less than 5% of the phosphoproteome. What is even more staggering is that almost 90% of the phosphorylation modifications that have been identified have been attributed to only 20% of kinases. The other 80% of the kinases are completely dark: their functions remain unknown. For many such kinases, we do not even know where in the cell they are located. (See Needham et al. ( 2019 ) for a review of the current state of play in phosphoproteomics.)

In such a field, systematic studies to quantify the entire phosphoproteome in a cell and an ability to assign a kinase to every phosphorylated component would be the ultimate aim. But phosphoproteomics studies themselves are currently extremely expensive, and there are technological limitations in mapping the global phosphoproteome—not least sparsity of data, which often comes as a result of limitations in the technical setup of laboratory measurements and experiments. For example: the same sample measured in the same machine at two different instances will give readings for different phosphoproteins. Some statistical methods can be used to overcome these limitations, but these require making assumptions regarding the underlying biology, which defeats the point of an unbiased study.

In spite of these difficulties, it has been shown that if one combines multiple large-scale phosphoprotemics data sets (each admittedly incomplete), it is possible to predict kinase-kinase regulatory relationships in a cell using data-driven phosphoprotein signalling networks obtained via supervised machine learning approaches (a recent study from Invergo et al. 2020 showcases one such approach; we will use this as a running example in the ensuing). Footnote 8 First, a training set of data is used to teach a machine a classification algorithm. Once the classification algorithm is learnt, the machine is set to the task of applying it to unlabelled data: in our case, the goal is to identify further, as-yet unknown, regulatory protein relationships or non-relationships. (On machine learning and network analysis of biological systems, see also Bechtel ( 2019 ) and Ratti ( 2020 ).)

Before assessing such phosphoproteomics machine learning algorithms as that of Invergo et al. ( 2020 ), there are two further complications with the current state of play in proteomics which need to be mentioned. First: it is much easier to curate positive lists of interactions than negative lists. (This is essentially a case of its being easier to confirm existentially quantified statements than universally quantifies statements: for how can we ever truly ascertain that any two given proteins never interact?) Thus, at present, negative lists obtained from laboratory experiments are underpopulated. Invergo et al. ( 2020 ) attempt to circumvent this issue in the following way: they assume that regulatory relationships are rare, so that if one were to randomly sample protein associations, one could create reliably large artificial negative sets; indeed, they do generate artificial negative sets in exactly this way. (Clearly, this means that these approaches again cannot be understood as being ‘hypothesis-free’: cf. Sect.  Introduction .)

The second problem with the current state of play in proteomics is this: when a given interaction occurs is a function of multifarious factors, most notably cell context. This context-dependence means that an entry in a negative set in one context might, in fact, be an entry in a positive set in another. To illustrate: in the case of regulatory relationships between two kinases, it is known that such relationships can be prone to dysregulation in diseases such as cancer. Hence, a well-annotated positive set relationship can very well be dysregulated in a cancer context, so that this relationship no longer exists, effectively putting it into a negative set. The problem is that many data-driven approaches rely on data that are generated in simple reductionist systems such as cancer cell lines—so that the results obtained might not carry across to the target physiological context. (Cancer cell lines can grow infinitely, and thus are ideal for experiments.) The approach taken by Invergo et al. ( 2020 ) utilises data from breast cancer cell lines; hence, the relationships they predict could be specific to a dysregulated system. In response to this second problem, we suggest replying on behalf of Invergo et al. ( 2020 ) that most regulatory relationships fundamental to the functioning of the cell should hold true in most contexts. At present, however, given the data-deprived nature of proteomics, there is little direct evidence for this hypothesis. (Again, the appeal to any such hypothesis would mean that such proteomics approaches cannot be ‘hypothesis-free’.)

Thus, the fact that Invergo et al. ( 2020 ) utilise data from breast cancer cell lines raises the possibility that their machine learning algorithms might be trained on data unsuited to other contexts, leading to concerns regarding error propagation. This general concern regarding the context-specificity (or lack thereof) of input data sets is, however, recognised by authors in the field—for example, Barrio-Hernandez et al. ( 2021 ) note that “improvements in mapping coverage and computational or experimental approaches to derive tissue or cell type specific networks could have a large impact on future effectiveness of network expansion” (Barrio-Hernandez et al. 2021 , p. 14).

Methodological iteration

In spite of these problems, Invergo et al. ( 2020 ) argue that the results obtained from their approach afford a useful means of bootstrapping further progress in phosphoproteomics. As they put it:

Although we do not suggest that these predictions can replace established methods for confirming regulatory relationships, they can nevertheless be used to reduce the vast space of possible relationships under consideration in order to form credible hypotheses and to prioritize experiments, particularly for understudied kinases. (Invergo et al. 2020 , p. 393)

One way to take this point is the following. Ideally, in order to construct positive and negative sets, one would test in the laboratory each individual protein association. Practically, however, this would be an unrealistic undertaking, as we have already seen. What can be done instead is this:

Generate a global phosphoproteomics data set, albeit one that is incomplete and sparse (e.g., that presented in Wilkes et al. ( 2015 )), based upon laboratory experiments.

Train, using this data set and input background hypotheses of the kind discussed above, a machine learning algorithm (such as that presented in Invergo et al. ( 2020 )) to identify candidate interactions in the unknown space of protein-protein interactions. Footnote 9

Use these results to guide further laboratory experimentation, leading to the development of more complete data sets.

Train one’s machine learning algorithms on these new data sets, to improve performance; in turn, repeat further the above process.

Clearly, a process of reflective equilibrium is at play here (cf. Daniels ( 2016 )). As is well-known, Chang ( 2004 ) has proposed an iterative conception of scientific methodology, according to which the accrual of scientific hypotheses is not a linear matter; rather, initial data may lead to the construction of a theoretical edifice which leads one to develop new experiments to revise one’s data; at which point, the process iterates. This fits well with the above-described procedures deployed in phosphoproteomics; it also accords with previous registration of the role of iterative procedures in large-scale biological studies—see e.g. O’Malley et al. ( 2010 ) and Elliott ( 2012 ).

Let us delve into this a little deeper. As Chang notes,

There are two modes of progress enabled by iteration: enrichment , in which the initially affirmed system is not negated but refined, resulting in the enhancement of some of its epistemic virtues; and self-correction , in which the initially affirmed system is actually altered in its content as a result of inquiry based on itself. (Chang 2004 , p. 228)

Certainly and uncontroversially, enrichment occurs in the above four-step process in phosophoproteomics: the new data yield a refinement of our previous hypotheses in the field. In addition, however, it is plausible to understand the above iterative methodology as involving self-correction: for example, in might be that the machine learning algorithm of Invergo et al. ( 2020 ) identifies a false positive, yet nevertheless makes sufficiently focused novel predictions with respect to other candidate interactions in order to drive new experimentation, leading to a new data set on which the algorithm can be trained, such that, ultimately, the refined algorithm does not make a false positive prediction for that particular interaction. This is entirely possible in the above iterative programme; thus, we maintain that both modes of Changian iterative methodology are at play in this approach.

There is another distinction which is also relevant here: that drawn by Elliott ( 2012 ) between ‘epistemic iteration’—“a process by which scientific knowledge claims are progressively altered and refined via self-correction or enrichment”—and ‘methodological iteration’—“a process by which scientists move repetitively back and forth between different modes of research practice” (Elliott 2012 , p. 378). It should be transparent from our above discussion that epistemic iteration is involved in these proteomics approaches. Equally, though, it should be clear that methodological iteration is involved, for the approach alternates between machine learning and more traditional laboratory experimentation. That machine learning can play a role in an iterative methodology does not seem to have been noted previously in the philosophical literature—for example, it is not identified by Elliott ( 2012 ) as a potential element of a methodologically iterative approach; on the other hand, although the role of machine learning in network modelling and large-scale studies is acknowledged by Bechtel ( 2019 ) and Ratti ( 2020 ) (the latter of whom also discusses—albeit without explicitly using this terminology—the role of machine learning in epistemic iteration: see (Ratti 2020 , p. 89)), there is no mention of its role in an iterative methodology such as that described above.

GWAS reprise

Given the foregoing, we hope it is reasonable to state that the approaches to proteomics of e.g. Invergo et al. ( 2020 ) constitute novels forms of exploratory experimentation, worthy of study in their own right. Let us, however, return now to the matter of polygenic approaches to GWAS hits. In principle, the results of the methodologies of e.g. Invergo et al. ( 2020 ) could further vindicate these approaches, by providing mechanistic models of which genes interact in a disease context, and when and why they do so. In turn, they have the capacity to allow biologists to prioritise specific hypotheses in Ratti’s step (2), without falling back upon assumptions that only few genes are directly involved in complex disease biology.

Note that that there is a complex interplay between this iterative methodology and the ‘eliminative induction’ of stages (1) and (2) Ratti’s analysis (see Sect.  Introduction ; for earlier sources on eliminative induction, see Earman ( 1992 ); Kitcher ( 1993 ); Norton ( 1995 )). We take this to consist in the following. First, a methodology such as that of Invergo et al. ( 2020 ) is used to generate a particular network-based model for the factors which are taken to underlie a particular phenotype. This model is used to prioritise ( à la eliminative induction) particular hypotheses, as per stage (2) of Ratti’s framework; these are then subject to specific test, as per stage (3) of Ratti’s framework. The data obtained from such more traditional experimentation is then used to construct more sophisticated network models within the framework of Invergo et al. ( 2020 ); these in turn lead to the (eliminative inductive) prioritisation of further specific hypotheses amenable to specific test. As already discussed above, this is a clear example of the ‘methodological iteration’ of Elliott ( 2012 ).

It bears stressing that (phospho)proteomics network-based approaches may, ultimately, constitute only one piece of the solution to the broader puzzle that is GWAS hypothesis prioritisation. In very recent work, Schwartzentruber et al. ( 2021 ) have brought to bear upon this problem consideration of, inter alia , epigenomic factors alongside network-based analyses. There are two salient points to be made on this work. First: although Bourrat et al. ( 2017 ) are correct that epigenomic studies and background may have a role to play in addressing the missing heritability problem (cf. Bourrat ( 2019 , 2020 ); Bourrat and Lu ( 2017 )), a view in contemporary large-scale biological studies—evident in papers such as Schwartzentruber et al. ( 2021 )—is that these considerations can be supplemented with yet other resources, such as network-based studies; we concur with this verdict. Second: in order to construct these networks, Schwartzentruber et al. ( 2021 ) rely on established protein-protein interaction databases such as STRING, IntAct and BioGRID (Schwartzentruber et al. 2021 , p. 397). While effective in their own right, networks developed from such databases have the disadvantage that they represent signalling in an ‘average’ cell, and are therefore unsuitable for studying dynamic context- and cell-type-specific signalling responses (cf. Sharma and Petsalaki ( 2019 )). In this regard, it would (at least in principle) be preferable to utilise regulatory and context-specific networks developed using methods described in work such as that of Invergo et al. ( 2020 ) in future approaches to GWAS hypothesis prioritisation. That being said, in practice this may not yet be fruitful, as at present contemporary large-scale biology is only at the early stages of the iterative processes discussed above; moreover, the training data sets used by such methods remain at this stage not completely context-specific (recall that Invergo et al. ( 2020 ) utilise a breast cancer training set)—meaning that the potential of such work to yield detailed, context-specific network-based models is yet to be realised in full.

With all of the above in hand, we close this subsection by considering more precisely the question of how the machine learning algorithms of Invergo et al. ( 2020 ) bear upon the missing heritability problem. Having developed regulatory protein-protein interaction networks on the basis of such algorithms, one can take (following here for the sake of concreteness the lead of Barrio-Hernandez et al. ( 2021 )) the connection with hypothesis prioritisation in GWAS (and, in turn, the missing heritability problem) to proceed via the following steps (also summarised visually in Fig.  1 ):

Select a protein-protein interaction network. Usually, this is a pre-existing curated network, such as those defined in the STRING database (discussed above). However, instead of such curated networks, use in their place networks developed on the machine learning models of e.g. Invergo et al. ( 2020 ).

Within those networks, identify the nodes (i.e., proteins) which correspond to hits from a particular GWAS (i.e., the proteins associated with the genes identified in the GWAS). Footnote 10

Use network propagation methods (see e.g. Cowen et al. ( 2017 ) for a review of such methods), potentially alongside other factors (as discussed in e.g. Schwartzentruber et al. ( 2021 )) in order to identify known modules (i.e., separated substructures within a network) associated with the disease in question.

Target elements of those modules, regardless of whether or not they were hits in the original GWAS. (This latter approach—of targeting beyond the original GWAS hits—is novel to the very recent work of Barrio-Hernandez et al. ( 2021 ).)

figure 1

The application of networks to GWAS hit prioritisation. In (1), GWAS hits are converted to candidate gene lists. In (2), one selects a cellular network: this could be a gene regulatory network, or a protein-protein interaction network (e.g. from STRING), or a protein-protein regulatory network (possibly constructed via the machine learning methodologies of Invergo et al. ( 2020 )). In (3), genes associated with the GWAS loci are mapped to the chosen network. In (4), network propagation methods (e.g. diffusion techniques) are applied in order identify potential disease-related genes not picked up by the GWAS. In (5), the results of these network analyses are used to identify significant genetic modules to be targeted experimentally in investigations into disease pathogenesis. Note, following Wray et al. ( 2018 ) and Barrio-Hernandez et al. ( 2021 ), that this particular means of bridging the gap between cellular networks and investigations into the results of GWAS hits does not presuppose a ‘core gene’ hypothesis

On (2) and (3): Boyle et al. ( 2017 ) may or may not be correct that many genes are implicated (either in the original screen, or after the network analysis has been undertaken)—recall from Sect.  GWAS’ discontents their ‘omnigenic’ model. However, on the basis of the work of Barrio-Hernandez et al. ( 2021 ) one might argue that this is not the most important question—rather, the important question is this: which gene modules provide insights into the disease mechanism? One can ask this question without subscribing to a ‘core gene’ model; thus, we take the work of Barrio-Hernandez et al. ( 2021 ) to be consistent with the above-discussed points raised by Wray et al. ( 2018 ).

This paper has had two goals. The first has been to propose revisions to the framework of Ratti ( 2015 ) for the study of the role of hypothesis-driven research in large-scale contemporary biological studies, in light of studies such as GWAS and its associated missing heritability problem. In this regard, we have seen that different hypotheses may be prioritised, depending upon whether one adopts a ‘core’ gene model (as Ratti ( 2015 ) assumes, and as is also advocated in Boyle et al. ( 2017 )), or whether one adopts a polygenic model (as endorsed by Wray et al. ( 2018 ); cf. Barrio-Hernandez et al. ( 2021 )). The second goal of this paper has been to consider how these hypotheses would be developed on polygenic approaches via (phospho)proteomics—which itself constitutes a novel form of exploratory experiment, featuring as it does both iterativity and deep learning—and to consider what it would take for these network-based proteomics approaches to succeed. A broader upshot of this paper has been the exposure for the first time to the philosophical literature of proteomics: given its potential to provide mechanistic models associated with disease phenotypes, the significance of this field cannot be overstated.

The issues discussed in this paper raise important questions regarding how researchers prioritise not just first-order hypotheses as per Ratti’s (2), but also the background assumptions which allow one to make such adjudications to begin with. To be concrete: in the case of GWAS, should one prioritise the assumption that rare variants of large effect in a small number of genes drive complex diseases, or rather invest in developing systems-based approaches and in improving under-studied fields, such as (phospho)proteomics, which may or may not ultimately shed light on the question of why complex diseases have thus far manifested empirically as polygenic? These choices lead to different first-order prioritisations in Ratti’s second step, and thereby have great potential to steer the course of large-scale studies in future years. Given limited resources in the field, it is, in our view, worth pausing to reflect on whether said resources are appropriately allocated between these options, and to strive to avoid any status quo bias in favour of currently-popular assumptions. Footnote 11

In fairness to Ratti, in other articles, e.g. López-Rubio and Ratti ( 2021 ), he does not make assumptions tantamount to a ‘core gene’ hypothesis; in this sense, our criticism falls most squarely on assumptions made in Ratti ( 2015 ).

Twin studies are powerful approaches to studying the genetics of complex traits. In simple terms, twin studies compare the phenotypic similarity of identical (monozygotic) twins to non-identical (dizygotic) twins. As monozygotic twins are genetically identical and non-identical twins are on average ‘half identical’, observing greater similarity of identical over non-identical twins can be used as an evidence to estimate the contribution of genetic variation to trait manifestation. For further discussion of twin studies in the philosophical literature, see e.g. Matthews and Turkheimer ( 2019 ); Downes and Matthews ( 2020 ).

There are many further questions to be addressed here in connection with the literature of mechanisms and mechanistic explanations. For example, are these network approaches best understood as revealing specific mechanisms, or rather as revealing mechanism schema (to use the terminology of (Craver and Darden 2013 , ch.3))? Although interesting and worthy of pursuit, for simplicity we set such questions aside in this paper, and simply speak of certain contemporary biology approaches as revealing ‘underlying mechanisms’. In this regard, we follow the lead of Ratti ( 2015 ).

To be completely clear: we do not claim that these (phospho)proteomics-based network approaches are superior to regulatory network approaches, given the current state of technology in the field. On the contrary—as we explain in Sect.  Proteomics and iterative methodology —the former of these fields is very much nascent, and has yet to yield significant predictive or explanatory fruit. Nevertheless—again as we explain in Srct.  Proteomics and iterative methodology —in our view these approaches are worthy of exposure in the philosophical literature in their own right, for (a) they offer one of the most promising means (in principle, if not yet in practice) of providing a mechanistic account of disease pathogenesis, and (b) the particular way in which hypotheses are developed and prioritised on these approaches is conceptually rich.

Recall: “Experiments count as exploratory when the concepts or categories in terms of which results should be understood are not obvious, the experimental methods and instruments for answering the questions are uncertain, or it is necessary first to establish relevant factual correlations in order to characterize the phenomena of a domain and the regularities that require (perhaps causal) explanation” (Burian 2013 ). Cf. e.g. Franklin ( 2005 ); Steinle ( 1997 ). All of the -omics approaches discussed in this paper were identified in Burian ( 2007 ) as cases of exploratory experimentation; the details of contemporary proteomics approaches have, however, not been presented in the philosophical literature up to this point (at least to our knowledge).

In this paper, we do not go into the details of specific transcriptomics studies. One interesting approach worthy of mention, however, is ‘single-cell RNA sequencing’ (SC-RNA), which allows biologists to assay the full transcriptome of hundreds of cells in an unbiased manner (see e.g. Hwang et al. ( 2018 ) for a recent review). The advantage of SC-RNA over older methods lies in its ability to identify the transcriptomes from heterocellular and poorly-classified tissue populations and disease-associated cell states.

As the addition or removal of phosphate groups regulates the activity of a protein, such relationships between a kinase and its target (also called a ‘substrate’) are referred to as ‘regulatory relationships’. Kinases themselves can also be phosphorylated by other kinases, so there exist also kinase-kinase regulatory relationships in a cell.

Supervised machine learning involves training a machine on a given data set (for example, a collection of cat photos versus dog photos), before assigning the machine the task of classifying entries in some new data set. By contrast, in unsupervised learning, the machine is instructed to find its own patterns in a given data set. For some recent philosophical considerations regarding machine learning, see Sullivan ( 2019 ).

One can also test the results of the machine binary classification algorithm on other data sets: this Invergo et al. ( 2020 ) did with reference to the data presented in Hijazi et al. ( 2020 ). The design of the algorithmic system and algorithm used by Invergo et al. ( 2020 ) is described with admirable clarity at (Invergo et al. 2020 , pp. e5ff.), to which the reader is referred for further details.

Note that identification of candidate genes from the loci which constitute GWAS hits is non-trivial. The recently-described ‘locus-to-gene’ (L2G) approach is a machine learning tool which can be used to prioritise likely causal genes at each locus given genetic and functional genomics features (see Mountjoy et al. ( 2020 )).

Cf. Samuelson and Zeckhauser ( 1988 ). For related discussion of funding decisions in the context of -omics studies, see Burian ( 2007 ).

Barrio-Hernandez I, Schwartzentruber J, Shrivastava A, del Toro N, Zhang Q, Bradley G, Hermjakob H, Orchard S, Dunham I, Anderson CA, Porras P, Beltrao P (2021) ‘Network expansion of genetic associations defines a pleiotropy map of human cell biology’, bioRxiv . https://www.biorxiv.org/content/early/2021/07/19/2021.07.19.452924

Bechtel W (2019) Hierarchy and levels: analysing networks to study mechanisms in molecular biology. Philos Transact R Soc B 375(20190320):20190320

Google Scholar  

Bourrat P (2019) Evolutionary transitions in heritability and individuality. Theory Biosci 138:305–323

Article   Google Scholar  

Bourrat P (2020) Causation and single nucleotide polymorphism heritability. Philos Sci 87:1073–1083

Bourrat P, Lu Q (2017) Dissolving the Missing Heritability Problem. Philosophy of Science 84:1055–1067

Bourrat P, Lu Q, Jablonka E (2017) Why the missing heritability might not be in the DNA. BioEssays 39:1700067

Boyle E, Li Y, Pritchard J (2017) An expanded view of complex traits: from polygenic to omnigenic. Cell 169:1177–1186

Burian R (2013) Exploratory experimentation. In: Dubitzky W, Wolkenhauer O, Cho K-H, Yokota H (eds) Encyclopedia of systems biology. Springer, Berlin

Burian RM (2007) On MicroRNA and the need for exploratory experimentation in post-genomic molecular biology. Hist Philos Life Sci. 29(3):285–311. http://www.jstor.org/stable/23334263

Chang H (2004) Inventing temperature: measurement and scientific progress. Oxford University Press, Oxford

Book   Google Scholar  

Cowen L, Ideker T, Raphael BJ, Sharan R (2017) Network propagation: a universal amplifier of genetic associations. Nat Rev Genet 18(9):551–562. https://doi.org/10.1038/nrg.2017.38

Craver CF, Darden L (2013) In search of mechanisms. University of Chicago Press, Chicago

Craver CF, Dozmorov M, Reimers M, Kendler KS (2020) Gloomy prospects and roller coasters: finding coherence in genome-wide association studies. Philos Sci 87(5):1084–1095

Daniels N (2016) Reflective equilibrium. The Stanford Encyclopedia of Philosophy

Doudna JA, Charpentier E (2014) The new frontier of genome engineering with CRISPR-Cas9. Science 346(6213):1258096

Downes SM, Matthews L (2020) Heritability. In: Zalta EN (ed) The Stanford encyclopedia of philosophy. Stanford University, Metaphysics Research Lab

Earman J (1992) Bayes or bust? a critical examination of BayesianConfirmation Theory,. MIT Press, Cambridge

Elliott KC (2012) Epistemic and methodological iteration in scientific research. Stud Hist Philos Sci Part A 43(2):376–382

Franklin L (2005) Exploratory experiments. Philos Sci. 72(5):888–899. https://www.jstor.org/stable/10.1086/508117

Gibson G (2012) Rare and common variants: twenty arguments. Nat Rev Genet 13(2):135–145

Goldstein D (2009) Common genetic variation and human traits. N Engl J Med 360:1696–1698

Gudbjartsson DF, Walters GB, Thorleifsson G, Stefansson H, Halldorsson BV, Zusmanovich P, Sulem P, Thorlacius S, Gylfason A, Steinberg S, Helgadottir A, Ingason A, Steinthorsdottir V, Olafsdottir EJ, Olafsdottir GH, Jonsson T, Borch-Johnsen K, Hansen T, Andersen G, Jorgensen T, Pedersen O, Aben KK, Witjes JA, Swinkels DW, Heijer Md, Franke B, Verbeek ALM, Becker DM, Yanek LR, Becker LC, Tryggvadottir L, Rafnar T, Gulcher J, Kiemeney LA, Kong A, Thorsteinsdottir U, Stefansson K (2008) Many sequence variants affecting diversity of adult human height. Nat Genet 40(5):609–615. https://doi.org/10.1038/ng.122

Hasin Y, Seldin M, Lusis A (2017) Multi-omics approaches to disease. Genome Biol 18(1):83

Hijazi M, Smith R, Rajeeve V, Bessant C, Cutillas PR (2020) Reconstructing kinase network topologies from phosphoproteomics data reveals cancer-associated rewiring. Nat Biotechnol 38(4):493–502

Hwang B, Lee JH, Bang D (2018) Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp Mol Med 50(8):96. https://doi.org/10.1038/s12276-018-0071-8

Invergo BM, Petursson B, Akhtar N, Bradley D, Giudice G, Hijazi M, Cutillas P, Petsalaki E, Beltrao P (2020) Prediction of signed protein kinase regulatory circuits. Cell Syst 10(5):384-396.e9

Kitcher PS (1993) The advancement of science. Oxford University Press, Oxford

Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, Stange-Thomann Y, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas P, Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, McMurray A, Matthews L, Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, Mardis ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C, Uberbacher E, Frazier M, Gibbs RA, Muzny DM, Scherer SE, Bouck JB, Sodergren EJ, Worley KC, Rives CM, Gorrell JH, Metzker ML, Naylor SL, Kucherlapati RS, Nelson DL, Weinstock GM, Sakaki Y, Fujiyama A, Hattori M, Yada T, Toyoda A, Itoh T, Kawagoe C, Watanabe H, Totoki Y, Taylor T, Weissenbach J, Heilig R, Saurin W, Artiguenave F, Brottier P, Bruls T, Pelletier E, Robert C, Wincker P, Smith DR, Doucette-Stamm L, Rubenfield M, Weinstock K, Lee HM, Dubois J, Rosenthal A, Platzer M, Nyakatura G, Taudien S, Rump A, Yang H, Yu J, Wang J, Huang G, Gu J, Hood L, Rowen L, Madan A, Qin S, Davis RW, Federspiel NA, Abola AP, Proctor MJ, Myers RM, Schmutz J, Dickson M, Grimwood J, Cox DR, Olson MV, Kaul R, Raymond C, Shimizu N, Kawasaki K, Minoshima S, Evans GA, Athanasiou M, Schultz R, Roe BA, Chen F, Pan H, Ramser J, Lehrach H, Reinhardt R, McCombie WR, de la Bastide M, Dedhia N, Blöcker H, Hornischer K, Nordsiek G, Agarwala R, Aravind L, Bailey JA, Bateman A, Batzoglou S, Birney E, Bork P, Brown DG, Burge CB, Cerutti L, Chen HC, Church D, Clamp M, Copley RR, Doerks T, Eddy SR, Eichler EE, Furey TS, Galagan J, Gilbert JG, Harmon C, Hayashizaki Y, Haussler D, Hermjakob H, Hokamp K, Jang W, Johnson LS, Jones TA, Kasif S, Kaspryzk A, Kennedy S, Kent WJ, Kitts, P, Koonin, E V, Korf I, Kulp, D, Lancet D, Lowe T M, McLysaght A, Mikkelsen T, Moran J V, Mulder N, Pollara V J, Ponting C P, Schuler G, Schultz J, Slater G, Smit A F, Stupka E, Szustakowki J, Thierry-Mieg D, Thierry-Mieg J, Wagner L, Wallis J, Wheeler R, Williams A, Wolf Y I, Wolfe, K H, Yang S P, Yeh R F, Collins F, Guyer M S, Peterson J, Felsenfeld A, Wetterstrand K A, Patrinos A, Morgan M J, de Jong P, Catanese J J, Osoegawa K, Shizuya H, Choi S, Chen Y J, Szustakowki J, and International Human Genome Sequencing Consortium (2001) Initial sequencing and analysis of the human genome. Nature 409(6822):860–921

Leonelli S (2016) Data-centric biology: a philosophical study. University of Chicago Press, Chicago

Lettre G, Jackson A. U., Gieger C., Schumacher F. R., Berndt S. I., Sanna S., Eyheramendy S., Voight B. F., Butler J. L., Guiducci C., Illig T., Hackett R., Heid I. M., Jacobs K. B., Lyssenko V., Uda M., Boehnke M., Chanock S. J., Groop L. C., Hu F. B., Isomaa B., Kraft P., Peltonen L., Salomaa V., Schlessinger D., Hunter D. J., Hayes R. B., Abecasis G. R., Wichmann H.-E., Mohlke K. L., Hirschhorn J. N., Initiative T. D. G., FUSION, KORA, The Prostate, LC, Trial OCS, Study TNH, SardiNIA (2008) Identification of ten loci associated with height highlights new biological pathways in human growth. Nat Genet 40(5):584–591. https://doi.org/10.1038/ng.125

López-Rubio E, Ratti E (2021) Data science and molecular biology: prediction and mechanistic explanation. Synthese 198(4):3131–3156. https://doi.org/10.1007/s11229-019-02271-0

Matthews LJ, Turkheimer E (2019) Across the great divide: pluralism and the hunt for missing heritability. Synthese. https://doi.org/10.1007/s11229-019-02205-w

Mountjoy E, Schmidt EM, Carmona M, Peat G, Miranda A, Fumis L, Hayhurst J, Buniello A, Schwartzentruber J, Karim MA, Wright D, Hercules A, Papa E, Fauman E, Barrett JC, Todd JA, Ochoa D, Dunham I, Ghoussaini M (2020) Open targets genetics: an open approach to systematically prioritize causal variants and genes at all published human GWAS trait-associated loci, bioRxiv . https://www.biorxiv.org/content/early/2020/09/21/2020.09.16.299271

Needham E, Parker B, Burykin T, James D, Humphreys S (2019) Illuminating the dark phosphoproteome, Sci Signal. Vol. 12

Norton J (1995) Eliminative induction as a method of discovery: how Einstein discovered general relativity. In: Leplin J (ed) The creation of ideas in physics. Kluwer, Alphen aan den Rijn, pp 29–69

Chapter   Google Scholar  

O‘Malley M, Elliott K, Burian R (2010) From genetic to genomic regulation: iterativity in microRNA research. Stud Hist Philos Sci Part C Stud Hist Philos Biol Biomed Sci 41(4):407–417

Ratti E (2015) Big Data biology: between eliminative inferences and exploratory experiments. Philos Sci 82:198–218

Ratti E (2020) What kind of novelties can machine learning possibly generate? The case of genomics. Stud Hist Philos Sci Part A. 83:86–96. https://www.sciencedirect.com/science/article/pii/S0039368119302924

Reimers M, Craver C, Dozmorov M, Bacanu S-A, Kendler K (2019) The coherence problem: finding meaning in GWAS complexity. Behav Genet 49:187–195

Richardson S, Stevens H (2015) Postgenomics: perspectives on biology after the genome. Duke University Press, Durham

Samuelson W, Zeckhauser R (1988) Status quo bias in decision making. J Risk Uncertain 1(1):7–59. https://doi.org/10.1007/BF00055564

Schwartzentruber J, Cooper S, Liu JZ, Barrio-Hernandez I, Bello E, Kumasaka N, Young AMH, Franklin RJM, Johnson T, Estrada K, Gaffney DJ, Beltrao P, Bassett A (2021) Genome-wide meta-analysis, fine-mapping and integrative prioritization implicate new Alzheimer’s disease risk genes. Nat Genet 53(3):392–402. https://doi.org/10.1038/s41588-020-00776-w

Shalem O, Sanjana NE, Zhang F (2015) High-throughput functional genomics using CRISPR-Cas9. Nat Rev Genet 16(5):299–311

Sharma S, Petsalaki E (2019) Large-scale datasets uncovering cell signalling networks in cancer: context matters. Curr Opin Genet Dev. 54:118–124 Cancer Genomics. https://www.sciencedirect.com/science/article/pii/S0959437X18301278

Steinle F (1997) Entering new fields: exploratory uses of experimentation. Philos Sci. 64:S65–S74. http://www.jstor.org/stable/188390

Sullivan E (2019) Understanding from machine learning models. British J Philos Sci

Tam V, Patel N, Turcotte M, Bossé Y, Paré G, Meyre D (2019) Benefits and limitations of genome-wide association studies. Nat Rev Genet 20(8):467–484. https://doi.org/10.1038/s41576-019-0127-1

Weedon MN, Lango H, Lindgren CM, Wallace C, Evans, DM, Mangino M, Freathy RM, Perry J. RB, Stevens S, Hall AS, Samani NJ, Shields B, Prokopenko I, Farrall M, Dominiczak A, Johnson T, Bergmann S, Beckmann, JS, Vollenweider, P, Waterworth DM, Mooser V, Palmer CNA Morris AD Ouwehand WH, Zhao JH, Li S, Loos R JF, Barroso I, Deloukas P, Sandhu MS, Wheeler E, Soranzo N, Inouye M, Wareham NJ, Caulfield M, Munroe PB, Hattersley AT, McCarthy MI, Frayling TM, Initiative, DG, Consortium TWTCC, Consortium, CG (2008) Genome-wide association analysis identifies 20 loci that influence adult height. Nat Genet 40(5):575–583. https://doi.org/10.1038/ng.121

Wilkes EH, Terfve C, Gribben JG, Saez-Rodriguez J, Cutillas PR (2015) Empirical inference of circuitry and plasticity in a kinase signaling network. Proc Natl Acad Sci U S A 112(25):7719–7724

Wray N, Wijmenga C, Sullivan P, Yang J, Visscher P (2018) Common disease is more complex than implied by the core gene omnigenic model. Cell 173:1573–1580

Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, Madden PA, Heath AC, Martin NG, Montgomery GW, Goddard ME, Visscher PM (2010) Common SNPs explain a large proportion of the heritability for human height. Nat Genet 42(7):565–569. https://doi.org/10.1038/ng.608

Download references

Acknowledgements

We are grateful to Simon Davis, Katie de Lange, and the anonymous reviewers (one of whom turned out to be Pierrick Bourrat) for helpful discussions and feedback. S.S. is supported by a Sir Henry Wellcome Postdoctoral Fellowship at the University of Oxford.

Author information

Authors and affiliations.

Faculty of Philosophy, University of Oxford, Oxford, UK

Weatherall Institute for Molecular Medicine, University of Oxford, Oxford, UK

Sumana Sharma

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to James Read .

Ethics declarations

Conflict of interest, additional information, publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Read, J., Sharma, S. Hypothesis-driven science in large-scale studies: the case of GWAS. Biol Philos 36 , 46 (2021). https://doi.org/10.1007/s10539-021-09823-0

Download citation

Received : 24 May 2021

Accepted : 08 September 2021

Published : 19 September 2021

DOI : https://doi.org/10.1007/s10539-021-09823-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Systems biology
  • Hypothesis-driven science
  • Machine learning
  • Find a journal
  • Publish with us
  • Track your research

Constructing Hypotheses in Quantitative Research

Hypotheses are the testable statements linked to your research question. Hypotheses bridge the gap from the general question you intend to investigate (i.e., the research question) to concise statements of what you hypothesize the connection between your variables to be. For example, if we were studying the influence of mentoring relationships on first-generation students’ intention to remain at their university, we might have the following research question:

“Does the presence of a mentoring relationship influence first-generation students’ intentions to remain at their university?”

request a consultation

Discover How We Assist to Edit Your Dissertation Chapters

Aligning theoretical framework, gathering articles, synthesizing gaps, articulating a clear methodology and data plan, and writing about the theoretical and practical implications of your research are part of our comprehensive dissertation editing services.

  • Bring dissertation editing expertise to chapters 1-5 in timely manner.
  • Track all changes, then work with you to bring about scholarly writing.
  • Ongoing support to address committee feedback, reducing revisions.

Although this statement clearly articulates the construct and specific variables we intend to study, we still have not identified exactly what we are testing. We use the hypotheses to make this clear. Specifically, we create null and alternate hypotheses to indicate exactly what we intend to test. In general, the null hypothesis states that there is no observable difference or relationship, and the alternate hypothesis states that there is an observable difference or relationship. In the example above, our hypotheses would be as follows:

Null hypothesis: The presence of a mentoring relationship does not influence first-generation students’ intention to remain at their university.

Alternate hypothesis: The presence of a mentoring relationship influences first-generation students’ intention to remain at their university.

Hypotheses may be worded with or without a direction. As written above, the hypotheses do not have a direction. To give them direction, we would consult previous literature to determine how a mentoring relationship is likely to influence intention to remain in school. If the research indicates that the presence of a mentoring relationship should increase students’ connections to the university and their willingness to remain, our alternate hypothesis would state:

“The presence of a mentoring relationship increases first-generation students’ intention to remain at their university.”

If the research indicates that the presence of a mentoring relationship minimizes students’ desire to make additional connections to the university and in turn decreases their willingness to remain, our alternate hypothesis would state:

“The presence of a mentoring relationship decreases first-generation students’ intention to remain at their university.”

Once you conduct your statistical analysis you will determine if the null hypothesis should be rejected in favor of the alternate hypothesis.

Educational resources and simple solutions for your research journey

Research hypothesis: What it is, how to write it, types, and examples

What is a Research Hypothesis: How to Write it, Types, and Examples

all quantitative research must be hypothesis driven

Any research begins with a research question and a research hypothesis . A research question alone may not suffice to design the experiment(s) needed to answer it. A hypothesis is central to the scientific method. But what is a hypothesis ? A hypothesis is a testable statement that proposes a possible explanation to a phenomenon, and it may include a prediction. Next, you may ask what is a research hypothesis ? Simply put, a research hypothesis is a prediction or educated guess about the relationship between the variables that you want to investigate.  

It is important to be thorough when developing your research hypothesis. Shortcomings in the framing of a hypothesis can affect the study design and the results. A better understanding of the research hypothesis definition and characteristics of a good hypothesis will make it easier for you to develop your own hypothesis for your research. Let’s dive in to know more about the types of research hypothesis , how to write a research hypothesis , and some research hypothesis examples .  

Table of Contents

What is a hypothesis ?  

A hypothesis is based on the existing body of knowledge in a study area. Framed before the data are collected, a hypothesis states the tentative relationship between independent and dependent variables, along with a prediction of the outcome.  

What is a research hypothesis ?  

Young researchers starting out their journey are usually brimming with questions like “ What is a hypothesis ?” “ What is a research hypothesis ?” “How can I write a good research hypothesis ?”   

A research hypothesis is a statement that proposes a possible explanation for an observable phenomenon or pattern. It guides the direction of a study and predicts the outcome of the investigation. A research hypothesis is testable, i.e., it can be supported or disproven through experimentation or observation.     

all quantitative research must be hypothesis driven

Characteristics of a good hypothesis  

Here are the characteristics of a good hypothesis :  

  • Clearly formulated and free of language errors and ambiguity  
  • Concise and not unnecessarily verbose  
  • Has clearly defined variables  
  • Testable and stated in a way that allows for it to be disproven  
  • Can be tested using a research design that is feasible, ethical, and practical   
  • Specific and relevant to the research problem  
  • Rooted in a thorough literature search  
  • Can generate new knowledge or understanding.  

How to create an effective research hypothesis  

A study begins with the formulation of a research question. A researcher then performs background research. This background information forms the basis for building a good research hypothesis . The researcher then performs experiments, collects, and analyzes the data, interprets the findings, and ultimately, determines if the findings support or negate the original hypothesis.  

Let’s look at each step for creating an effective, testable, and good research hypothesis :  

  • Identify a research problem or question: Start by identifying a specific research problem.   
  • Review the literature: Conduct an in-depth review of the existing literature related to the research problem to grasp the current knowledge and gaps in the field.   
  • Formulate a clear and testable hypothesis : Based on the research question, use existing knowledge to form a clear and testable hypothesis . The hypothesis should state a predicted relationship between two or more variables that can be measured and manipulated. Improve the original draft till it is clear and meaningful.  
  • State the null hypothesis: The null hypothesis is a statement that there is no relationship between the variables you are studying.   
  • Define the population and sample: Clearly define the population you are studying and the sample you will be using for your research.  
  • Select appropriate methods for testing the hypothesis: Select appropriate research methods, such as experiments, surveys, or observational studies, which will allow you to test your research hypothesis .  

Remember that creating a research hypothesis is an iterative process, i.e., you might have to revise it based on the data you collect. You may need to test and reject several hypotheses before answering the research problem.  

How to write a research hypothesis  

When you start writing a research hypothesis , you use an “if–then” statement format, which states the predicted relationship between two or more variables. Clearly identify the independent variables (the variables being changed) and the dependent variables (the variables being measured), as well as the population you are studying. Review and revise your hypothesis as needed.  

An example of a research hypothesis in this format is as follows:  

“ If [athletes] follow [cold water showers daily], then their [endurance] increases.”  

Population: athletes  

Independent variable: daily cold water showers  

Dependent variable: endurance  

You may have understood the characteristics of a good hypothesis . But note that a research hypothesis is not always confirmed; a researcher should be prepared to accept or reject the hypothesis based on the study findings.  

all quantitative research must be hypothesis driven

Research hypothesis checklist  

Following from above, here is a 10-point checklist for a good research hypothesis :  

  • Testable: A research hypothesis should be able to be tested via experimentation or observation.  
  • Specific: A research hypothesis should clearly state the relationship between the variables being studied.  
  • Based on prior research: A research hypothesis should be based on existing knowledge and previous research in the field.  
  • Falsifiable: A research hypothesis should be able to be disproven through testing.  
  • Clear and concise: A research hypothesis should be stated in a clear and concise manner.  
  • Logical: A research hypothesis should be logical and consistent with current understanding of the subject.  
  • Relevant: A research hypothesis should be relevant to the research question and objectives.  
  • Feasible: A research hypothesis should be feasible to test within the scope of the study.  
  • Reflects the population: A research hypothesis should consider the population or sample being studied.  
  • Uncomplicated: A good research hypothesis is written in a way that is easy for the target audience to understand.  

By following this research hypothesis checklist , you will be able to create a research hypothesis that is strong, well-constructed, and more likely to yield meaningful results.  

Research hypothesis: What it is, how to write it, types, and examples

Types of research hypothesis  

Different types of research hypothesis are used in scientific research:  

1. Null hypothesis:

A null hypothesis states that there is no change in the dependent variable due to changes to the independent variable. This means that the results are due to chance and are not significant. A null hypothesis is denoted as H0 and is stated as the opposite of what the alternative hypothesis states.   

Example: “ The newly identified virus is not zoonotic .”  

2. Alternative hypothesis:

This states that there is a significant difference or relationship between the variables being studied. It is denoted as H1 or Ha and is usually accepted or rejected in favor of the null hypothesis.  

Example: “ The newly identified virus is zoonotic .”  

3. Directional hypothesis :

This specifies the direction of the relationship or difference between variables; therefore, it tends to use terms like increase, decrease, positive, negative, more, or less.   

Example: “ The inclusion of intervention X decreases infant mortality compared to the original treatment .”   

4. Non-directional hypothesis:

While it does not predict the exact direction or nature of the relationship between the two variables, a non-directional hypothesis states the existence of a relationship or difference between variables but not the direction, nature, or magnitude of the relationship. A non-directional hypothesis may be used when there is no underlying theory or when findings contradict previous research.  

Example, “ Cats and dogs differ in the amount of affection they express .”  

5. Simple hypothesis :

A simple hypothesis only predicts the relationship between one independent and another independent variable.  

Example: “ Applying sunscreen every day slows skin aging .”  

6 . Complex hypothesis :

A complex hypothesis states the relationship or difference between two or more independent and dependent variables.   

Example: “ Applying sunscreen every day slows skin aging, reduces sun burn, and reduces the chances of skin cancer .” (Here, the three dependent variables are slowing skin aging, reducing sun burn, and reducing the chances of skin cancer.)  

7. Associative hypothesis:  

An associative hypothesis states that a change in one variable results in the change of the other variable. The associative hypothesis defines interdependency between variables.  

Example: “ There is a positive association between physical activity levels and overall health .”  

8 . Causal hypothesis:

A causal hypothesis proposes a cause-and-effect interaction between variables.  

Example: “ Long-term alcohol use causes liver damage .”  

Note that some of the types of research hypothesis mentioned above might overlap. The types of hypothesis chosen will depend on the research question and the objective of the study.  

all quantitative research must be hypothesis driven

Research hypothesis examples  

Here are some good research hypothesis examples :  

“The use of a specific type of therapy will lead to a reduction in symptoms of depression in individuals with a history of major depressive disorder.”  

“Providing educational interventions on healthy eating habits will result in weight loss in overweight individuals.”  

“Plants that are exposed to certain types of music will grow taller than those that are not exposed to music.”  

“The use of the plant growth regulator X will lead to an increase in the number of flowers produced by plants.”  

Characteristics that make a research hypothesis weak are unclear variables, unoriginality, being too general or too vague, and being untestable. A weak hypothesis leads to weak research and improper methods.   

Some bad research hypothesis examples (and the reasons why they are “bad”) are as follows:  

“This study will show that treatment X is better than any other treatment . ” (This statement is not testable, too broad, and does not consider other treatments that may be effective.)  

“This study will prove that this type of therapy is effective for all mental disorders . ” (This statement is too broad and not testable as mental disorders are complex and different disorders may respond differently to different types of therapy.)  

“Plants can communicate with each other through telepathy . ” (This statement is not testable and lacks a scientific basis.)  

Importance of testable hypothesis  

If a research hypothesis is not testable, the results will not prove or disprove anything meaningful. The conclusions will be vague at best. A testable hypothesis helps a researcher focus on the study outcome and understand the implication of the question and the different variables involved. A testable hypothesis helps a researcher make precise predictions based on prior research.  

To be considered testable, there must be a way to prove that the hypothesis is true or false; further, the results of the hypothesis must be reproducible.  

Research hypothesis: What it is, how to write it, types, and examples

Frequently Asked Questions (FAQs) on research hypothesis  

1. What is the difference between research question and research hypothesis ?  

A research question defines the problem and helps outline the study objective(s). It is an open-ended statement that is exploratory or probing in nature. Therefore, it does not make predictions or assumptions. It helps a researcher identify what information to collect. A research hypothesis , however, is a specific, testable prediction about the relationship between variables. Accordingly, it guides the study design and data analysis approach.

2. When to reject null hypothesis ?

A null hypothesis should be rejected when the evidence from a statistical test shows that it is unlikely to be true. This happens when the test statistic (e.g., p -value) is less than the defined significance level (e.g., 0.05). Rejecting the null hypothesis does not necessarily mean that the alternative hypothesis is true; it simply means that the evidence found is not compatible with the null hypothesis.  

3. How can I be sure my hypothesis is testable?  

A testable hypothesis should be specific and measurable, and it should state a clear relationship between variables that can be tested with data. To ensure that your hypothesis is testable, consider the following:  

  • Clearly define the key variables in your hypothesis. You should be able to measure and manipulate these variables in a way that allows you to test the hypothesis.  
  • The hypothesis should predict a specific outcome or relationship between variables that can be measured or quantified.   
  • You should be able to collect the necessary data within the constraints of your study.  
  • It should be possible for other researchers to replicate your study, using the same methods and variables.   
  • Your hypothesis should be testable by using appropriate statistical analysis techniques, so you can draw conclusions, and make inferences about the population from the sample data.  
  • The hypothesis should be able to be disproven or rejected through the collection of data.  

4. How do I revise my research hypothesis if my data does not support it?  

If your data does not support your research hypothesis , you will need to revise it or develop a new one. You should examine your data carefully and identify any patterns or anomalies, re-examine your research question, and/or revisit your theory to look for any alternative explanations for your results. Based on your review of the data, literature, and theories, modify your research hypothesis to better align it with the results you obtained. Use your revised hypothesis to guide your research design and data collection. It is important to remain objective throughout the process.  

5. I am performing exploratory research. Do I need to formulate a research hypothesis?  

As opposed to “confirmatory” research, where a researcher has some idea about the relationship between the variables under investigation, exploratory research (or hypothesis-generating research) looks into a completely new topic about which limited information is available. Therefore, the researcher will not have any prior hypotheses. In such cases, a researcher will need to develop a post-hoc hypothesis. A post-hoc research hypothesis is generated after these results are known.  

6. How is a research hypothesis different from a research question?

A research question is an inquiry about a specific topic or phenomenon, typically expressed as a question. It seeks to explore and understand a particular aspect of the research subject. In contrast, a research hypothesis is a specific statement or prediction that suggests an expected relationship between variables. It is formulated based on existing knowledge or theories and guides the research design and data analysis.

7. Can a research hypothesis change during the research process?

Yes, research hypotheses can change during the research process. As researchers collect and analyze data, new insights and information may emerge that require modification or refinement of the initial hypotheses. This can be due to unexpected findings, limitations in the original hypotheses, or the need to explore additional dimensions of the research topic. Flexibility is crucial in research, allowing for adaptation and adjustment of hypotheses to align with the evolving understanding of the subject matter.

8. How many hypotheses should be included in a research study?

The number of research hypotheses in a research study varies depending on the nature and scope of the research. It is not necessary to have multiple hypotheses in every study. Some studies may have only one primary hypothesis, while others may have several related hypotheses. The number of hypotheses should be determined based on the research objectives, research questions, and the complexity of the research topic. It is important to ensure that the hypotheses are focused, testable, and directly related to the research aims.

9. Can research hypotheses be used in qualitative research?

Yes, research hypotheses can be used in qualitative research, although they are more commonly associated with quantitative research. In qualitative research, hypotheses may be formulated as tentative or exploratory statements that guide the investigation. Instead of testing hypotheses through statistical analysis, qualitative researchers may use the hypotheses to guide data collection and analysis, seeking to uncover patterns, themes, or relationships within the qualitative data. The emphasis in qualitative research is often on generating insights and understanding rather than confirming or rejecting specific research hypotheses through statistical testing.

Researcher.Life is a subscription-based platform that unifies the best AI tools and services designed to speed up, simplify, and streamline every step of a researcher’s journey. The Researcher.Life All Access Pack is a one-of-a-kind subscription that unlocks full access to an AI writing assistant, literature recommender, journal finder, scientific illustration tool, and exclusive discounts on professional publication services from Editage.  

Based on 21+ years of experience in academia, Researcher.Life All Access empowers researchers to put their best research forward and move closer to success. Explore our top AI Tools pack, AI Tools + Publication Services pack, or Build Your Own Plan. Find everything a researcher needs to succeed, all in one place –  Get All Access now starting at just $17 a month !    

Related Posts

Turabian Format

Turabian Format: A Beginner’s Guide

research

What is Research? Definition, Types, Methods, and Examples

Banner

  • Teesside University Student & Library Services
  • Learning Hub Group

Quantitative data collection and analysis

  • Testing hypotheses
  • Quantitative data collection
  • Averages and percentiles
  • Measures of Spread or Dispersion
  • Samples and population
  • Statistical tests - parametric
  • Statistical tests - non-parametric
  • Probability
  • Reliability and Validity
  • Analysing relationships
  • Useful Books

Testing Hypotheses

  • What is a hypothesis?
  • Significance testing
  • One-tailed or two-tailed?
  • Degrees of freedom

A hypothesis is a statement that we are trying to prove or disprove. It is used to express the relationship between variables  and whether this relationship is significant. It is specific and offers a prediction on the results of your research question.

Your research question  will lead you to developing a hypothesis, this is why your research question needs to be specific and clear.

The hypothesis will then guide you to the most appropriate techniques you should use to answer the question. They reflect the literature and theories on which you basing them. They need to be testable (i.e. measurable and practical).

Null hypothesis  (H 0 ) is the proposition that there will not be a relationship between the variables you are looking at. i.e. any differences are due to chance). They always refer to the population. (Usually we don't believe this to be true.)

e.g. There is  no difference in instances of illegal drug use by teenagers who are members of a gang and those who are not..

Alternative hypothesis  (H A ) or ( H 1 ):  this is sometimes called the research hypothesis or experimental hypothesis. It is the proposition that there will be a relationship. It is a statement of inequality between the variables you are interested in. They always refer to the sample. It is usually a declaration rather than a question and is clear, to the point and specific.

e.g. The instances of illegal drug use of teenagers who are members of a gang  is different than the instances of illegal drug use of teenagers who are not gang members.

A non-directional research hypothesis - reflects an expected difference between groups but does not specify the direction of this difference (see two-tailed test).

A directional research hypothesis - reflects an expected difference between groups but does specify the direction of this difference. (see one-tailed test)

e.g. The instances of illegal drug use by teenagers who are members of a gang will be higher t han the instances of illegal drug use of teenagers who are not gang members.

Then the process of testing is to ascertain which hypothesis to believe. 

It is usually easier to prove something as untrue rather than true, so looking at the null hypothesis is the usual starting point.

The process of examining the null hypothesis in light of evidence from the sample is called significance testing . It is a way of establishing a range of values in which we can establish whether the null hypothesis is true or false.

The debate over hypothesis testing

There has been discussion over whether the scientific method employed in traditional hypothesis testing is appropriate.  

See below for some articles that discuss this:

  • Gill, J. (1999) 'The insignificance of null hypothesis testing',  Politics Research Quarterly , 52(3), pp. 647-674 .
  • Wainer, H. and Robinson, D.H. (2003) 'Shaping up the practice of null hypothesis significance testing',  Educational Researcher, 32(7), pp.22-30 .
  • Ferguson, C.J. and Heener, M. (2012) ' A vast graveyard of undead theories: publication bias and psychological science's aversion to the null' ,  Perspectives on Psychological Science, 7(6), pp.555-561 .

Taken from: Salkind, N.J. (2017)  Statistics for people who (think they) hate statistics. 6th edn. London: SAGE pp. 144-145.

  • Null hypothesis - a simple introduction (SPSS)

A significance level defines the level when your sample evidence contradicts your null hypothesis so that your can then reject it. It is the probability of rejecting the null hypothesis when it is really true.

e.g. a significance level of 0.05 indicates that there is a 5% (or 1 in 20) risk of deciding that there is an effect when in fact there is none.

The lower the significance level that you set,  then the evidence from the sample has to be stronger to be able to reject the null hypothesis.

N.B.  - it is important that you set the significance level before you carry out your study and analysis.

Using Confidence Intervals

I t is possible to test the significance of your null hypothesis using Confidence Interval (see under samples and populations tab).

- if the range lies outside our predicted null hypothesis value we can reject it and accept the alternative hypothesis  

The test statistic

This is another commonly used statistic

  • Write down your null and alternative hypothesis
  • Find the sample statistic (e.g.the mean of your sample)
  • Calculate the test statistic Z score (see under Measures of spread or dispersion and Statistical tests - parametric). In this case the sample mean is compared to the population mean (assumed from the null hypothesis) and the standard error (see under Samples and population) is used rather than the standard deviation.
  • Compare the test statistic with the critical values (e.g. plus or minus 1.96 for 5% significance)
  • Draw a conclusion about the hypotheses - does the calculated z value lies in this critical range i.e. above 1.96 or below -1.96? If it does we can reject the null hypothesis. This would indicate that the results are significant (or an effect has been detected) - which means that if there were no difference in the population then getting a result that you have observed would be highly unlikely therefore you can reject the null hypothesis.

all quantitative research must be hypothesis driven

Type I error  - this is the chance of wrongly rejecting the null hypothesis even though it is actually true, e.g. by using a 5% p  level you would expect the null hypothesis to be rejected about 5% of the time when the null hypothesis is true. You could set a more stringent p  level such as 1% (or 1 in 100) to be more certain of not seeing a Type I error. This, however, makes more likely another type of error (Type II) occurring.

Type II error  - this is where there is an effect, but the  p  value you obtain is non-significant hence you don’t detect this effect.

  • Statistical significance - what does it really mean?
  • Statistical tables

One-tailed tests - where we know in which direction (e.g. larger or smaller) the difference between sample and population will be. It is a directional hypothesis.

Two-tailed tests - where we are looking at whether there is a difference between sample and population. This difference could be larger or smaller. This is a non-directional hypothesis.

If the difference is in the direction you have predicted (i.e. a one-tailed test) it is easier to get a significant result. Though there are arguments against using a one-tailed test (Wright and London, 2009, p. 98-99)*

*Wright, D. B. & London, K. (2009)  First (and second) steps in statistics . 2nd edn. London: SAGE.

N.B. - think of the ‘tails’ as the regions at the far-end of a normal distribution. For a two-tailed test with significance level of 0.05% then 0.025% of the values would be at one end of the distribution and the other 0.025% would be at the other end of the distribution. It is the values in these ‘critical’ extreme regions where we can think about rejecting the null hypothesis and claim that there has been an effect.

Degrees of freedom ( df)  is a rather difficult mathematical concept, but is needed to calculate the signifcance of certain statistical tests, such as the t-test, ANOVA and Chi-squared test.

It is broadly defined as the number of "observations" (pieces of information) in the data that are free to vary when estimating statistical parameters. (Taken from Minitab Blog ).

The higher the degrees of freedom are the more powerful and precise your estimates of the parameter (population) will be.

Typically, for a 1-sample t-test it is considered as the number of values in your sample minus 1.

For chi-squared tests with a table of rows and columns the rule is:

(Number of rows minus 1) times (number of columns minus 1)

Any accessible example to illustrate the principle of degrees of freedom using chocolates.

  • You have seven chocolates in a box, each being a different type, e.g. truffle, coffee cream, caramel cluster, fudge, strawberry dream, hazelnut whirl, toffee. 
  • You are being good and intend to eat only one chocolate each day of the week.
  • On the first day, you can choose to eat any one of the 7 chocolate types  - you have a choice from all 7.
  • On the second day, you can choose from the 6 remaining chocolates, on day 3 you can choose from 5 chocolates, and so on.
  • On the sixth day you have a choice of the remaining 2 chocolates you haven't ate that week.
  • However on the seventh day - you haven't really got any choice of chocolate - it has got to be the one you have left in your box.
  • You had 7-1 = 6 days of “chocolate” freedom—in which the chocolate you ate could vary!
  • << Previous: Samples and population
  • Next: Statistical tests - parametric >>
  • Last Updated: Jan 9, 2024 11:01 AM
  • URL: https://libguides.tees.ac.uk/quantitative

Advertisement

Issue Cover

  • Previous Article
  • Next Article

Introduction

Observation-driven exploration versus data-driven analyses, embarking on the journey of experimental design, perspectives, acknowledgements, hypothesis-driven quantitative fluorescence microscopy – the importance of reverse-thinking in experimental design.

ORCID logo

Competing interests

The authors declare no competing or financial interests.

  • Split-screen
  • Article contents
  • Figures & tables
  • Supplementary Data
  • Peer Review
  • Open the PDF for in another window
  • Version of Record 05 November 2020
  • Get Permissions
  • Cite Icon Cite
  • Search Site

Eric C. Wait , Michael A. Reiche , Teng-Leong Chew; Hypothesis-driven quantitative fluorescence microscopy – the importance of reverse-thinking in experimental design. J Cell Sci 1 November 2020; 133 (21): jcs250027. doi: https://doi.org/10.1242/jcs.250027

Download citation file:

  • Ris (Zotero)
  • Reference Manager

One of the challenges in modern fluorescence microscopy is to reconcile the conventional utilization of microscopes as exploratory instruments with their emerging and rapidly expanding role as a quantitative tools. The contribution of microscopy to observational biology will remain enormous owing to the improvements in acquisition speed, imaging depth, resolution and biocompatibility of modern imaging instruments. However, the use of fluorescence microscopy to facilitate the quantitative measurements necessary to challenge hypotheses is a relatively recent concept, made possible by advanced optics, functional imaging probes and rapidly increasing computational power. We argue here that to fully leverage the rapidly evolving application of microscopes in hypothesis-driven biology, we not only need to ensure that images are acquired quantitatively but must also re-evaluate how microscopy-based experiments are designed. In this Opinion, we present a reverse logic that guides the design of quantitative fluorescence microscopy experiments. This unique approach starts from identifying the results that would quantitatively inform the hypothesis and map the process backward to microscope selection. This ensures that the quantitative aspects of testing the hypothesis remain the central focus of the entire experimental design.

Advancements in optical engineering, labeling technologies, and computational capacity have turned fluorescence microscopy into an indispensable tool in the life sciences. Its unique capacity to probe biological questions across a large range of biological length scales has made it a popular tool in cell biology, neurobiology and developmental biology, as well as many other fields of research. Modern microscopy can reveal valuable information on molecular ultrastructure, dynamic biological processes and biological functions. Yet, the appeal of seemingly limitless promises, the myriad of technical details and the rapid development of computational capabilities has also created confusion for many seeking the right combination of imaging tools. As has been previously pointed out by Jonkman and colleagues ( Jonkman et al., 2020 ), biologists can spend considerable time and resources acquiring huge amounts of data without proper planning, only to realize later that the data cannot appropriately address a particular biological question. This usually occurs when the design of a microscopy experiment is not guided by a suitable hypothesis, the experimenter gets side-tracked by new observations or the experiment starts without a design at all. The method proposed here aims to assist the gathering of appropriate data that directly addresses a quantitative hypothesis. The intent is to give the reader a better understanding of the process and potential issues that arise in quantitative experiments.

The importance of fluorescence microscopy lies in its ability to serve both as an exploratory and a quantitative tool. In other words, microscopy has a combined capacity that enables a biologist to both formulate hypotheses based on observation and to perform quantitative measurements to test those hypotheses. For example, one might easily observe the localization of a target protein within a mitochondrial compartment. However, it takes a shift in mindset to design an appropriate experiment capable of quantifying this localization change in response to an oxidative stress. Quantitative measurements, however, can only produce results that directly address a proposed hypothesis when the experiment is designed appropriately. In fact, even an accurate, quantitative set of data that has been generated with the best practices will not necessarily yield biologically meaningful results. An image acquired with a digital detector is inherently a data map – an array of values. While any digital image can be quantified, these measurements are only biologically meaningful when they are pertinent to the hypothesis. Take for example a study that investigates the rates of filopodia extension during cell migration. Data revealing the super-resolved, 3D actin filaments are not sufficient for determining the rate of filopodia extension. However, an experiment that captures the change in location of the filopodial tip will provide the necessary data. In other words, when testing a quantitative hypothesis, informative data are quantitative, but not all quantitative data are informative.

Reliable and informative results require high-quality image data and relevant analyses. Fortunately, there is no shortage of excellent reviews in the literature that offer step-by-step guidance to perform microscopy experiments, from image acquisition to quantitative image analysis ( Berg et al., 2019 ; Jonkman et al., 2014 ; McQuin et al., 2018 ; North, 2006 ; Rueden et al., 2017 ; Swedlow, 2013 ; Van Den Berge et al., 2019 ; Waters, 2009 ; Weigert et al., 2018 ). The task now lies in ensuring that data acquisition and analyses can be translated into biologically meaningful information, capable of challenging a hypothesis. We argue that this must be achieved through rational experimental design.

Designing a hypothesis-driven experiment is a vital step in the overall experimental scheme, but it is often over-simplified and represented by a single step. The conventional workflow of an imaging experiment, as astutely observed by Lee and Kitaoka (2018) , is adapted in Fig. 1 A. In this generalized diagram, the execution of the experiment begins with sample preparation after experimental design. The images are acquired, and the data will then be processed and analyzed – usually followed by several iterations of optimization – before the final results are presented. What is important to note is that experimental design is appropriately singled out as the key first step ( Fig. 1 A). Yet, in stark contrast to the wealth of technical guides, there is a paucity of discussion in the literature on the logic of rational experimental design and how it can be harnessed to successfully perform a hypothesis-driven, quantitative experiment. This is an unfortunate omission, partly due to the difficulty in summarizing a logical scheme that is sufficiently general to be applicable to most biological questions. In this Opinion article, we aim to fill this important gap and focus on rational, hypothesis-driven experimental design. This guide is aimed toward biologists interested in learning how to design quantitative experiments that are geared toward testing their hypotheses. It embodies our experience in steering imaging projects from hypotheses to quantitative, informative results at the Advanced Imaging Center at HHMI Janelia Research Campus ( Chew et al., 2017 ). We include in Box 1 a case study of how we have successfully steered the development of such a quantitative microscopy project.

Conducting and designing quantitative fluorescence microscopy experiments. (A) Typical workflow in microscopy experiment. This workflow is forward-facing, progressing from the formulation of a hypothesis to the eventual presentation of the data as results. Adapted with permission of American Society for Cell Biology from Lee and Kitaoka (2018); permission conveyed through Copyright Clearance Center, Inc. (B) A focused view of the experimental planning phase. We propose that experimental design would be more efficient and effective by adopting a reverse-facing workflow. Here, the hypothesis should determine what the necessary results should be. From there, the experimenter can plan backward from the required data to the point where the experiment can be executed. The processes outlined in A and B are iterative, and the experimenter should re-evaluate whether the best decision has been made at each step. (C) A flow diagram to determine whether the experimental output generated from the microscope will lead to informative results. Answering the questions outlined here will identify the corresponding step in the design that needs re-evaluation. Reaching the ‘Informative results’ box would indicate that the data acquired were most likely collected in a manner that would directly test the hypothesis. Alternatively, the bulleted lists provide insight into which step in the design process requires re-evaluation to be improved in subsequent design iterations.

Conducting and designing quantitative fluorescence microscopy experiments . (A) Typical workflow in microscopy experiment. This workflow is forward-facing, progressing from the formulation of a hypothesis to the eventual presentation of the data as results. Adapted with permission of American Society for Cell Biology from Lee and Kitaoka (2018) ; permission conveyed through Copyright Clearance Center, Inc. (B) A focused view of the experimental planning phase. We propose that experimental design would be more efficient and effective by adopting a reverse-facing workflow. Here, the hypothesis should determine what the necessary results should be. From there, the experimenter can plan backward from the required data to the point where the experiment can be executed. The processes outlined in A and B are iterative, and the experimenter should re-evaluate whether the best decision has been made at each step. (C) A flow diagram to determine whether the experimental output generated from the microscope will lead to informative results. Answering the questions outlined here will identify the corresponding step in the design that needs re-evaluation. Reaching the ‘Informative results’ box would indicate that the data acquired were most likely collected in a manner that would directly test the hypothesis. Alternatively, the bulleted lists provide insight into which step in the design process requires re-evaluation to be improved in subsequent design iterations.

all quantitative research must be hypothesis driven

This case study partially summarizes one of the quantitative experiments performed by McArthur and colleagues ( McArthur et al., 2018 ). Preliminary observations indicated that the mitochondrial network of cells deficient in induced myeloid leukemia cell differentiation protein (MCL-1), a Bcl-2 family member, broke down during apoptosis (A in the box figure), followed by the presence of mitochondrial DNA (mtDNA) in the cytoplasm (B in the box figure). This observation led to the conceptualization of the working model – ‘during apoptosis, mitochondrial morphology changes prior to the release of mtDNA into the cytoplasm’.

To properly plan a quantitative experiment to test this model, we used our reverse-logic to steer the following experimental design:

1. A more-defined hypothesis was formulated – ‘during apoptosis, the mitochondrial sphericity increases prior to an increase in the number of externalized mtDNA’. Note how the initial descriptive semantics have been translated into quantitative semantics that will guide subsequent measurements.

2. Two sets of informative results were essential to test this hypothesis: (i) mitochondrial sphericity, and (ii) mtDNA externalization, both measured as a function of time.

3. To achieve these informative results, the required data must include time-lapsed, volumetric images of labeled mitochondria and mtDNA.

4. To produce these data, the following experimental imaging parameters had to be met:

• high-speed volumetric imaging to accurately track 3D mitochondrial network reorganization

• high signal-to-noise ratio and resolution in order to accurately measure the 3D structures of the mitochondria

• near-isotropic resolution to precisely characterize the sphericity of mitochondria and mtDNA extrusion

• two-color acquisition to provide information on both the mitochondria and the mtDNA.

5. While both lattice lightsheet microscopy (LLSM) and 3D structured illumination microscopy (SIM) met these benchmarks, it was also important to meet the biological requirements. Pilot studies established that two-channel volumes of 50 slices each, acquired approximately every 10 s for a total of 50 min would be necessary to capture and follow this rare process in its entirety. Photoxicity could affect the mitochondrial biology, introducing artifacts. To mitigate phototoxicity, the gentle illumination of LLSM established it as the clear choice. To further reduce light exposure, brighter fluorescent labels, such as mNeonGreen ( Shaner et al., 2013 ) and HaloTag™ (Promega, USA) with Janelia Fluor ® 646 ( Grimm et al., 2017 ) (instead of EGFP and mCherry), were used. Note that the experimental design process was iterative and benefited from pilot studies used to identify the necessary imaging parameters, suitable fluorophores, and the optimal microscope.

C to E in the box figure illustrate the successful completion of this quantitative experiment. The LLSM micrograph (C) shows mtDNA extrusion from mitochondria. These images were used to create 3D segmentations (D) and were quantified. The mitochondrial sphericity and mtDNA externalization were measured over time, and plotted in E. This graph shows that an increase in mitochondrial sphericity (thin red line) preceded the onset of mtDNA extrusion (thin green line) – providing the informative result that ultimately supported the hypothesis.

The box figure shows morphological changes of mitochondria and mitochondrial DNA release during apoptosis; images were previously published in McArthur et al. (2018) and are reused here with permission. Scale bars: 5 µm.

The success of a microscopy-based quantitative experiment hinges on the appreciation and understanding of (i) how the underlying biological query and defined hypothesis directs the experimental design, and (ii) how experimental design and instrument choice are related to the way in which image data will eventually be analyzed. For this reason, we outline a logic that exemplifies these themes ( Fig. 1 B). We propose, in this Opinion article, that the very first step of experimental design, following the formulation of a hypothesis, is to determine the informative results that can quantitatively test that hypothesis. In other words, informative results are the ultimate goal of the designed experiment. Therefore, an experiment that has been developed to specifically generate data pertinent to the biological query will produce informative results. As such, the production of the required data will necessitate that a certain set of experimental parameters be met, which would in turn prescribe the features of the instrument needed to make such measurements. Overall, such a systematic workflow ensures that the hypothesis remains central to the experiment and that the experiment yields information capable of challenging the hypothesis. This will help chart the roadmap of how microscopy-based experiments should be designed for quantitative analyses. We will not replicate the many superbly written reviews and guides in the literature here, but rather aim to help readers better utilize these guides, as we embark on our journey of experimental design.

The capacity of modern optical microscopy to support both visual exploration and content-rich measurement has made it a versatile biological research technique. Unfortunately, it is also one that is commonly misunderstood. Biologists are keen observers, exceptional in recognizing patterns, finding anomalies and identifying new phenotypes. In fact, when it comes to studying structures and processes, visualization by itself is often sufficient to prompt biologists to formulate working models of the observed systems, and these working models provide abstract representations of the observation. The descriptive semantics used in these working models have served as powerful tools in life sciences and enable biologists to organize and communicate information about the complexity of the living systems ( Courtot et al., 2011 ). Indeed, specific follow-up questions can often already be framed by experienced biologists as soon as the initial images appear on their monitor; and this is the inception point of many biological queries. This is the essence of observation-driven, empirical inferences – ‘I know it when I see it’, and this is where the power of microscopy has historically been leveraged. Observational biology will continue to play an important role, and it is certainly true that not all biological hypotheses must be quantitatively tested. However, there is no denying that with the advent of modern experimental methods, hypotheses in general have become, and are increasingly expected to be, formulated in more quantitative terms. Addressing these increasingly focused hypotheses is where the quantitative capacity of microscopy has the most impact and is the core of this Opinion article.

If one were to accept the idea that ‘seeing is believing’ with microscopy as an exploratory instrument, then surely one must also accept the notion that ‘measuring is knowing’ when using microscopy as an analytical technique. The challenge here is to reconcile observation and quantification using the same instrument. Quantitative measurement is intrinsically analysis-rich and semantics-agnostic ( Shasha, 2003 ). However, this is where the disparity between observation and quantification often arises. It is common to see proposed microscopy studies with phrases such as ‘to analyze the spatial-temporal dynamics of an organelle’. There is unfortunately no specific analytical metric for the ‘dynamics’ of an organelle or any other biological structure. Dynamics is an ambiguous term that is often used to encapsulate several different metrics that together describe a particular observation. To transform vague biological queries such as this into quantifiable goals for microscopic analysis, we need to consider how intuitive biological semantics can be reformulated. With this in mind, we will begin by exploring how hypotheses shape the rationale of microscopy-based experiments.

Testable hypothesis

The cornerstone of the classical scientific method is to determine whether evidence supports or negates a postulated idea. Hypotheses, at the experimental level, must therefore be negatable by observation or measurements ( Popper, 2005 ). A clearly stated, verifiable hypothesis will guide every step of an experiment and will provide invaluable checkpoints. More importantly, the negatable hypothesis will impart the necessary restraint to mitigate being side-tracked from the initial question. This disciplined approach does not preclude future exploration of other observations, but it serves to balance both the exploratory and the analytical priorities of an experiment ( Fig. 1 C). This is why a hypothesis such as ‘condition X will increase the rate of mitochondrial fission’ has stronger semantic specificity than ‘condition X will affect the spatial-temporal dynamics of mitochondria’. The latter hypothesis cannot be tested because the experimental variables (i.e. fission events) that either support or negate it are not defined.

Interestingly, such cautionary advice is rarely needed for biochemical and molecular biology assays. These are assays that are uniquely quantitative and do not usually serve as observational tools, and biologists learn these techniques extensively during their training. As a result, biologists formulate testable hypotheses and perform quantitative analyses with ease using assays such as immunoblots, PCRs, ELISAs or enzyme kinetic assays. What differentiates these assays from microscopy is that they are explicitly linked to well-defined sets of output. For example, an immunoblot yields specific information on molecular mass and abundance. In contrast, a vast plethora of information can be derived from microscopy data, including molecular abundance, spatial location, movement behavior, morphological changes, structural features, molecular association, enzymatic activity, and the list goes on. Microscopy is therefore not a single assay; instead, it is a collection of assays that vary depending on how the experiment is designed. Without a defined boundary, the scope of an experiment can quickly become too ambitious and unnecessarily complex. This underscores the importance of identifying the appropriate experimental output that addresses the hypothesis early in the design process.

Compared to biochemical and molecular biology assays, the complexity of microscopy is further compounded by the variability in the nature of the sample. In comparison to molecular biology assays that use defined samples for input, such as nucleic acids or proteins, microscopy can accommodate a wide variety of complex samples (from purified molecules to a multitude of model organisms at various stages of development, for example) that in turn change the requirements and implementation of the experiment. Thus, it does not come as a surprise that the experimental scheme and sample choice often have to be considered in parallel due to their interdependencies ( Galas et al., 2018 ). Sample compatibility is a complex issue that comprises both the specimen and fluorescent labels. Likewise, the labeling strategy and sample viability are critically important factors to the success of an experiment, and these topics have been extensively discussed in the literature ( Albrecht and Oliver, 2018 ; Dean and Palmer, 2014 ; Frigault et al., 2009 ; Heppert et al., 2016 ; Icha et al., 2017 ; Kiepas et al., 2020 ; Lambert, 2019 ; Schneider and Hackenberger, 2017 ; Specht et al., 2017 ; Thorn, 2017 ). Overall, the compatibility of a sample will be determined by all aspects of the experiment and demands careful consideration. As a result, the hypothesis and the associated experiment will be heavily influenced by what can be realistically achieved given the nature of the sample. Once the hypothesis has been appropriately defined, rather than proceeding directly to performing microscopy experiments, the most critical step at this point is to evaluate what it means to challenge the hypothesis.

Informative results

Not all results can adequately test a hypothesis. It is important to differentiate between a ‘desired outcome’ and an ‘informative result’. The desired outcome would naturally be for the evidence to support the hypothesis. Continuing with the example of mitochondrial fission stated above, the informative result in this case would be the number of mitochondrial fission events as a function of time, both in the presence and absence of condition X. This is in contrast to the ‘desired outcome’ of finding an increased rate of mitochondrial fission given condition X. In addition, to be informative, the required data should encompass appropriate controls and sufficient replicates to support statistical analyses. The informative result is not designed to affirm one's intuition; it is required to support or negate the hypothesis.

Required data

As depicted in Fig. 1 B, experimental design involves a reverse-thinking workflow that begins with informative results and concludes with microscope choice. This reverse-flow provides the necessary logic for designing a quantitative experiment. The essence of efficient experimental design is to home in on the appropriate assay from the multitude of possibilities offered by fluorescence microscopy. It is therefore imperative that the experimenter identifies what the necessary data are, as this will ultimately define the appropriate assay. This underscores the importance of thinking in reverse, as the necessary data can only be defined by informative results. While results and data are sometimes used interchangeably elsewhere, they are distinctly different in this context. Results refer to the final analytical metrics compiled from a set of related experiments. In contrast, a set of data generated from the microscope, by itself, is insufficient to speak to the validity of a hypothesis.

The transition from data to results requires certain translational steps. A good example of such translation is the process of connecting coordinates of a moving object, be it a cell or particle, between time points into a defined track. Without further analyses, the tracked data of a moving object is only minimally informative; it merely indicates that the object has moved. If one were to hypothesize that the object would change its migratory behavior under certain conditions, then one would need to consider which measurements could describe that behavior. These informative measurements, when performed on the data, are referred to as the analytical metrics. In this example of characterizing migration patterns, the analytical metrics may include directionality, velocity and motion persistence ( Aaron et al., 2019 ). Informative results are produced when these analytical metrics are applied to the appropriate data .

Adhering to our reverse-design approach, the types of analytical metrics that will lead to the informative results are the next factor an experimenter must consider. Table 1 shows how common biological objectives dictate the relevant analytical metrics, which in turn prescribe the necessary experimental tools. Analytical metrics is a form of semantics. What sets it apart from the semantics used in working models is that, in analytical metrics, the semantics are quantitative and specific rather than descriptive. What should be clear from Table 1 is that careful consideration is required to choose the appropriate analytical metrics. In fact, as reflected in the mitochondrial fission example, analytical metrics (mitochondrial fission rate) should be central to the hypothesis, so that it can be tested. An additional example where the choice of the correct analytical metric would affect the results is in colocalization studies. One must first determine whether measuring the degree of overlap (co-occurrence) of the two signals is more appropriate than measuring the extent of their correlation. This decision will dictate the analytical metric that should be used ( Aaron et al., 2018 ). Likewise, if a certain treatment is postulated to increase the dissemination of cancer cells from a cell cluster, it is important, from a mechanistic standpoint, to properly frame the testable hypothesis. This can be accomplished by avoiding vague descriptions such as ‘dissemination’ and instead frame the descriptor in quantitative terms, such as velocity, directionality and persistence of the cellular movement ( Aaron et al., 2019 ). This is how descriptive semantics should be translated into quantitative semantics, thereby enabling the underlying biology to be measured.

Selecting analytical metrics based on biological questions

Selecting analytical metrics based on biological questions

Interestingly, and perhaps ironically, many of the analytical metrics listed in Table 1 , such as velocity, directionality, or curvature, collectively describe ‘spatial-temporal dynamics’. Yet, owing to various limitations of individual microscope design, it is impossible to capture them all in one experiment (see the section on microscope selection below). Similarly, it is often counter-productive to acquire more data than one needs, as this complicates data analysis and also compounds the problem of data storage ( Andreev and Koo, 2020 ). Added complexity can lead to the experimenter being side-tracked from the original goal and makes data interpretation more difficult. Fig. 1 C shows how the iterative evaluation of the experimental output will ensure that these readouts stay pertinent to the hypothesis and allow room for observational biology to take place. Parsimonious selection of analytical metrics will focus the scope of the experiment, generating data that can test the hypothesis. However, the well-considered selection of analytical metrics only fulfills half of the data requirement. One also needs to consider the validity of the data. In other words, how to ensure that the data set is accurate and reproducible.

Accuracy and reproducibility together describe the rigor of the experiment. While highly related, it is possible that accurate data are not reproducible, and reproducibility does not ensure accuracy ( Payne-Tobin Jost and Waters, 2019 ). Too often, the accuracy and reproducibility of microscopy data is only an afterthought, which can potentially jeopardize an entire experiment. There are two places in which rigor can be compromised: during data generation and in the experimental design. Great care should be taken to ensure unbiased sampling, appropriate use of standards and controls, uniform instrument performance and consistent data processing pipelines. In this light, preserving accuracy and reproducibility during image acquisition has been extensively covered ( Jonkman, 2020 ; McQuin et al., 2018 ; Payne-Tobin Jost and Waters, 2019 ), and is beyond the scope of our discussion. Nevertheless, this is extremely important advice and should be followed closely.

However, identifying the appropriate constraints for a rigorous experimental design can be equally challenging. How experimental controls and baselines are chosen can alter the data and the results, and therefore cannot be taken lightly as it can skew data interpretation. In stark contrast to physics, in which absolute numbers of various universal constants can be mathematically derived, biology is a comparative science. In biology, it is the change of experimental readouts in response to a modification of the experimental variables that is the important factor. As previously mentioned, modern microscopes will always generate quantifiable data because a digital image is intrinsically a data map. However, not all quantifiable digital images are meaningful. An absolute number derived from a colocalization experiment (for example, a calculated Pearson's correlation coefficient of 0.75) between two proteins is quantitative, but utterly meaningless as a stand-alone piece of data. It has to be compared to controls to become biologically informative – has the Pearson's coefficient changed in response to a variation in the experimental condition? The importance of establishing an experimental baseline for comparison cannot be overstated. Owing to our inherent tendency to look for the desired outcome, experimental bias occurs in the absence of a rigorous baseline. Validation of an experimental pipeline will ensure the measurements accurately represent the biological truth. This can be achieved by the effective use of controls and standards ( Payne-Tobin Jost and Waters, 2019 ). While this sounds cliché, we found that comparative baselines are often forgotten. By articulating the necessary controls for a given hypothesis, the underlying nature of the experiment can become more apparent. This in turn can be used to refine the hypothesis and home in on what the biologist seeks to test. Stringent controls will indeed make for better experiments.

When an experiment is driven by a hypothesis, the hypothesis itself will define the requirements of the experiment. These, in turn, will define the parameters that subsequently circumscribe the rest of the microscopy assay. The key parameters in any microscopy experiment will include one or more of the following: (i) lateral and axial spatial resolution, (ii) temporal resolution, (iii) tolerance to phototoxicity and photobleaching, (iv) field of view, (v) imaging depth, (vi) multiplexing capacity to acquire a combination of colors, and (vii) spectroscopic imaging capabilities. In a perfect world, a microscope will encompass all these parameters. Unfortunately, in reality, such a microscope does not exist as every microscope requires trade-offs ( Combs, 2010 ; Lemon and McDole, 2020 ; Scherf and Huisken, 2015 ; Schermelleh et al., 2010 ). Occasionally, the trade-off can come at an exorbitant price, and this is especially the case with super-resolution microscopy. To gain the extra resolution, these modalities either completely sacrifice the capacity to image live phenomena or incur unacceptable doses of illumination light that rapidly induces phototoxicity ( Schermelleh et al., 2019 ). Thus, the trade-off of an otherwise suitable microscope may render it incapable of producing the required data.

In order to avoid such situations, it is best to understand what needs to be captured by the microscope before selecting an instrument. This can be achieved by changing the ambiguous, descriptive semantics (e.g. ‘membrane 3D dynamics’) to those that are framed in the semantics of analytical metrics (e.g. ‘filopodial angular deflection’, ‘membrane surface curvature’) (see Table 1 ). By identifying the necessary metrics, the required imaging parameters can be prioritized. For example, the analytical metrics required to sufficiently measure the 3D membrane ruffles of a cell ( Fritz-Laylin et al., 2017 ) include angular deflection, surface curvature, volumetric changes and the turnover rate of these membranous structures. These metrics will mandate the following imaging parameters: (i) high volumetric imaging speed (multiple volumes per min); (ii) improved axial resolution producing near or true isotropic resolution in all three axes, so that the ruffling structures can be resolved and segmented accurately; (iii) gentle illumination to minimize phototoxicity; (iv) live-cell-compatible imaging; and (iv) labeling of the cell membrane that is capable of facilitating the high number of image acquisitions. Box 1 also provides a case study of how analytical metrics influence microscope choice. Specific analytical metrics do not preclude the experimenter from observing (and even exploring) the biology; instead, they help winnow the imaging parameters down to the bare essentials. Together, quantitative metrics and experimental parameters will guide the user to the optimal microscope(s).

Microscope selection

The task of microscope selection can be bewildering to novices, and at times is confusing to even experienced microscopists. Biologists often face multiple hurdles in identifying suitable microscopes for an experiment through no fault of their own. These include (i) the lack of access to the desired instrument, (ii) ill-informed demand from reviewers to use the latest technology in the name of innovation, (iii) over-promise of instrument capabilities from the manufacturers, (iv) under-reporting of the instrument limitations, and (v) insufficient or erroneous reporting of published results that render experimental conditions irreproducible. Table 2 summarizes the features of various commonly used microscope modalities, as well as their relative advantages and shortcomings in our experience. Biologists have access to a wide range of modalities beyond standard widefield epifluorescence microscopes: total internal reflection fluorescence microscopy ( Mattheyses et al., 2010 ), lightsheet microscopy ( Chatterjee et al., 2018 ; Chen et al., 2014 ; Power and Huisken, 2017 ), confocal microscopy ( Claxton et al., 2011 ; Conchello and Lichtman, 2005 ; Jonkman et al., 2020 ; Oreopoulos et al., 2014 ), two-photon excitation fluorescence microscopy ( Benninger and Piston, 2013 ; So et al., 2000 ) and image scanning microscopy ( Gregor and Enderlein, 2019 ), as well as super-resolution techniques ( Demmerle et al., 2017 ; Sahl et al., 2017 ; Schermelleh et al., 2019 ; Sydor et al., 2015 ; Vicidomini et al., 2018 ). What should be immediately obvious from their comparison is that there is no ‘winner’ or ‘loser’ ( Table 2 ). No microscope scores equally well or poorly across the various parameters, reinforcing the notion that every microscope compromises a combination of parameters in order to excel at others. As a result, the process of microscope selection is rarely linear. Many instruments have overlapping capabilities that obscure the selection process and will require that more than one instrument be considered at a time. By defining the required parameters beforehand, they can be used to filter the selection down to the most appropriate instrument(s), as exemplified in the case study presented in Box 1 . The process of microscope selection is aided by a good understanding of the necessary imaging parameters. Ultimately, the justification for an instrument lies solely on the ability of that microscope to provide the necessary analytical metrics and the data informative of the biology.

Performance comparison of various microscope modalities

Performance comparison of various microscope modalities

It is impractical to expect biologists to understand the myriad of technical nuances of these rapidly evolving technologies. Likewise, most advanced imaging systems are usually concentrated in shared microscopy facilities, managed by experienced microscopists. This makes it all the more important for biologists to communicate, precisely and concisely, the desired analytical metrics and the corresponding parameters required for a successful experiment. It is sometimes difficult to appreciate that the latest imaging technology is not always the most appropriate. A super-resolution microscope or advanced lightsheet microscope may not necessarily be more suitable than a widefield epifluorescence microscope for a particular experiment. A microscope can only enhance certain parameters, and it is only beneficial if the enhanced parameters are utilized wisely. Even though structured illumination microscopy (SIM) offers improved resolution (see Table 2 ), it does not enhance the data of cell tracking studies over what can be achieved with a standard epifluorescence widefield microscope. It is also important to note that sometimes no single existing imaging technology may be able to produce the required data, necessitating the use of multiple instruments, or even the modification of the testable hypothesis. However, the availability of a new technology can open up the possibility of previously unfeasible analytical metrics that make it possible to address different biological queries.

The microcopy literature has no shortage of excellent reviews on the technical aspects of various imaging modalities, as well as tutorials on how to generate quantitative and reproducible data. However, topical discussion of best practices and optics does not necessarily engender a coherent framework of how these sets of information can be integrated to facilitate a hypothesis-driven, quantitative experimental design. Here, we present not only a roadmap of how to use these guides in the literature, but we break with the convention and argue that microscopy-based quantitative experiments should be designed in reverse, starting with determining the informative results needed to challenge a hypothesis.

Despite the promises of the latest technologies, no microscope is perfect. Usually, a feature gained in a technique comes at the cost of other key parameters. The essence of experimental design is never about the inclusion of every parameter the experimenter wants; rather it is about the careful exclusion of unnecessary parameters. This will allow accurate measurements to be performed and will ensure that the parameters relevant to the information the experimenter needs are maintained. This is the core concept of our approach. The essential parameters must be determined by what is required to test a hypothesis. These parameters will, in turn, naturally shape the rest of the elements of an experimental pipeline ( Fig. 1 A). A hypothesis-driven experimental design must be just that – driven by the hypothesis. It should be based on the biological question at hand, and not by the lure of the latest technologies. Fortunately, this process is an iterative feedback loop. The key questions left unanswered due to lack of technology inspire the development of novel microscopes. New technologies then reciprocally inform biology so that new hypotheses can be formulated. This cyclical process, however, does not negate the fact that experiments should be framed within the confines of existing technologies.

This Opinion article does not, by any means, diminish the exploratory power of microscopes and the well-honed acumen of biologists to observe and deduce. On the contrary, most hypotheses are synthesized following keen observation. The scope of this discussion is to focus on the process of quantitatively verifying a hypothesis. We have not addressed here how the power of modern microscopy has been harnessed in big-data scientific exploration. Such experiments are usually hypothesis-free; instead machine-learning algorithms are employed to search for patterns beyond what human perception can efficiently discern ( Chessel and Carazo Salas, 2019 ; Piccinini et al., 2017 ).

Quantitative microscopy experiments are not easy to design, as they require knowledge at the confluence of optics, imaging probes, data analysis and how the biological samples interact with the microscope. It is therefore of paramount importance for biologists to seek and heed the advice of expert microscopists and data scientists, especially those in core facilities, who are experienced in the application of microscopy. The conventional practice of generating a lot of data first, followed by data analysis as a secondary consideration should be avoided. Microscopy-related experiments demand careful planning and continued, iterative evaluation before the optimal approaches can be implemented. The fact that this message is echoed in every review and guide cited here is because it is important, and unfortunately, because it is also commonly overlooked. The perils of ignoring it cannot be overstated.

We thank Dr Christopher Obara as well as the members of the Advanced Imaging Center for their thoughtful discussion and insightful contributions.

The Advanced Imaging Center at Janelia Research Campus is generously supported by the Howard Hughes Medical Institute and the Gordon and Betty Moore Foundation.

Email alerts

A short guide to the tight junction.

part of Fig 2 in the paper: Schematic of a sheet of polarized epithelial cells, highlighting the position of TJs and AJs. Arrows indicate three functionally defined paracellular pathways: (1) the ‘pore’ pathway formed by claudin-based channels (blue); (2) the ‘leak’ pathway formed by breaks within the bTJ and/or by opening of the tTJ central tube (green); and (3) the ‘unrestricted’ pathway caused by epithelial damage (red). The actin and microtubule (MT) cytoskeletons are outlined in a simplified organization within the cell in red and orange, respectively.

In this Perspective, Sandra Citi and colleagues summarise the key structural and functional features of vertebrate tight junctions, and the mechanisms and regulation of the tight junction barrier in physiology and pathology.

Call for papers: Cell Biology of Mitochondria

Promotional banner for the next call for papers: Cell Biology of Mitochondria

We are welcoming submissions for our upcoming special issue: Cell Biology of Mitochondria . This issue will be coordinated by two Guest Editors: Ana J. Garcia-Saez (University of Cologne, Germany) and Heidi McBride (McGill University, Canada). Submission deadline: 1 October 2024.

JCS-FocalPlane Training Grants

Supporting microcopy training for early-career researchers in cell biology - apply for a grant up to £1,000

Early-career researchers - working in an area covered by JCS - who would like to attend a microscopy training course, please apply . Deadline dates for 2024 applications: 7 September (decision by week commencing 8 October 2024); 22 November (decision by week commencing 16 December).

Biologists @ 100 - join us in Liverpool in March 2025

Conference announcement - Biologists @ 100 - 24-27 March 2025

We are excited to invite you to a unique scientific conference, celebrating the 100-year anniversary of The Company of Biologists, and bringing together our different communities. The conference will incorporate the Spring Meetings of the BSCB and the BSDB, the JEB Symposium Sensory Perception in a Changing World and a DMM programme on antimicrobial resistance. Find out more and register your interest to join us in March 2025 in Liverpool, UK.

Propose a new Workshop for 2026

Want to organise one of our Workshops? You focus on the science, we focus on the logistics

We are now accepting proposals for our 2026 Workshops programme. We aim to be responsive to the community and open to novel initiatives, so if you have a new idea for a biological workshop that you feel would work well, please apply . Applications deadline: 19 July 2024.

Social media

X icon

Other journals from The Company of Biologists

  • Development
  • Journal of Experimental Biology
  • Disease Models & Mechanisms
  • Biology Open
  • Editors and Board
  • Aims and scope
  • Submit a manuscript
  • Manuscript preparation
  • Journal policies
  • Rights and permissions
  • Sign up for alerts

Affiliations

  • Journal of Cell Science
  • Journal Meetings
  • Library hub
  • Company news

WeChat logo

  • Privacy policy
  • Terms & conditions
  • Copyright policy
  • © 2024 The Company of Biologists. All rights reserved.
  • Registered Charity 277992 | Registered in England and Wales | Company Limited by Guarantee No 514735 Registered office: Bidder Building, Station Road, Histon, Cambridge CB24 9LF, UK

This Feature Is Available To Subscribers Only

Sign In or Create an Account

Logo for JCU Open eBooks

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

3.4 Sampling Techniques in Quantitative Research

Target population.

The target population includes the people the researcher is interested in conducting the research and generalizing the findings on. 40 For example, if certain researchers are interested in vaccine-preventable diseases in children five years and younger in Australia. The target population will be all children aged 0–5 years residing in Australia. The actual population is a subset of the target population from which the sample is drawn, e.g. children aged 0–5 years living in the capital cities in Australia. The sample is the people chosen for the study from the actual population (Figure 3.9). The sampling process involves choosing people, and it is distinct from the sample. 40 In quantitative research, the sample must accurately reflect the target population, be free from bias in terms of selection, and be large enough to validate or reject the study hypothesis with statistical confidence and minimise random error. 2

all quantitative research must be hypothesis driven

Sampling techniques

Sampling in quantitative research is a critical component that involves selecting a representative subset of individuals or cases from a larger population and often employs sampling techniques based on probability theory. 41 The goal of sampling is to obtain a sample that is large enough and representative of the target population. Examples of probability sampling techniques include simple random sampling, stratified random sampling, systematic random sampling and cluster sampling ( shown below ). 2 The key feature of probability techniques is that they involve randomization. There are two main characteristics of probability sampling. All individuals of a population are accessible to the researcher (theoretically), and there is an equal chance that each person in the population will be chosen to be part of the study sample. 41 While quantitative research often uses sampling techniques based on probability theory, some non-probability techniques may occasionally be utilised in healthcare research. 42 Non-probability sampling methods are commonly used in qualitative research. These include purposive, convenience, theoretical and snowballing and have been discussed in detail in chapter 4.

Sample size calculation

In order to enable comparisons with some level of established statistical confidence, quantitative research needs an acceptable sample size. 2 The sample size is the most crucial factor for reliability (reproducibility) in quantitative research. It is important for a study to be powered – the likelihood of identifying a difference if it exists in reality. 2 Small sample-sized studies are more likely to be underpowered, and results from small samples are more likely to be prone to random error. 2 The formula for sample size calculation varies with the study design and the research hypothesis. 2 There are numerous formulae for sample size calculations, but such details are beyond the scope of this book. For further readings, please consult the biostatistics textbook by Hirsch RP, 2021. 43 However, we will introduce a simple formula for calculating sample size for cross-sectional studies with prevalence as the outcome. 2

all quantitative research must be hypothesis driven

z   is the statistical confidence; therefore,  z = 1.96 translates to 95% confidence; z = 1.68 translates to 90% confidence

p = Expected prevalence (of health condition of interest)

d = Describes intended precision; d = 0.1 means that the estimate falls +/-10 percentage points of true prevalence with the considered confidence. (e.g. for a prevalence of 40% (0.4), if d=.1, then the estimate will fall between 30% and 50% (0.3 to 0.5).

Example: A district medical officer seeks to estimate the proportion of children in the district receiving appropriate childhood vaccinations. Assuming a simple random sample of a community is to be selected, how many children must be studied if the resulting estimate is to fall within 10% of the true proportion with 95% confidence? It is expected that approximately 50% of the children receive vaccinations

all quantitative research must be hypothesis driven

z = 1.96 (95% confidence)

d = 10% = 10/ 100 = 0.1 (estimate to fall within 10%)

p = 50% = 50/ 100 = 0.5

Now we can enter the values into the formula

all quantitative research must be hypothesis driven

Given that people cannot be reported in decimal points, it is important to round up to the nearest whole number.

An Introduction to Research Methods for Undergraduate Health Profession Students Copyright © 2023 by Faith Alele and Bunmi Malau-Aduli is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License , except where otherwise noted.

A general passion

May 30, 2024

A general passion

Image credit: pgen.1011291

research article

Conserved signalling functions for Mps1, Mad1 and Mad2 in the Cryptococcus neoformans spindle checkpoint

Mps1-dependent phosphorylation of C-terminal Mad1 residues is a critical step in Cryptococcus spindle checkpoint signalling. 

Image credit: pgen.1011302

Conserved signalling functions for Mps1, Mad1 and Mad2 in the Cryptococcus neoformans spindle checkpoint

Recently Published Articles

  • LINC03045 regulating glioblastoma invasion">CRISPRi screen of long non-coding RNAs identifies LINC03045 regulating glioblastoma invasion
  • Pseudomonas copper resistance">Cross-regulation and cross-talk of conserved and accessory two-component regulatory systems orchestrate Pseudomonas copper resistance
  • Myosin II mediates Shh signals to shape dental epithelia via control of cell adhesion and movement

Current Issue

Current Issue May 2024

Adaptations to nitrogen availability drive ecological divergence of chemosynthetic symbionts

The importance of nitrogen availability in driving the ecological diversification of chemosynthetic symbiont species and the role that bacterial symbionts may play in the adaptation of marine organisms to changing environmental conditions.

Image credit: pgen.1011295

Adaptations to nitrogen availability drive ecological divergence of chemosynthetic symbionts

Paramutation at the maize pl1 locus is associated with RdDM activity at distal tandem repeats

pl1 paramutation depends on trans-chromosomal RNA-directed DNA methylation operating at a discrete cis-linked and copy-number-dependent transcriptional regulatory element.

Image credit: pgen.1011296

Paramutation at the maize pl1 locus is associated with RdDM activity at distal tandem repeats

Research Article

Genomic analyses of Symbiomonas scintillans show no evidence for endosymbiotic bacteria but does reveal the presence of giant viruses

A multi-gene tree showed the three SsV genome types branched within highly supported clades with each of BpV2, OlVs, and MpVs, respectively.

Genomic analyses of Symbiomonas scintillans show no evidence for endosymbiotic bacteria but does reveal the presence of giant viruses

Image credit: pgen.1011218

A natural bacterial pathogen of C . elegans uses a small RNA to induce transgenerational inheritance of learned avoidance

A mechanism of learning and remembering pathogen avoidance likely happens in the wild. 

A natural bacterial pathogen of C. elegans uses a small RNA to induce transgenerational inheritance of learned avoidance

Image credit: pgen.1011178

Spoink , a LTR retrotransposon, invaded D. melanogaster populations in the 1990s

Evidence of Spoink retrotransposon's horizontal transfer into D. melanogaster populations post-1993, suggesting its origin from D.willistoni .

Spoink, a LTR retrotransposon, invaded D. melanogaster populations in the 1990s

Image credit: pgen.1011201

Comparison of clinical geneticist and computer visual attention in assessing genetic conditions

Understanding AI, specifically Deep Learning, in facial diagnostics for genetic conditions can enhance the design and utilization of AI tools.

Comparison of clinical geneticist and computer visual attention in assessing genetic conditions

Image credit: pgen.1011168

Maintenance of proteostasis by Drosophila Rer1 is essential for competitive cell survival and Myc-driven overgrowth

Loss of Rer1 induces proteotoxic stress, leading to cell competition and elimination ...

Maintenance of proteostasis by Drosophila Rer1 is essential for competitive cell survival and Myc-driven overgrowth

Image credit: pgen.1011171

Anthracyclines induce cardiotoxicity through a shared gene expression response signature

TOP2i induce thousands of shared gene expression changes in cardiomyocytes.

Anthracyclines induce cardiotoxicity through a shared gene expression response signature

Image credit: pgen.1011164

New PLOS journals accepting submissions

Five new journals unified in addressing global health and environmental challenges are now ready to receive submissions: PLOS Climate , PLOS Sustainability and Transformation , PLOS Water , PLOS Digital Health , and PLOS Global Public Health

COVID-19 Collection

The COVID-19 Collection highlights all content published across the PLOS journals relating to the COVID-19 pandemic.

Submit your Lab and Study Protocols to PLOS ONE !

PLOS ONE is now accepting submissions of Lab Protocols, a peer-reviewed article collaboration with protocols.io, and Study Protocols, an article that credits the work done prior to producing and publishing results.

PLOS Reviewer Center

A collection of free training and resources for peer reviewers of PLOS journals—and for the peer review community more broadly—drawn from research and interviews with staff editors, editorial board members, and experienced reviewers.

Ten Simple Rules

PLOS Computational Biology 's "Ten Simple Rules" articles provide quick, concentrated guides for mastering some of the professional challenges research scientists face in their careers.

Welcome New Associate Editors!

PLOS Genetics welcomes several new Associate Editors to our board: Nicolas Bierne, Julie Simpson, Yun Li, Hongbin Ji, Hongbing Zhang, Bertrand Servin, & Benjamin Schwessinger

Expanding human variation at PLOS Genetics

The former Natural Variation section at PLOS Genetics relaunches as Human Genetic Variation and Disease. Read the editors' reasoning behind this change.

PLOS Genetics welcomes new Section Editors

Quanjiang Ji (ShanghaiTech University) joined the editorial board and Xiaofeng Zhu (Case Western Reserve University) was promoted as new Section Editors for the PLOS Genetics Methods section.

PLOS Genetics editors elected to National Academy of Sciences

Congratulations to Associate Editor Michael Lichten and Consulting Editor Nicole King, who are newly elected members of the National Academy of Sciences.

Harmit Malik receives Novitski Prize

Congratulations to Associate Editor Harmit Malik, who was awarded the Edward Novitski Prize by the Genetics Society of America for his work on genetic conflict. Harmit has also been elected as a new member of the American Academy of Arts & Sciences.

Publish with PLOS

  • Submission Instructions
  • Submit Your Manuscript

Connect with Us

  • PLOS Genetics on Twitter
  • PLOS on Facebook

Get new content from PLOS Genetics in your inbox

Thank you you have successfully subscribed to the plos genetics newsletter., sorry, an error occurred while sending your subscription. please try again later..

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

geosciences-logo

Article Menu

all quantitative research must be hypothesis driven

  • Subscribe SciFeed
  • Recommended Articles
  • Author Biographies
  • Google Scholar
  • on Google Scholar

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

Article Versions Notes

Action Date Notes Link
article xml file uploaded 14 June 2024 10:00 CEST Original file -
article xml uploaded. 14 June 2024 10:00 CEST Update
article pdf uploaded. 14 June 2024 10:00 CEST Version of Record
article html file updated 14 June 2024 10:02 CEST Original file

Badillo-Rivera, E.; Olcese, M.; Santiago, R.; Poma, T.; Muñoz, N.; Rojas-León, C.; Chávez, T.; Eyzaguirre, L.; Rodríguez, C.; Oyanguren, F. A Comparative Study of Susceptibility and Hazard for Mass Movements Applying Quantitative Machine Learning Techniques—Case Study: Northern Lima Commonwealth, Peru. Geosciences 2024 , 14 , 168. https://doi.org/10.3390/geosciences14060168

Badillo-Rivera E, Olcese M, Santiago R, Poma T, Muñoz N, Rojas-León C, Chávez T, Eyzaguirre L, Rodríguez C, Oyanguren F. A Comparative Study of Susceptibility and Hazard for Mass Movements Applying Quantitative Machine Learning Techniques—Case Study: Northern Lima Commonwealth, Peru. Geosciences . 2024; 14(6):168. https://doi.org/10.3390/geosciences14060168

Badillo-Rivera, Edwin, Manuel Olcese, Ramiro Santiago, Teófilo Poma, Neftalí Muñoz, Carlos Rojas-León, Teodosio Chávez, Luz Eyzaguirre, César Rodríguez, and Fernando Oyanguren. 2024. "A Comparative Study of Susceptibility and Hazard for Mass Movements Applying Quantitative Machine Learning Techniques—Case Study: Northern Lima Commonwealth, Peru" Geosciences 14, no. 6: 168. https://doi.org/10.3390/geosciences14060168

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List

Conceptions of Good Science in Our Data-Rich World

Kevin c. elliott.

Kevin C. Elliott ( ude.usm@eck ) is an associate professor in Lyman Briggs College, the Department of Fisheries and Wildlife, and the Department of Philosophy; Kendra S. Cheruvelil is an associate professor in Lyman Briggs College and the Department of Fisheries and Wildlife; Georgina M. Montgomery is an associate professor in Lyman Briggs College and the Department of History; and Patricia A. Soranno is a professor in the Department of Fisheries and Wildlife at Michigan State University, in East Lansing. All authors contributed equally to the conceptualization of the paper and the supporting research. KCE organized the collaboration and initiated the writing process. All authors contributed text, reviewed manuscript drafts, and approved the final version.

Kendra S. Cheruvelil

Georgina m. montgomery, patricia a. soranno.

Scientists have been debating for centuries the nature of proper scientific methods. Currently, criticisms being thrown at data-intensive science are reinvigorating these debates. However, many of these criticisms represent long-standing conflicts over the role of hypothesis testing in science and not just a dispute about the amount of data used. Here, we show that an iterative account of scientific methods developed by historians and philosophers of science can help make sense of data-intensive scientific practices and suggest more effective ways to evaluate this research. We use case studies of Darwin's research on evolution by natural selection and modern-day research on macrosystems ecology to illustrate this account of scientific methods and the innovative approaches to scientific evaluation that it encourages. We point out recent changes in the spheres of science funding, publishing, and education that reflect this richer account of scientific practice, and we propose additional reforms.

Scientists have been debating for centuries the nature of proper scientific methods, especially the role of hypothesis testing in scientific practice (Laudan 1981 ). These debates are being reinvigorated as many fields of science, including high-energy physics, astronomy, public health, climate science, environmental science, and genomics, are increasingly using data-intensive approaches (Bell et al. 2009 , Baraniuk 2011 , Winsberg 2010 , King 2011 , Porter et al. 2012, Mattman 2013 , Khoury and Ioannidis 2014 , Katzav and Parker 2015 ). Data-intensive science has been described as research in which the capture, curation, and analysis of (usually) large volumes of data are central to the scientific question; it has also been defined as research that uses data sets so large or complex that they are hard to process and analyze using traditional approaches and methods (Hey et al. 2009 , Critchlow and van Dam 2013).

Although the term data intensive is relatively new, historians of science point out that scientists have been capturing, curating, and analyzing large volumes of data for centuries in ways that have challenged existing techniques (Muller-Wille and Charmantier 2012 ). For example, the disciplines of natural history and taxonomy provide important historical examples of data-intensive research; as Strasser (2012) put it, “Renaissance naturalists were no less inundated with new information than our contemporaries” (p. 85). However, contemporary data-intensive science is also characterized by new computational methods and technologies for creating, storing, processing, and analyzing data and also by the use of interdisciplinary teams for designing and implementing research to address complex societal challenges (Strasser 2012, Leonelli 2014 ). Consequently, in some areas of science (e.g., astronomy), there can be particularly sharp distinctions between historical and current data-intensive approaches, whereas in other areas of science (e.g., natural history), there are fewer differences (Evans and Rzhetsky 2010 , Haufe et al. 2010 , Pietsch 2016 ).

Contemporary examples of data-intensive science include collecting evidence for the existence of the Higgs boson, sequencing the human genome, developing computer models of climate change and carbon sequestration, and identifying relationships between social networks and human behaviors. Despite these high-profile examples and the increasing availability of large data sets for many science disciplines, there are concerns that contemporary data-intensive research is bad for science or that it will lead to poor methodology and unsubstantiated inferences. For example, data-intensive research has been criticized for being atheoretical, being nothing more than a “fishing expedition,” having a high probability of leading to nonsense results or spurious correlations, being reliant on scientists who do not have adequate expertise in data analysis, and yielding data biased by the mode of collection (Boyd and Crawford 2012 , Fan et al. 2014, Lazer et al. 2014 ).

Such concerns actually reflect deeper and more widespread debates about the centrality of hypothesis-driven research that have challenged the scientific community for centuries. Most contemporary scientific disciplines share a commitment to a hypothesis-driven methodology (see Peters R 1991, Weinberg 2010 , Keating and Cambrosio 2012 , Fudge 2014 ). Definitions for hypotheses vary across disciplines (ranging from specific to general and quantitative to qualitative; Donovan et al. 2015 ), but we define hypothesis-driven methodology in terms of the linear process canonized in many textbooks and represented in figure 1.

An external file that holds a picture, illustration, etc.
Object name is biw115fig1.jpg

Linear account employed in many descriptions of the scientific method.

Although this linear scientific process continues to be held up as an exemplar in many textbooks and grant proposal guidelines (Harwood 2004 , O'Malley et al. 2009 , Haufe 2013 ), recent commentaries from scientists and historians and philosophers of science have argued that historical and contemporary scientific practices incorporate a much more complex, iterative mixture of different methods (e.g., Kell and Oliver 2004 , Glass and Hall 2008 , Gannon 2009 , O'Malley et al. 2010 , Forber 2011 , Elliott 2012 , Glass 2014 , Peters DPC et al. 2014, Pietsch 2016 ). These scholars argue that focusing primarily on a linear, hypothesis-driven account of science impoverishes the scientific enterprise by encouraging scientists to focus on narrowly defined questions that can be posed as testable hypotheses. For example, hypothesis-driven approaches are particularly helpful for choosing between alternative mechanisms that could explain an observed phenomenon (e.g., through a controlled experiment), but they are much less helpful for mapping out new areas of inquiry (e.g., the sequence of the human genome), identifying important relationships among many different variables, or studying complex systems. According to those who accept an iterative account of scientific methods, attempting to draw a sharp distinction between hypothesis-driven and data-intensive science is misleading; these modes of research are not in fact orthogonal and often intertwine in actual scientific practice (e.g., O'Malley et al. 2009 , Elliott 2012 , Peters DPC et al. 2014).

Unfortunately, the historical and philosophical literature on iterative scientific methods has not been well integrated into recent accounts of data-intensive research, nor have the implications for evaluating research quality been fully explored. We address both of these gaps by showing how data-intensive research can be conceptualized more effectively using iterative accounts of scientific methods and by showing how these accounts encourage innovative approaches for evaluation. We argue that the key to assessing the appropriateness of data-intensive research—and, indeed, any scientific practice—is to evaluate how it is situated within broader research practices. Scientific practices should be evaluated on the basis of the significance of the knowledge gap that they address and the alignment between the nature of the gap and the approach or combination of approaches used to address it. In order to better reflect scientific practices and to accommodate all scientific approaches, including data-intensive ones, we point out recent changes and propose additional reforms in the spheres of funding, publishing, and education.

Debates over scientific methods

Contemporary debates over data-intensive methods are merely the latest episode in a long-standing conflict over the proper roles of hypotheses in scientific research. In the seventeenth century, figures such as Robert Boyle and Robert Hooke espoused the use of hypotheses, whereas Francis Bacon and Isaac Newton argued that investigators could easily be led astray if they proposed bold conjectures rather than working inductively from the available evidence (Laudan 1981 , Glass 2014 ). These examples illustrate the long history during which hypothesis-driven science has waxed and waned in popularity (figure ​ (figure2; 2 ; Laudan 1981 ). Most scientists did not favor the use of hypotheses during the eighteenth century, but this perspective changed dramatically over the next 100 years (Laudan 1981 ). By the late nineteenth century, largely descriptive disciplines such as natural history were beginning to be dismissed as a form of “stamp collecting” (Johnson 2007 ). Popper's (1963) emphasis on the hypothetico-deductive (H-D) method proved hugely influential during the twentieth century, and most textbooks continue to focus on hypothesis testing as the core of the scientific method (see figure ​ figure1; 1 ; Harwood 2004 ). Although some scientists, publishers, and funders have remained loyal to a Popper-informed account of the scientific method that privileges hypothesis-driven research, many today are questioning this focus and mirroring the methodological debates embodied in previous time periods (Hilborn and Mangel 1997, Kell and Oliver 2004 , Glass and Hall 2008 , Peters DPC et al. 2014).

An external file that holds a picture, illustration, etc.
Object name is biw115fig2.jpg

A depiction of the waxing and waning of hypothesis-driven approaches.

In particular, despite the huge potential for new data-intensive methodologies to generate knowledge (King 2011 ), the advent of these techniques has raised questions about the appropriate relationships between hypothesis-driven and observationally driven modes of investigation (Kell and Oliver 2004 , Beard and Kushmerick 2009 ). Again, historians of science have shown that this debate is not a new one and that scientists have struggled for centuries with storing, analyzing, and standardizing large quantities of data (Muller-Wille and Charmantier 2012 ). Nevertheless, contemporary data-intensive science raises additional issues because of its extensive use of statistical and computer science methodologies and interdisciplinary teams (Strasser 2012), thereby adding further dimensions to debates about appropriate scientific methods.

A richer account of scientific practice

Many concerns about data-intensive research can be addressed by defining scientific practice more broadly (figure ​ (figure3), 3 ), as has been argued in recent historical and philosophical studies of scientific methods. Taking this view, the fundamental goal of science is to address gaps or challenges facing our current state of knowledge. Hypothesis testing is one approach for filling these knowledge gaps, but science proceeds in other ways as well (Chang 2004 , Franklin 2005, O'Malley et al. 2009 , Elliott 2012 , O'Malley and Soyer 2012 ). Scientists attempt to answer research questions with observations, field studies, or integrated databases (Leonelli 2014 ); they engage in exploratory inquiry or modeling exercises to detect patterns in available data (Steinle 1997 , Burian 2007 , Elliott 2007 , Winsberg 2010 , Katzav and Parker 2015 ); or they create new tools, techniques, and methods (Baird 2004 , O'Malley et al. 2010 )—all of which in turn enable them to test hypotheses, answer questions, or gather additional data more effectively.

An external file that holds a picture, illustration, etc.
Object name is biw115fig3.jpg

A representation of scientific practice as an iterative process, with many approaches and links (as depicted by two-way arrows). The evaluation or assessment of scientific practices is based on the importance of the knowledge generated, the importance of the gap or challenge addressed, and the alignment of the approaches and methods used to conduct the science.

This multiplicity of different research approaches is not new, but it has become even more prominent in contemporary data-intensive research. Historically, it was often most efficient for scientists to work from hypotheses that guided their inquiry in the most promising directions. But with the advent of high-throughput technologies and data-mining techniques that make data less expensive to generate and analyze, other approaches that are more inductive also play a fruitful role in scientific research (Franklin 2005, Servick 2015 ). Broad hypotheses or background assumptions may still provide guidance about what sorts of questions or exploratory inquiries are likely to be most fruitful, but these are not the sorts of specific hypotheses envisioned by most hypothesis-driven accounts of scientific method (Franklin 2005, Leonelli 2012 , Ratti 2015 ). Because it is difficult (often impossible) for an individual scientist to become an expert in all of these contemporary approaches and methods, good science also incorporates the most appropriate disciplines and collaborators, thus making the development of effective—and often interdisciplinary—scientific teams more essential than in the past, and the resulting research reflects a combination of methods originating from multiple disciplines (Cheruvelil et al. 2014, NRC 2015).

An important feature of the scientific methods illustrated in figure ​ figure3 3 is that they are often employed in an iterative fashion in order to address complex research challenges (Chang 2004 , O'Malley et al. 2010 , Elliott 2012 , Leonelli 2012 ). Although some contemporary data-intensive research focuses primarily on the repeated use of inductive methods and machine-learning algorithms (Evans and Rzhetsky 2010 , Lazer et al. 2014 , Pietsch 2016 ), much of it involves a combination of different approaches. O'Malley and colleagues (2010) argued that not only data-intensive research but also scientific practice as a whole should be characterized as an iterative interplay between at least four different modes of research: hypothesis-driven, question-driven, exploratory, and tool- and method-oriented. As inquiry proceeds, initial questions are specified, whereas others are revised or give rise to new lines of research. In an effort to address these questions, new equipment and techniques are often developed and tested, frequently generating new questions and altering old ones. In the course of investigating questions and developing new techniques, exploratory approaches are often central (O'Malley et al. 2010 ). These exploratory efforts, which can include experimentation, data mining, and simulation modeling, often involve the systematic variation of experimental parameters or analysis of datasets in search of important regularities and patterns (Elliott 2007 , Winsberg 2010 ). In many cases, this web of activities generates the sorts of tightly constrained contexts in which specific hypotheses can be fruitfully tested, but this may be just one component of a much broader scientific context. In fact, the methodological iteration between different approaches results in a process of epistemic iteration by which our knowledge is gradually altered and improved (Elliott 2012 ), as is depicted by the two-way arrows in figure ​ figure3 3 that highlight the links among knowledge, motivation, and the multiple approaches employed by scientists.

One of the primary lessons to be learned from the iterative model of scientific methods is that contemporary research, and especially data-intensive research, incorporates a wide variety of different approaches, which gain their significance primarily from their roles in broader research programs and lines of inquiry. Therefore, evaluating the quality of this work requires much more than looking to confirm that it incorporates a well-formulated hypothesis (Kell and Oliver 2004 , Beard and Kushmerick 2009 ). Instead, it should be evaluated on the basis of the alignment between the nature of the knowledge gap or challenge addressed and the combination of approaches or methods used to address the gap. Research should be evaluated favorably if it incorporates approaches and methods that are well-suited for addressing an important gap in current knowledge, even if they do not focus solely or primarily on hypothesis testing (figure ​ (figure3 3 ).

An iterative model of scientific practice alleviates many common concerns about data-intensive research. The potential for generating spurious correlations becomes less serious when data-generated patterns are identified and evaluated as part of larger research projects that incorporate broader research questions, hypotheses, or objectives and when appropriate techniques and inferences are used to deal with spurious correlations (Hand 1998 ). These projects are also frequently embedded within conceptual frameworks or theories that facilitate the investigation of underlying causal mechanisms. Some proponents of data-intensive science argue that it can largely replace hypothesis testing, focusing on generating correlations rather than seeking causal understanding (Prensky 2009 , Steadman 2013 ). In contrast, we contend that data-intensive science will typically be most fruitful when it is part of broader inquiries that guide the collection and interpretation of data and that provide additional investigations of the correlations that are generated (Leonelli 2012 , Kitchin 2014 ). Finally, the worry that individual researchers do not have the skill sets to perform data-intensive work can be alleviated by the development of interdisciplinary research teams that can accomplish the iterative tasks required for many contemporary scientific research projects. Admittedly, data-intensive methods can still be used inappropriately, such as when data are collected without standard approaches or quality metadata or when data are simply mined for correlative relationships without attention to spurious correlations (Hand 1998 ). However, we argue that this is a matter of improper technique or a poorly designed research program, which can occur in any form of scientific practice; it is not a problem inherent in data-intensive methods themselves.

Examples of iterative data-intensive research practices

The interplay between multiple research approaches can be observed across many scientific subdisciplines and time periods. To illustrate, we present two examples drawn from the natural sciences. The first example highlights the historical nature of these debates concerning scientific methods (the study of evolution by natural selection; figure ​ figure4a). 4a ). It shows that even though contemporary data-intensive approaches have unique characteristics, historical research also incorporated iterative and data-intensive components. The second example highlights how methods from contemporary data-intensive ecology are being used to better understand broad-scale ecological research questions and environmental problems (the study of macrosystems ecology; figure ​ figure4b). 4b ). It also illustrates how contemporary data-intensive research incorporates greater use of computational approaches and interdisciplinary teams than did historical data-intensive research.

An external file that holds a picture, illustration, etc.
Object name is biw115fig4.jpg

Two examples of iterative scientific efforts using multiple approaches.

The historical study of evolution by natural selection

Darwin's development of the theory of natural selection provides a classic example of research that incorporates multiple approaches. Despite the efforts of some commentators to reconstruct Darwin's research as primarily hypothesis-driven (Ayala 2009), he spent more than two decades performing exploratory work in an effort to identify the patterns that he later explained in The Origin of Species . Driven by curiosity and a naturalist's love for nature, as well as a structured observational agenda that he learned from scholars like Humboldt, Cuvier, and Lyell, Darwin's observations during his famous voyage aboard the Beagle generated questions that guided his inductive data collection over subsequent decades. During that time, he drew upon a wide range of methods and sources (Hodge 1983), including data produced by fellow members of the traditional scientific elite and countless women and other so-called amateurs practicing science outside of the scientific societies and journals of the nineteenth century. In the Origin , for instance, Darwin cites animal breeders as an important source of data, and in Expression of Emotions , mothers provided observations of their own children to supplement those made by Darwin of his own family (Harvey 2009, Montgomery 2012 ).

Darwin's use of natural history methods led Frank Gannon to write a tongue-in-cheek editorial pointing out that in today's funding structure Darwin's work would be dismissed as “an open-ended ‘fishing expedition’” (Gannon 2009 ). However, Darwin also engaged in experiments that showed how his theory of evolution could explain the details of sexual form in plant species (Bellon 2013 ). His combination of methods and compilation of data from a variety of sources proved to be extremely fruitful, and works such as Origin (1859), The Variation of Animals and Plants under Domestication (1868), The Descent of Man (1871), and Expression of Emotions (1872) all embody a blend of what are now often held up as distinct approaches: inductive and deductive methods, observation and experiment.

Even in Darwin's own time, he was forced to consciously navigate scientific norms when considering how to present his multi-modal research. For example, following nineteenth-century philosophers of science such as William Whewell and John F. W. Herschel, Darwin organized the Origin to conform to the scientific values of the day—namely, demonstrating the strength of a theory by the breadth of facts it explained (Ruse 1975 ). Arguing from analogy, as Whewell recommended, Darwin began by recognizing an uncontested phenomenon—that artificial selection quickly resulted in drastic structural changes in domestic breeding of animals such as pigeons—and used this accepted truth to compel the reader to accept his inference that natural selection accounted for species changes.

Darwin's use of both inductive and deductive methods also followed Whewell's methodological recommendations. In contrast with more recent accounts of hypothesis-driven science, Whewell insisted that scientists should move through a very gradual inductive process to arrive at successively more general causal laws (Snyder 1999 ). Only after performing this inductive process did he think that scientists could legitimately move on to test these hypotheses. Thus, Whewell himself encouraged the use of a combination of research modes, and this is reflected in Darwin's works. Philosophers of science have since debated the extent to which Darwin was influenced by different methodologists (including Francis Bacon and John Stuart Mill, as well as Whewell and Herschel) and precisely when Darwin switched from an inductive to a deductive approach during the 20-plus years of gestation of the Origin (Ruse 1975 , Hodge 1991 ). Regardless of the exact year when this switch occurred, it is clear that scientists today—like Darwin—often move back and forth between the best aspects of both inductive and deductive logic when formulating and testing a theory. Similarly—and again like Darwin—scientists also often blend laboratory- and field-work, observation and experiment, and data from multiple sources rather than conforming to artificially distinct modes of scientific practice that are sometimes held up as “traditional” to a particular field of science, despite the long history of a multimodal reality.

The contemporary study of macrosystems ecology

A contemporary example of data-intensive research that involves multiple and iterative approaches comes from the emerging subdiscipline of macrosystems ecology (Heffernan et al. 2014). Most traditional ecological research is conducted by studying organisms and their environments at relatively small scales—such as individual species, communities, or ecosystems—using methods such as lab or field experiments, modeling, field surveys, or long-term studies (Carmel et al. 2013). However, environmental changes such as the spread of invasive species, climate change, and land-use intensification are occurring globally, are the result of relationships and interactions between human and natural systems, and may result in widespread but complicated effects. For example, across regions and continents (at the scales of hundreds of kilometers), there are differences in the direction and magnitude of environmental changes, the underlying geophysical and ecological contexts, and social structures. These differences mean that results from fine-scaled studies in some regions are not likely to apply to other regions and that the study of ecological systems at larger scales—such as regions to continents—is required. Macrosystems ecology fills this gap by explicitly studying fine-scaled ecological patterns and processes nested within regions and continents and employing a variety of methods to do so.

Such multiscaled understanding of ecological systems cannot be achieved through an individual hypothesis test or a field experiment, nor can it be achieved by using only one approach (Heffernan et al. 2014, Levy et al. 2014 ). For example, to understand the complex relationships among tree growth, human disturbance, and regional and global climate, scientists need to study forests as a whole using multiple methods within a region rather than at the scale of individual trees or stands (Chapin et al. 2008). One approach that ecologists have used to study ecological systems at regional scales is by quantitatively delineating ecological regions that represent a measured combination of geophysical features thought to influence fine-scaled ecological processes (Cheruvelil et al. 2013). However, existing ecological regions have limitations in that they were created for a variety of purposes, using different underlying geophysical and human data and using a variety of methods.

For example, lake water quality is related to both climate and land use. Therefore, scientists have speculated that lake water quality is likely to strongly respond to changes in both climate and land uses. However, the response of lake water quality to such environmental changes is likely to vary among regions and continents. In fact, Cheruvelil and ­colleagues (2008, 2013) had observed that lake water chemistry varied regionally but that the variation depended on how the boundaries of “regions” were defined. Therefore, they had the overarching goals of developing new ways to define regional boundaries that were based on the geophysical features that are likely important for predicting regional water quality and its response to climate and land-use change (figure ​ (figure4b). 4b ). Meeting these goals required the iterative use of multiple research methodologies, data collected by various individuals and groups, and contributions from multiple disciplines.

An interdisciplinary team was created ( sensu Cheruvelil et al. 2014) that included ecologists, computer scientists, and experts in geospatial analysis and ecoinformatics to create a large, multiscaled database by integrating multiple lake data sources (including field surveys of water quality conducted by state agency scientists, citizen scientists, and university researchers) with geospatial data quantified at the national scale (Soranno et al. 2015). The team used three data-intensive approaches to meet their goal of developing new ecological regions for water quality (figure ​ (figure4b): 4b ): First, they developed and tested a clustering algorithm to define regional boundaries (Yuan et al. 2015); second, they used an exploratory data-mining analysis to determine which geophysical features were correlated with the regional boundaries and might lend insight into the underlying mechanisms driving regional variation in lake water quality (Cheruvelil, Lyman Briggs College and Department of Fisheries and Wildlife, Michigan State University, East Lansing, personal communication, 9 November 2015); and third, they used statistical models to quantify how well the regional boundaries captured variation in lake water quality for thousands of lakes in approximately 100 regions (Cheruvelil, Lyman Briggs College and Department of Fisheries and Wildlife, Michigan State University, East Lansing, personal communication, 9 November 2015). Ecological regions were created with a variety of geophysical features that are related to lake water quality, many of which are expected to be strongly affected by changes in climate and land use. Employing multiple scientific practices, rather than solely a hypothesis-driven approach, improved their ability to use the regional scale for understanding, explaining, and predicting ecological phenomena across spatial scales.

Lessons learned from examples of iterative data-intensive research

Together, these two examples illustrate the major points that we have made in this article. First, they show that although scientists have been working with challenging quantities of data for centuries, contemporary data-intensive science incorporates additional features. For example, whereas Darwin received data from numerous sources, he worked primarily on his own (with input from colleagues) to analyze the data. In contrast, the environmental scientists in the second example worked with computer scientists and experts in ecoinformatics in order to make optimal use of contemporary computational tools for integrating, creating, and analyzing data.

Second, these examples illustrate the power of moving iteratively among multiple research methods. What made both of these research efforts successful is not the fact that they used a particular approach but rather that the approaches they chose were well designed for addressing important knowledge gaps. In Darwin's case, his research was important because he was addressing one of the most fundamental issues in biology—namely, the processes by which species have changed over time. Similarly, the scientists in our second example were addressing the important societal issue of the response of water quality to environmental changes at macroscales. Encouraging scientists to emulate the iterative approaches embodied in these two examples requires the development of richer conceptions of scientific practice.

Recommendations for promoting good science in our data-rich world

A number of reforms should be made to promote not only iterative data-intensive science but also the scientific enterprise more broadly (table ​ (table1). 1 ). First, funding agencies (and reviewers) should evaluate the quality of proposed research not based on a uniform requirement that it states a specific hypothesis but based on the importance of the knowledge gaps that it identifies and the appropriateness of the methods proposed for addressing those gaps (O'Malley et al. 2009 ). For example, some recent funding initiatives are placing emphasis on grand challenges (e.g., the human genome project, brain research, personalized medicine, smart cities) that do not lend themselves to solely hypothesis-based approaches. Therefore, rather than expecting researchers to shoehorn proposals into a misleading, linear research format, reviewers should be open to proposals that describe a more realistic, iterative research trajectory. This reform will require developing appropriate grant guidelines and review mechanisms that encourage mixed modes of scientific practice, such as those recently being used by the US National Institutes of Health to fund investigators rather than individual projects (table ​ (table1 1 ).

Recommendations for promoting iterative data-intensive science.

Components of scienceCurrent normsProposed reformsRecent exemplar of reform
FundingProposals are expected to have an organizing hypothesis.Proposals should be expected to have alignment between knowledge gaps and approaches.Several institutes of the NIH have introduced long-term funding opportunities that allow investigators to pursue more creative, innovative research projects (e.g., and )
Proposals are expected to describe a linear, non-iterative approach.Proposals should be expected to describe appropriate iterative use of multiple approaches.The Biotechnology and Biological Sciences Research Council of the UK describes multiple methods that are integrated into the systems-biology research it funds ( ).
PublishingArticles are expected to be structured to embody a hypothesis-testing approach.Articles should be structured to convey the alignment between the identified knowledge gaps and the approaches used.A new journal, Limnology and Oceanography Letters, requires an explicit statement by the authors of the knowledge gaps filled by the study ( ).
The components of iterative research are difficult to publish on their own (e.g., exploratory analysis, data, methods, code).Articles focused on any aspect of iterative research should be publishable based on contribution to knowledge, data, or methods developmentRecent advent of outlets for a broad range of research products, such as data journals (e.g., Earth System Science Data, Scientific Data, GigaScience, Biodiversity Data Journal), online code repositories (e.g., GitHub, BitBucket), and online data repositories (e.g., FigShare, Dryad, TreeBASE)
Education (K–12, undergraduate, and graduate)Students are taught mainly about hypothesis testing.Students should be taught multiple scientific methods and to choose approaches that best align with knowledge gaps.Reformed teaching approaches, such as authentic science labs (e.g., Luckie et al. , Harwood ) and teaching with case studies (e.g., , , White et al. ).
Students are taught linear, non-iterative scientific methods.Students should be taught an iterative account of scientific methods.Dissemination of nonlinear accounts of scientific methods (e.g., )

Second, rather than expecting articles to be structured to embody a linear hypothesis-testing approach, journal editors and reviewers should be open to publications that are organized around the full range of methods used to address knowledge gaps. Allowing journal articles and other research products to take a greater variety of forms will help alleviate the discrepancies that a number of authors have identified between the structure of scientific articles and the actual practice of research (e.g., Medawar 1996 , Schickore 2008 ). Some journals and online repositories are providing guidelines and mechanisms for scientists to disseminate data and computer code, and the science community as a whole is discussing ways to give scientists credit for a variety of research products that will help advance a broader view of scientific practices (e.g., Goring et al. 2014 ; see also table ​ table1 1 ).

Third, whereas K–12 through graduate science education currently emphasizes a linear, hypothesis-driven approach to science, it should be reformed to incorporate more complex models of the scientific method. For example, students should be taught that hypothesis testing is just one important component of a much broader landscape of scientific activities that need to be combined in creative and interdisciplinary ways to move science forward (Harwood 2004 ). Including the history, philosophy, and sociology of science in science curricula; teaching science in interdisciplinary ways; and using reformed teaching methods in science courses (e.g., inquiry-based labs, case studies) can introduce students to the multiple methods scientists have historically used—and continue to use—to address significant knowledge gaps (table ​ (table1 1 ).

Conclusions

The recognition that data-intensive research methods—and indeed, research practices in all areas of science—need to be evaluated as part of broader research programs does much to alleviate common concerns about these and other non-hypothesis-driven methods. Although data-intensive and exploratory efforts to identify patterns in large datasets have the potential to generate spurious results, all methods have their potential problems when used poorly; when used properly, such data-intensive approaches can play a very fruitful role in broader research programs that also test hypothesized processes and mechanisms. The iterative research methods that we have described in this article allow researchers to address more complex questions than they could with hypothesis testing alone. To make these efforts successful, changes are needed in the norms for research funding, publication, and education. In all these areas, more emphasis should be placed on aligning research methods with the knowledge gaps that need to be addressed rather than focusing primarily on hypothesis testing. In addition, scientific practice should be more explicitly recognized as an iterative path through multiple approaches rather than as a linear process of moving through pre-defined steps. Of course, this does not mean that “anything goes”; rather, it facilitates more careful thought about how to fund, publish, and teach the right combinations of methods that will enable the scientific community to tackle the big issues confronting society today.

Acknowledgments

Funding for this work was provided by the Science + Society @ State program at Michigan State University to all authors; the US National Science Foundation's Macrosystems Biology Program (no. EF-1065786) to PAS and KSC; and the USDA National Institute of Food and Agriculture, Hatch Project no. 176820 to PAS.

References cited

  • Baird D. University of California Press; 2004. Thing Knowledge: A Philosophy of Scientific Instruments. [ Google Scholar ]
  • Baraniuk RG. More is less: Signal processing and the data deluge. Science. 2011; 331 :717–719. [ PubMed ] [ Google Scholar ]
  • Beard D, Kushmerick M. Strong inference for systems biology. PLOS Computational Biology. 2009; 5 (art. e1000459). [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bell G, Hey T, Szalay A. Beyond the data deluge. Science. 2009; 323 :1297–1298. [ PubMed ] [ Google Scholar ]
  • Bellon R. Darwin's evolutionary botany. In: Ruse M, editor. The Cambridge Encyclopedia of Darwin and Evolutionary Thought. Cambridge University Press; 2013. pp. 131–138. [ Google Scholar ]
  • Boyd D, Crawford K. Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication, and Society. 2012; 15 :662–679. [ Google Scholar ]
  • Burian R. On microRNA and need for exploratory experimentation in post-genomic molecular biology. History and Philosophy of the Life Sciences. 2007; 29 :283–310. [ PubMed ] [ Google Scholar ]
  • Chang H. Oxford University Press; 2004. Inventing Temperature: Measurement and Scientific Progress. [ Google Scholar ]
  • Donovan S, O'Rourke M, Looney C. Your hypothesis or mine? Terminological and conceptual variation across disciplines. Sage Open. 2015; 5 :1–13. doi: 10.1177/2158244015586237. [ Google Scholar ]
  • Elliott K. Varieties of exploratory experimentation in nanotoxicology. History and Philosophy of the Life Sciences. 2007; 29 :311–334. [ PubMed ] [ Google Scholar ]
  • ——— Epistemic and methodological iteration in scientific research. Studies in History and Philosophy of Science. 2012; 43 :376–382. [ Google Scholar ]
  • Evans J, Rzhetsky A. Machine science. Science. 2010; 329 :399–400. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Fan J, Han F, Liu H. Challenges of big data analysis. Natural Science Review. 1 :293–314. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Forber P. Reconceiving eliminative inference. Philosophy of Science. 2011; 78 :185–208. [ Google Scholar ]
  • Franklin L. Exploratory experiments. Philosophy of Science. 72 :888–899. [ Google Scholar ]
  • Fudge D. Fifty years of J. R. Platt's strong inference. Journal of Experimental Science. 2014; 217 :1202–1204. [ PubMed ] [ Google Scholar ]
  • Gannon F. A letter to Darwin. EMBO Reports. 2009; 10 :1. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Glass DJ. NIH grants: Focus on questions, not hypotheses. Nature. 2014; 507 :306. [ PubMed ] [ Google Scholar ]
  • Glass DJ, Hall N. A brief history of the hypothesis. Cell. 2008; 134 :378–381. [ PubMed ] [ Google Scholar ]
  • Goring S, et al. Improving the culture of interdisciplinary collaboration in ecology by expanding the measures of success. Frontiers in Ecology and the Environment. 2014; 12 :39–47. [ Google Scholar ]
  • Hand DJ. Data mining: Statistics and more? American Statistician. 1998; 52 :112–118. [ Google Scholar ]
  • Harwood W. A new model for inquiry: Is the scientific method dead? Journal of College Science Teaching. 2004; 33 :29–33. [ Google Scholar ]
  • Haufe C. Why do funding agencies favor hypothesis testing? Studies in History and Philosophy of Science. 2013; 44 :363–374. [ Google Scholar ]
  • Haufe C, Burian R, Elliott K, O'Malley M. Machine science: What's missing. Science. 2010; 330 :317–318. [ PubMed ] [ Google Scholar ]
  • Hey T, Tansley S, Tolle K. The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research 2009 [ Google Scholar ]
  • Hilborn R, Mangell M. The Ecological Detective: Confronting Models with Data. Princeton University Press; 1997. [ Google Scholar ]
  • Hodge M. Darwin, Whewell, and natural selection. Biology and Philosophy. 1991; 6 :457–460. [ Google Scholar ]
  • Johnson K. Natural history as stamp collecting: A brief history. Archives of Natural History. 2007; 34 :244–258. [ Google Scholar ]
  • Katzav J, Parker S. The future of climate modeling. Climate Change. 2015; 132 :475–487. [ Google Scholar ]
  • Kitchin R. The Data Revolution: Big Data, Open Data, Data Infrastructures, and Their Consequences. Sage; 2014. [ Google Scholar ]
  • Keating P, Cambrosio K. Too many numbers: Microarrays in clinical cancer research. Studies in History and Philosophy of Biology and Biomedical Sciences. 2012; 43 :37–51. [ PubMed ] [ Google Scholar ]
  • Kell D, Oliver S. Here is the evidence, now what is the hypothesis? The complementary roles of inductive and hypothesis-driven science in the post-genomic era. Bioessays. 2004; 26 :99–105. [ PubMed ] [ Google Scholar ]
  • Khoury M, Ioannidis J. Big data meets public health. Science. 2014; 346 :1054–1055. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • King G. Ensuring the data-rich future of the social sciences. Science. 2011; 331 :719–721. [ PubMed ] [ Google Scholar ]
  • Laudan L. Science and Hypothesis: Historical Essays on Scientific Methodology. Reidel. 1981 [ Google Scholar ]
  • Lazer D, Kennedy R, King G, Vespignani A. The parable of Google flu: Traps in big data analysis. Science. 2014; 343 :1203–1205. [ PubMed ] [ Google Scholar ]
  • Leonelli S. Introduction: Making sense of data-driven research in the biological and biomedical sciences. Studies in History and Philosophy of Biology and Biomedical Sciences. 2012; 43 :1–3. [ PubMed ] [ Google Scholar ]
  • ——— What difference does quantity make? On the epistemology of Big Data in biology. Big Data and Society. 2014; 1 :1–11. doi:10.1177/2053951714534395. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Leonelli S, Ankeny R. Re-thinking organisms: The impact of databases on model organism biology. Studies in History and Philosophy of Biology and Biomedical Sciences. 2012; 43 :29–36. [ PubMed ] [ Google Scholar ]
  • Levy O, et al. Approaches to advance scientific understanding of macrosystems ecology. Frontiers in Ecology and the Environment. 2014; 12 :15–23. [ Google Scholar ]
  • Luckie DB, Maleszewski JJ, Loznak SD, Krha M. Infusion of collaborative inquiry throughout a biology curriculum increases student learning: A four-year study of “Teams and Streams.” Advances in Physiology Education. 2004; 28 :199–209. [ PubMed ] [ Google Scholar ]
  • Mattman CA. A vision for data science. Nature. 2013; 493 :473–475. [ PubMed ] [ Google Scholar ]
  • Medawar P. Is the scientific paper a fraud? In: Medawar P, editor. The Strange Case of the Spotted Mice and Other Classic Essays on Science. Oxford: Oxford University Press; 1996. pp. 33–39. [ Google Scholar ]
  • Montgomery GM. Gender and evolution. In: Ruse M, editor. Cambridge Encyclopedia of Darwin and Evolutionary Thought. Cambridge University Press; 2012. pp. 443–450. [ Google Scholar ]
  • Muller-Wille S, Charmantier I. Natural history and information overload: The case of Linnaeus. Studies in History and Philosophy of Biology and Biomedical Sciences. 2012; 43 :4–15. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • [NRC] National Research Council . Enhancing the Effectiveness of Team Science. National Academies Press; 2015. [ PubMed ] [ Google Scholar ]
  • O'Malley M, Soyer O. The roles of integration in molecular systems biology. Studies in History and Philosophy of Biology and Biomedical Sciences. 2012; 43 :58–68. [ PubMed ] [ Google Scholar ]
  • O'Malley M, Elliott KC, Haufe C, Burian R. Philosophies of funding. Cell. 2009; 138 :611–615. [ PubMed ] [ Google Scholar ]
  • O'Malley M, Elliott KC, Burian R. From genetic to genomic regulation: Iterative methods in miRNA research. Studies in History and Philosophy of Biology and Biomedical Sciences. 2010; 41 :407–417. [ PubMed ] [ Google Scholar ]
  • Peters R. A Critique for Ecology. Cambridge University Press; 1991. [ Google Scholar ]
  • Peters DPC, Havstad KM, Cushing J, Tweedie C, Fuentes O, Villanueva-Rosales N. Harnessing the power of big data: Infusing the scientific method with machine learning to transform ecology. Ecosphere 5 (art. 67). 2014 [ Google Scholar ]
  • Pietsch W. The causal nature of modeling with big data. Philosophy and Technology. 2016; 29 :137–171. [ Google Scholar ]
  • Popper K. Conjectures and Refutations. Routledge and Kegan Paul; 1963. [ Google Scholar ]
  • Porter JH, Hanson PC, Lin C-C. Staying afloat in the sensor data deluge. Trends in Ecology and Evolution. 2009; 27 :121–129. [ PubMed ] [ Google Scholar ]
  • Prensky MH. Sapiens digital: From digital immigrants and digital natives to digital wisdom. Innovate. 2009; 5 (art. 1). [ Google Scholar ]
  • Ratti E. Big data biology: Between eliminative inferences and exploratory experiments. Philosophy of Science. 2015; 82 :198–218. [ Google Scholar ]
  • Ruse M. Examination of the influence of the philosophical ideas of John F. W. Herschel and William Whewell in the development of Charles Darwin's theory of evolution. Studies in History and Philosophy of Science. 1975; 6 :159–181. [ PubMed ] [ Google Scholar ]
  • Schickore J. Doing science, writing science. Philosophy of Science. 2008; 75 :323–343. [ Google Scholar ]
  • Servick K. Proposed study would closely track 10,000 New Yorkers. Science. 2015; 350 :493–494. [ PubMed ] [ Google Scholar ]
  • Snyder L. Renovating the Novum Organum : Bacon, Whewell, and induction. Studies in History and Philosophy of Science. 1999; 30 :531–557. [ Google Scholar ]
  • Soranno PA, Cheruvelil KS, Elliott KC, Montgomery GM. It's good to share: Why environmental scientists’ ethics are out of date. BioScience. 2015a; 65 :69–73. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Soranno PA, et al. Building a multi-scaled geospatial temporal ecology database from disparate data sources: Fostering open science and data reuse. GigaScience. 2015b; 4 (art. 28). doi:10.1186/s13742-015-0067-4. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Steadman I. Big data and the death of the theorist. Wired. 2013 (22 August 2016; www.wired.co.uk/news/archive/2013-01/25/big-data-end-of-theory ) [ Google Scholar ]
  • Steinle F. Entering new fields: Exploratory uses of experimentation. Philosophy of Science. 1997; 64 :S65–S74. [ Google Scholar ]
  • Weinberg R. Point: Hypotheses first. Nature. 2010; 464 :678. [ PubMed ] [ Google Scholar ]
  • White PJT, Heidemann MK, Smith JJ. A new integrative approach to evolution education. BioScience. 2013; 63 :586–594. [ Google Scholar ]
  • Winsberg E. University of Chicago Press; 2010. Science in the Age of Computer Simulations. [ Google Scholar ]

IMAGES

  1. | The synergistic cycle of hypothesis-driven and data-driven

    all quantitative research must be hypothesis driven

  2. The intertwined cycles involving hypothesis-driven research and

    all quantitative research must be hypothesis driven

  3. 13 Different Types of Hypothesis (2024)

    all quantitative research must be hypothesis driven

  4. PPT

    all quantitative research must be hypothesis driven

  5. PPT

    all quantitative research must be hypothesis driven

  6. Ideal paradigm of hypothesis-driven basic research.

    all quantitative research must be hypothesis driven

VIDEO

  1. Intro to Statistics Basic Concepts and Research Techniques

  2. Step10 Hypothesis Driven Design Cindy Alvarez

  3. QUANTITATIVE Research Design: A Comprehensive Guide with Examples #phd #quantitativeresearch

  4. Hypothesis

  5. What Is A Hypothesis?

  6. Step-by-Step Guide to Hypothesis Testing: A Detailed Example of the 9 Essential Steps

COMMENTS

  1. A Practical Guide to Writing Quantitative and Qualitative Research Questions and Hypotheses in Scholarly Articles

    At this stage, some ideas regarding expectations from the research to be conducted must be drawn.18 Then, the variables to be manipulated (independent) and influenced (dependent) are defined.4 Thereafter, the hypothesis is stated and refined, and reproducible data tailored to the hypothesis are identified, collected, and analyzed.4 The ...

  2. Research Questions & Hypotheses

    The primary research question should originate from the hypothesis, not the data, and be established before starting the study. Formulating the research question and hypothesis from existing data (e.g., a database) can lead to multiple statistical comparisons and potentially spurious findings due to chance.

  3. Research questions, hypotheses and objectives

    Research hypothesis. The primary research question should be driven by the hypothesis rather than the data. 1, 2 That is, the research question and hypothesis should be developed before the start of the study. This sounds intuitive; however, if we take, for example, a database of information, it is potentially possible to perform multiple ...

  4. Formulating Hypotheses for Different Study Designs

    Formulating Hypotheses for Different Study Designs. Generating a testable working hypothesis is the first step towards conducting original research. Such research may prove or disprove the proposed hypothesis. Case reports, case series, online surveys and other observational studies, clinical trials, and narrative reviews help to generate ...

  5. What Is Quantitative Research?

    Revised on June 22, 2023. Quantitative research is the process of collecting and analyzing numerical data. It can be used to find patterns and averages, make predictions, test causal relationships, and generalize results to wider populations. Quantitative research is the opposite of qualitative research, which involves collecting and analyzing ...

  6. Key Concepts in Quantitative Research

    Key Concepts in Quantitative Research. In this module, we are going to explore the nuances of quantitative research, including the main types of quantitative research, more exploration into variables (including confounding and extraneous variables), and causation. Content includes: Objectives: Discuss the flaws, proof, and rigor in research.

  7. Hypothesis Testing

    Step 5: Present your findings. The results of hypothesis testing will be presented in the results and discussion sections of your research paper, dissertation or thesis.. In the results section you should give a brief summary of the data and a summary of the results of your statistical test (for example, the estimated difference between group means and associated p-value).

  8. PDF Research Questions and Hypotheses

    The most rigorous form of quantitative research follows from a test of a theory (see Chapter 3) and the specification of research questions or hypotheses that are included in the theory. The independent and dependent variables must be measured sepa-rately. This procedure reinforces the cause-and-effect logic of quantitative research.

  9. How to Write a Strong Hypothesis

    6. Write a null hypothesis. If your research involves statistical hypothesis testing, you will also have to write a null hypothesis. The null hypothesis is the default position that there is no association between the variables. The null hypothesis is written as H 0, while the alternative hypothesis is H 1 or H a.

  10. Research Problems and Hypotheses in Empirical Research

    ABSTRACT. Criteria are briefly proposed for final conclusions, research problems, and research hypotheses in quantitative research. Moreover, based on a proposed definition of applied and basic/general research, it is argued that (1) in applied quantitative research, while research problems are necessary, research hypotheses are unjustified, and that (2) in basic/general quantitative ...

  11. Hypothesis Requirements

    Example 1: "Disease X results from the expression of virulence genes.". Instead the hypothesis should focus on the expression of a particular gene or a set of genes. Example 2: "Quantifying X will provide significant increases in income for industry.". This is essentially untestable in an experimental setup and is really a potential ...

  12. Hypothesis-driven science in large-scale studies: the case ...

    In turn, this has led to a shift in biological methodology, from carefully constructed hypothesis-driven research, to unbiased data-driven approaches, sometimes called '-omics' studies. ... so one might be inclined to reject the original hypothesis that all diseases must fit the mould of 'small number of genes cause complex diseases ...

  13. Constructing Hypotheses in Quantitative Research

    Hypotheses are the testable statements linked to your research question. Hypotheses bridge the gap from the general question you intend to investigate (i.e., the research question) to concise statements of what you hypothesize the connection between your variables to be. For example, if we were studying the influence of mentoring relationships ...

  14. Research: Articulating Questions, Generating Hypotheses, and Choosing

    Articulating a clear and concise research question is fundamental to conducting a robust and useful research study. Although "getting stuck into" the data collection is the exciting part of research, this preparation stage is crucial. Clear and concise research questions are needed for a number of reasons. Initially, they are needed to ...

  15. What is a Research Hypothesis: How to Write it, Types, and Examples

    It seeks to explore and understand a particular aspect of the research subject. In contrast, a research hypothesis is a specific statement or prediction that suggests an expected relationship between variables. It is formulated based on existing knowledge or theories and guides the research design and data analysis. 7.

  16. Is it a must for a quantitative study to have hypotheses?

    Popular answers (1) Muayyad Ahmad. University of Jordan. Hi, No, it is not a must to have hypotheses in all quantitative research. Descriptive studies dont need hypotheses. however, RCT and ...

  17. Quantitative data collection and analysis

    Alternative hypothesis (HA) or (H1): this is sometimes called the research hypothesis or experimental hypothesis. It is the proposition that there will be a relationship. It is a statement of inequality between the variables you are interested in. They always refer to the sample. It is usually a declaration rather than a question and is clear ...

  18. Hypothesis-driven quantitative fluorescence microscopy

    A hypothesis-driven experimental design must be just that - driven by the hypothesis. It should be based on the biological question at hand, and not by the lure of the latest technologies. Fortunately, this process is an iterative feedback loop. The key questions left unanswered due to lack of technology inspire the development of novel ...

  19. (PDF) Quantitative Research: A Successful Investigation in Natural and

    Quantitative research explains phenomena by collecting numerical unchanging d etailed data t hat. are analyzed using mathematically based methods, in particular statistics that pose questions of ...

  20. Hypothesis-driven Research

    The scope of a well-designed methodology in a hypothesis-driven research equips the researcher to establish an opportunity to state the outcome of the study. A provisional statement in which the relationship between two variables is described is known as hypothesis. It is very specific and offers the freedom of evaluating a prediction between ...

  21. What is Quantitative Research Design? Definition, Types, Methods and

    Quantitative research design is defined as a research method used in various disciplines, including social sciences, psychology, economics, and market research. It aims to collect and analyze numerical data to answer research questions and test hypotheses. Quantitative research design offers several advantages, including the ability to ...

  22. 3.4 Sampling Techniques in Quantitative Research

    All individuals of a population are accessible to the researcher (theoretically), and there is an equal chance that each person in the population will be chosen to be part of the study sample. 41 While quantitative research often uses sampling techniques based on probability theory, some non-probability techniques may occasionally be utilised ...

  23. PLOS Pathogens

    Species-specific emergence of H7 highly pathogenic avian influenza virus is driven by intrahost selection differences between chickens and ducks. HPAIVs are more likely to be selected at the intra-. host level in poultry than in wild aquatic birds... Image credit: ppat.1011942.

  24. Scientific Hypotheses: Writing, Promoting, and Predicting Implications

    A snapshot analysis of citation activity of hypothesis articles may reveal interest of the global scientific community towards their implications across various disciplines and countries. As a prime example, Strachan's hygiene hypothesis, published in 1989,10 is still attracting numerous citations on Scopus, the largest bibliographic database ...

  25. PLOS Genetics

    Genomic analyses of Symbiomonas scintillans show no evidence for endosymbiotic bacteria but does reveal the presence of giant viruses. A multi-gene tree showed the three SsV genome types branched within highly supported clades with each of BpV2, OlVs, and MpVs, respectively. Image credit: pgen.1011218. 03/28/2024. Research Article.

  26. Geosciences

    Feature papers represent the most advanced research with significant potential for high impact in the field. ... Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers. Editor's Choice articles are based on recommendations by the scientific ...

  27. Conceptions of Good Science in Our Data-Rich World

    Scientists have been debating for centuries the nature of proper scientific methods, especially the role of hypothesis testing in scientific practice (Laudan 1981).These debates are being reinvigorated as many fields of science, including high-energy physics, astronomy, public health, climate science, environmental science, and genomics, are increasingly using data-intensive approaches (Bell ...