Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Guide to Experimental Design | Overview, Steps, & Examples

Guide to Experimental Design | Overview, 5 steps & Examples

Published on December 3, 2019 by Rebecca Bevans . Revised on June 21, 2023.

Experiments are used to study causal relationships . You manipulate one or more independent variables and measure their effect on one or more dependent variables.

Experimental design create a set of procedures to systematically test a hypothesis . A good experimental design requires a strong understanding of the system you are studying.

There are five key steps in designing an experiment:

  • Consider your variables and how they are related
  • Write a specific, testable hypothesis
  • Design experimental treatments to manipulate your independent variable
  • Assign subjects to groups, either between-subjects or within-subjects
  • Plan how you will measure your dependent variable

For valid conclusions, you also need to select a representative sample and control any  extraneous variables that might influence your results. If random assignment of participants to control and treatment groups is impossible, unethical, or highly difficult, consider an observational study instead. This minimizes several types of research bias, particularly sampling bias , survivorship bias , and attrition bias as time passes.

Table of contents

Step 1: define your variables, step 2: write your hypothesis, step 3: design your experimental treatments, step 4: assign your subjects to treatment groups, step 5: measure your dependent variable, other interesting articles, frequently asked questions about experiments.

You should begin with a specific research question . We will work with two research question examples, one from health sciences and one from ecology:

To translate your research question into an experimental hypothesis, you need to define the main variables and make predictions about how they are related.

Start by simply listing the independent and dependent variables .

Research question Independent variable Dependent variable
Phone use and sleep Minutes of phone use before sleep Hours of sleep per night
Temperature and soil respiration Air temperature just above the soil surface CO2 respired from soil

Then you need to think about possible extraneous and confounding variables and consider how you might control  them in your experiment.

Extraneous variable How to control
Phone use and sleep in sleep patterns among individuals. measure the average difference between sleep with phone use and sleep without phone use rather than the average amount of sleep per treatment group.
Temperature and soil respiration also affects respiration, and moisture can decrease with increasing temperature. monitor soil moisture and add water to make sure that soil moisture is consistent across all treatment plots.

Finally, you can put these variables together into a diagram. Use arrows to show the possible relationships between variables and include signs to show the expected direction of the relationships.

Diagram of the relationship between variables in a sleep experiment

Here we predict that increasing temperature will increase soil respiration and decrease soil moisture, while decreasing soil moisture will lead to decreased soil respiration.

Prevent plagiarism. Run a free check.

Now that you have a strong conceptual understanding of the system you are studying, you should be able to write a specific, testable hypothesis that addresses your research question.

Null hypothesis (H ) Alternate hypothesis (H )
Phone use and sleep Phone use before sleep does not correlate with the amount of sleep a person gets. Increasing phone use before sleep leads to a decrease in sleep.
Temperature and soil respiration Air temperature does not correlate with soil respiration. Increased air temperature leads to increased soil respiration.

The next steps will describe how to design a controlled experiment . In a controlled experiment, you must be able to:

  • Systematically and precisely manipulate the independent variable(s).
  • Precisely measure the dependent variable(s).
  • Control any potential confounding variables.

If your study system doesn’t match these criteria, there are other types of research you can use to answer your research question.

How you manipulate the independent variable can affect the experiment’s external validity – that is, the extent to which the results can be generalized and applied to the broader world.

First, you may need to decide how widely to vary your independent variable.

  • just slightly above the natural range for your study region.
  • over a wider range of temperatures to mimic future warming.
  • over an extreme range that is beyond any possible natural variation.

Second, you may need to choose how finely to vary your independent variable. Sometimes this choice is made for you by your experimental system, but often you will need to decide, and this will affect how much you can infer from your results.

  • a categorical variable : either as binary (yes/no) or as levels of a factor (no phone use, low phone use, high phone use).
  • a continuous variable (minutes of phone use measured every night).

How you apply your experimental treatments to your test subjects is crucial for obtaining valid and reliable results.

First, you need to consider the study size : how many individuals will be included in the experiment? In general, the more subjects you include, the greater your experiment’s statistical power , which determines how much confidence you can have in your results.

Then you need to randomly assign your subjects to treatment groups . Each group receives a different level of the treatment (e.g. no phone use, low phone use, high phone use).

You should also include a control group , which receives no treatment. The control group tells us what would have happened to your test subjects without any experimental intervention.

When assigning your subjects to groups, there are two main choices you need to make:

  • A completely randomized design vs a randomized block design .
  • A between-subjects design vs a within-subjects design .

Randomization

An experiment can be completely randomized or randomized within blocks (aka strata):

  • In a completely randomized design , every subject is assigned to a treatment group at random.
  • In a randomized block design (aka stratified random design), subjects are first grouped according to a characteristic they share, and then randomly assigned to treatments within those groups.
Completely randomized design Randomized block design
Phone use and sleep Subjects are all randomly assigned a level of phone use using a random number generator. Subjects are first grouped by age, and then phone use treatments are randomly assigned within these groups.
Temperature and soil respiration Warming treatments are assigned to soil plots at random by using a number generator to generate map coordinates within the study area. Soils are first grouped by average rainfall, and then treatment plots are randomly assigned within these groups.

Sometimes randomization isn’t practical or ethical , so researchers create partially-random or even non-random designs. An experimental design where treatments aren’t randomly assigned is called a quasi-experimental design .

Between-subjects vs. within-subjects

In a between-subjects design (also known as an independent measures design or classic ANOVA design), individuals receive only one of the possible levels of an experimental treatment.

In medical or social research, you might also use matched pairs within your between-subjects design to make sure that each treatment group contains the same variety of test subjects in the same proportions.

In a within-subjects design (also known as a repeated measures design), every individual receives each of the experimental treatments consecutively, and their responses to each treatment are measured.

Within-subjects or repeated measures can also refer to an experimental design where an effect emerges over time, and individual responses are measured over time in order to measure this effect as it emerges.

Counterbalancing (randomizing or reversing the order of treatments among subjects) is often used in within-subjects designs to ensure that the order of treatment application doesn’t influence the results of the experiment.

Between-subjects (independent measures) design Within-subjects (repeated measures) design
Phone use and sleep Subjects are randomly assigned a level of phone use (none, low, or high) and follow that level of phone use throughout the experiment. Subjects are assigned consecutively to zero, low, and high levels of phone use throughout the experiment, and the order in which they follow these treatments is randomized.
Temperature and soil respiration Warming treatments are assigned to soil plots at random and the soils are kept at this temperature throughout the experiment. Every plot receives each warming treatment (1, 3, 5, 8, and 10C above ambient temperatures) consecutively over the course of the experiment, and the order in which they receive these treatments is randomized.

Finally, you need to decide how you’ll collect data on your dependent variable outcomes. You should aim for reliable and valid measurements that minimize research bias or error.

Some variables, like temperature, can be objectively measured with scientific instruments. Others may need to be operationalized to turn them into measurable observations.

  • Ask participants to record what time they go to sleep and get up each day.
  • Ask participants to wear a sleep tracker.

How precisely you measure your dependent variable also affects the kinds of statistical analysis you can use on your data.

Experiments are always context-dependent, and a good experimental design will take into account all of the unique considerations of your study system to produce information that is both valid and relevant to your research question.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Likert scale

Research bias

  • Implicit bias
  • Framing effect
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 21). Guide to Experimental Design | Overview, 5 steps & Examples. Scribbr. Retrieved August 21, 2024, from https://www.scribbr.com/methodology/experimental-design/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, random assignment in experiments | introduction & examples, quasi-experimental design | definition, types & examples, how to write a lab report, get unlimited documents corrected.

✔ Free APA citation check included ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Study/experimental/research design: much more than statistics

Affiliation.

  • 1 Brigham Young University, Provo, UT 84602, USA. [email protected]
  • PMID: 20064054
  • PMCID: PMC2808761
  • DOI: 10.4085/1062-6050-45.1.98

Context: The purpose of study, experimental, or research design in scientific manuscripts has changed significantly over the years. It has evolved from an explanation of the design of the experiment (ie, data gathering or acquisition) to an explanation of the statistical analysis. This practice makes "Methods" sections hard to read and understand.

Objective: To clarify the difference between study design and statistical analysis, to show the advantages of a properly written study design on article comprehension, and to encourage authors to correctly describe study designs.

Description: The role of study design is explored from the introduction of the concept by Fisher through modern-day scientists and the AMA Manual of Style. At one time, when experiments were simpler, the study design and statistical design were identical or very similar. With the complex research that is common today, which often includes manipulating variables to create new variables and the multiple (and different) analyses of a single data set, data collection is very different than statistical design. Thus, both a study design and a statistical design are necessary.

Advantages: Scientific manuscripts will be much easier to read and comprehend. A proper experimental design serves as a road map to the study methods, helping readers to understand more clearly how the data were obtained and, therefore, assisting them in properly analyzing the results.

PubMed Disclaimer

Similar articles

  • Writing biomedical manuscripts part I: fundamentals and general rules. Ohwovoriole AE. Ohwovoriole AE. West Afr J Med. 2011 May-Jun;30(3):151-7. West Afr J Med. 2011. PMID: 22120477 Review.
  • Writing Scientific and Medical Papers Clearly. Wheatley D. Wheatley D. Anat Rec (Hoboken). 2018 Sep;301(9):1493-1496. doi: 10.1002/ar.23860. Anat Rec (Hoboken). 2018. PMID: 29901275
  • Rules to be adopted for publishing a scientific paper. Picardi N. Picardi N. Ann Ital Chir. 2016;87:1-3. Ann Ital Chir. 2016. PMID: 28474609
  • The anatomy of an article: Methods and results. Sauaia A, Moore EE, Crebs J, Maier R, Hoyt DB, Shackford SR. Sauaia A, et al. J Trauma Acute Care Surg. 2017 Sep;83(3):543-550. doi: 10.1097/TA.0000000000001536. J Trauma Acute Care Surg. 2017. PMID: 28463937 No abstract available.
  • Publishing Without Perishing: A Guide to the Successful Reporting of Clinical Data. Derraik JGB, Butler ÉM, Rerkasem K. Derraik JGB, et al. Int J Low Extrem Wounds. 2019 Sep;18(3):219-227. doi: 10.1177/1534734619865860. Epub 2019 Sep 3. Int J Low Extrem Wounds. 2019. PMID: 31478405 Review.
  • Reporting and analysis of repeated measurements in preclinical animals experiments. Zhao J, Wang C, Totton SC, Cullen JN, O'Connor AM. Zhao J, et al. PLoS One. 2019 Aug 12;14(8):e0220879. doi: 10.1371/journal.pone.0220879. eCollection 2019. PLoS One. 2019. PMID: 31404099 Free PMC article.
  • The Principles of Biomedical Scientific Writing: Materials and Methods. Ghasemi A, Bahadoran Z, Zadeh-Vakili A, Montazeri SA, Hosseinpanah F. Ghasemi A, et al. Int J Endocrinol Metab. 2019 Jan 28;17(1):e88155. doi: 10.5812/ijem.88155. eCollection 2019 Jan. Int J Endocrinol Metab. 2019. PMID: 30881471 Free PMC article. Review.
  • Murine models of Pneumocystis infection recapitulate human primary immune disorders. Elsegeiny W, Zheng M, Eddens T, Gallo RL, Dai G, Trevejo-Nunez G, Castillo P, Kracinovsky K, Cleveland H, Horne W, Franks J, Pociask D, Pilarski M, Alcorn JF, Chen K, Kolls JK. Elsegeiny W, et al. JCI Insight. 2018 Jun 21;3(12):e91894. doi: 10.1172/jci.insight.91894. eCollection 2018 Jun 21. JCI Insight. 2018. PMID: 29925696 Free PMC article.
  • Keep it simple: study design nomenclature in research article abstracts. Hertel J. Hertel J. J Athl Train. 2010 May-Jun;45(3):213-4. doi: 10.4085/1062-6050-45.3.213. J Athl Train. 2010. PMID: 20446832 Free PMC article. No abstract available.
  • Knight K. L., Ingersoll C. D. Structure of a scholarly manuscript: 66 tips for what goes where. J Athl Train. 1996;31(3):201–206. - PMC - PubMed
  • Iverson C., Christiansen S., Flanagin A., et al. AMA Manual of Style: A Guide for Authors and Editors. New York, NY: Oxford University Press; 2007. 10th ed.
  • Altman D. G. Practical Statistics for Medical Research. New York, NY: Chapman & Hall; 1991. pp. 4–5.
  • Thomas J. R., Nelson J. K., Silverman S. J. Research Methods in Physical Activity. Champaign, IL: Human Kinetics; 2005. 5th ed.
  • Leedy P. D. Practical Research, Planning and Design. New York, NY: Macmillan Publishing; 1985. pp. 96–99. 3rd ed.
  • Search in MeSH

Related information

Linkout - more resources, full text sources.

  • Europe PubMed Central
  • PubMed Central
  • Silverchair Information Systems

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

  • - Google Chrome

Intended for healthcare professionals

  • My email alerts
  • BMA member login
  • Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Home

Search form

  • Advanced search
  • Search responses
  • Search blogs
  • Beauty sleep:...

Beauty sleep: experimental study on the perceived health and attractiveness of sleep deprived people

  • Related content
  • Peer review
  • John Axelsson , researcher 1 2 ,
  • Tina Sundelin , research assistant and MSc student 2 ,
  • Michael Ingre , statistician and PhD student 3 ,
  • Eus J W Van Someren , researcher 4 ,
  • Andreas Olsson , researcher 2 ,
  • Mats Lekander , researcher 1 3
  • 1 Osher Center for Integrative Medicine, Department of Clinical Neuroscience, Karolinska Institutet, 17177 Stockholm, Sweden
  • 2 Division for Psychology, Department of Clinical Neuroscience, Karolinska Institutet
  • 3 Stress Research Institute, Stockholm University, Stockholm
  • 4 Netherlands Institute for Neuroscience, an Institute of the Royal Netherlands Academy of Arts and Sciences, and VU Medical Center, Amsterdam, Netherlands
  • Correspondence to: J Axelsson john.axelsson{at}ki.se
  • Accepted 22 October 2010

Objective To investigate whether sleep deprived people are perceived as less healthy, less attractive, and more tired than after a normal night’s sleep.

Design Experimental study.

Setting Sleep laboratory in Stockholm, Sweden.

Participants 23 healthy, sleep deprived adults (age 18-31) who were photographed and 65 untrained observers (age 18-61) who rated the photographs.

Intervention Participants were photographed after a normal night’s sleep (eight hours) and after sleep deprivation (31 hours of wakefulness after a night of reduced sleep). The photographs were presented in a randomised order and rated by untrained observers.

Main outcome measure Difference in observer ratings of perceived health, attractiveness, and tiredness between sleep deprived and well rested participants using a visual analogue scale (100 mm).

Results Sleep deprived people were rated as less healthy (visual analogue scale scores, mean 63 (SE 2) v 68 (SE 2), P<0.001), more tired (53 (SE 3) v 44 (SE 3), P<0.001), and less attractive (38 (SE 2) v 40 (SE 2), P<0.001) than after a normal night’s sleep. The decrease in rated health was associated with ratings of increased tiredness and decreased attractiveness.

Conclusion Our findings show that sleep deprived people appear less healthy, less attractive, and more tired compared with when they are well rested. This suggests that humans are sensitive to sleep related facial cues, with potential implications for social and clinical judgments and behaviour. Studies are warranted for understanding how these effects may affect clinical decision making and can add knowledge with direct implications in a medical context.

Introduction

The recognition [of the case] depends in great measure on the accurate and rapid appreciation of small points in which the diseased differs from the healthy state Joseph Bell (1837-1911)

Good clinical judgment is an important skill in medical practice. This is well illustrated in the quote by Joseph Bell, 1 who demonstrated impressive observational and deductive skills. Bell was one of Sir Arthur Conan Doyle’s teachers and served as a model for the fictitious detective Sherlock Holmes. 2 Generally, human judgment involves complex processes, whereby ingrained, often less consciously deliberated responses from perceptual cues are mixed with semantic calculations to affect decision making. 3 Thus all social interactions, including diagnosis in clinical practice, are influenced by reflexive as well as reflective processes in human cognition and communication.

Sleep is an essential homeostatic process with well established effects on an individual’s physiological, cognitive, and behavioural functionality 4 5 6 7 and long term health, 8 but with only anecdotal support of a role in social perception, such as that underlying judgments of attractiveness and health. As illustrated by the common expression “beauty sleep,” an individual’s sleep history may play an integral part in the perception and judgments of his or her attractiveness and health. To date, the concept of beauty sleep has lacked scientific support, but the biological importance of sleep may have favoured a sensitivity to perceive sleep related cues in others. It seems warranted to explore such sensitivity, as sleep disorders and disturbed sleep are increasingly common in today’s 24 hour society and often coexist with some of the most common health problems, such as hypertension 9 10 and inflammatory conditions. 11

To describe the relation between sleep deprivation and perceived health and attractiveness we asked untrained observers to rate the faces of people who had been photographed after a normal night’s sleep and after a night of sleep deprivation. We chose facial photographs as the human face is the primary source of information in social communication. 12 A perceiver’s response to facial cues, signalling the bearer’s emotional state, intentions, and potential mate value, serves to guide actions in social contexts and may ultimately promote survival. 13 14 15 We hypothesised that untrained observers would perceive sleep deprived people as more tired, less healthy, and less attractive compared with after a normal night’s sleep.

Using an experimental design we photographed the faces of 23 adults (mean age 23, range 18-31 years, 11 women) between 14.00 and 15.00 under two conditions in a balanced design: after a normal night’s sleep (at least eight hours of sleep between 23.00-07.00 and seven hours of wakefulness) and after sleep deprivation (sleep 02.00-07.00 and 31 hours of wakefulness). We advertised for participants at four universities in the Stockholm area. Twenty of 44 potentially eligible people were excluded. Reasons for exclusion were reported sleep disturbances, abnormal sleep requirements (for example, sleep need out of the 7-9 hour range), health problems, or availability on study days (the main reason). We also excluded smokers and those who had consumed alcohol within two days of the protocol. One woman failed to participate in both conditions. Overall, we enrolled 12 women and 12 men.

The participants slept in their own homes. Sleep times were confirmed with sleep diaries and text messages. The sleep diaries (Karolinska sleep diary) included information on sleep latency, quality, duration, and sleepiness. Participants sent a text message to the research assistant by mobile phone (SMS) at bedtime and when they got up on the night before sleep deprivation. They had been instructed not to nap. During the normal sleep condition the participants’ mean duration of sleep, estimated from sleep diaries, was 8.45 (SE 0.20) hours. The sleep deprivation condition started with a restriction of sleep to five hours in bed; the participants sent text messages (SMS) when they went to sleep and when they woke up. The mean duration of sleep during this night, estimated from sleep diaries and text messages, was 5.06 (SE 0.04) hours. For the following night of total sleep deprivation, the participants were monitored in the sleep laboratory at all times. Thus, for the sleep deprivation condition, participants came to the laboratory at 22.00 (after 15 hours of wakefulness) to be monitored, and stayed awake for a further 16 hours. We therefore did not observe the participants during the first 15 hours of wakefulness, when they had had a slightly restricted sleep, but had good control over the last 16 hours of wakefulness when sleepiness increased in magnitude. For the sleep condition, participants came to the laboratory at 12.00 (after five hours of wakefulness). They were kept indoors two hours before being photographed to avoid the effects of exposure to sunlight and the weather. We had a series of five or six photographs (resolution 3872×2592 pixels) taken in a well lit room, with a constant white balance (×900l; colour temperature 4200 K, Nikon D80; Nikon, Tokyo). The white balance was differently set during the two days of the study and affected seven photographs (four taken during sleep deprivation and three during a normal night’s sleep). Removing these participants from the analyses did not affect the results. The distance from camera to head was fixed, as was the focal length, within 14 mm (between 44 and 58 mm). To ensure a fixed surface area of each face on the photograph, the focal length was adapted to the head size of each participant.

For the photo shoot, participants wore no makeup, had their hair loose (combed backwards if long), underwent similar cleaning or shaving procedures for both conditions, and were instructed to “sit with a straight back and look straight into the camera with a neutral, relaxed facial expression.” Although the photographer was not blinded to the sleep conditions, she followed a highly standardised procedure during each photo shoot, including minimal interaction with the participants. A blinded rater chose the most typical photograph from each series of photographs. This process resulted in 46 photographs; two (one from each sleep condition) of each of the 23 participants. This part of the study took place between June and September 2007.

In October 2007 the photographs were presented at a fixed interval of six seconds in a randomised order to 65 observers (mainly students at the Karolinska Institute, mean age 30 (range 18-61) years, 40 women), who were unaware of the conditions of the study. They rated the faces for attractiveness (very unattractive to very attractive), health (very sick to very healthy), and tiredness (not at all tired to very tired) on a 100 mm visual analogue scale. After every 23 photographs a brief intermission was allowed, including a working memory task lasting 23 seconds to prevent the faces being memorised. To ensure that the observers were not primed to tiredness when rating health and attractiveness they rated the photographs for attractiveness and health in the first two sessions and tiredness in the last. To avoid the influence of possible order effects we presented the photographs in a balanced order between conditions for each session.

Statistical analyses

Data were analysed using multilevel mixed effects linear regression, with two crossed independent random effects accounting for random variation between observers and participants using the xtmixed procedure in Stata 9.2. We present the effect of condition as a percentage of change from the baseline condition as the reference using the absolute value in millimetres (rated on the visual analogue scale). No data were missing in the analyses.

Sixty five observers rated each of the 46 photographs for attractiveness, health, and tiredness: 138 ratings by each observer and 2990 ratings for each of the three factors rated. When sleep deprived, people were rated as less healthy (visual analogue scale scores, mean 63 (SE 2) v 68 (SE 2)), more tired (53 (SE 3) v 44 (SE 3)), and less attractive (38 (SE 2) v 40 (SE 2); P<0.001 for all) than after a normal night’s sleep (table 1 ⇓ ). Compared with the normal sleep condition, perceptions of health and attractiveness in the sleep deprived condition decreased on average by 6% and 4% and tiredness increased by 19%.

 Multilevel mixed effects regression on effect of how sleep deprived people are perceived with respect to attractiveness, health, and tiredness

  • View inline

A 10 mm increase in tiredness was associated with a −3.0 mm change in health, a 10 mm increase in health increased attractiveness by 2.4 mm, and a 10 mm increase in tiredness reduced attractiveness by 1.2 mm (table 2 ⇓ ). These findings were also presented as correlation, suggesting that faces with perceived attractiveness are positively associated with perceived health (r=0.42, fig 1 ⇓ ) and negatively with perceived tiredness (r=−0.28, fig 1). In addition, the average decrease (for each face) in attractiveness as a result of deprived sleep was associated with changes in tiredness (−0.53, n=23, P=0.03) and in health (0.50, n=23, P=0.01). Moreover, a strong negative association was found between the respective perceptions of tiredness and health (r=−0.54, fig 1). Figure 2 ⇓ shows an example of observer rated faces.

 Associations between health, tiredness, and attractiveness

Fig 1  Relations between health, tiredness, and attractiveness of 46 photographs (two each of 23 participants) rated by 65 observers on 100 mm visual analogue scales, with variation between observers removed using empirical Bayes’ estimates

  • Download figure
  • Open in new tab
  • Download powerpoint

Fig 2  Participant after a normal night’s sleep (left) and after sleep deprivation (right). Faces were presented in a counterbalanced order

To evaluate the mediation effects of sleep loss on attractiveness and health, tiredness was added to the models presented in table 1 following recommendations. 16 The effect of sleep loss was significantly mediated by tiredness on both health (P<0.001) and attractiveness (P<0.001). When tiredness was added to the model (table 1) with an estimated coefficient of −2.9 (SE 0.1; P<0.001) the independent effect of sleep loss on health decreased from −4.2 to −1.8 (SE 0.5; P<0.001). The effect of sleep loss on attractiveness decreased from −1.6 (table 1) to −0.62 (SE 0.4; P=0.133), with tiredness estimated at −1.1 (SE 0.1; P<0.001). The same approach applied to the model of attractiveness and health (table 2), with a decrease in the association from 2.4 to 2.1 (SE 0.1; P<0.001) with tiredness estimated at −0.56 (SE 0.1; P<0.001).

Sleep deprived people are perceived as less attractive, less healthy, and more tired compared with when they are well rested. Apparent tiredness was strongly related to looking less healthy and less attractive, which was also supported by the mediating analyses, indicating that a large part of the found effects and relations on appearing healthy and attractive were mediated by looking tired. The fact that untrained observers detected the effects of sleep loss in others not only provides evidence for a perceptual ability not previously subjected to experimental control, but also supports the notion that sleep history gives rise to socially relevant signals that provide information about the bearer. The adaptiveness of an ability to detect sleep related facial cues resonates well with other research, showing that small deviations from the average sleep duration in the long term are associated with an increased risk of health problems and with a decreased longevity. 8 17 Indeed, even a few hours of sleep deprivation inflict an array of physiological changes, including neural, endocrinological, immunological, and cellular functioning, that if sustained are relevant for long term health. 7 18 19 20 Here, we show that such physiological changes are paralleled by detectable facial changes.

These results are related to photographs taken in an artificial setting and presented to the observers for only six seconds. It is likely that the effects reported here would be larger in real life person to person situations, when overt behaviour and interactions add further information. Blink interval and blink duration are known to be indicators of sleepiness, 21 and trained observers are able to evaluate reliably the drowsiness of drivers by watching their videotaped faces. 22 In addition, a few of the people were perceived as healthier, less tired, and more attractive during the sleep deprived condition. It remains to be evaluated in follow-up research whether this is due to random error noise in judgments, or associated with specific characteristics of observers or the sleep deprived people they judge. Nevertheless, we believe that the present findings can be generalised to a wide variety of settings, but further studies will have to investigate the impact on clinical studies and other social situations.

Importantly, our findings suggest a prominent role of sleep history in several domains of interpersonal perception and judgment, in which sleep history has previously not been considered of importance, such as in clinical judgment. In addition, because attractiveness motivates sexual behaviour, collaboration, and superior treatment, 13 sleep loss may have consequences in other social contexts. For example, it has been proposed that facial cues perceived as attractive are signals of good health and that this recognition has been selected evolutionarily to guide choice of mate and successful transmission of genes. 13 The fact that good sleep supports a healthy look and poor sleep the reverse may be of particular relevance in the medical setting, where health estimates are an essential part. It is possible that people with sleep disturbances, clinical or otherwise, would be judged as more unhealthy, whereas those who have had an unusually good night’s sleep may be perceived as rather healthy. Compared with the sleep deprivation used in the present investigation, further studies are needed to investigate the effects of less drastic acute reductions of sleep as well as long term clinical effects.

Conclusions

People are capable of detecting sleep loss related facial cues, and these cues modify judgments of another’s health and attractiveness. These conclusions agree well with existing models describing a link between sleep and good health, 18 23 as well as a link between attractiveness and health. 13 Future studies should focus on the relevance of these facial cues in clinical settings. These could investigate whether clinicians are better than the average population at detecting sleep or health related facial cues, and whether patients with a clinical diagnosis exhibit more tiredness and are less healthy looking than healthy people. Perhaps the more successful doctors are those who pick up on these details and act accordingly.

Taken together, our results provide important insights into judgments about health and attractiveness that are reminiscent of the anecdotal wisdom harboured in Bell’s words, and in the colloquial notion of “beauty sleep.”

What is already known on this topic

Short or disturbed sleep and fatigue constitute major risk factors for health and safety

Complaints of short or disturbed sleep are common among patients seeking healthcare

The human face is the main source of information for social signalling

What this study adds

The facial cues of sleep deprived people are sufficient for others to judge them as more tired, less healthy, and less attractive, lending the first scientific support to the concept of “beauty sleep”

By affecting doctors’ general perception of health, the sleep history of a patient may affect clinical decisions and diagnostic precision

Cite this as: BMJ 2010;341:c6614

We thank B Karshikoff for support with data acquisition and M Ingvar for comments on an earlier draft of the manuscript, both without compensation and working at the Department for Clinical Neuroscience, Karolinska Institutet, Sweden.

Contributors: JA designed the data collection, supervised and monitored data collection, wrote the statistical analysis plan, carried out the statistical analyses, obtained funding, drafted and revised the manuscript, and is guarantor. TS designed and carried out the data collection, cleaned the data, drafted, revised the manuscript, and had final approval of the manuscript. JA and TS contributed equally to the work. MI wrote the statistical analysis plan, carried out the statistical analyses, drafted the manuscript, and critically revised the manuscript. EJWVS provided statistical advice, advised on data handling, and critically revised the manuscript. AO provided advice on the methods and critically revised the manuscript. ML provided administrative support, drafted the manuscript, and critically revised the manuscript. All authors approved the final version of the manuscript.

Funding: This study was funded by the Swedish Society for Medical Research, Rut and Arvid Wolff’s Memory Fund, and the Osher Center for Integrative Medicine.

Competing interests: All authors have completed the Unified Competing Interest form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: no support from any company for the submitted work; no financial relationships with any companies that might have an interest in the submitted work in the previous 3 years; no other relationships or activities that could appear to have influenced the submitted work.

Ethical approval: This study was approved by the Karolinska Institutet’s ethical committee. Participants were compensated for their participation.

Participant consent: Participant’s consent obtained.

Data sharing: Statistical code and dataset of ratings are available from the corresponding author at john.axelsson{at}ki.se .

This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode .

  • ↵ Deten A, Volz HC, Clamors S, Leiblein S, Briest W, Marx G, et al. Hematopoietic stem cells do not repair the infarcted mouse heart. Cardiovasc Res 2005 ; 65 : 52 -63. OpenUrl Abstract / FREE Full Text
  • ↵ Doyle AC. The case-book of Sherlock Holmes: selected stories. Wordsworth, 1993.
  • ↵ Lieberman MD, Gaunt R, Gilbert DT, Trope Y. Reflection and reflexion: a social cognitive neuroscience approach to attributional inference. Adv Exp Soc Psychol 2002 ; 34 : 199 -249. OpenUrl CrossRef
  • ↵ Drummond SPA, Brown GG, Gillin JC, Stricker JL, Wong EC, Buxton RB. Altered brain response to verbal learning following sleep deprivation. Nature 2000 ; 403 : 655 -7. OpenUrl CrossRef PubMed
  • ↵ Harrison Y, Horne JA. The impact of sleep deprivation on decision making: a review. J Exp Psychol Appl 2000 ; 6 : 236 -49. OpenUrl CrossRef PubMed Web of Science
  • ↵ Huber R, Ghilardi MF, Massimini M, Tononi G. Local sleep and learning. Nature 2004 ; 430 : 78 -81. OpenUrl CrossRef PubMed Web of Science
  • ↵ Spiegel K, Leproult R, Van Cauter E. Impact of sleep debt on metabolic and endocrine function. Lancet 1999 ; 354 : 1435 -9. OpenUrl CrossRef PubMed Web of Science
  • ↵ Kripke DF, Garfinkel L, Wingard DL, Klauber MR, Marler MR. Mortality associated with sleep duration and insomnia. Arch Gen Psychiatry 2002 ; 59 : 131 -6. OpenUrl CrossRef PubMed Web of Science
  • ↵ Olson LG, Ambrogetti A. Waking up to sleep disorders. Br J Hosp Med (Lond) 2006 ; 67 : 118 , 20. OpenUrl PubMed
  • ↵ Rajaratnam SM, Arendt J. Health in a 24-h society. Lancet 2001 ; 358 : 999 -1005. OpenUrl CrossRef PubMed Web of Science
  • ↵ Ranjbaran Z, Keefer L, Stepanski E, Farhadi A, Keshavarzian A. The relevance of sleep abnormalities to chronic inflammatory conditions. Inflamm Res 2007 ; 56 : 51 -7. OpenUrl CrossRef PubMed Web of Science
  • ↵ Haxby JV, Hoffman EA, Gobbini MI. The distributed human neural system for face perception. Trends Cogn Sci 2000 ; 4 : 223 -33. OpenUrl CrossRef PubMed Web of Science
  • ↵ Rhodes G. The evolutionary psychology of facial beauty. Annu Rev Psychol 2006 ; 57 : 199 -226. OpenUrl CrossRef PubMed Web of Science
  • ↵ Todorov A, Mandisodza AN, Goren A, Hall CC. Inferences of competence from faces predict election outcomes. Science 2005 ; 308 : 1623 -6. OpenUrl Abstract / FREE Full Text
  • ↵ Willis J, Todorov A. First impressions: making up your mind after a 100-ms exposure to a face. Psychol Sci 2006 ; 17 : 592 -8. OpenUrl Abstract / FREE Full Text
  • ↵ Krull JL, MacKinnon DP. Multilevel modeling of individual and group level mediated effects. Multivariate Behav Res 2001 ; 36 : 249 -77. OpenUrl CrossRef Web of Science
  • ↵ Ayas NT, White DP, Manson JE, Stampfer MJ, Speizer FE, Malhotra A, et al. A prospective study of sleep duration and coronary heart disease in women. Arch Intern Med 2003 ; 163 : 205 -9. OpenUrl CrossRef PubMed Web of Science
  • ↵ Bryant PA, Trinder J, Curtis N. Sick and tired: does sleep have a vital role in the immune system. Nat Rev Immunol 2004 ; 4 : 457 -67. OpenUrl CrossRef PubMed Web of Science
  • ↵ Cirelli C. Cellular consequences of sleep deprivation in the brain. Sleep Med Rev 2006 ; 10 : 307 -21. OpenUrl CrossRef PubMed Web of Science
  • ↵ Irwin MR, Wang M, Campomayor CO, Collado-Hidalgo A, Cole S. Sleep deprivation and activation of morning levels of cellular and genomic markers of inflammation. Arch Intern Med 2006 ; 166 : 1756 -62. OpenUrl CrossRef PubMed Web of Science
  • ↵ Schleicher R, Galley N, Briest S, Galley L. Blinks and saccades as indicators of fatigue in sleepiness warnings: looking tired? Ergonomics 2008 ; 51 : 982 -1010. OpenUrl CrossRef PubMed Web of Science
  • ↵ Wierwille WW, Ellsworth LA. Evaluation of driver drowsiness by trained raters. Accid Anal Prev 1994 ; 26 : 571 -81. OpenUrl CrossRef PubMed Web of Science
  • ↵ Horne J. Why we sleep—the functions of sleep in humans and other mammals. Oxford University Press, 1988.

experimental research design article

  • Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Experimental Design: Definition and Types

By Jim Frost 3 Comments

What is Experimental Design?

An experimental design is a detailed plan for collecting and using data to identify causal relationships. Through careful planning, the design of experiments allows your data collection efforts to have a reasonable chance of detecting effects and testing hypotheses that answer your research questions.

An experiment is a data collection procedure that occurs in controlled conditions to identify and understand causal relationships between variables. Researchers can use many potential designs. The ultimate choice depends on their research question, resources, goals, and constraints. In some fields of study, researchers refer to experimental design as the design of experiments (DOE). Both terms are synonymous.

Scientist who developed an experimental design for her research.

Ultimately, the design of experiments helps ensure that your procedures and data will evaluate your research question effectively. Without an experimental design, you might waste your efforts in a process that, for many potential reasons, can’t answer your research question. In short, it helps you trust your results.

Learn more about Independent and Dependent Variables .

Design of Experiments: Goals & Settings

Experiments occur in many settings, ranging from psychology, social sciences, medicine, physics, engineering, and industrial and service sectors. Typically, experimental goals are to discover a previously unknown effect , confirm a known effect, or test a hypothesis.

Effects represent causal relationships between variables. For example, in a medical experiment, does the new medicine cause an improvement in health outcomes? If so, the medicine has a causal effect on the outcome.

An experimental design’s focus depends on the subject area and can include the following goals:

  • Understanding the relationships between variables.
  • Identifying the variables that have the largest impact on the outcomes.
  • Finding the input variable settings that produce an optimal result.

For example, psychologists have conducted experiments to understand how conformity affects decision-making. Sociologists have performed experiments to determine whether ethnicity affects the public reaction to staged bike thefts. These experiments map out the causal relationships between variables, and their primary goal is to understand the role of various factors.

Conversely, in a manufacturing environment, the researchers might use an experimental design to find the factors that most effectively improve their product’s strength, identify the optimal manufacturing settings, and do all that while accounting for various constraints. In short, a manufacturer’s goal is often to use experiments to improve their products cost-effectively.

In a medical experiment, the goal might be to quantify the medicine’s effect and find the optimum dosage.

Developing an Experimental Design

Developing an experimental design involves planning that maximizes the potential to collect data that is both trustworthy and able to detect causal relationships. Specifically, these studies aim to see effects when they exist in the population the researchers are studying, preferentially favor causal effects, isolate each factor’s true effect from potential confounders, and produce conclusions that you can generalize to the real world.

To accomplish these goals, experimental designs carefully manage data validity and reliability , and internal and external experimental validity. When your experiment is valid and reliable, you can expect your procedures and data to produce trustworthy results.

An excellent experimental design involves the following:

  • Lots of preplanning.
  • Developing experimental treatments.
  • Determining how to assign subjects to treatment groups.

The remainder of this article focuses on how experimental designs incorporate these essential items to accomplish their research goals.

Learn more about Data Reliability vs. Validity and Internal and External Experimental Validity .

Preplanning, Defining, and Operationalizing for Design of Experiments

A literature review is crucial for the design of experiments.

This phase of the design of experiments helps you identify critical variables, know how to measure them while ensuring reliability and validity, and understand the relationships between them. The review can also help you find ways to reduce sources of variability, which increases your ability to detect treatment effects. Notably, the literature review allows you to learn how similar studies designed their experiments and the challenges they faced.

Operationalizing a study involves taking your research question, using the background information you gathered, and formulating an actionable plan.

This process should produce a specific and testable hypothesis using data that you can reasonably collect given the resources available to the experiment.

  • Null hypothesis : The jumping exercise intervention does not affect bone density.
  • Alternative hypothesis : The jumping exercise intervention affects bone density.

To learn more about this early phase, read Five Steps for Conducting Scientific Studies with Statistical Analyses .

Formulating Treatments in Experimental Designs

In an experimental design, treatments are variables that the researchers control. They are the primary independent variables of interest. Researchers administer the treatment to the subjects or items in the experiment and want to know whether it causes changes in the outcome.

As the name implies, a treatment can be medical in nature, such as a new medicine or vaccine. But it’s a general term that applies to other things such as training programs, manufacturing settings, teaching methods, and types of fertilizers. I helped run an experiment where the treatment was a jumping exercise intervention that we hoped would increase bone density. All these treatment examples are things that potentially influence a measurable outcome.

Even when you know your treatment generally, you must carefully consider the amount. How large of a dose? If you’re comparing three different temperatures in a manufacturing process, how far apart are they? For my bone mineral density study, we had to determine how frequently the exercise sessions would occur and how long each lasted.

How you define the treatments in the design of experiments can affect your findings and the generalizability of your results.

Assigning Subjects to Experimental Groups

A crucial decision for all experimental designs is determining how researchers assign subjects to the experimental conditions—the treatment and control groups. The control group is often, but not always, the lack of a treatment. It serves as a basis for comparison by showing outcomes for subjects who don’t receive a treatment. Learn more about Control Groups .

How your experimental design assigns subjects to the groups affects how confident you can be that the findings represent true causal effects rather than mere correlation caused by confounders. Indeed, the assignment method influences how you control for confounding variables. This is the difference between correlation and causation .

Imagine a study finds that vitamin consumption correlates with better health outcomes. As a researcher, you want to be able to say that vitamin consumption causes the improvements. However, with the wrong experimental design, you might only be able to say there is an association. A confounder, and not the vitamins, might actually cause the health benefits.

Let’s explore some of the ways to assign subjects in design of experiments.

Completely Randomized Designs

A completely randomized experimental design randomly assigns all subjects to the treatment and control groups. You simply take each participant and use a random process to determine their group assignment. You can flip coins, roll a die, or use a computer. Randomized experiments must be prospective studies because they need to be able to control group assignment.

Random assignment in the design of experiments helps ensure that the groups are roughly equivalent at the beginning of the study. This equivalence at the start increases your confidence that any differences you see at the end were caused by the treatments. The randomization tends to equalize confounders between the experimental groups and, thereby, cancels out their effects, leaving only the treatment effects.

For example, in a vitamin study, the researchers can randomly assign participants to either the control or vitamin group. Because the groups are approximately equal when the experiment starts, if the health outcomes are different at the end of the study, the researchers can be confident that the vitamins caused those improvements.

Statisticians consider randomized experimental designs to be the best for identifying causal relationships.

If you can’t randomly assign subjects but want to draw causal conclusions about an intervention, consider using a quasi-experimental design .

Learn more about Randomized Controlled Trials and Random Assignment in Experiments .

Randomized Block Designs

Nuisance factors are variables that can affect the outcome, but they are not the researcher’s primary interest. Unfortunately, they can hide or distort the treatment results. When experimenters know about specific nuisance factors, they can use a randomized block design to minimize their impact.

This experimental design takes subjects with a shared “nuisance” characteristic and groups them into blocks. The participants in each block are then randomly assigned to the experimental groups. This process allows the experiment to control for known nuisance factors.

Blocking in the design of experiments reduces the impact of nuisance factors on experimental error. The analysis assesses the effects of the treatment within each block, which removes the variability between blocks. The result is that blocked experimental designs can reduce the impact of nuisance variables, increasing the ability to detect treatment effects accurately.

Suppose you’re testing various teaching methods. Because grade level likely affects educational outcomes, you might use grade level as a blocking factor. To use a randomized block design for this scenario, divide the participants by grade level and then randomly assign the members of each grade level to the experimental groups.

A standard guideline for an experimental design is to “Block what you can, randomize what you cannot.” Use blocking for a few primary nuisance factors. Then use random assignment to distribute the unblocked nuisance factors equally between the experimental conditions.

You can also use covariates to control nuisance factors. Learn about Covariates: Definition and Uses .

Observational Studies

In some experimental designs, randomly assigning subjects to the experimental conditions is impossible or unethical. The researchers simply can’t assign participants to the experimental groups. However, they can observe them in their natural groupings, measure the essential variables, and look for correlations. These observational studies are also known as quasi-experimental designs. Retrospective studies must be observational in nature because they look back at past events.

Imagine you’re studying the effects of depression on an activity. Clearly, you can’t randomly assign participants to the depression and control groups. But you can observe participants with and without depression and see how their task performance differs.

Observational studies let you perform research when you can’t control the treatment. However, quasi-experimental designs increase the problem of confounding variables. For this design of experiments, correlation does not necessarily imply causation. While special procedures can help control confounders in an observational study, you’re ultimately less confident that the results represent causal findings.

Learn more about Observational Studies .

For a good comparison, learn about the differences and tradeoffs between Observational Studies and Randomized Experiments .

Between-Subjects vs. Within-Subjects Experimental Designs

When you think of the design of experiments, you probably picture a treatment and control group. Researchers assign participants to only one of these groups, so each group contains entirely different subjects than the other groups. Analysts compare the groups at the end of the experiment. Statisticians refer to this method as a between-subjects, or independent measures, experimental design.

In a between-subjects design , you can have more than one treatment group, but each subject is exposed to only one condition, the control group or one of the treatment groups.

A potential downside to this approach is that differences between groups at the beginning can affect the results at the end. As you’ve read earlier, random assignment can reduce those differences, but it is imperfect. There will always be some variability between the groups.

In a  within-subjects experimental design , also known as repeated measures, subjects experience all treatment conditions and are measured for each. Each subject acts as their own control, which reduces variability and increases the statistical power to detect effects.

In this experimental design, you minimize pre-existing differences between the experimental conditions because they all contain the same subjects. However, the order of treatments can affect the results. Beware of practice and fatigue effects. Learn more about Repeated Measures Designs .

Assigned to one experimental condition Participates in all experimental conditions
Requires more subjects Fewer subjects
Differences between subjects in the groups can affect the results Uses same subjects in all conditions.
No order of treatment effects. Order of treatments can affect results.

Design of Experiments Examples

For example, a bone density study has three experimental groups—a control group, a stretching exercise group, and a jumping exercise group.

In a between-subjects experimental design, scientists randomly assign each participant to one of the three groups.

In a within-subjects design, all subjects experience the three conditions sequentially while the researchers measure bone density repeatedly. The procedure can switch the order of treatments for the participants to help reduce order effects.

Matched Pairs Experimental Design

A matched pairs experimental design is a between-subjects study that uses pairs of similar subjects. Researchers use this approach to reduce pre-existing differences between experimental groups. It’s yet another design of experiments method for reducing sources of variability.

Researchers identify variables likely to affect the outcome, such as demographics. When they pick a subject with a set of characteristics, they try to locate another participant with similar attributes to create a matched pair. Scientists randomly assign one member of a pair to the treatment group and the other to the control group.

On the plus side, this process creates two similar groups, and it doesn’t create treatment order effects. While matched pairs do not produce the perfectly matched groups of a within-subjects design (which uses the same subjects in all conditions), it aims to reduce variability between groups relative to a between-subjects study.

On the downside, finding matched pairs is very time-consuming. Additionally, if one member of a matched pair drops out, the other subject must leave the study too.

Learn more about Matched Pairs Design: Uses & Examples .

Another consideration is whether you’ll use a cross-sectional design (one point in time) or use a longitudinal study to track changes over time .

A case study is a research method that often serves as a precursor to a more rigorous experimental design by identifying research questions, variables, and hypotheses to test. Learn more about What is a Case Study? Definition & Examples .

In conclusion, the design of experiments is extremely sensitive to subject area concerns and the time and resources available to the researchers. Developing a suitable experimental design requires balancing a multitude of considerations. A successful design is necessary to obtain trustworthy answers to your research question and to have a reasonable chance of detecting treatment effects when they exist.

Share this:

experimental research design article

Reader Interactions

' src=

March 23, 2024 at 2:35 pm

Dear Jim You wrote a superb document, I will use it in my Buistatistics course, along with your three books. Thank you very much! Miguel

' src=

March 23, 2024 at 5:43 pm

Thanks so much, Miguel! Glad this post was helpful and I trust the books will be as well.

' src=

April 10, 2023 at 4:36 am

What are the purpose and uses of experimental research design?

Comments and Questions Cancel reply

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of nihpa

Experimental and Quasi-Experimental Designs in Implementation Research

Christopher j. miller.

a VA Boston Healthcare System, Center for Healthcare Organization and Implementation Research (CHOIR), United States Department of Veterans Affairs, Boston, MA, USA

b Department of Psychiatry, Harvard Medical School, Boston, MA, USA

Shawna N. Smith

c Department of Psychiatry, University of Michigan Medical School, Ann Arbor, MI, USA

d Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA

Marianne Pugatch

Implementation science is focused on maximizing the adoption, appropriate use, and sustainability of effective clinical practices in real world clinical settings. Many implementation science questions can be feasibly answered by fully experimental designs, typically in the form of randomized controlled trials (RCTs). Implementation-focused RCTs, however, usually differ from traditional efficacy- or effectiveness-oriented RCTs on key parameters. Other implementation science questions are more suited to quasi-experimental designs, which are intended to estimate the effect of an intervention in the absence of randomization. These designs include pre-post designs with a non-equivalent control group, interrupted time series (ITS), and stepped wedges, the last of which require all participants to receive the intervention, but in a staggered fashion. In this article we review the use of experimental designs in implementation science, including recent methodological advances for implementation studies. We also review the use of quasi-experimental designs in implementation science, and discuss the strengths and weaknesses of these approaches. This article is therefore meant to be a practical guide for researchers who are interested in selecting the most appropriate study design to answer relevant implementation science questions, and thereby increase the rate at which effective clinical practices are adopted, spread, and sustained.

1. Background

The first documented clinical trial was conducted in 1747 by James Lind, a royal navy physician, who tested the hypothesis that citrus fruit could cure scurvy. Since then, based on foundational work by Fisher and others (1935), the randomized controlled trial (RCT) has emerged as the gold standard for testing the efficacy of treatment versus a control condition for individual patients. Randomization of patients is seen as a crucial to reducing the impact of measured or unmeasured confounding variables, in turn allowing researchers to draw conclusions regarding causality in clinical trials.

As described elsewhere in this special issue, implementation science is ultimately focused on maximizing the adoption, appropriate use, and sustainability of effective clinical practices in real world clinical settings. As such, some implementation science questions may be addressed by experimental designs. For our purposes here, we use the term “experimental” to refer to designs that feature two essential ingredients: first, manipulation of an independent variable; and second, random assignment of subjects. This corresponds to the definition of randomized experiments originally championed by Fisher (1925) . From this perspective, experimental designs usually take the form of RCTs—but implementation- oriented RCTs typically differ in important ways from traditional efficacy- or effectiveness-oriented RCTs. Other implementation science questions require different methodologies entirely: specifically, several forms of quasi-experimental designs may be used for implementation research in situations where an RCT would be inappropriate. These designs are intended to estimate the effect of an intervention despite a lack of randomization. Quasi-experimental designs include pre-post designs with a nonequivalent control group, interrupted time series (ITS), and stepped wedge designs. Stepped wedges are studies in which all participants receive the intervention, but in a staggered fashion. It is important to note that quasi-experimental designs are not unique to implementation science. As we will discuss below, however, each of them has strengths that make them particularly useful in certain implementation science contexts.

Our goal for this manuscript is two-fold. First, we will summarize the use of experimental designs in implementation science. This will include discussion of ways that implementation-focused RCTs may differ from efficacy- or effectiveness-oriented RCTs. Second, we will summarize the use of quasi-experimental designs in implementation research. This will include discussion of the strengths and weaknesses of these types of approaches in answering implementation research questions. For both experimental and quasi-experimental designs, we will discuss a recent implementation study as an illustrative example of one approach.

1. Experimental Designs in Implementation Science

RCTs in implementation science share the same basic structure as efficacy- or effectiveness-oriented RCTs, but typically feature important distinctions. In this section we will start by reviewing key factors that separate implementation RCTs from more traditional efficacy- or effectiveness-oriented RCTs. We will then discuss optimization trials, which are a type of experimental design that is especially useful for certain implementation science questions. We will then briefly turn our attention to single subject experimental designs (SSEDs) and on-off-on (ABA) designs.

The first common difference that sets apart implementation RCTs from more traditional clinical trials is the primary research question they aim to address. For most implementation trials, the primary research question is not the extent to which a particular treatment or evidence-based practice is more effective than a comparison condition, but instead the extent to which a given implementation strategy is more effective than a comparison condition. For more detail on this pivotal issue, see Drs. Bauer and Kirchner in this special issue.

Second, as a corollary of this point, implementation RCTs typically feature different outcome measures than efficacy or effectiveness RCTs, with an emphasis on the extent to which a health intervention was successfully implemented rather than an evaluation of the health effects of that intervention ( Proctor et al., 2011 ). For example, typical implementation outcomes might include the number of patients who receive the intervention, or the number of providers who administer the intervention as intended. A variety of evaluation-oriented implementation frameworks may guide the choices of such measures (e.g. RE-AIM; Gaglio et al., 2013 ; Glasgow et al., 1999 ). Hybrid implementation-effectiveness studies attend to both effectiveness and implementation outcomes ( Curran et al., 2012 ); these designs are also covered in more detail elsewhere in this issue (Landes, this issue).

Third, given their focus, implementation RCTs are frequently cluster-randomized (i.e. with sites or clinics as the unit of randomization, and patients nested within those sites or clinics). For example, consider a hypothetical RCT that aims to evaluate the implementation of a training program for cognitive behavioral therapy (CBT) in community clinics. Randomizing at the patient level for such a trial would be inappropriate due to the risk of contamination, as providers trained in CBT might reasonably be expected to incorporate CBT principles into their treatment even to patients assigned to the control condition. Randomizing at the provider level would also risk contamination, as providers trained in CBT might discuss this treatment approach with their colleagues. Thus, many implementation trials are cluster randomized at the site or clinic level. While such clustering minimizes the risk of contamination, it can unfortunately create commensurate problems with confounding, especially for trials with very few sites to randomize. Stratification may be used to at least partially address confounding issues in cluster- randomized and more traditional trials alike, by ensuring that intervention and control groups are broadly similar on certain key variables. Furthermore, such allocation schemes typically require analytic models that account for this clustering and the resulting correlations among error structures (e.g., generalized estimating equations [GEE] or mixed-effects models; Schildcrout et al., 2018 ).

1.1. Optimization trials

Key research questions in implementation science often involve determining which implementation strategies to provide, to whom, and when, to achieve optimal implementation success. As such, trials designed to evaluate comparative effectiveness, or to optimize provision of different types or intensities of implementation strategies, may be more appealing than traditional effectiveness trials. The methods described in this section are not unique to implementation science, but their application in the context of implementation trials may be particularly useful for informing implementation strategies.

While two-arm RCTs can be used to evaluate comparative effectiveness, trials focused on optimizing implementation support may use alternative experimental designs ( Collins et al., 2005 ; Collins et al., 2007 ). For example, in certain clinical contexts, multi-component “bundles” of implementation strategies may be warranted (e.g. a bundle consisting of clinician training, technical assistance, and audit/feedback to encourage clinicians to use a new evidence-based practice). In these situations, implementation researchers might consider using factorial or fractional-factorial designs. In the context of implementation science, these designs randomize participants (e.g. sites or providers) to different combinations of implementation strategies, and can be used to evaluate the effectiveness of each strategy individually to inform an optimal combination (e.g. Coulton et al., 2009 ; Pellegrini et al., 2014 ; Wyrick, et al., 2014 ). Such designs can be particularly useful in informing multi-component implementation strategies that are not redundant or overly burdensome ( Collins et al., 2014a ; Collins et al., 2009 ; Collins et al., 2007 ).

Researchers interested in optimizing sequences of implementation strategies that adapt to ongoing needs over time may be interested in a variant of factorial designs known as the sequential, multiple-assignment randomized trial (SMART; Almirall et al., 2012 ; Collins et al., 2014b ; Kilbourne et al., 2014b ; Lei et al., 2012 ; Nahum-Shani et al., 2012 ; NeCamp et al., 2017 ). SMARTs are multistage randomized trials in which some or all participants are randomized more than once, often based on ongoing information (e.g., treatment response). In implementation research, SMARTs can inform optimal sequences of implementation strategies to maximize downstream clinical outcomes. Thus, such designs are well-suited to answering questions about what implementation strategies should be used, in what order, to achieve the best outcomes in a given context.

One example of an implementation SMART is the Adaptive Implementation of Effective Program Trial (ADEPT; Kilbourne et al., 2014a ). ADEPT was a clustered SMART ( NeCamp et al., 2017 ) designed to inform an adaptive sequence of implementation strategies for implementing an evidence-based collaborative chronic care model, Life Goals ( Kilbourne et al., 2014c ; Kilbourne et al., 2012a ), into community-based practices. Life Goals, the clinical intervention being implemented, has proven effective at improving physical and mental health outcomes for patients with unipolar and bipolar depression by encouraging providers to instruct patients in self-management, and improving clinical information systems and care management across physical and mental health providers ( Bauer et al., 2006 ; Kilbourne et al., 2012a ; Kilbourne et al., 2008 ; Simon et al., 2006 ). However, in spite of its established clinical effectiveness, community-based clinics experienced a number of barriers in trying to implement the Life Goals model, and there were questions about how best to efficiently and effectively augment implementation strategies for clinics that struggled with implementation.

The ADEPT study was thus designed to determine the best sequence of implementation strategies to offer sites interested in implementing Life Goals. The ADEPT study involved use of three different implementation strategies. First, all sites received implementation support based on Replicating Effective Programs (REP), which offered an implementation manual, brief training, and low- level technical support ( Kilbourne et al., 2007 ; Kilbourne et al., 2012b ; Neumann and Sogolow, 2000 ). REP implementation support had been previously found to be low-cost and readily scalable, but also insufficient for uptake for many community-based settings ( Kilbourne et al., 2015 ). For sites that failed to implement Life Goals under REP, two additional implementation strategies were considered as augmentations to REP: External Facilitation (EF; Kilbourne et al., 2014b ; Stetler et al., 2006 ), consisting of phone-based mentoring in strategic skills from a study team member; and Internal Facilitation (IF; Kirchner et al., 2014 ), which supported protected time for a site employee to address barriers to program adoption.

The ADEPT study was designed to evaluate the best way to augment support for these sites that were not able to implement Life Goals under REP, specifically querying whether it was better to augment REP with EF only or the more intensive EF/IF, and whether augmentations should be provided all at once, or staged. Intervention assignments are mapped in Figure 1 . Seventy-nine community-based clinics across Michigan and Colorado were provided with initial implementation support under REP. After six months, implementation of the clinical intervention, Life Goals, was evaluated at all sites. Sites that had failed to reach an adequate level of delivery (defined as those sites enrolling fewer than ten patients in Life Goals, or those at which fewer than 50% of enrolled patients had received at least three Life Goals sessions) were considered non-responsive to REP and randomized to receive additional support through either EF or combined EF/IF. After six further months, Life Goals implementation at these sites was again evaluated. Sites surpassing the implementation response benchmark had their EF or EF/IF support discontinued. EF/IF sites that remained non-responsive continued to receive EF/IF for an additional six months. EF sites that remained non-responsive were randomized a second time to either continue with EF or further augment with IF. This design thus allowed for comparison of three different adaptive implementation interventions for sites that were initially non-responsive to REP to determine the best adaptive sequence of implementation support for sites that were initially non-responsive under REP:

An external file that holds a picture, illustration, etc.
Object name is nihms-1533574-f0001.jpg

SMART design from ADEPT trial.

  • Provide EF for 6 months; continue EF for a further six months for sites that remain nonresponsive; discontinue EF for sites that are responsive;
  • Provide EF/IF for 6 months; continue EF/IF for a further six months for sites that remain non-responsive; discontinue EF/IF for sites that are responsive; and
  • Provide EF for 6 months; step up to EF/IF for a further six months for sites that remain non-responsive; discontinue EF for sites that are responsive.

While analyses of this study are still ongoing, including the comparison of these three adaptive sequences of implementation strategies, results have shown that patients at sites that were randomized to receive EF as the initial augmentation to REP saw more improvement in clinical outcomes (SF-12 mental health quality of life and PHQ-9 depression scores) after 12 months than patients at sites that were randomized to receive the more intensive EF/IF augmentation.

1.2. Single Subject Experimental Designs and On-Off-On (ABA) Designs

We also note that there are a variety of Single Subject Experimental Designs (SSEDs; Byiers et al., 2012 ), including withdrawal designs and alternating treatment designs, that can be used in testing evidence-based practices. Similarly, an implementation strategy may be used to encourage the use of a specific treatment at a particular site, followed by that strategy’s withdrawal and subsequent reinstatement, with data collection throughout the process (on-off-on or ABA design). A weakness of these approaches in the context of implementation science, however, is that they usually require reversibility of the intervention (i.e. that the withdrawal of implementation support truly allows the healthcare system to revert to its pre-implementation state). When this is not the case—for example, if a hypothetical study is focused on training to encourage use of an evidence-based psychotherapy—then these designs may be less useful.

2. Quasi-Experimental Designs in Implementation Science

In some implementation science contexts, policy-makers or administrators may not be willing to have a subset of participating patients or sites randomized to a control condition, especially for high-profile or high-urgency clinical issues. Quasi-experimental designs allow implementation scientists to conduct rigorous studies in these contexts, albeit with certain limitations. We briefly review the characteristics of these designs here; other recent review articles are available for the interested reader (e.g. Handley et al., 2018 ).

2.1. Pre-Post with Non-Equivalent Control Group

The pre-post with non-equivalent control group uses a control group in the absence of randomization. Ideally, the control group is chosen to be as similar to the intervention group as possible (e.g. by matching on factors such as clinic type, patient population, geographic region, etc.). Theoretically, both groups are exposed to the same trends in the environment, making it plausible to decipher if the intervention had an effect. Measurement of both treatment and control conditions classically occurs pre- and post-intervention, with differential improvement between the groups attributed to the intervention. This design is popular due to its practicality, especially if data collection points can be kept to a minimum. It may be especially useful for capitalizing on naturally occurring experiments such as may occur in the context of certain policy initiatives or rollouts—specifically, rollouts in which it is plausible that a control group can be identified. For example, Kirchner and colleagues (2014) used this type of design to evaluate the integration of mental health services into primary care clinics at seven US Department of Veterans Affairs (VA) medical centers and seven matched controls.

One overarching drawback of this design is that it is especially vulnerable to threats to internal validity ( Shadish, 2002 ), because pre-existing differences between the treatment and control group could erroneously be attributed to the intervention. While unmeasured differences between treatment and control groups are always a possibility in healthcare research, such differences are especially likely to occur in the context of these designs due to the lack of randomization. Similarly, this design is particularly sensitive to secular trends that may differentially affect the treatment and control groups ( Cousins et al., 2014 ; Pape et al., 2013 ), as well as regression to the mean confounding study results ( Morton and Torgerson, 2003 ). For example, if a study site is selected for the experimental condition precisely because it is underperforming in some way, then regression to the mean would suggest that the site will show improvement regardless of any intervention; in the context of a pre-post with non-equivalent control group study, however, this improvement would erroneously be attributed to the intervention itself (Type I error).

There are, however, various ways that implementation scientists can mitigate these weaknesses. First, as mentioned briefly above, it is important to select a control group that is as similar as possible to the intervention site(s), which can include matching at both the health care network and clinic level (e.g. Kirchner et al., 2014 ). Second, propensity score weighting (e.g. Morgan, 2018 ) can statistically mitigate internal validity concerns, although this approach may be of limited utility when comparing secular trends between different study cohorts ( Dimick and Ryan, 2014 ). More broadly, qualitative methods (e.g. periodic interviews with staff at intervention and control sites) can help uncover key contextual factors that may be affecting study results above and beyond the intervention itself.

2.2. Interrupted Time Series

Interrupted time series (ITS; Shadish, 2002 ; Taljaard et al., 2014 ; Wagner et al., 2002 ) designs represent one of the most robust categories of quasi-experimental designs. Rather than relying on a non-equivalent control group, ITS designs rely on repeated data collections from intervention sites to determine whether a particular intervention is associated with improvement on a given metric relative to the pre-intervention secular trend. They are particularly useful in cases where a comparable control group cannot be identified—for example, following widespread implementation of policy mandates, quality improvement initiatives, or dissemination campaigns ( Eccles et al., 2003 ). In ITS designs, data are collected at multiple time points both before and after an intervention (e.g., policy change, implementation effort), and analyses explore whether the intervention was associated with the outcome beyond any pre-existing secular trend. More formally, ITS evaluations focus on identifying whether there is discontinuity in the trend (change in slope or level) after the intervention relative to before the intervention, using segmented regression to model pre- and post-intervention trends ( Gebski et al., 2012 ; Penfold and Zhang, 2013 ; Taljaard et al., 2014 ; Wagner et al., 2002 ). A number of recent implementation studies have used ITS designs, including an evaluation of implementation of a comprehensive smoke-free policy in a large UK mental health organization to reduce physical assaults ( Robson et al., 2017 ); the impact of a national policy limiting alcohol availability on suicide mortality in Slovenia ( Pridemore and Snowden, 2009 ); and the effect of delivery of a tailored intervention for primary care providers to increase psychological referrals for women with mild to moderate postnatal depression ( Hanbury et al., 2013 ).

ITS designs are appealing in implementation work for several reasons. Relative to uncontrolled pre-post analyses, ITS analyses reduce the chances that intervention effects are confounded by secular trends ( Bernal et al., 2017 ; Eccles et al., 2003 ). Time-varying confounders, such as seasonality, can also be adjusted for, provided adequate data ( Bernal et al., 2017 ). Indeed, recent work has confirmed that ITS designs can yield effect estimates similar to those derived from cluster-randomized RCTs ( Fretheim et al., 2013 ; Fretheim et al., 2015 ). Relative to an RCT, ITS designs can also allow for a more comprehensive assessment of the longitudinal effects of an intervention (positive or negative), as effects can be traced over all included time points ( Bernal et al., 2017 ; Penfold and Zhang, 2013 ).

ITS designs also present a number of challenges. First, the segmented regression approach requires clear delineation between pre- and post-intervention periods; interventions with indeterminate implementation periods are likely not good candidates for ITS. While ITS designs that include multiple ‘interruptions’ (e.g. introductions of new treatment components) are possible, they will require collection of enough time points between interruptions to ensure that each intervention’s effects can be ascertained individually ( Bernal et al., 2017 ). Second, collecting data from sufficient time points across all sites of interest, especially for the pre-intervention period, can be challenging ( Eccles et al., 2003 ): a common recommendation is at least eight time points both pre- and post-intervention ( Penfold and Zhang, 2013 ). This may be onerous, particularly if the data are not routinely collected by the health system(s) under study. Third, ITS cannot protect against confounding effects from other interventions that begin contemporaneously and may impact similar outcomes ( Eccles et al., 2003 ).

2.3. Stepped Wedge Designs

Stepped wedge trials are another type of quasi-experimental design. In a stepped wedge, all participants receive the intervention, but are assigned to the timing of the intervention in a staggered fashion ( Betran et al., 2018 ; Brown and Lilford, 2006 ; Hussey and Hughes, 2007 ), typically at the site or cluster level. Stepped wedge designs have their analytic roots in balanced incomplete block designs, in which all pairs of treatments occur an equal number of times within each block ( Hanani, 1961 ). Traditionally, all sites in stepped wedge trials have outcome measures assessed at all time points, thus allowing sites that receive the intervention later in the trial to essentially serve as controls for early intervention sites. A recent special issue of the journal Trials includes more detail on these designs ( Davey et al., 2015 ), which may be ideal for situations in which it is important for all participating patients or sites to receive the intervention during the trial. Stepped wedge trials may also be useful when resources are scarce enough that intervening at all sites at once (or even half of the sites as in a standard treatment-versus-control RCT) would not be feasible. If desired, the administration of the intervention to sites in waves allows for lessons learned in early sites to be applied to later sites (via formative evaluation; see Elwy et al., this issue).

The Behavioral Health Interdisciplinary Program (BHIP) Enhancement Project is a recent example of a stepped-wedge implementation trial ( Bauer et al., 2016 ; Bauer et al., 2019 ). This study involved using blended facilitation (including internal and external facilitators; Kirchner et al., 2014 ) to implement care practices consistent with the collaborative chronic care model (CCM; Bodenheimer et al., 2002a , b ; Wagner et al., 1996 ) in nine outpatient mental health teams in VA medical centers. Figure 2 illustrates the implementation and stepdown periods for that trial, with black dots representing primary data collection points.

An external file that holds a picture, illustration, etc.
Object name is nihms-1533574-f0002.jpg

BHIP Enhancement Project stepped wedge (adapted form Bauer et al., 2019).

The BHIP Enhancement Project was conducted as a stepped wedge for several reasons. First, the stepped wedge design allowed the trial to reach nine sites despite limited implementation resources (i.e. intervening at all nine sites simultaneously would not have been feasible given study funding). Second, the stepped wedge design aided in recruitment and retention, as all participating sites were certain to receive implementation support during the trial: at worst, sites that were randomized to later- phase implementation had to endure waiting periods totaling about eight months before implementation began. This was seen as a major strength of the design by its operational partner, the VA Office of Mental Health and Suicide Prevention. To keep sites engaged during the waiting period, the BHIP Enhancement Project offered a guiding workbook and monthly technical support conference calls.

Three additional features of the BHIP Enhancement Project deserve special attention. First, data collection for late-implementing sites did not begin until immediately before the onset of implementation support (see Figure 2 ). While this reduced statistical power, it also significantly reduced data collection burden on the study team. Second, onset of implementation support was staggered such that wave 2 began at the end of month 4 rather than month 6. This had two benefits: first, this compressed the overall amount of time required for implementation during the trial. Second, it meant that the study team only had to collect data from one site at a time, with data collection periods coming every 2–4 months. More traditional stepped wedge approaches typically have data collection across sites temporally aligned (e.g. Betran et al., 2018 ). Third, the BHIP Enhancement Project used a balancing algorithm ( Lew et al., 2019 ) to assign sites to waves, retaining some of the benefits of randomization while ensuring balance on key site characteristics (e.g. size, geographic region).

Despite their utility, stepped wedges have some important limitations. First, because they feature delayed implementation at some sites, stepped wedges typically take longer than similarly-sized parallel group RCTs. This increases the chances that secular trends, policy changes, or other external forces impact study results. Second, as with RCTs, imbalanced site assignment can confound results. This may occur deliberately in some cases—for example, if sites that develop their implementation plans first are assigned to earlier waves. Even if sites are randomized, however, early and late wave sites may still differ on important characteristics such as size, rurality, and case mix. The resulting confounding between site assignment and time can threaten the internal validity of the study—although, as above, balancing algorithms can reduce this risk. Third, the use of formative evaluation (Elwy, this issue), while useful for maximizing the utility of implementation efforts in a stepped wedge, can mean that late-wave sites receive different implementation strategies than early-wave sites. Similarly, formative evaluation may inform midstream adaptations to the clinical innovation being implemented. In either case, these changes may again threaten internal validity. Overall, then, stepped wedges represent useful tools for evaluating the impact of health interventions that (as with all designs) are subject to certain weaknesses and limitations.

3. Conclusions and Future Directions

Implementation science is focused on maximizing the extent to which effective healthcare practices are adopted, used, and sustained by clinicians, hospitals, and systems. Answering questions in these domains frequently requires different research methods than those employed in traditional efficacy- or effectiveness-oriented randomized clinical trials (RCTs). Implementation-oriented RCTs typically feature cluster or site-level randomization, and emphasize implementation outcomes (e.g. the number of patients receiving the new treatment as intended) rather than traditional clinical outcomes. Hybrid implementation-effectiveness designs incorporate both types of outcomes; more details on these approaches can be found elsewhere in this special issue (Landes, this issue). Other methodological innovations, such as factorial designs or sequential, multiple-assignment randomized trials (SMARTs), can address questions about multi-component or adaptive interventions, still under the umbrella of experimental designs. These types of trials may be especially important for demystifying the “black box” of implementation—that is, determining what components of an implementation strategy are most strongly associated with implementation success. In contrast, pre-post designs with non-equivalent control groups, interrupted time series (ITS), and stepped wedge designs are all examples of quasiexperimental designs that may serve implementation researchers when experimental designs would be inappropriate. A major theme cutting across each of these designs is that there are relative strengths and weaknesses associated with any study design decision. Determining what design to use ultimately will need to be informed by the primary research question to be answered, while simultaneously balancing the need for internal validity, external validity, feasibility, and ethics.

New innovations in study design are constantly being developed and refined. Several such innovations are covered in other articles within this special issue (e.g. Kim et al., this issue). One future direction relevant to the study designs presented in this article is the potential for adaptive trial designs, which allow information gleaned during the trial to inform the adaptation of components like treatment allocation, sample size, or study recruitment in the later phases of the same trial ( Pallmann et al., 2018 ). These designs are becoming increasingly popular in clinical treatment ( Bhatt and Mehta, 2016 ) but could also hold promise for implementation scientists, especially as interest grows in rapid-cycle testing of implementation strategies or efforts. Adaptive designs could potentially be incorporated into both SMART designs and stepped wedge studies, as well as traditional RCTs to further advance implementation science ( Cheung et al., 2015 ). Ideally, these and other innovations will provide researchers with increasingly robust and useful methodologies for answering timely implementation science questions.

  • Many implementation science questions can be addressed by fully experimental designs (e.g. randomized controlled trials [RCTs]).
  • Implementation trials differ in important ways, however, from more traditional efficacy- or effectiveness-oriented RCTs.
  • Adaptive designs represent a recent innovation to determine optimal implementation strategies within a fully experimental framework.
  • Quasi-experimental designs can be used to answer implementation science questions in the absence of randomization.
  • The choice of study designs in implementation science requires careful consideration of scientific, pragmatic, and ethical issues.

Acknowledgments

This work was supported by Department of Veterans Affairs grants QUE 15–289 (PI: Bauer) and CIN 13403 and National Institutes of Health grant RO1 MH 099898 (PI: Kilbourne).

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

  • Almirall D, Compton SN, Gunlicks-Stoessel M, Duan N, Murphy SA, 2012. Designing a pilot sequential multiple assignment randomized trial for developing an adaptive treatment strategy . Stat Med 31 ( 17 ), 1887–1902. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bauer MS, McBride L, Williford WO, Glick H, Kinosian B, Altshuler L, Beresford T, Kilbourne AM, Sajatovic M, Cooperative Studies Program 430 Study, T., 2006. Collaborative care for bipolar disorder: Part II. Impact on clinical outcome, function, and costs . Psychiatr Serv 57 ( 7 ), 937–945. [ PubMed ] [ Google Scholar ]
  • Bauer MS, Miller C, Kim B, Lew R, Weaver K, Coldwell C, Henderson K, Holmes S, Seibert MN, Stolzmann K, Elwy AR, Kirchner J, 2016. Partnering with health system operations leadership to develop a controlled implementation trial . Implement Sci 11 , 22. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bauer MS, Miller CJ, Kim B, Lew R, Stolzmann K, Sullivan J, Riendeau R, Pitcock J, Williamson A, Connolly S, Elwy AR, Weaver K, 2019. Effectiveness of Implementing a Collaborative Chronic Care Model for Clinician Teams on Patient Outcomes and Health Status in Mental Health: A Randomized Clinical Trial . JAMA Netw Open 2 ( 3 ), e190230. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bernal JL, Cummins S, Gasparrini A, 2017. Interrupted time series regression for the evaluation of public health interventions: a tutorial . Int J Epidemiol 46 ( 1 ), 348–355. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Betran AP, Bergel E, Griffin S, Melo A, Nguyen MH, Carbonell A, Mondlane S, Merialdi M, Temmerman M, Gulmezoglu AM, 2018. Provision of medical supply kits to improve quality of antenatal care in Mozambique: a stepped-wedge cluster randomised trial . Lancet Glob Health 6 ( 1 ), e57–e65. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bhatt DL, Mehta C, 2016. Adaptive Designs for Clinical Trials . N Engl J Med 375 ( 1 ), 65–74. [ PubMed ] [ Google Scholar ]
  • Bodenheimer T, Wagner EH, Grumbach K, 2002a. Improving primary care for patients with chronic illness . JAMA 288 ( 14 ), 1775–1779. [ PubMed ] [ Google Scholar ]
  • Bodenheimer T, Wagner EH, Grumbach K, 2002b. Improving primary care for patients with chronic illness: the chronic care model, Part 2 . JAMA 288 ( 15 ), 1909–1914. [ PubMed ] [ Google Scholar ]
  • Brown CA, Lilford RJ, 2006. The stepped wedge trial design: a systematic review . BMC medical research methodology 6 ( 1 ), 54. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Byiers BJ, Reichle J, Symons FJ, 2012. Single-subject experimental design for evidence-based practice . Am J Speech Lang Pathol 21 ( 4 ), 397–414. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Cheung YK, Chakraborty B, Davidson KW, 2015. Sequential multiple assignment randomized trial (SMART) with adaptive randomization for quality improvement in depression treatment program . Biometrics 71 ( 2 ), 450–459. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Collins LM, Dziak JJ, Kugler KC, Trail JB, 2014a. Factorial experiments: efficient tools for evaluation of intervention components . Am J Prev Med 47 ( 4 ), 498–504. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Collins LM, Dziak JJ, Li R, 2009. Design of experiments with multiple independent variables: a resource management perspective on complete and reduced factorial designs . Psychol Methods 14 ( 3 ), 202–224. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Collins LM, Murphy SA, Bierman KL, 2004. A conceptual framework for adaptive preventive interventions . Prev Sci 5 ( 3 ), 185–196. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Collins LM, Murphy SA, Nair VN, Strecher VJ, 2005. A strategy for optimizing and evaluating behavioral interventions . Ann Behav Med 30 ( 1 ), 65–73. [ PubMed ] [ Google Scholar ]
  • Collins LM, Murphy SA, Strecher V, 2007. The multiphase optimization strategy (MOST) and the sequential multiple assignment randomized trial (SMART): new methods for more potent eHealth interventions . Am J Prev Med 32 ( 5 Suppl ), S112–118. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Collins LM, Nahum-Shani I, Almirall D, 2014b. Optimization of behavioral dynamic treatment regimens based on the sequential, multiple assignment, randomized trial (SMART) . Clin Trials 11 ( 4 ), 426–434. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Coulton S, Perryman K, Bland M, Cassidy P, Crawford M, Deluca P, Drummond C, Gilvarry E, Godfrey C, Heather N, Kaner E, Myles J, Newbury-Birch D, Oyefeso A, Parrott S, Phillips T, Shenker D, Shepherd J, 2009. Screening and brief interventions for hazardous alcohol use in accident and emergency departments: a randomised controlled trial protocol . BMC Health Serv Res 9 , 114. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Cousins K, Connor JL, Kypri K, 2014. Effects of the Campus Watch intervention on alcohol consumption and related harm in a university population . Drug Alcohol Depend 143 , 120–126. [ PubMed ] [ Google Scholar ]
  • Curran GM, Bauer M, Mittman B, Pyne JM, Stetler C, 2012. Effectiveness-implementation hybrid designs: combining elements of clinical effectiveness and implementation research to enhance public health impact . Med Care 50 ( 3 ), 217–226. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Davey C, Hargreaves J, Thompson JA, Copas AJ, Beard E, Lewis JJ, Fielding KL, 2015. Analysis and reporting of stepped wedge randomised controlled trials: synthesis and critical appraisal of published studies, 2010 to 2014 . Trials 16 ( 1 ), 358. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Dimick JB, Ryan AM, 2014. Methods for evaluating changes in health care policy: the difference-in- differences approach . JAMA 312 ( 22 ), 2401–2402. [ PubMed ] [ Google Scholar ]
  • Eccles M, Grimshaw J, Campbell M, Ramsay C, 2003. Research designs for studies evaluating the effectiveness of change and improvement strategies . Qual Saf Health Care 12 ( 1 ), 47–52. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Fisher RA, 1925, July Theory of statistical estimation In Mathematical Proceedings of the Cambridge Philosophical Society (Vol. 22, No. 5, pp. 700–725). Cambridge University Press. [ Google Scholar ]
  • Fisher RA, 1935. The design of experiments . Oliver and Boyd, Edinburgh. [ Google Scholar ]
  • Fretheim A, Soumerai SB, Zhang F, Oxman AD, Ross-Degnan D, 2013. Interrupted time-series analysis yielded an effect estimate concordant with the cluster-randomized controlled trial result . Journal of Clinical Epidemiology 66 ( 8 ), 883–887. [ PubMed ] [ Google Scholar ]
  • Fretheim A, Zhang F, Ross-Degnan D, Oxman AD, Cheyne H, Foy R, Goodacre S, Herrin J, Kerse N, McKinlay RJ, Wright A, Soumerai SB, 2015. A reanalysis of cluster randomized trials showed interrupted time-series studies were valuable in health system evaluation . J Clin Epidemiol 68 ( 3 ), 324–333. [ PubMed ] [ Google Scholar ]
  • Gaglio B, Shoup JA, Glasgow RE, 2013. The RE-AIM framework: a systematic review of use over time . Am J Public Health 103 ( 6 ), e38–46. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Gebski V, Ellingson K, Edwards J, Jernigan J, Kleinbaum D, 2012. Modelling interrupted time series to evaluate prevention and control of infection in healthcare . Epidemiol Infect 140 ( 12 ), 2131–2141. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Glasgow RE, Vogt TM, Boles SM, 1999. Evaluating the public health impact of health promotion interventions: the RE-AIM framework . Am J Public Health 89 ( 9 ), 1322–1327. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hanani H, 1961. The existence and construction of balanced incomplete block designs . The Annals of Mathematical Statistics 32 ( 2 ), 361–386. [ Google Scholar ]
  • Hanbury A, Farley K, Thompson C, Wilson PM, Chambers D, Holmes H, 2013. Immediate versus sustained effects: interrupted time series analysis of a tailored intervention . Implement Sci 8 , 130. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Handley MA, Lyles CR, McCulloch C, Cattamanchi A, 2018. Selecting and Improving Quasi-Experimental Designs in Effectiveness and Implementation Research . Annu Rev Public Health 39 , 5–25. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hussey MA, Hughes JP, 2007. Design and analysis of stepped wedge cluster randomized trials . Contemp Clin Trials 28 ( 2 ), 182–191. [ PubMed ] [ Google Scholar ]
  • Kilbourne AM, Almirall D, Eisenberg D, Waxmonsky J, Goodrich DE, Fortney JC, Kirchner JE, Solberg LI, Main D, Bauer MS, Kyle J, Murphy SA, Nord KM, Thomas MR, 2014a. Protocol: Adaptive Implementation of Effective Programs Trial (ADEPT): cluster randomized SMART trial comparing a standard versus enhanced implementation strategy to improve outcomes of a mood disorders program . Implement Sci 9 , 132. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kilbourne AM, Almirall D, Goodrich DE, Lai Z, Abraham KM, Nord KM, Bowersox NW, 2014b. Enhancing outreach for persons with serious mental illness: 12-month results from a cluster randomized trial of an adaptive implementation strategy . Implement Sci 9 , 163. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kilbourne AM, Bramlet M, Barbaresso MM, Nord KM, Goodrich DE, Lai Z, Post EP, Almirall D, Verchinina L, Duffy SA, Bauer MS, 2014c. SMI life goals: description of a randomized trial of a collaborative care model to improve outcomes for persons with serious mental illness . Contemp Clin Trials 39 ( 1 ), 74–85. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kilbourne AM, Goodrich DE, Lai Z, Clogston J, Waxmonsky J, Bauer MS, 2012a. Life Goals Collaborative Care for patients with bipolar disorder and cardiovascular disease risk . Psychiatr Serv 63 ( 12 ), 1234–1238. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kilbourne AM, Goodrich DE, Nord KM, Van Poppelen C, Kyle J, Bauer MS, Waxmonsky JA, Lai Z, Kim HM, Eisenberg D, Thomas MR, 2015. Long-Term Clinical Outcomes from a Randomized Controlled Trial of Two Implementation Strategies to Promote Collaborative Care Attendance in Community Practices . Adm Policy Ment Health 42 ( 5 ), 642–653. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kilbourne AM, Neumann MS, Pincus HA, Bauer MS, Stall R, 2007. Implementing evidence-based interventions in health care: application of the replicating effective programs framework . Implement Sci 2 , 42. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kilbourne AM, Neumann MS, Waxmonsky J, Bauer MS, Kim HM, Pincus HA, Thomas M, 2012b. Public-academic partnerships: evidence-based implementation: the role of sustained community-based practice and research partnerships . Psychiatr Serv 63 ( 3 ), 205–207. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kilbourne AM, Post EP, Nossek A, Drill L, Cooley S, Bauer MS, 2008. Improving medical and psychiatric outcomes among individuals with bipolar disorder: a randomized controlled trial . Psychiatr Serv 59 ( 7 ), 760–768. [ PubMed ] [ Google Scholar ]
  • Kirchner JE, Ritchie MJ, Pitcock JA, Parker LE, Curran GM, Fortney JC, 2014. Outcomes of a partnered facilitation strategy to implement primary care-mental health . J Gen Intern Med 29 Suppl 4 , 904–912. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Lei H, Nahum-Shani I, Lynch K, Oslin D, Murphy SA, 2012. A “SMART” design for building individualized treatment sequences . Annu Rev Clin Psychol 8 , 21–48. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Lew RA, Miller CJ, Kim B, Wu H, Stolzmann K, Bauer MS, 2019. A robust method to reduce imbalance for site-level randomized controlled implementation trial designs . Implementation Sci , 14 , 46. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Morgan CJ, 2018. Reducing bias using propensity score matching . J Nucl Cardiol 25 ( 2 ), 404–406. [ PubMed ] [ Google Scholar ]
  • Morton V, Torgerson DJ, 2003. Effect of regression to the mean on decision making in health care . BMJ 326 ( 7398 ), 1083–1084. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Nahum-Shani I, Qian M, Almirall D, Pelham WE, Gnagy B, Fabiano GA, Waxmonsky JG, Yu J, Murphy SA, 2012. Experimental design and primary data analysis methods for comparing adaptive interventions . Psychol Methods 17 ( 4 ), 457–477. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • NeCamp T, Kilbourne A, Almirall D, 2017. Comparing cluster-level dynamic treatment regimens using sequential, multiple assignment, randomized trials: Regression estimation and sample size considerations . Stat Methods Med Res 26 ( 4 ), 1572–1589. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Neumann MS, Sogolow ED, 2000. Replicating effective programs: HIV/AIDS prevention technology transfer . AIDS Educ Prev 12 ( 5 Suppl ), 35–48. [ PubMed ] [ Google Scholar ]
  • Pallmann P, Bedding AW, Choodari-Oskooei B, Dimairo M, Flight L, Hampson LV, Holmes J, Mander AP, Odondi L.o., Sydes MR, Villar SS, Wason JMS, Weir CJ, Wheeler GM, Yap C, Jaki T, 2018. Adaptive designs in clinical trials: why use them, and how to run and report them . BMC medicine 16 ( 1 ), 29–29. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pape UJ, Millett C, Lee JT, Car J, Majeed A, 2013. Disentangling secular trends and policy impacts in health studies: use of interrupted time series analysis . J R Soc Med 106 ( 4 ), 124–129. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pellegrini CA, Hoffman SA, Collins LM, Spring B, 2014. Optimization of remotely delivered intensive lifestyle treatment for obesity using the Multiphase Optimization Strategy: Opt-IN study protocol . Contemp Clin Trials 38 ( 2 ), 251–259. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Penfold RB, Zhang F, 2013. Use of Interrupted Time Series Analysis in Evaluating Health Care Quality Improvements . Academic Pediatrics 13 ( 6, Supplement ), S38–S44. [ PubMed ] [ Google Scholar ]
  • Pridemore WA, Snowden AJ, 2009. Reduction in suicide mortality following a new national alcohol policy in Slovenia: an interrupted time-series analysis . Am J Public Health 99 ( 5 ), 915–920. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Proctor E, Silmere H, Raghavan R, Hovmand P, Aarons G, Bunger A, Griffey R, Hensley M, 2011. Outcomes for implementation research: conceptual distinctions, measurement challenges, and research agenda . Adm Policy Ment Health 38 ( 2 ), 65–76. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Robson D, Spaducci G, McNeill A, Stewart D, Craig TJK, Yates M, Szatkowski L, 2017. Effect of implementation of a smoke-free policy on physical violence in a psychiatric inpatient setting: an interrupted time series analysis . Lancet Psychiatry 4 ( 7 ), 540–546. [ PubMed ] [ Google Scholar ]
  • Schildcrout JS, Schisterman EF, Mercaldo ND, Rathouz PJ, Heagerty PJ, 2018. Extending the Case-Control Design to Longitudinal Data: Stratified Sampling Based on Repeated Binary Outcomes . Epidemiology 29 ( 1 ), 67–75. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Shadish WR, Cook Thomas D., Campbell Donald T, 2002. Experimental and quasi-experimental designs for generalized causal inference . Houghton Miffflin Company, Boston, MA. [ Google Scholar ]
  • Simon GE, Ludman EJ, Bauer MS, Unutzer J, Operskalski B, 2006. Long-term effectiveness and cost of a systematic care program for bipolar disorder . Arch Gen Psychiatry 63 ( 5 ), 500–508. [ PubMed ] [ Google Scholar ]
  • Stetler CB, Legro MW, Rycroft-Malone J, Bowman C, Curran G, Guihan M, Hagedorn H, Pineros S, Wallace CM, 2006. Role of “external facilitation” in implementation of research findings: a qualitative evaluation of facilitation experiences in the Veterans Health Administration . Implement Sci 1 , 23. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Taljaard M, McKenzie JE, Ramsay CR, Grimshaw JM, 2014. The use of segmented regression in analysing interrupted time series studies: an example in pre-hospital ambulance care . Implement Sci 9 , 77. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Wagner AK, Soumerai SB, Zhang F, Ross-Degnan D, 2002. Segmented regression analysis of interrupted time series studies in medication use research . J Clin Pharm Ther 27 ( 4 ), 299–309. [ PubMed ] [ Google Scholar ]
  • Wagner EH, Austin BT, Von Korff M, 1996. Organizing care for patients with chronic illness . Milbank Q 74 ( 4 ), 511–544. [ PubMed ] [ Google Scholar ]
  • Wyrick DL, Rulison KL, Fearnow-Kenney M, Milroy JJ, Collins LM, 2014. Moving beyond the treatment package approach to developing behavioral interventions: addressing questions that arose during an application of the Multiphase Optimization Strategy (MOST) . Transl Behav Med 4 ( 3 ), 252–259. [ PMC free article ] [ PubMed ] [ Google Scholar ]

Enago Academy

Experimental Research Design — 6 mistakes you should never make!

' src=

Since school days’ students perform scientific experiments that provide results that define and prove the laws and theorems in science. These experiments are laid on a strong foundation of experimental research designs.

An experimental research design helps researchers execute their research objectives with more clarity and transparency.

In this article, we will not only discuss the key aspects of experimental research designs but also the issues to avoid and problems to resolve while designing your research study.

Table of Contents

What Is Experimental Research Design?

Experimental research design is a framework of protocols and procedures created to conduct experimental research with a scientific approach using two sets of variables. Herein, the first set of variables acts as a constant, used to measure the differences of the second set. The best example of experimental research methods is quantitative research .

Experimental research helps a researcher gather the necessary data for making better research decisions and determining the facts of a research study.

When Can a Researcher Conduct Experimental Research?

A researcher can conduct experimental research in the following situations —

  • When time is an important factor in establishing a relationship between the cause and effect.
  • When there is an invariable or never-changing behavior between the cause and effect.
  • Finally, when the researcher wishes to understand the importance of the cause and effect.

Importance of Experimental Research Design

To publish significant results, choosing a quality research design forms the foundation to build the research study. Moreover, effective research design helps establish quality decision-making procedures, structures the research to lead to easier data analysis, and addresses the main research question. Therefore, it is essential to cater undivided attention and time to create an experimental research design before beginning the practical experiment.

By creating a research design, a researcher is also giving oneself time to organize the research, set up relevant boundaries for the study, and increase the reliability of the results. Through all these efforts, one could also avoid inconclusive results. If any part of the research design is flawed, it will reflect on the quality of the results derived.

Types of Experimental Research Designs

Based on the methods used to collect data in experimental studies, the experimental research designs are of three primary types:

1. Pre-experimental Research Design

A research study could conduct pre-experimental research design when a group or many groups are under observation after implementing factors of cause and effect of the research. The pre-experimental design will help researchers understand whether further investigation is necessary for the groups under observation.

Pre-experimental research is of three types —

  • One-shot Case Study Research Design
  • One-group Pretest-posttest Research Design
  • Static-group Comparison

2. True Experimental Research Design

A true experimental research design relies on statistical analysis to prove or disprove a researcher’s hypothesis. It is one of the most accurate forms of research because it provides specific scientific evidence. Furthermore, out of all the types of experimental designs, only a true experimental design can establish a cause-effect relationship within a group. However, in a true experiment, a researcher must satisfy these three factors —

  • There is a control group that is not subjected to changes and an experimental group that will experience the changed variables
  • A variable that can be manipulated by the researcher
  • Random distribution of the variables

This type of experimental research is commonly observed in the physical sciences.

3. Quasi-experimental Research Design

The word “Quasi” means similarity. A quasi-experimental design is similar to a true experimental design. However, the difference between the two is the assignment of the control group. In this research design, an independent variable is manipulated, but the participants of a group are not randomly assigned. This type of research design is used in field settings where random assignment is either irrelevant or not required.

The classification of the research subjects, conditions, or groups determines the type of research design to be used.

experimental research design

Advantages of Experimental Research

Experimental research allows you to test your idea in a controlled environment before taking the research to clinical trials. Moreover, it provides the best method to test your theory because of the following advantages:

  • Researchers have firm control over variables to obtain results.
  • The subject does not impact the effectiveness of experimental research. Anyone can implement it for research purposes.
  • The results are specific.
  • Post results analysis, research findings from the same dataset can be repurposed for similar research ideas.
  • Researchers can identify the cause and effect of the hypothesis and further analyze this relationship to determine in-depth ideas.
  • Experimental research makes an ideal starting point. The collected data could be used as a foundation to build new research ideas for further studies.

6 Mistakes to Avoid While Designing Your Research

There is no order to this list, and any one of these issues can seriously compromise the quality of your research. You could refer to the list as a checklist of what to avoid while designing your research.

1. Invalid Theoretical Framework

Usually, researchers miss out on checking if their hypothesis is logical to be tested. If your research design does not have basic assumptions or postulates, then it is fundamentally flawed and you need to rework on your research framework.

2. Inadequate Literature Study

Without a comprehensive research literature review , it is difficult to identify and fill the knowledge and information gaps. Furthermore, you need to clearly state how your research will contribute to the research field, either by adding value to the pertinent literature or challenging previous findings and assumptions.

3. Insufficient or Incorrect Statistical Analysis

Statistical results are one of the most trusted scientific evidence. The ultimate goal of a research experiment is to gain valid and sustainable evidence. Therefore, incorrect statistical analysis could affect the quality of any quantitative research.

4. Undefined Research Problem

This is one of the most basic aspects of research design. The research problem statement must be clear and to do that, you must set the framework for the development of research questions that address the core problems.

5. Research Limitations

Every study has some type of limitations . You should anticipate and incorporate those limitations into your conclusion, as well as the basic research design. Include a statement in your manuscript about any perceived limitations, and how you considered them while designing your experiment and drawing the conclusion.

6. Ethical Implications

The most important yet less talked about topic is the ethical issue. Your research design must include ways to minimize any risk for your participants and also address the research problem or question at hand. If you cannot manage the ethical norms along with your research study, your research objectives and validity could be questioned.

Experimental Research Design Example

In an experimental design, a researcher gathers plant samples and then randomly assigns half the samples to photosynthesize in sunlight and the other half to be kept in a dark box without sunlight, while controlling all the other variables (nutrients, water, soil, etc.)

By comparing their outcomes in biochemical tests, the researcher can confirm that the changes in the plants were due to the sunlight and not the other variables.

Experimental research is often the final form of a study conducted in the research process which is considered to provide conclusive and specific results. But it is not meant for every research. It involves a lot of resources, time, and money and is not easy to conduct, unless a foundation of research is built. Yet it is widely used in research institutes and commercial industries, for its most conclusive results in the scientific approach.

Have you worked on research designs? How was your experience creating an experimental design? What difficulties did you face? Do write to us or comment below and share your insights on experimental research designs!

Frequently Asked Questions

Randomization is important in an experimental research because it ensures unbiased results of the experiment. It also measures the cause-effect relationship on a particular group of interest.

Experimental research design lay the foundation of a research and structures the research to establish quality decision making process.

There are 3 types of experimental research designs. These are pre-experimental research design, true experimental research design, and quasi experimental research design.

The difference between an experimental and a quasi-experimental design are: 1. The assignment of the control group in quasi experimental research is non-random, unlike true experimental design, which is randomly assigned. 2. Experimental research group always has a control group; on the other hand, it may not be always present in quasi experimental research.

Experimental research establishes a cause-effect relationship by testing a theory or hypothesis using experimental groups or control variables. In contrast, descriptive research describes a study or a topic by defining the variables under it and answering the questions related to the same.

' src=

good and valuable

Very very good

Good presentation.

Rate this article Cancel Reply

Your email address will not be published.

experimental research design article

Enago Academy's Most Popular Articles

10 Tips to Prevent Research Papers From Being Retracted

  • Publishing Research

10 Tips to Prevent Research Papers From Being Retracted

Research paper retractions represent a critical event in the scientific community. When a published article…

2024 Scholar Metrics: Unveiling research impact (2019-2023)

  • Industry News

Google Releases 2024 Scholar Metrics, Evaluates Impact of Scholarly Articles

Google has released its 2024 Scholar Metrics, assessing scholarly articles from 2019 to 2023. This…

What is Academic Integrity and How to Uphold it [FREE CHECKLIST]

Ensuring Academic Integrity and Transparency in Academic Research: A comprehensive checklist for researchers

Academic integrity is the foundation upon which the credibility and value of scientific findings are…

7 Step Guide for Optimizing Impactful Research Process

  • Reporting Research

How to Optimize Your Research Process: A step-by-step guide

For researchers across disciplines, the path to uncovering novel findings and insights is often filled…

Launch of "Sony Women in Technology Award with Nature"

  • Trending Now

Breaking Barriers: Sony and Nature unveil “Women in Technology Award”

Sony Group Corporation and the prestigious scientific journal Nature have collaborated to launch the inaugural…

Choosing the Right Analytical Approach: Thematic analysis vs. content analysis for…

Comparing Cross Sectional and Longitudinal Studies: 5 steps for choosing the right…

experimental research design article

Sign-up to read more

Subscribe for free to get unrestricted access to all our resources on research writing and academic publishing including:

  • 2000+ blog articles
  • 50+ Webinars
  • 10+ Expert podcasts
  • 50+ Infographics
  • 10+ Checklists
  • Research Guides

We hate spam too. We promise to protect your privacy and never spam you.

  • AI in Academia
  • Promoting Research
  • Career Corner
  • Diversity and Inclusion
  • Infographics
  • Expert Video Library
  • Other Resources
  • Enago Learn
  • Upcoming & On-Demand Webinars
  • Peer-Review Week 2023
  • Open Access Week 2023
  • Conference Videos
  • Enago Report
  • Journal Finder
  • Enago Plagiarism & AI Grammar Check
  • Editing Services
  • Publication Support Services
  • Research Impact
  • Translation Services
  • Publication solutions
  • AI-Based Solutions
  • Thought Leadership
  • Call for Articles
  • Call for Speakers
  • Author Training
  • Edit Profile

I am looking for Editing/ Proofreading services for my manuscript Tentative date of next journal submission:

experimental research design article

In your opinion, what is the most effective way to improve integrity in the peer review process?

bioRxiv

Optimising experimental designs for model selection of ion channel drug binding mechanisms

  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Frankie Patten-Elliott
  • ORCID record for Chon Lok R Lei
  • ORCID record for Simon P Preston
  • ORCID record for Richard D Wilkinson
  • ORCID record for Gary R Mirams
  • For correspondence: [email protected]
  • Info/History
  • Supplementary material
  • Preview PDF

The rapid delayed rectifier current carried by the human Ether-à-go-go-Related Gene (hERG) channel is susceptible to drug-induced reduction which can lead to an increased risk of cardiac arrhythmia. Establishing the mechanism by which a specific drug compound binds to hERG can help to reduce uncertainty when quantifying pro-arrhythmic risk. In this study, we introduce a methodology for optimising experimental voltage protocols to produce data that enable different proposed models for the drug-binding mechanism to be distinguished. We demonstrate the performance of this methodology via a synthetic data study. If the underlying model of hERG current is known exactly, then the optimised protocols generated show noticeable improvements in our ability to select the true model when compared to a simple protocol used in previous studies. However, if the model is not known exactly, and we assume a discrepancy between the data-generating hERG model and the hERG model used in fitting the models, then the optimised protocols become less effective in determining the 'true' binding dynamics. While the introduced methodology shows promise, we must be careful to ensure that, if applied in a real data study, we have a well-calibrated model of hERG current gating.

Competing Interest Statement

The authors have declared no competing interest.

https://github.com/CardiacModelling/binding_model_OED

View the discussion thread.

Supplementary Material

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Twitter logo

Citation Manager Formats

  • EndNote (tagged)
  • EndNote 8 (xml)
  • RefWorks Tagged
  • Ref Manager
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Pharmacology and Toxicology
  • Animal Behavior and Cognition (5539)
  • Biochemistry (12595)
  • Bioengineering (9478)
  • Bioinformatics (30899)
  • Biophysics (15886)
  • Cancer Biology (12965)
  • Cell Biology (18559)
  • Clinical Trials (138)
  • Developmental Biology (10026)
  • Ecology (15008)
  • Epidemiology (2067)
  • Evolutionary Biology (19197)
  • Genetics (12768)
  • Genomics (17573)
  • Immunology (12719)
  • Microbiology (29774)
  • Molecular Biology (12400)
  • Neuroscience (64892)
  • Paleontology (480)
  • Pathology (2008)
  • Pharmacology and Toxicology (3468)
  • Physiology (5355)
  • Plant Biology (11125)
  • Scientific Communication and Education (1729)
  • Synthetic Biology (3065)
  • Systems Biology (7702)
  • Zoology (1732)

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Published: 19 August 2024

A safety guide for transgenic Cre drivers in metabolism

  • Carla Horvath   ORCID: orcid.org/0000-0003-2660-4710 1 ,
  • Christian Wolfrum   ORCID: orcid.org/0000-0002-3862-6805 1 &
  • Pawel Pelczar   ORCID: orcid.org/0000-0003-0189-6868 2  

Nature Metabolism ( 2024 ) Cite this article

136 Accesses

6 Altmetric

Metrics details

  • Genetic engineering

Despite the high utility and widespread use of Cre driver lines, lack of Cre specificity, Cre-induced toxicity or poor experimental design can affect experimental results and conclusions. Such pitfalls must be considered before embarking on any Cre-based studies in metabolic research.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 12 digital issues and online access to articles

111,21 € per year

only 9,27 € per issue

Buy this article

  • Purchase on SpringerLink
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

experimental research design article

Luo, L. et al. Neuron 106 , 37–65.e5 (2020).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Vooijs, M., Jonkers, J. & Berns, A. EMBO Rep. 2 , 292–297 (2001).

Lee, J.-Y. et al. J. Biol. Chem. 281 , 2649–2653 (2006).

Article   CAS   PubMed   Google Scholar  

Rashbrook, V. S., Brash, J. T. & Ruhrberg, C. Nat. Cardiovasc. Res. 1 , 806–816 (2022).

Article   PubMed   PubMed Central   Google Scholar  

Jeffery, E. et al. Adipocyte 3 , 206–211 (2014).

Kim, K. et al. Mol. Metab. 84 , 101948 (2024).

Ye, R. et al. Mol. Metab. 4 , 771–778 (2015).

Moutier, R., Tchang, F., Caucheteux, S. M. & Kanellopoulos-Langevin, C. Transgenic Res. 12 , 369–373 (2003).

Moullan, N. et al. Cell Rep. 10 , 1681–1691 (2015).

Hwang, I. et al. FASEB J. 29 , 2397–2411 (2015).

Brinster, R. L., Chen, H. Y., Trumbauer, M. E., Yagle, M. K. & Palmiter, R. D. Proc. Natl Acad. Sci. USA 82 , 4438–4442 (1985).

Woychik, R. P. & Alagramam, K. Int. J. Dev. Biol. 42 , 1009–1017 (1998).

CAS   PubMed   Google Scholar  

Yang, H. et al. Cell 154 , 1370–1379 (2013).

Kim, J. S. et al. Immunity 54 , 176–190.e7 (2021).

Weng, W., Liu, X., Lui, K. O. & Zhou, B. Trends Cell Biol. 32 , 324–337 (2022).

Download references

Author information

Authors and affiliations.

Institute of Food, Nutrition and Health, ETH Zurich, Schwerzenbach, Switzerland

Carla Horvath & Christian Wolfrum

Center for Transgenic Models, University of Basel, Basel, Switzerland

Pawel Pelczar

You can also search for this author in PubMed   Google Scholar

Contributions

C.H., C.W. and P.P. conceptualized, wrote, edited and revised the manuscript. C.H. and P.P. designed the workflow with advice from C.W.

Corresponding authors

Correspondence to Carla Horvath , Christian Wolfrum or Pawel Pelczar .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Rights and permissions

Reprints and permissions

About this article

Cite this article.

Horvath, C., Wolfrum, C. & Pelczar, P. A safety guide for transgenic Cre drivers in metabolism. Nat Metab (2024). https://doi.org/10.1038/s42255-024-01087-8

Download citation

Published : 19 August 2024

DOI : https://doi.org/10.1038/s42255-024-01087-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

experimental research design article

Different types of textual cues in educational animations: Effect on science learning outcomes, cognitive load, and self-efficacy among elementary students

  • Published: 21 August 2024

Cite this article

experimental research design article

  • Lei Du   ORCID: orcid.org/0000-0002-2578-5452 1 ,
  • Xiaoyu Tang 1 &
  • Jingying Wang 1  

Educational animation, recognized for its potential accessibility and engaging qualities, has become increasingly prevalent in classroom instruction. However, not all educational animations exhibit high quality or significantly enhance learning outcomes. This study addresses the need for optimizing educational animation design to enhance student learning outcomes and experiences, employing the construction-integration model. We developed three types of educational animations: subtitled textual cue (STC), keyword textual cue (KTC), and structured textual cue (CTC). Through a quasi-experimental research design, 257 fifth-grade students were assigned to three groups, each exposed to one type of textual cue. The results indicate that CTC leads to superior achievement, knowledge retention, higher self-efficacy, and the lowest cognitive load. In comparison, KTC demonstrates moderate results, while STC yields the poorest outcomes. Furthermore, there is a significant negative correlation between achievement and cognitive load, and a significant positive correlation between achievement and self-efficacy. Additionally, there is a significant positive correlation between the "faded effect" of knowledge retention and self-efficacy. These findings highlight the superior learning outcomes and experiences associated with CTC. Based on these findings, recommendations are provided for future educational animation design and instructional practices.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

experimental research design article

Explore related subjects

  • Artificial Intelligence
  • Digital Education and Educational Technology

Data availability

The datasets used and analyzed in the current study are available on reasonable request.

Acuna, S. R., Rodicio, H. G., & Sanchez, E. (2011). Fostering active processing of instructional explanations of learners with high and low prior knowledge. European Journal of Psychology of Education, 26 (4), 435–452. https://doi.org/10.1007/s10212-010-0049-y

Google Scholar  

Alpizar, D., Adesope, O. O., & Wong, R. M. (2020). A meta-analysis of signaling principle in multimedia learning environments. Educational Technology Research and Development , 1–25. https://doi.org/10.1007/s11423-020-09748-7

Arslan-Ari, I., Crooks, S. M., & Ari, F. (2020). How much cueing is needed in instructional animations? The role of prior knowledge. Journal of Science Education and Technology, 29 (5), 666–676. https://doi.org/10.1007/s10956-020-09845-5

Barak, M., Ashkar, T., & Dori, Y. J. (2011). Learning science via animated movies: Its effect on students’ thinking and motivation. Computers & Education, 56 (3), 839–846. https://doi.org/10.1016/j.compedu.2010.10.025

Berney, S., & Betrancourt, M. (2016). Does animation enhance learning? A meta-analysis. Computers & Education, 101 , 150–167. https://doi.org/10.1016/j.compedu.2016.06.005

Castro-Alonso, et al. (2019). Gender imbalance in instructional dynamic versus static visualizations: A meta-analysis. Educational Psychology Review, 31 , 361–387. https://doi.org/10.1007/s10648-019-09469-1

Castro-Alonso, J. C., Ayres, P., Wong, M., & Paas, F. (2018). Learning symbols from permanent and transient visual presentations: Don’t overplay the hand. Computers & Education, 116 , 1–13. https://doi.org/10.1016/j.compedu.2017.08.011

Clark, R. C., & Mayer, R. E. (2016). e-Learning and the Science of Instruction: Proven Guidelines for Consumers and Designers of Multimedia Learning . John Wiley & Sons. https://doi.org/10.1002/9781119239086

De Koning, B. B., & Jarodzka, H. (2017). Attention guidance strategies for supporting learning from dynamic visualizations. In Richard Lowe & Rolf Ploetzner (Eds.), Learning from dynamic visualization (pp. 255–278). Springer. https://doi.org/10.1007/978-3-319-56204-9_11

De Koning, B. B., Tabbers, H. K., Rikers, R. M., & Paas, F. (2009). Towards a framework for attention cueing in instructional animations: Guidelines for research and design. Educational Psychology Review, 21 , 113–140. https://doi.org/10.1007/s10648-009-9098-7

Dunning, D. (2011). The Dunning-Kruger effect: On being ignorant of one’s own ignorance. Advances in Experimental Social Psychology, 44 , 247–296. https://doi.org/10.1016/B978-0-12-385522-0.00005-6

Ebbinghaus, H. (2013). Memory: A contribution to experimental psychology. Annals of Neurosciences, 20 (4), 155–156. https://doi.org/10.5214/ans.0972.7531.200408

Granic, A. (2022). Educational technology adoption: A systematic review. Education and Information Technologies, 27 (7), 9725–9744. https://doi.org/10.1007/s10639-022-10951-7

Hwang, G. J., Yang, L. H., & Wang, S. Y. (2013). A concept map-embedded educational computer game for improving students’ learning performance in natural science courses. Computers & Education, 69 , 121–130. https://doi.org/10.1016/j.compedu.2013.07.008

Karlsson, G. (2010). Animation and grammar in science education: Learners’ construal of animated educational software. International Journal of Computer Supported Collaborative Learning, 5 (2), 167–189. https://doi.org/10.1007/s11412-010-9085-5

Kintsch, W. (1998). Comprehension: A paradigm for cognition . Cambridge University Press.

Kintsch, W. (2019). Revisiting the construction–integration model of text comprehension and its implications for instruction. In D. E. Alvermann, N. J. Unrau, & R. B. Ruddell (Eds.), Theoretical models and processes of reading (7th ed., pp. 178–203). International Reading Association. https://doi.org/10.4324/9781315110592-12

Krieglstein, F., Schneider, S., Gröninger, J., Beege, M., Nebel, S., Wesenberg, L., Suren, M., & Rey, G. D. (2023). Exploring the effects of content-related segmentations and metacognitive cues on learning with whiteboard animations. Computers & Education, 194 ,. https://doi.org/10.1016/j.compedu.2022.104702

Kriz, S., & Hegarty, M. (2007). Top-down and bottom-up influences on learning from animations. International Journal of Human-Computer Studies, 65 (11), 911–930. https://doi.org/10.1016/j.ijhcs.2007.06.005

Lin, T. J. (2021). Multi-dimensional explorations into the relationships between high school students’ science learning self-efficacy and engagement. International Journal of Science Education, 43 (8), 1193–1207. https://doi.org/10.1080/09500693.2021.1904523

Lowe, R. K., & Schnotz, W. (2014). Animation principles in multimedia learning. In R. E. Mayer (Ed.), The Cambridge handbook of multimedia learning (pp. 513–546). Cambridge University Press. https://doi.org/10.1017/CBO9781139547369.026

Matthew, G. (2020). The effect of adding same-language subtitles to recorded lectures for non-native, English speakers in e-learning environments. Research in Learning Technology, 28 , 2340–2355. https://doi.org/10.25304/rlt.v28.2340

Mayer, R. E. (2024). The past, present, and future of the cognitive theory of multimedia learning. Educational Psychology Review, 36 , 8. https://doi.org/10.1007/s10648-023-09842-1

Mayer, R. E., & Pilegard, C. (2014). Principles for managing essential processing in multimedia learning: Segmenting, pre-training, and modality principles. In R. E. Mayer (Ed.), The Cambridge handbook of multimedia learning (pp. 316–344). Cambridge University Press. https://doi.org/10.1017/CBO9781139547369.016

Ministry of Education (MOE). (2022). Science curriculum standards for compulsory education . Beijing Normal University Press. (In Chinese).

Nguyen, H. H., Do Trung, K., Duc, L. N., et al. (2024). A model to create a personalized online course based on the student’s learning styles. Education and Information Technologies, 29 , 571–593. https://doi.org/10.1007/s10639-023-12287-2

Paas, F., Van Gerven, P. W. M., & Wouters, P. (2007). Instructional efficiency of animation: Effects of interactivity through mental reconstruction of static key frames. Applied Cognitive Psychology, 21 , 783–793. https://doi.org/10.1002/acp.1349

Paas, F. G. W. C., & Van Merriënboer, J. J. G. (1994). Instructional control of cognitive load in the training of complex cognitive tasks. Educational Psychology Review, 6 (4), 351–371. https://doi.org/10.1007/BF02213420

Pintrich, P. R., Smith, D. A. F., Garcia, T., & McKeachie, W. J. (1991) . A manual for the use of the motivated strategies questionnaire (MSLQ) . https://doi.org/10.13140/RG.2.1.2547.6968

Ploetzner, R., Berney, S., & Bétrancourt, M. (2021). When learning from animations is more successful than learning from static pictures: Learning the specifics of change. Instructional Science, 49 , 497–514. https://doi.org/10.1007/s11251-021-09541-w

Ploetzner, R., & Lowe, R. (2012). A systematic characterisation of expository animations. Computers in Human Behavior, 28 (3), 781–794. https://doi.org/10.1016/j.chb.2011.12.001

Richter, J., Scheiter, K., & Eitel, A. (2018). Signaling text–picture relations in multimedia learning: The influence of prior knowledge. Journal of Educational Psychology, 110 (4), 544–560. https://doi.org/10.1037/edu0000220

Rop Schüler, A., Verkoeijen, P. P., Scheiter, K., & Gog, T. V. (2018). The effect of layout and pacing on learning from diagrams with unnecessary text. Applied Cognitive Psychology, 32 (5), 610–621. https://doi.org/10.1002/acp.3445

Semilarski, H., Soobard, R., Holbrook, J., & Rannikmäe, M. (2022). Expanding disciplinary and interdisciplinary core idea maps by students to promote perceived self-efficacy in learning science. International Journal of STEM Education, 9 (1), 1–20. https://doi.org/10.1186/s40594-022-00374-8

Sweller, J. (2010). Element interactivity and intrinsic, extraneous, and germane cognitive load. Educational Psychology Review, 22 (2), 123–138. https://doi.org/10.1007/s10648-010-9128-5

Tabbers, H. K., & De Koeijer, B. (2010). Learner control in animated multimedia instructions. Instructional Science, 38 , 441–453. https://doi.org/10.1007/s11251-009-9119-4

Tarchi, C., Zaccoletti, S., & Mason, L. (2021). Learning from text, video, or subtitles: A comparative analysis. Computers and Education, 160 ,. https://doi.org/10.1016/j.compedu.2020.104034

Toni, H., Jaclyn, B., & Matthew, F. (2023). The self-efficacy and academic performance reciprocal relationship: The influence of task difficulty and baseline achievement on learner trajectory. Higher Education Research & Development, 42 (8), 1936–1953. https://doi.org/10.1080/07294360.2023.2197194

Tosun, C. (2022). Analysis of the last 40 years of science education research via bibliometric methods. Science and Education , 1–30. https://doi.org/10.1007/s11191-022-00400-9

Türkay, S. (2016). The effects of whiteboard animations on retention and subjective experiences when learning advanced physics topics. Computers & Education, 98 , 102–114. https://doi.org/10.1016/j.compedu.2016.03.004

Wang, F., Zhao, T., Mayer, R. E., & Wang, Y. (2020). Guiding the learner’s cognitive processing of a narrated animation. Learning and Instruction, 69 , 1–12. https://doi.org/10.1016/j.learninstruc.2020.101357

Yilmaz. (2023). Effects of using cueing in instructional animations on learning and cognitive load level of elementary students in science education. Interactive Learning Environments, 31 (3), 1727–1741. https://doi.org/10.1080/10494820.2020.1857784

Download references

“Big Data Education Evaluation Research”, the priority topic of the 14th Five-Year Plan for Education Science of Beijing, China in 2022 (No.CDEA22008).

Author information

Authors and affiliations.

Faculty of Education, Beijing Normal University, Beijing, P. R. China

Lei Du, Xiaoyu Tang & Jingying Wang

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Jingying Wang .

Ethics declarations

Ethical approval.

All procedures performed in studies involving human participants followed the ethical standards of the institutional and national research committees. Participation in this study was voluntary. All participants were informed about the study and signed a free informed consent.

Conflict of Interest

There is no potential conflict of interest in this study.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Du, L., Tang, X. & Wang, J. Different types of textual cues in educational animations: Effect on science learning outcomes, cognitive load, and self-efficacy among elementary students. Educ Inf Technol (2024). https://doi.org/10.1007/s10639-024-12929-z

Download citation

Received : 03 December 2023

Accepted : 25 July 2024

Published : 21 August 2024

DOI : https://doi.org/10.1007/s10639-024-12929-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Educational animation
  • Self-efficacy
  • Cognitive load
  • Learning outcomes
  • Multimedia learning
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. Experimental Research Design Example Pdf

    experimental research design article

  2. Experimental Research: What it is + Types of designs

    experimental research design article

  3. Experimental Study Design: Research, Types of Design, Methods and

    experimental research design article

  4. Experimental Research Designs: Types, Examples & Advantages (2023)

    experimental research design article

  5. Experimental Research Design With Examples

    experimental research design article

  6. (PDF) Experimental Research Design-types & process

    experimental research design article

COMMENTS

  1. Study/Experimental/Research Design: Much More Than Statistics

    Study, experimental, or research design is the backbone of good research. It directs the experiment by orchestrating data collection, defines the statistical analysis of the resultant data, and guides the interpretation of the results. When properly described in the written report of the experiment, it serves as a road map to readers, 1 helping ...

  2. Experimental Research Design

    Abstract. Experimental research design is centrally concerned with constructing research that is high in causal (internal) validity. Randomized experimental designs provide the highest levels of causal validity. Quasi-experimental designs have a number of potential threats to their causal validity. Yet, new quasi-experimental designs adopted ...

  3. Exploring Experimental Research: Methodologies, Designs, and

    Experimental research serves as a fundamental scientific method aimed at unraveling. cause-and-effect relationships between variables across various disciplines. This. paper delineates the key ...

  4. Guide to Experimental Design

    Table of contents. Step 1: Define your variables. Step 2: Write your hypothesis. Step 3: Design your experimental treatments. Step 4: Assign your subjects to treatment groups. Step 5: Measure your dependent variable. Other interesting articles. Frequently asked questions about experiments.

  5. (PDF) An Introduction to Experimental Design Research

    P. Cash et al. (eds.), Experimental Design Research, DOI 10.1007/978-3-319-33781-4_1. Abstract Design research brings together influences from the whole gamut of. social, psychological, and more ...

  6. Experimental Design

    An understanding of how experimental design can impinge upon the integrity of research requires a shift from mid-level discussions about experimental design to high-level discourse concerning the causal assumptions held by proponents of experimental design (Guala 2009; Cartwright 2014).From there, it is possible to identify and analyze the problems of causal inference that arise particularly ...

  7. An Introduction to Experimental Design Research

    Abstract. Design research brings together influences from the whole gamut of social, psychological, and more technical sciences to create a tradition of empirical study stretching back over 50 years (Horvath 2004; Cross 2007 ). A growing part of this empirical tradition is experimental, which has gained in importance as the field has matured.

  8. Experimental Research Design

    This chapter addresses experimental research designs' peculiarities, characteristics, and significant fallacies. Experiments have a long and important history in the social, natural, and medicinal sciences. Unfortunately, in business and management, this looks different. This is astounding, as experiments are suitable for analyzing cause-and ...

  9. Experimental Research Design

    However, the term "research design" typically does not refer to the issues discussed above. The term "experimental research design" is centrally concerned with constructing research that is high in causal (or internal) validity. Causal validity concerns the accuracy of statements regarding cause and efect relationships.

  10. Study/experimental/research design: much more than statistics

    Abstract. Context: The purpose of study, experimental, or research design in scientific manuscripts has changed significantly over the years. It has evolved from an explanation of the design of the experiment (ie, data gathering or acquisition) to an explanation of the statistical analysis. This practice makes "Methods" sections hard to read ...

  11. Quantitative Research Excellence: Study Design and Reliable and Valid

    Experimental and quasi-experimental designs for research. Houghton Mifflin. Google Scholar. Chapman D. J., Doughty K., Mullin E. M., Pérez-Escamilla R. (2016). Reliability of lactation assessment tools applied to overweight and obese women. ... Experimental and quasi-experimental designs for generalized causal inference. Wadsworth Cengage ...

  12. Beauty sleep: experimental study on the perceived health and

    Methods. Using an experimental design we photographed the faces of 23 adults (mean age 23, range 18-31 years, 11 women) between 14.00 and 15.00 under two conditions in a balanced design: after a normal night's sleep (at least eight hours of sleep between 23.00-07.00 and seven hours of wakefulness) and after sleep deprivation (sleep 02.00-07.00 and 31 hours of wakefulness).

  13. Experimental Design: Definition and Types

    An experimental design is a detailed plan for collecting and using data to identify causal relationships. Through careful planning, the design of experiments allows your data collection efforts to have a reasonable chance of detecting effects and testing hypotheses that answer your research questions. An experiment is a data collection ...

  14. (PDF) Experimental Research Design-types & process

    Experimental design is the process of carrying out research in an objective and controlled fashion. so that precision is maximized and specific conclusions can be drawn regarding a hypothesis ...

  15. Experimental and Quasi-Experimental Designs in Implementation Research

    Quasi-experimental designs allow implementation scientists to conduct rigorous studies in these contexts, albeit with certain limitations. We briefly review the characteristics of these designs here; other recent review articles are available for the interested reader (e.g. Handley et al., 2018 ). 2.1.

  16. Experimental Research Designs: Types, Examples & Advantages

    An experimental research design helps researchers execute their research objectives with more clarity and transparency. In this article, we will not only discuss the key aspects of experimental research designs but also the issues to avoid and problems to resolve while designing your research study.

  17. Micro-Randomized Trials in Information Systems Research: An

    We introduce an experimental design, called a Micro-Randomized-Trial (MRT), and propose that it is widely applicable in IS research for examining complex and dynamic IS research phenomena. MRTs allow for analyzing causalities and testing theories with dynamic components by considering how time-varying personal and contextual factors influence ...

  18. PDF Experimental Design 1

    Experimental Design. An experiment is "that portion of research in which variables are. served" (Campbell &S. nley, 1963, p. 171). Or stated another way, experime. ts are concerned withan independent variable (IV) that causes or predicts the outcome of the de.

  19. Experimental Research Design

    12.2 Particularities of Experimental Research. In this section, we specifically address the elements that make experimental research a discrete research design differentiated from others. Next to the characteristics of experimental research, we address the main issues and decisions to be made within this research design, and the major pitfalls.

  20. Full article: Applied research by design: an experimental collaborative

    Applied research by design: an experimental collaborative and interdisciplinary design charrette. Michael Neuman a School of Architecture and Cities, University of Westminster, ... This article reports on one experimental case of interdisciplinary collaboration on a design and planning exercise across several scales - local through urban to ...

  21. Use of Quasi-Experimental Research Designs in Education Research

    In the past few decades, we have seen a rapid proliferation in the use of quasi-experimental research designs in education research. This trend, stemming in part from the "credibility revolution" in the social sciences, particularly economics, is notable along with the increasing use of randomized controlled trials in the strive toward rigorous causal inference.

  22. (PDF) Basics of Research Design: A Guide to selecting appropriate

    for validity and reliability. Design is basically concerned with the aims, uses, purposes, intentions and plans within the. pr actical constraint of location, time, money and the researcher's ...

  23. Optimising experimental designs for model selection of ion ...

    The rapid delayed rectifier current carried by the human Ether-à-go-go-Related Gene (hERG) channel is susceptible to drug-induced reduction which can lead to an increased risk of cardiac arrhythmia. Establishing the mechanism by which a specific drug compound binds to hERG can help to reduce uncertainty when quantifying pro-arrhythmic risk. In this study, we introduce a methodology for ...

  24. Full article: Understanding coping with the climate crisis: an

    Research Article. Understanding coping with the climate crisis: an experimental study with young people on agency and mental health. ... In sum, our experimental design contributes to the so-far limited body of research on coping by indicating that agency could be a relevant mechanism of change with a focus on avoiding low agency communication.

  25. A safety guide for transgenic Cre drivers in metabolism

    Despite the high utility and widespread use of Cre driver lines, lack of Cre specificity, Cre-induced toxicity or poor experimental design can affect experimental results and conclusions. Such ...

  26. (PDF) Experimental Research Design: A Play of Variables

    This article provides a rationale for carrying out different kinds of experimental research designs in different social contexts. It discusses varied variables as the central manipulative and ...

  27. Cost‐Efficient Network Design in Multichannel ...

    Considering the aforementioned points, the primary objective of this research is to design and evaluate a novel, cost-effective, and energy-efficient solution for enhancing PDR and prolonging network lifetime in WSNs. In our study, we bridge the existing gap in WSNs research by proposing a novel approach that employs the combined power of MC ...

  28. (PDF) Research Design

    Research design is the plan, structure and strategy and investigation concaved so as to obtain search question and control variance" (Borwankar, 1995). ... Experimental Research Design. Whether it ...

  29. Integration of Global Navigation Satellite System and Ultra-Wide Band

    Experiments, design, and test results of the proposed real-time localization system are presented. In the experimental work, the precision of the UWB and GNSS modules is determined, where UWB exhibit higher precision than GNSS while GNSS has wider range. ... JOURNAL OF SUSTAINABLE RESEARCH IN ENGINEERING, 8(2), 114-122. Retrieved from https ...

  30. Different types of textual cues in educational animations ...

    Through a quasi-experimental research design, 257 fifth-grade students were assigned to three groups, each exposed to one type of textual cue. The results indicate that CTC leads to superior achievement, knowledge retention, higher self-efficacy, and the lowest cognitive load. In comparison, KTC demonstrates moderate results, while STC yields ...