Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Confounding Variables | Definition, Examples & Controls

Confounding Variables | Definition, Examples & Controls

Published on May 29, 2020 by Lauren Thomas . Revised on June 22, 2023.

In research that investigates a potential cause-and-effect relationship, a confounding variable is an unmeasured third variable that influences both the supposed cause and the supposed effect.

It’s important to consider potential confounding variables and account for them in your research design to ensure your results are valid . Left unchecked, confoudning variables can introduce many research biases to your work, causing you to misinterpret your results.

Table of contents

What is a confounding variable, why confounding variables matter, how to reduce the impact of confounding variables, other interesting articles, frequently asked questions about confounding variables.

Confounding variables (a.k.a. confounders or confounding factors) are a type of extraneous variable that are related to a study’s independent and dependent variables . A variable must meet two conditions to be a confounder:

  • It must be correlated with the independent variable. This may be a causal relationship, but it does not have to be.
  • It must be causally related to the dependent variable.

Example of a confounding variable

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

experimental studies confounding

To ensure the internal validity of your research, you must account for confounding variables. If you fail to do so, your results may not reflect the actual relationship between the variables that you are interested in, biasing your results.

For instance, you may find a cause-and-effect relationship that does not actually exist, because the effect you measure is caused by the confounding variable (and not by your independent variable). This can lead to omitted variable bias or placebo effects , among other biases.

Even if you correctly identify a cause-and-effect relationship, confounding variables can result in over- or underestimating the impact of your independent variable on your dependent variable.

There are several methods of accounting for confounding variables. You can use the following methods when studying any type of subjects— humans, animals, plants, chemicals, etc. Each method has its own advantages and disadvantages.

Restriction

In this method, you restrict your treatment group by only including subjects with the same values of potential confounding factors.

Since these values do not differ among the subjects of your study, they cannot correlate with your independent variable and thus cannot confound the cause-and-effect relationship you are studying.

  • Relatively easy to implement
  • Restricts your sample a great deal
  • You might fail to consider other potential confounders

In this method, you select a comparison group that matches with the treatment group. Each member of the comparison group should have a counterpart in the treatment group with the same values of potential confounders, but different independent variable values.

This allows you to eliminate the possibility that differences in confounding variables cause the variation in outcomes between the treatment and comparison group. If you have accounted for any potential confounders, you can thus conclude that the difference in the independent variable must be the cause of the variation in the dependent variable.

  • Allows you to include more subjects than restriction
  • Can prove difficult to implement since you need pairs of subjects that match on every potential confounding variable
  • Other variables that you cannot match on might also be confounding variables

Statistical control

If you have already collected the data, you can include the possible confounders as control variables in your regression models ; in this way, you will control for the impact of the confounding variable.

Any effect that the potential confounding variable has on the dependent variable will show up in the results of the regression and allow you to separate the impact of the independent variable.

  • Easy to implement
  • Can be performed after data collection
  • You can only control for variables that you observe directly, but other confounding variables you have not accounted for might remain

Randomization

Another way to minimize the impact of confounding variables is to randomize the values of your independent variable. For instance, if some of your participants are assigned to a treatment group while others are in a control group , you can randomly assign participants to each group.

Randomization ensures that with a sufficiently large sample, all potential confounding variables—even those you cannot directly observe in your study—will have the same average value between different groups. Since these variables do not differ by group assignment, they cannot correlate with your independent variable and thus cannot confound your study.

Since this method allows you to account for all potential confounding variables, which is nearly impossible to do otherwise, it is often considered to be the best way to reduce the impact of confounding variables.

  • Allows you to account for all possible confounding variables, including ones that you may not observe directly
  • Considered the best method for minimizing the impact of confounding variables
  • Most difficult to carry out
  • Must be implemented prior to beginning data collection
  • You must ensure that only those in the treatment (and not control) group receive the treatment

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Degrees of freedom
  • Null hypothesis
  • Discourse analysis
  • Control groups
  • Mixed methods research
  • Non-probability sampling
  • Quantitative research
  • Ecological validity

Research bias

  • Rosenthal effect
  • Implicit bias
  • Cognitive bias
  • Selection bias
  • Negativity bias
  • Status quo bias

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

A confounding variable is closely related to both the independent and dependent variables in a study. An independent variable represents the supposed cause , while the dependent variable is the supposed effect . A confounding variable is a third variable that influences both the independent and dependent variables.

Failing to account for confounding variables can cause you to wrongly estimate the relationship between your independent and dependent variables.

An extraneous variable is any variable that you’re not investigating that can potentially affect the dependent variable of your research study.

A confounding variable is a type of extraneous variable that not only affects the dependent variable, but is also related to the independent variable.

To ensure the internal validity of your research, you must consider the impact of confounding variables. If you fail to account for them, you might over- or underestimate the causal relationship between your independent and dependent variables , or even find a causal relationship where none exists.

There are several methods you can use to decrease the impact of confounding variables on your research: restriction, matching, statistical control and randomization.

In restriction , you restrict your sample by only including certain subjects that have the same values of potential confounding variables.

In matching , you match each of the subjects in your treatment group with a counterpart in the comparison group. The matched subjects have the same values on any potential confounding variables, and only differ in the independent variable .

In statistical control , you include potential confounders as variables in your regression .

In randomization , you randomly assign the treatment (or independent variable) in your study to a sufficiently large number of subjects, which allows you to control for all potential confounding variables.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Thomas, L. (2023, June 22). Confounding Variables | Definition, Examples & Controls. Scribbr. Retrieved August 5, 2024, from https://www.scribbr.com/methodology/confounding-variables/

Is this article helpful?

Lauren Thomas

Lauren Thomas

Other students also liked, independent vs. dependent variables | definition & examples, extraneous variables | examples, types & controls, control variables | what are they & why do they matter, get unlimited documents corrected.

✔ Free APA citation check included ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

Confounding Variables in Psychology: Definition & Examples

Julia Simkus

Editor at Simply Psychology

BA (Hons) Psychology, Princeton University

Julia Simkus is a graduate of Princeton University with a Bachelor of Arts in Psychology. She is currently studying for a Master's Degree in Counseling for Mental Health and Wellness in September 2023. Julia's research has been published in peer reviewed journals.

Learn about our Editorial Process

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

A confounding variable is an unmeasured third variable that influences, or “confounds,” the relationship between an independent and a dependent variable by suggesting the presence of a spurious correlation.

Confounding Variables in Research

Due to the presence of confounding variables in research, we should never assume that a correlation between two variables implies causation.

When an extraneous variable has not been properly controlled and interferes with the dependent variable (i.e., results), it is called a confounding variable.

Confounding Variable

For example, if there is an association between an independent variable (IV) and a dependent variable (DV), but that association is due to the fact that the two variables are both affected by a third variable (C). The association between IV and DV is extraneous.

Variable C would be considered the confounding variable in this example. We would say that the IV and DV are confounded by C whenever C causally influences both the IV and the DV.

In order to accurately estimate the effect of the IV on the DV, the researcher must reduce the effects of C.

If you identify a causal relationship between the independent variable and the dependent variable, that relationship might not actually exist because it could be affected by the presence of a confounding variable.

Even if the cause-and-effect relationship does exist, the confounding variable still might overestimate or underestimate the impact of the independent variable on the dependent variable.

Reducing Confounding Variables

It is important to identify all possible confounding variables and consider their impact of them on your research design in order to ensure the internal validity of your results.

Here are some techniques to reduce the effects of these confounding variables:
  • Random allocation : randomization will help eliminate the impact of confounding variables. You can randomly assign half of your subjects to a treatment group and the other half to a control group. This will ensure that confounders have the same effect on both groups, so they cannot correlate with your independent variable.
  • Control variables : This involves restricting the treatment group only to include subjects with the same potential for confounding factors. For example, you can restrict your subject pool by age, sex, demographic, level of education, or weight (etc.) to ensure that these variables are the same among all subjects and thus cannot confound the cause-and-effect relationship at hand.
  • Within-subjects design : In a within-subjects design, all participants participate in every condition.
  • Case-control studies : Case-control studies assign confounders to both groups (the experimental group and the control group) equally.

Suppose we wanted to measure the effects of caloric intake (IV) on weight (DV). We would have to try to ensure that confounding variables did not affect the results. These variables could include the following:

  • Metabolic rate : If you have a faster metabolism, you tend to burn calories more quickly.
  • Age : Age can affect weight gain differently, as younger individuals tend to burn calories quicker than older individuals.
  • Physical Activity : Those who exercise or are more active will burn more calories and could weigh less, even if they consume more.
  • Height : Taller individuals tend to need to consume more calories in order to gain weight.
  • Sex : Men and women have different caloric needs to maintain a certain weight.

Frequently asked questions

1. what is a confounding variable in psychology.

A confounding variable in psychology is an extraneous factor that interferes with the relationship between an experiment’s independent and dependent variables . It’s not the variable of interest but can influence the outcome, leading to inaccurate conclusions about the relationship being studied.

For instance, if studying the impact of studying time on test scores, a confounding variable might be a student’s inherent aptitude or previous knowledge.

2. What is the difference between an extraneous variable and a confounding variable?

A confounding variable is a type of extraneous variable . Confounding variables affect both the independent and dependent variables. They influence the dependent variable directly and either correlate with or causally affect the independent variable.

An extraneous variable is any variable that you are not investigating that can influence the dependent variable.

3. What is Confounding Bias?

Confounding bias is a bias that is the result of having confounding variables in your study design. If the observed association overestimates the effect of the independent variable on the dependent variable, this is known as a positive confounding bias.

If the observed association underestimates the effect of the independent variable on the dependent variable, this is known as a negative confounding bias.

Glen, Stephanie. Confounding Variable: Simple Definition and Example. Retrieved from StatisticsHowTo.com: Elementary Statistics for the rest of us! https://www.statisticshowto.com/experimental-design/confounding-variable/

Thomas, L. (2021). Understanding confounding variables. Scribbr. Retrieved from https://www.scribbr.com/methodology/confounding-variables/

University of Michigan. (n.d.). Confounding Variables. ICPSR. Retrieved from https://www.icpsr.umich.edu/web/pages/instructors/setups2012/exercises/notes/confounding-variable.html

Print Friendly, PDF & Email

  • Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Confounding Variable: Definition & Examples

By Jim Frost 86 Comments

Confounding Variable Definition

In studies examining possible causal links, a confounding variable is an unaccounted factor that impacts both the potential cause and effect and can distort the results. Recognizing and addressing these variables in your experimental design is crucial for producing valid findings. Statisticians also refer to confounding variables that cause bias as confounders, omitted variables, and lurking variables .

diagram that displays how confounding works.

A confounding variable systematically influences both an independent and dependent variable in a manner that changes the apparent relationship between them. Failing to account for a confounding variable can bias your results, leading to erroneous interpretations. This bias can produce the following problems:

  • Overestimate the strength of an effect.
  • Underestimate the strength of an effect.
  • Change the direction of an effect.
  • Mask an effect that actually exists.
  • Create Spurious Correlations .

Additionally confounding variables reduce an experiment’s internal validity , thereby reducing its ability to make causal inferences about treatment effects. You don’t want any of these problems!

In this post, you’ll learn about confounding variables, the problems they cause, and how to minimize their effects. I’ll provide plenty of examples along the way!

What is a Confounding Variable?

Confounding variables bias the results when researchers don’t account for them. How can variables you don’t measure affect the results for variables that you record? At first glance, this problem might not make sense.

Confounding variables influence both the independent and dependent variable, distorting the observed relationship between them. To be a confounding variable, the following two conditions must exist:

  • It must correlate with the dependent variable.
  • It must correlate with at least one independent variable in the experiment.

The diagram below illustrates these two conditions. There must be non-zero correlations (r) on all three sides of the triangle. X1 is the independent variable of interest while Y is the dependent variable. X2 is the confounding variable.

Diagram that displays the conditions for confounding variables to produce bias.

The correlation structure can cause confounding variables to bias the results that appear in your statistical output. In short, The amount of bias depends on the strength of these correlations. Strong correlations produce greater bias. If the relationships are weak, the bias might not be severe. If any of the correlations are zero, the extraneous variable won’t produce bias even if the researchers don’t control for it.

Leaving a confounding variable out of a regression model can produce omitted variable bias .

Confounding Variable Examples

Exercise and weight loss.

In a study examining the relationship between regular exercise and weight loss, diet is a confounding variable. People who exercise are likely to have other healthy habits that affect weight loss, such as diet. Without controlling for dietary habits, it’s unclear whether weight loss is due to exercise, changes in diet, or both.

Education and Income Level

When researching the correlation between the level of education and income, geographic location can be a confounding variable. Different regions may have varying economic opportunities, influencing income levels irrespective of education. Without controlling for location, you can’t be sure if education or location is driving income.

Exercise and Bone Density

I used to work in a biomechanics lab. For a bone density study, we measured various characteristics including the subjects’ activity levels, their weights, and bone densities among many others. Bone growth theories suggest that a positive correlation between activity level and bone density likely exists. Higher activity should produce greater bone density.

Early in the study, I wanted to validate our initial data quickly by using simple regression analysis to assess the relationship between activity and bone density. There should be a positive relationship. To my great surprise, there was no relationship at all!

Long story short, a confounding variable was hiding a significant positive correlation between activity and bone density. The offending variable was the subjects’ weights because it correlates with both the independent (activity) and dependent variable (bone density), thus allowing it to bias the results.

After including weight in the regression model, the results indicated that both activity and weight are statistically significant and positively correlate with bone density. Accounting for the confounding variable revealed the true relationship!

The diagram below shows the signs of the correlations between the variables. In the next section, I’ll explain how the confounder (Weight) hid the true relationship.

Diagram of the bone density model.

Related post : Identifying Independent and Dependent Variables

How the Confounder Hid the Relationship

The diagram for the Activity and Bone Density study indicates the conditions exist for the confounding variable (Weight) to bias the results because all three sides of the triangle have non-zero correlations. Let’s find out how leaving the confounding variable of weight out of the model masked the relationship between activity and bone density.

The correlation structure produces two opposing effects of activity. More active subjects get a bone density boost directly. However, they also tend to weigh less, which reduces bone density.

When I fit a regression model with only activity, the model had to attribute both opposing effects to activity alone. Hence, the zero correlation. However, when I fit the model with both activity and weight, it could assign the opposing effects to each variable separately.

Now imagine if we didn’t have the weight data. We wouldn’t have discovered the positive correlation between activity and bone density. Hence, the example shows the importance of controlling confounding variables. Which leads to the next section!

Reducing the Effect of Confounding Variables

As you saw above, accounting for the influence of confounding variables is essential to ensure your findings’ validity . Here are four methods to reduce their effects.

Restriction

Restriction involves limiting the study population to a specific group or criteria to eliminate confounding variables.

For example, in a study on the effects of caffeine on heart rate, researchers might restrict participants to non-smokers. This restriction eliminates smoking as a confounder that can influence heart rate.

This process involves pairing subjects by matching characteristics pertinent to the study. Then, researchers randomly assign one individual from each pair to the control group and the other to the experimental group. This randomness helps eliminate bias, ensuring a balanced and fair comparison between groups. This process controls confounding variables by equalizing them between groups. The goal is to create groups as similar as possible except for the experimental treatment.

For example, in a study examining the impact of a new education method on student performance, researchers match students on age, socioeconomic status, and baseline academic performance to control these potential confounders.

Learn more about Matched Pairs Design: Use & Examples .

Random Assignment

Randomly assigning subjects to the control and treatment groups helps ensure that the groups are statistically similar, minimizing the influence of confounding variables.

For example, in clinical trials for a new medication, participants are randomly assigned to either the treatment or control group. This random assignment helps evenly distribute variables such as age, gender, and health status across both groups.

Learn more about Random Assignment in Experiments .

Statistical Control

Statistical control involves using analytical techniques to adjust for the effect of confounding variables in the analysis phase. Researchers can use methods like regression analysis to control potential confounders.

For example, I showed you how I controlled for weight as a confounding variable in the bone density study. Including weight in the regression model revealed the genuine relationship between activity and bone density.

Learn more about controlling confounders by using regression analysis .

By incorporating these strategies into research design and analysis, researchers can significantly reduce the impact of confounding variables, leading to more accurate results.

If you aren’t careful, the hidden hazards of a confounding variable can completely flip the results of your experiment!

Kamangar F. Confounding variables in epidemiologic studies: basics and beyond . Arch Iran Med. 2012 Aug;15(8):508-16. PMID: 22827790.

Share this:

experimental studies confounding

Reader Interactions

' src=

January 15, 2024 at 10:02 am

To address this potential problem, I collect all the possible variables and create a correlation matrix to identify all the correlations, there direction, and their statistical significance, before regression.

' src=

January 15, 2024 at 2:54 pm

That’s a great practice for understanding the underlying correlation structure of your data. Definitely a good thing to do along with graphing the scatterplots for all those pairs because they’re good at displaying curved relationships that might not register with Pearson’s correlation.

It’s been awhile since I worked on the bone density study, but I’m sure I created that correlation & scatterplot matrix to get the lay of the land.

A couple of caveats:

Those correlations are pairwise relationships, equivalent to one predictor for a response (but without the directionality). So, those correlations can be affected by a confounding variable just like a simple regression model. Going back to the example in my post, if I did a pairwise correlation between all variables, including activity and bone density, that would’ve still been essentially zero–affected by the weight confounder in the same way as the regression model. At least with a correlation matrix, you’d be able to piece together that weight was a confounder likely affecting the other correlation.

And a confounder can exist outside your dataset. You might not have even measured a confounder, so it won’t be in your correlation matrix, but it can still impact your results. Hence, it’s always good to consider variables that you didn’t record as well.

I’m guessing you know all that, I’m more spelling it out for other readers.

And if I’m remember correctly, your background is more with randomized experiments. The random assignment process should break any correlation between a confounder and the outcome, making it essentially zero. Consequently, randomizes experiments tend to prevent confounding variables from affecting the results.

' src=

July 17, 2023 at 11:11 am

Hi Jim, In multivariate regression, I have always removed variables that aren’t significant. However, recently a reviewer said that this approach is unjustified. Is there a consensus about this? a reference article? thanks, Ray

July 17, 2023 at 4:52 pm

Hi Raymond,

I don’t have an article handy to refer you to. But based on what happens to models when you retain and exclude variables, I recommend the following approach.

Deciding whether to eliminate an insignificant independent variable from a regression model requires a thorough understanding of the theoretical implications related to that variable. If there’s strong theoretical justification for its inclusion, it might be advisable to keep it within the model, despite its insignificance.

Maintaining an insignificant variable in the model does not typically degrade its overall performance. On the contrary, removing a theoretically justified but insignificant variable can lead to biased outcomes for the remaining independent variables, a situation known as omitted variable bias . Therefore, it can be beneficial to retain an insignificant variable within the model.

It’s vital to consider two major aspects when making this decision. Firstly, whether there’s strong theoretical support for retaining the insignificant variable, and secondly, whether excluding it has a significant impact on the coefficient estimates of the remaining variables. In short, if you remove an insignificant variable and the other coefficients change, you need to assess the situation.

If there are no theoretical reasons to retain an insignificant variable and removing it doesn’t appear to bias the result, then you probably should remove it because it might increase the precision of your model somewhat.

Consequently, I advise “considering” the removal of insignificant independent variables from the model, instead of asserting that you “should” remove them, as this decision depends on the aforementioned factors and is not a hard-and-fast rule. Of course, when you do the write-up, explain your reasoning for including insignificant variables along with everything else.

' src=

January 16, 2023 at 5:31 pm

Thank you very much! That helped a lot.

January 15, 2023 at 9:12 am

thank you for the interesting post. I would like to ask a question because I think that I am very much stuck into a discipline mismatch. I come from economics but I am now working in the social sciences field.

You describe that conditions for confounding bias: 1) there is a correlation between x1 and x2 (the OVB) 2) x1 associates with y 3) x2 associates with y. I interpret 1) as that sometime x1 may determine x2 or the contrary.

However, I read quite recently a social stat paper in which they define confounding bias differently. 2)3) still hold but 1) says that x2 –> x1, not the contrary. So, the direction of the relationship cannot go the other way around. Otherwise that would be mediation..

I am a bit confused and think that this could be due to the different disciplines but I would be interested in knowing what you think.

Thank you. Best, Vero

January 16, 2023 at 12:56 am

Hi Veronica,

Some of your notation looks garbled in the comment, but I think I get the gist of your question. Unfortunately, the comments section doesn’t handle formatting well!

So, X 1 and X 2 are explanatory variables while Y is the outcome. The two x variables correlate with each other and the Y variable. In this scenario, yes, if you exclude X 2 , it will cause some degree of omitted variable bias. It is a confounding variable. The degree of bias depends on the collective strength of all three correlations.

Now, as for the question of the direction of the relationship between X 1 and X 2 , that doesn’t matter statistically. As long as the correlation is there, the potential for confounding bias exists. This is true whether the relationship between X 1 and X 2 is causal in either direction or totally non-causal. It just depends on the set of correlations existing.

I think you’re correct in that this is a difference between disciplines.

The social sciences define a mediator variable as explaining the process by which two variables are related, which gets to your point about the direction of a causal relationship. When X 1 –> X 2 , I’d say that the social sciences would call that a mediator variable AND that X 2 is still a confounder that will cause bias if it is omitted from the model. Both things are true.

I hope that helps!

' src=

October 10, 2022 at 11:07 am

Thanks in advance for your awesome content.

Regarding this question brought by Lucy, I want to ask the following: If introducing variables reduces the bias (because the model controls for it), why don’t we just insert all variables at once to see the real impact of each variable?

Let’s say I have a dataset of 150 observations and I want to study the impact of 20 variables (dummies and continuous), it is advantageous to introduce everything at once and see which variables are significant? I got the idea that introducing variables is always positive because it forces the model to show the real effects (of course I am talking about fundamented variables), but are there any caveats of doing so? Is it possible that some variables in fact may “hide” the significance of others because they will overshadow the others regressors? Usually it is said that, if the significance changes when introducing a variable, it was due to confounding. My question now is: is possible that confounding was not case and, in fact, the significance is just being hiden due to a present of a much more strong predictor?

October 10, 2022 at 8:10 pm

In some ways, you’re correct. Generally speaking, it is better to include too many variables than too few. However, there is a cost for including more variables than necessary, particularly when they’re not significant. Adding more variables than needed increases the model’s variance, which reduces statistical power and precision of the estimates. Ideally, you want a balance of all the necessary variables, no more, and no less. I write about this tradeoff in my post about selecting the best model . That should answer a lot of your questions.

I think the approach of starting with model with all possible variables has merit. You can always start removing the ones that are not significant. Just do that by removing one at a time and start by removing the least significant. Watch for any abrupt changes in coefficient signs and p-values as you remove each one.

As for caveats, there are rules of thumb as to how many independent variables you can include in a model based on how many observations you have. If you include too many, you can run into overfitting, which can produce whacky results. Read my post about overfitting models for information about that. So, in some cases, you just won’t be able to add all the potential variables at once, but that depends on the number of variables versus the number of observations. The overfitting post describes that.

And, to answer your last question, overfitting is another case where adding variables can change the significance that’s not due to confounding.

' src=

January 20, 2022 at 8:10 am

Thanks for the clear explanation, it was reallly helpful! I do have a question regarding this sentence: “The important takeaway here is that leaving out a confounding variable not only reduces the goodness-of-fit (larger residuals), but it can also bias the coefficient estimates.”

Is it always the case that leaving out a confounding variable leads to a lesser fit? I was thinking about the case of positive bias: say variables x and y are both negatively correlated with the dependent variable, but x and y are positively correlated with each other. If a high value for x is caused by a high value of y both variables ‘convey the information’ of variable y. So adding variable x to a model wouldn’t add any additional information, and thus wouldn’t improve the fit of the model.

Am I making a mistake in my reasoning somewhere? Or does leaving out a confounding variable not lead to a worse fit in this case?

Thanks again for the article! Sterre

January 20, 2022 at 2:20 pm

Think about it this way. In general, adding an IV always causes R-squared to increase to some degree–even when it’s only a chance correlation. That still applies when you add a confounding variable. However, with a confounding variable, you know it’s an appropriate variable to add.

Yes, the correlation with the IV in the model might capture some of the confounder’s explanatory power, but you can also be sure that adding it will cause the model to fit better. And, again, it’s an entirely appropriate variable to include because of its relationship with the DV (i.e., you’re not adding it just to artificially inflate R-squared/goodness-of-fit). Additionally, unless there’s a perfect correlation between the included IV and the confounder, the included IV can’t contain all the confounder’s information. But, if there was a perfect correlation, you wouldn’t be able to add both anyway.

There are cases where you might not want to include the confounder. If you’re mainly interested in making predictions and don’t need to understand the role of each IV, you might not need to include the confounder if your model makes sufficiently precise predictions. That’s particularly true if the confounder is difficult/expensive to measure.

Alternatively, if there is a very high, but not perfect correlation, between the included IV and the confounder, adding the confounder might introduce too much multicollinearity , which causes its own problems. So, you might be willing to take the tradeoff between exchanging multicollinearity issues for omitted variable bias. However, that’s a very specific weighing of pros and cons given the relative degree of severity for both problems for your specific model. So, there’s no general advice for which way to go. It’s also important to note that there are other types of regression analysis (Ridge and LASSO) that can effectively handle multicollinearity, although at the cost of introducing a slight bias. Another possibility to balance!

But, to your main question, yes, if you add the confounder, you can expect the model fit to improve to some degree. It may or may not be an improvement that’s important in a practical sense. Even if the fit isn’t notably better, it’s often worthwhile adding the confounder to address the bias.

' src=

May 2, 2021 at 4:23 pm

Jim, this was a great article, but I do not understand the table. I am sure it is easy, and I am missing something basic. what does it mean to be included and omitted: negative correlation…. etc. in the 2 way by 2 way table? I cannot wrap my head around the titles, and correspdonding scenarios. thanks John

May 3, 2021 at 9:39 pm

When I refer to “included” and “omitted,” I’m talking about whether the variable in question an independent variable IN the model (included), or a potential independent variable that is NOT in the model (omitted). After all, we’re talking about omitted variable bias, which is the bias caused by leaving an important variable out of the model.

The table allows you to determine the direction the coefficient estimate is being biased if you can determine the direction of the correlation between several variables.

In the example, I’m looking at a model where Activity (the included IV) predicts the bone density of the individual (the DV). The omitted confounder is weight. So, now we just need to assess the relationships between those variables to determine the direction of the bias. I explain the process of using the table with this example in the paragraph below the table, so I won’t retype it here. But, if you don’t understand something I write there, PLEASE let me know and I’ll help clarify it!

In the example, Activity = Included, Weight = Omitted, and Dependent = Bone Density. I use the signs from the triangle diagram that include a ways before the table which lists these three variables to determine the column and row to use.

Again, I’m not sure which part is tripping you up!

' src=

April 27, 2021 at 2:23 am

Thank you Jim ! The two groups are both people with illness, only different because they are illnesses that occur in different ages. The first illness group is of younger age like around 30, the other of older age around 45. Overlap of ages between these groups is very minimal. By control group, I meant a third group of healthy people without illness, and has ages uniformly distributed in the range represented in the two patient groups, and thus the group factor having three levels now.. I was thinking if this can reduce the previous problem of directly comparing the young and old patient groups where adding age as covariate can cause collinearity problem..

April 28, 2021 at 10:42 pm

Ah, ok. I didn’t realize that both groups had an illness. Usually a control group won’t have a condition.

I really wouldn’t worry about the type of multicollinearity you’re referring to. You’d want to include those two groups and age plus the interaction term, which you could remove if it’s not significant. If the two groups were completely distinct in age and had a decent gap between them, there are other model estimate problems to worry about, but that doesn’t seem to be the case. If age is a factor in this study area, you definitely don’t want to exclude it. Including it allows you to control for it. Otherwise, if you leave it out, the age effect will get rolled into the groups and, thereby, bias your results. Including age is particularly important in your case because you know the groups are unbalanced in age. You don’t want the model to attribute the difference in outcomes to the illness condition when it’s actually age that is unbalanced between those two conditions. I’d go so far to say that your model urgently needs you to include age!

That said, I would collect a true control group that has healthy people and ideally a broad range of ages that covers both groups. That will give you several benefits. Right now, you won’t know how your illness groups compare to a healthy group. You’ll only know how they compare to each other. Having that third group will allow you to compare each illness group to the healthy group. I’m assuming that’s useful information. Plus, having a full range of ages will allow the model to produce a better estimate of the age effect.

April 26, 2021 at 6:51 am

Hi JIm, Thanks a lot for your intuitive explanations!!

I want to study the effect of two Groups of patients (X1) on y (a test performance score), in a GLM framework. Age (X2) and Education (X3) are potential confounders on y.

However its not possible to match these two groups for age, as they are illnesses that occur in different age groups-one group is younger than the other. Hence the mean ages are significantly different between these groups.

I’m afraid adding age as a covariate could potentially cause multicollinearity problem as age is significantly different between groups, and make the estimation of group effect (β1) erroneous, although it might improve the model. Is recruiting a control group with age distribution comparable to the pooled patient groups, hence of a mean age mid-way between the two patient groups a good idea to improve the statistical power of the study? In this case my group factor X1 will have three levels. Can this reduce the multicollinearity problem to an extent as the ages of patients in the two patient groups are approximately represented in the control group also..? Should I add an interaction term of Age*Group in the GLM to account for the age difference between groups..? Thank you in advance.. -Mohan

April 26, 2021 at 11:13 pm

I’d at least try including age to see what happens. If there’s any overlap in age between the two groups, I think you’ll be ok. Even if there is no overlap, age is obviously a crucial variable. My guess would be that it’s doing more harm by excluding it from the model when it’s clearly important.

I’m a bit confused by what you’re suggesting for the control group. Isn’t one of your groups those individuals with the condition and the other without it?

It does sound possible that there would be an interaction effect in this case. I’d definitely try fitting and see what the results are! That interaction term would show whether the relationship between age and test score is different between the groups.

' src=

April 26, 2021 at 12:44 am

In the paragraph below the table, both weight and activity are referred to as included variables.

April 26, 2021 at 12:50 am

Hi Joshua, yes, you’re correct! A big thanks! I’ve corrected the text. In that example, activity is the included variable, weight is the omitted variable, and bone density it the dependent variable.

' src=

April 24, 2021 at 1:06 pm

Hi, Jim. Great article. However, is that a typo in the direction of omitted variable bias table? For the rows, it makes more sense to me if they were “correlation between dependent and omitted variables” instead of between depedent and included variables”.

April 25, 2021 at 11:21 pm

No, that’s not a typo!

' src=

April 22, 2021 at 9:53 am

Please let me know if this summary makes sense. Again, Thanks for the great posts !

Scenario 1: There are 10 IVs. They are modeled using OLS. We get the regression coefficients.

Scenario 2: One of the IVs is removed. It is not a confounder. The only impact is on the residuals (they increase). The coefficients obtained in Scenario 1 remain intact. Is that correct ?

Scenario 3: The IV that was removed in Scenario 2, is placed back into the mix. This time, another IV is removed. Now this one’s a confounder. OLS modeling is re-run. There are 3 resutls.

1) The residuals increase — because it is correlated with the dependent variable. 2) The coefficient of the other IV, to which this removed confounder is correlated, changes. 3) The coefficients of the other IVs remain intact.

Are these 3 scenarios an accurate summary, Jim? A reply would be much appreciated !

Again, do keep up the good work.

April 25, 2021 at 11:26 pm

Yes, that all sounds right on! 🙂

April 22, 2021 at 8:37 am

Great post, Jim !

Probably a basic question, but would appreciate your answer on this, since we have encountered this in practical scenarios. Thanks in advance.

What if we know of a variable that should get included on the IV side, we don’t have data for that, we know (from domain expertise) that it is correlated with the dependent variable, but it is not correlated with any of the IVs…In other words, it is not a confounding variable in the strictest sense of the term (since it is not correlated to any of the IVs).

How do we account for such variables?

Here again the solution would be to use proxy variables? In other words, can we consider proxy variables to be a workaround for not just confounders, but also non-confounders of the above type ?

Thanks again !

April 23, 2021 at 11:20 pm

I discuss several methods in this article. The one I’d recommend if at all possible is identifying a proxy variable that stands in for the important variable that you don’t have. It sounds like in your case it’s not a confounder. So, it’s probably not biasing your other coefficients. However, your model is missing important information. You might be able to improve the precision using a proxy variable.

' src=

March 19, 2021 at 10:45 am

Hi Jim, that article is helping me a lot during my research project, thank you so for that! However, there is one question for which I couldn’t find a satisfactory answer on the internet, so I hope that maybe you can shed some light on this: In my panel regression, I have my main independent variable on “Policy Uncertainty”, that catpures uncertainty related to the possible impact of future government policies. It is based on an index that has a mean of 100. My dependent variable is whether a firm has received funding in quarter t (Yes = 1, No = 0), thus I want to estimate the impact of policy uncertainty on the likelihood of receiving external funding. In my baseline regression, the coefficient on policy uncertainty is insignificant, suggesting that policy uncertainty has no impact. When I now add a proxy for uncertainty related finincial markets (e.g. implied stock market volatitily), then policy uncertainty becomes significant at the 1% level and the market uncertainty proxy is statistically significant at the 1% level too! The correlation between both is rather low, 0.2. Furthermore, both have opposite signs (poilcy uncertainty is positively associated with the likelihood of receiving funding), additionally, the magnitude of the coefficients is comparable.

Now am I wondering what this tells me…did the variable on policy uncertainty previously capture the effect of market uncertainty before including the latter in regression? Would be great if you could help 🙂

March 19, 2021 at 2:56 pm

Thanks for writing with the interesting questions!

First, I’ll assume you’re using binary logistic regression because you have a binary dependent variable. For logistic regression, you don’t interpret the coefficients that same ways as you do for say least squares regression. Typically, you’ll assess the odds ratio to understand the IVs relationship to the binary DV.

On to your example. It’s entirely possible that leaving out market uncertainty was causing omitted variable bias in the policy uncertainty. That might be what is happening. But, the positive sign of one and the negative sign of the other could be cancelling each other out when you only include the one. That is what happens in the example I use in this post. However, for that type of bias/confounding, you’d expect there to be a correlation between the two DVs and you say it is low.

Another possibility is the fact that for each variable in a model, the significance refers to the Adj SS for the variable, which factors in all the other variables before entering variable in question. So, the policy uncertainty in the model with market volatility is significant after accounting for the variance that the other variables explain, including market volatility. For the model without market volatility, the policy uncertainty is not significant in that different pool of remaining variability. Given the low correlation (0.2) between those two IVs, I’d lean towards this explanation. If there was a stronger correlation between the policy/market uncertainty, I’d lean towards omitted variable bias.

Also be sure that your model doesn’t have any other type of problems, such as overfitting or patterns in the residual plots . Those can cause weird things to happen with the coefficients.

It can be unnerving when the significance of one variable depends entirely on the presence of another variable. It makes choosing the correct model difficult! I’d let theory be your guide. I write about that towards the end of my post about selecting the correct regression model . That’s written in the contest of least squares regression, but the same ideas about theory and other research apply here.

You should definitely investigate this mystery further!

' src=

February 11, 2021 at 12:31 am

Thank you for this blog. I have a question: If two independent variables are corelated, can we not convert one into the other and replace that in the model? For example, If Y=X1 +X2, and X2= – 0.5X1, then Y=0.5X1. However, I don’t see that as a suggestion in the blog. The blog mentions that activity is related to weight, but then somehow both are finally included in the model, rather than replacing one with the other in the model. Will this not help with multicollinearity, too? I am sure I am missing something here that you can see, but I am unable to find that out. Can you please help?

Regards, Kushal Jain

February 11, 2021 at 4:45 pm

Why would you want to convert one to another? Typically, you want to understand the relationship between each independent variable and the dependent variable. In the model I talk about, I’d want to know the relationship between both activity and weight with bone density. Converting activity to weight does not help with that.

And, I’m not understanding what you mean by “then somehow both are finally included in the model.” You just include both variables in the model the normal way.

There’s no benefit to converting the variables as you describe and there are reasons not to do that!

' src=

November 25, 2020 at 2:22 pm

Hi Jim, I have been trying to figure out covariates for a study we are doing for some time. My colleague believes that if two covariates have a high correlation (>20%) then one should be removed from the model. I’m assuming this is true unless both are correlated to the dependent variable, per your discussion above? Also, what do you think about selecting covariates by using the 10% change method? Any thoughts would be helpful. We’ve had a heck of a time selecting covariates for this study. Thanks, Erin

November 27, 2020 at 2:06 am

It’s usually ok to have covariates that have a correlation greater than 20%. The exact value depends on the number of covariates and the strength of their correlations. But 20% is low and almost never a problem. When covariates are corelated, it’s known as multicollinearity. And, there’s a special measure known as VIFs that determine whether you have an excessive amount of correlation amongst your covariates. I have a post that discusses multicollinearity and how to detect and correct it .

I have not used the 10% change method myself. However, I would suggest using that method only as one point of information. I’d really place more emphasis on theory and understanding the subject area. However, observing how much a covariate changes can provide useful information about whether bias is a problem or not. In general, if you’re uncertain, I’d err on the side of unnecessarily including a covariate than leaving it out. There are usually fewer problems associated with having an additional variable than omitting one. However, keep an eye out on the VIFs as you do that. And, having a number of unnecessary variables could lead to problems if taken to an extreme or if you have a really small sample size.

I wrote a post about model selection . I give some practical tips in it. Overall, I suggest using a mix of theory, subject area knowledge, and statistical approaches. I’d suggest reading that. It’s not specifically about controlling for confounders but the same principles apply. Also, I’d highly recommend reading about what researchers performing similar studies have done if that’s at all possible. They might have already addressed that issue!

' src=

November 5, 2020 at 6:29 am

Hi Jim Im not sure whether my problem fits under this category or not so apologies if not. I am looking at whether an inflammatory biomarker (independant variable) correlates with a measure of cognitive function (dependant variable). It does if its just a simple linear regression however the biomarker (independant variable) is affected by age, sex and whether you’re a smoker or not. Correcting for these 3 covariables in the model shows that actually there is no correlation between the biomarker and cognitive function. I assume this was the correct thing to do but wanted to make sure seeing as a) none of the 3 covariables correlate with/predict my dependant variable, and b) as age correlates highly with the biomarker, does this not introduce colinearity? Thanks! Charlotte

November 6, 2020 at 9:46 pm

Hi Charlotte,

Yes, it sounds like you did the right thing. Including the other variables in the model allows the model to control for them.

The collinearity (aka multicollinearity or correlation between independent variables) between age and the biomarker is a potential concern. However, a little correlation, or a moderate amount of correlation is fine. What you really need to do is to assess the VIFs for your independent variables. I discuss VIFs and multicollinearity in my post about multicollinearity . So, your next step should be to determine whether you have problematic levels of multicollinearity.

One symptom of multicollinearity is a lack of statistical significance, which your model is experience. So, it would be good to check.

Actually, I’m noticing that at least several of your independent variables are binary. Smoker. Gender. Is the biomarker also binary? Present or not present? If so, that’s doesn’t change the rational for including the other variables in the model but it does mean VIFs won’t detect the multicollinearity.

' src=

October 28, 2020 at 9:33 pm

Thanks for the clarification, Jim. Best regards.

October 24, 2020 at 11:30 pm

I think the section on “Predicting the Direction of Omitted Variable Bias” has a typo on the first column, first two rows. It should state:

*Omitted* and Dependent: Negative Correlation

*Omitted* and Dependent: Positive Correlation

This makes it consistent with the required two conditions for Omitted Variable Bias to occurs:

The *omitted* variable must correlate with the dependent variable. The omitted variable must correlate with at least one independent variable that is in the regression model.

October 25, 2020 at 12:24 am

Hi Humberto,

Thanks for the close reading of my article! The table is correct as it is, but you are also correct. Let’s see why!

There are the following two requirements for omitted variable bias to exist: *The omitted variable must correlate with an IV in the model. *That IV must correlate with the DV.

The table accurately depicts both those conditions. The columns indicate the relationship between the IV (included) and omitted variable. The rows indicate the nature of the relationship between the IV and DV.

If both those conditions are true, you can then infer that there is a correlation between the omitted variable and the dependent variable and the nature of the correlation, as you indicate. I could include that in the table, but it is redundant information.

We’re thinking along the same lines and portraying the same overall picture. Alas, I’d need to use a three dimensional matrix to portray those three conditions! Fortunately, using the two conditions that I show in the table, we can still determine the direction of bias. And you could use those two relationships to determine the relationship between the omitted variable and dependent variable if you so wanted. However, that information doesn’t change our understanding of the direction of bias because it’s redundant with information already in the table.

Thanks for the great comment and it’s always beneficial thinking through these things using a different perspective!

' src=

August 14, 2020 at 3:00 am

Thank you for the intuitive explanation, Jim! I would like to ask a query. Suppose i have two groups-one with a recently diagnosed lung disease and another with chronic lung disease where i would like to do an independent t-test for the amount of lung damage. It happens that the two groups also significantly differ in their mean age. The group with recently diagnosed disease has a lesser mean age than the group with chronic disease. Also theory says Age can cause some damage in lung as a normal course too. So if i include age as a covariate in the model, wont it regress out the effect of DV and give underestimated effect as the IV (age) significantly correlates with DV (lung damage)? How do we address this confounding effect of correlation between only IV and DV? Should it be by having a control group without lung disease? If so can one control group help? Or should there be 2 control groups with age-matching to the two study groups? Thank you in advance.

August 15, 2020 at 3:46 pm

Hi Vineeth,

First, yes, if you know age is a factor, you should include it as a covariate in the model. It won’t “regress out” the true effect between the two groups. I would think of it a little differently.

You have two groups and you suspect that something caused those two groups to have differing amounts of lung damage. You also know that age plays a role. And those groups have different ages. So, if you look only at the groups without factoring in age, the effect of age is still present but the model is incorrectly attributing it to the groups. In your case, it will make the effect look larger.

When you include age, yes, it will reduce the effect size between the groups, but it’s reveal the correct effect by accounting for age. So, yes, in your cases, it’ll make the group difference look smaller, but don’t think of it as “regressing out” the effect but instead it is removing the bias in the other results. In other words, you’re improving the quality of your results.

When you look at your model results for say the grouping variable, it’s already controlling for the age variable. So, you’re left with what you need, just the effect between the IV and DV that is accounted for by another variable in the model, such as age. That’s what you need!

A control group for any experiment is always a good idea if you can manage one. However, it’s not always possible. I write about these experimental design issues, randomized experiments, observational studies, how to design a good experiment, etc. among other topics in my Introduction to Statistics ebook , which you might consider. It’s also just now available in print on Amazon !

' src=

August 12, 2020 at 7:04 am

I was wondering whether it’s correct to check the correlation between the independent variables and the error term in order to check for endogeneity. If we assume that there is endogeneity then the estimated errors aren’t correct and so the correlation between the independent variables and those errors doesn’t say much. Am I missing something here?

best regards,

' src=

July 15, 2020 at 1:57 pm

I wanted to look at the effects of confounders on my study but I’m not sure what analysis(es) to use for dichotomous covariates. I have one categorical iv with two levels, two continuous dvs, and then the two dichotomous confounding variables. It was hard to finds information for categorical covariates online. Thanks in advance Jim!

' src=

May 8, 2020 at 10:04 am

Thank you for your nice blog. I have still a question. Let’s say I want to determine the effect of one independent variable on a dependent variable with a linear regression analysis. I have selected a number of potential variables for this relationship based on literature, such as age, gender, health status and education level. How can I check (with statistical analyses) if these are indeed confounders? I would like to know for which of them I should control for in my linear regression analysis. Can I create a correlationmatrix beforehand to see if the potential confounder is both correlated with my independent and dependent variable? And what threshold for the correlation coefficient should be taken here? Is this every correlation coefficient except zero (for instance 0.004? Are there scientific articles/books that endorce this threshold? Or is it maybe better to use a “change-in-estimate” criterion to see if my regression coefficient changes with a particular size after adding my potential confounder in the linear regression model? What would be the threshold here?

I hope my question is clear. Thanks in advance!

' src=

April 29, 2020 at 2:47 am

thanks for a wonderful website! I love your example with the bone density which does not appear to be correlated to physical activity if looked at alone, and needs to have the weight added as explanatory variable to make both of them appear as significantly correlated with bone density. I would love to use this example in my class, as I think it is very important to understand that there are situations where a single-parameter model can lead you badly astray (here into thinking activity is not correlated with bone density). Of course, I could make up some numbers for my students, but it would be even nicer if I could give them your real data. Could you by any chance make a file of real measurements of bone densities, physical activity and weight available? I would be very grateful, and I suppose a lot of other teachers/students too!

best regards Martin

April 30, 2020 at 5:06 pm

When I wrote this post, I wanted to share the data. Unfortunately, it seems like I no longer have it. If I uncover it, I’ll add it to the post.

' src=

February 8, 2020 at 1:45 pm

The work you have done is amazing, and I’ve learned so much through this website. . I am at beginner level in SPSS and I would be grateful if you could answer my question. I have found that a medical treatment results in worse quality of life. But I know from crosstabs that people that are taking this treatment present more severe disease (continuous variable) that also correlates to quality of life. How can I test if it is treatment or severity that worsens quality of life?

February 8, 2020 at 3:16 pm

Hi Evangelia,

Thanks so much for your kind words, I really appreciate them! And, I’m glad my website has been helpful!

That’s a great question and a valid concern to have. Fortunately, in a regression model, the solution is very simple. Just include both the treatment and severity of the disease in the model as independent variables. Doing that allows the model to hold disease severity constant (i.e., controls for it) while it estimates the effect of the treatment.

Conversely, if you did not include severity of the disease in the model, and it correlates with both the treatment and quality of life, it is uncontrolled and will be a confounding variable. In other words, if you don’t include severity of disease, the estimate for the relationship between treatment and quality of life will be biased.

We can use the table in this post for estimating the direction of bias. Based on what you wrote, I’ll assume that the treatment condition and severity have a positive correlation. Those taking the treatment present a more severe disease. And, that the treatment condition has a negative correlation with quality of life. Those on the treatment have a lower quality of life for the reasons you indicated. That puts us in the top-right quadrant of the table, which indicates that if you do not include severity of disease as an IV, the treatment effect will be underestimated.

Again, simply by including disease severity in your model will reduce the bias!

' src=

December 7, 2019 at 7:32 pm

Just a question about what you said about power. Will adding more independent variables to a regression model cause a loss of power? (at a fixed sample size). Or does it depend on the type of independent variable added: confounder vs. non confounder.

' src=

November 1, 2019 at 8:54 pm

you mention “Suppose you have a regression model with two significant independent variables, X1 and X2. These independent variables correlate with each other and the dependent variable” How is possible for two random variables (in this case the two factors) to correlate with each other if they are independent? If two random variables are independent then covariance is zero and therefore correlaton is zero.

Corr(X1,X2)=Cov(X1, X2)/(sqrt(var(X1))*sqrt(var(X2))) Cov(X1,X2)=E[X1*X2]-E[X1]*E[X2] if X1 and X2 are independent then E[X1*X2]=E[X1]*E[X2] and therefore covariance is zero.

November 4, 2019 at 9:07 am

Ah, there’s a bit of confusion here. The explanatory variables in a regression model are often referred to as independent variables, as well as predictors, x-variables, inputs, etc. I was using “independent variable” as the name. You’re correct, if they were independent in the sense that you describe them, there would be no correlation. Ideally, there would be no correlation between them in a regression model. However, they can, in fact, be correlated. If that correlation is too strong, it will cause problems with the model.

“Independent variable” in the regression context refers to the predictors and describes their ideal state. In practice, they’ll often have some degree of correlation.

I hope this helps!

' src=

April 8, 2019 at 12:33 pm

Ah! Enlightenment!

I had taken your statement about the correlation of the independent variable with the residuals to be a statement about computed value of the correlation between them, that is, that cor(X1, resid) was nonzero. I believe that (in a model with a constant term), this is impossible.

But I think I get now that that you were using the term more loosely, referring to a (nonlinear) pattern appearing between the values of X1 and the corresponding residuals, in the same way as you would see a parabolic pattern in a scatterplot of residuals versus X if you tried to make a linear fit of quadratic data. The linear correlation between X and the residuals would still compute out, numerically, to zero, so X1 and the residuals would would technically be uncorrelated, but they would not be statistically independent. If the residuals are showing a nonlinear pattern when plotted against X, look for a lurker.

The Albany example was very helpful. Thanks so much for digging it up!

April 8, 2019 at 8:38 am

Hi, Jim! Thanks very much for you speedy reply!

I appreciate the clarity that you aim for in your writing, and I’m sorry if I wasn’t clear in my post. Let me try again, being a bit more precise, hopefully without getting too technical.

My problem is that I think that the very process used in finding the OLS coefficients (like minimizing the sum squared error of the residuals) results in a regression equation that satisfies two properties. First, that the sum (or mean) of the resulting residuals is zero. Second, that for any regressor Xi, Xi is orthogonal to the vector of residuals, which in turn leads to the covariance of the residuals with any regressor having to be zero. Certainly, the true error terms need not sum to zero, nor need they be uncorrelated with a regressor…but if I understand correctly, these properties of the _residuals_ is an automatic consequence of fitting OLS to a data set, regardless of whether the actual error terms are correlated to the regressor or not.

I’ve found a number of sources that seem to say this–one online example is on page two here: https://www.stat.berkeley.edu/~aditya/resources/LectureSIX.pdf . I’ll be happy to provide others on request.

I’ve also generated a number of my own data sets with correlated regressors X1 and X2 and Y values generated by a X1 + b X2 + (error), where a and b are constants and (error) is a normally distributed error term of fixed variance, independently chosen for each point in the data set. In each case, leaving X2 out of the model still left me with zero correlation between X1 and the residuals, although there was a correlation between X1 and the true error terms, of course.

If I have it wrong, I’d love to see a data set that demonstrates what you’re talking about. If you don’t have time to find one (which I certainly understand), I’d be quite happy with any reference you might point me to that talks about this kind of correlation between residuals and one of the regressors in OLS, in any context.

Thanks again for your help, and for making regression more comprehensible to so many people.

Scott Stevens

April 8, 2019 at 10:59 am

Unfortunately, the analysis doesn’t fix all possible problems with the residuals. It is possible to specify models where the residuals exhibit various problems. You mention that residuals will sum to zero. However, if you specify a model without a constant, the residuals won’t necessarily sum to zero-read about that here . If you have a time series model, it’s possible to have autocorrelation in the residuals if you leave out important variables. If you specify a model that doesn’t adequately model curvature in the data, you’ll see patterns in the residuals.

In a similar vein, if you leave out an important variable that is correlated both with the DV and another IV in the model, you can have residuals that correlate with an IV. The standard practice is to graph the residuals by the independent variable to look for that relationship because it might have a curved shape which indicates a relationship but not necessarily a linear one that correlation would detect.

As for references, any regression textbook should cover this assumption. Again, it’ll refer to error, but the key is to remember that residuals are the proxy for error.

Here’s a reference from the University of Albany about Omitted Variable Bias that goes into it in more detail from the standpoint of residuals and includes an example of graphing the residuals by the omitted variable.

April 7, 2019 at 11:17 am

Hi, Jim. I very much enjoy how you make regression more accessible, and I like to use your approaches with my own students. I’m confused, though by the matter brought up by SFDude.

I certainly see how the _error_ term in a regression model will be correlated with an independent variable when a confounding variable is omitted, but it seems to me that the normal equations that define the regression coefficients assure that an independent variable in the model will always be uncorrelated with the _residuals_ of that model, regardless of whether an omitted confounding variable exists or not. Certainly, “X1 correlates with X2, and X2 correlates with the residuals. Ergo, variable X1 correlates with the residuals” would not hold for any three variables X1 and X2 and R. For example, if A and B are independent, then “A correlates with A + B, A + B correlates with B. Ergo, A correlates with B” is a false statement.

If I’m missing something here, I’d very much appreciate a data set that demonstrates the kind of correlation between an independent variable and the residuals of the model that it seems you’re talking about.

Thanks! Scott Stevens

April 7, 2019 at 6:28 pm

Thanks for writing. And, I’m glad to hear that you find my website helpful!

The key thing to remember is that while the OLS assumptions refer to the error, we can’t directly observe the true error. So, we use the residuals as estimates of the error. If the error is correlated with an omitted variable, we’d expect the residuals to be correlated as well in approximately the same manner. Omitted variable bias is a real condition, and that description is simply getting deep into the nuts and bolts of how it works. But, it’s the accepted explanation. You can read it in textbooks. While the assumptions refer to error, we can only assess the residuals instead. They’re the best we’ve got!

When you say A and B are “independent”, if you mean they are not correlated, I’d agree that removing a truly uncorrelated variable from the model does not cause this type of bias. I mention that in this post. This bias only occurs when independent variables are correlated with each other to some degree, and with the dependent variable, and you exclude one of the IVs.

I guess I’m not exactly sure which part is causing the difficulty? The regression equations can’t ensure that the residuals are not uncorrelated if the model is specified in such a way that it causes them to be correlated. It’s just like in time series regression models, you have to be on the look out for autocorrelation (correlated residuals) because the model doesn’t account for time-order effects. Incorrectly specified models can and do cause problems with the residuals, including residuals that are correlated with other variables and themselves.

I’ll have to see if I can find a dataset with this condition.

' src=

March 10, 2019 at 10:41 am

Hi Jim, I am involved in a study which involves looking into s number of clinical paramaters like platelet count and Haemogobin for patients who underwent emergency change of a mechanical circulatory support device due to thrombosis or clotting of the actual device. The purpose is to look if there is a trend in these parameters in the time frame of before 3 days and after 3 days of the change and establish if these parameters could be used as predictor of the event. My concern is that there is no control group for this study. But I dont see the need for looking into trend in a group which never had an event itself. Will not having a control group be considered as a weakness for this study? Also, what would be best statistical test for this. I was thinking of the generalized linear model. I would really appreciate your guidance here. Thank you

' src=

February 20, 2019 at 8:49 am

I’m looking at a published paper that develops clinical prediction rules by using logistic regression in order to help primary care doctors to decide who to refer to breast clinics for further investigation. The dependent variable is simply whether breast cancer is found to be present or not. The independent variables include 11 symptoms and age in (mostly) ten year increments (six separate age bands). The age bands were decided before the logistical regression was carried out. The paper goes on to use the data to create a scoring system based on symptoms and age. If this scoring system were to be used then above a certain score a woman would be referred, and below a certain score a woman would not be referred.

The total sample size is 6590 women referred to a breast clinic of which 320 were found to have breast cancer. The sample itself is very skewed. In younger women, breast cancer is rare and so some categories the numbers are very low. So for instance, in the 18-29 age band there are 62 women referred of whom 8 women have breast cancer, and in the 30-39 age band there are 755 women referred of which only one woman has breast cancer. So my first question is: if there are fewer individuals in particular categories than symptoms can the paper still use logistic regression to predict who to refer to a breast clinic based on a scoring system that includes both age and symptoms? My second question is: if there is meant to be at least 10 individuals per variable in logistic regression, are the numbers of women with breast cancer in these age groups too small for logistic regression to apply?

When I look at the total number of women in the sample (6590) and then the total number of symptoms (8616) there is a discrepancy. This means that some women have had more than one symptom recorded. (Or from the symptoms’ point of view, some women have been recorded more than once). So my third question is: does this mean that some of the independent variables are not actually independent of each other? (There is around a 30%-32% discrepancy in all categories. How significant is this?)

There are lots of other problems with the paper (the fact the authors only look at referred women rather than all the symptomatic women that a primary care doctor sees is a case in point) but I’d like to know whether the statistics are flawed too. If there are any other questions I need to ask about the data please do let me know.

With very best wishes,

Ms Susan Mitchell

February 20, 2019 at 11:23 pm

Offhand, I don’t see anything that screams to me that there is a definite problem. I’d have to read the study to be really sure. Here’s some thoughts.

I’m not in the medical field, but I’ve heard talks by people in the that field and it sounds like this is a fairly common use for binary logistic regression. The analyst creates a model where you indicate which characteristics, risk factors, etc apply to an individual. Then, the model predicts the probability of an outcome for them. I’ve seen similar models for surgical success, death, etc. The idea is that it’s fairly easy to use because some can just enter the characteristics of the patient and the model spits out a probability. For any model of this type, you’d really have to check the residuals and see all the output to determine how well the model fits the data. But, there’s nothing inherently wrong with this approach.

I don’t see a problem with the sample size (6590) and the number of IVs (12). That’s actually a very good ratio of observations per IV.

It’s ok that there are fewer individuals in some categories. It’s better if you have a fairly equal number but it’s not a show stopper. Categories with fewer observations will have less precise estimates. It can potentially reduce the precision of model. You’d have to see how well the model fit the data to really know how well it works out. But, yes, if you have an extremely low number of individuals that have a particular symptom, you won’t get as precise of an estimate for that symptoms effect. You might see a wider CI for its odds ratio. But, it’s hard to say without seeing all of that output and how the numbers by symptoms. And, it’s possible that they selected the characteristics that apply to a sufficient number of women. Again, I wouldn’t be able to say. It’s an issue to consider for sure.

As for the number of symptoms versus the number of women, it’s ok that a woman can have more than one symptom. Each symptom is in it’s own column and will be coded with a 1 or 0. A row corresponds to one woman and she’ll have a 1 for each characteristic that she has and 0s for the ones that she does not have. It’s possible these symptoms are correlated. These are categorical variables, so you couldn’t use Pearson’s correlation. You’d need to use something like the chi-square test of independence. And, some correlation is okay. Only very high correlation would be problematic. Again, I can’t say whether that’s a problem in this study or not because it depends on the degree of correlation. It might be, but it’s not necessarily a problem. You’d hope that the study strategically included a good set of IVs that aren’t overly correlated.

Regarding the referred women vs symptomatic women, that comes down to the population that is being sampled and how generalizeable the results are. Not being familiar with the field, I don’t have a good sense for how that affects generalizability, but yes that would be a concern to consider.

So, I don’t see anything that shouts to me that it’s a definite problem. But, as with any regression model, it would come down to the usual assessments of how well the model fits the data. You mention issues that could be concerns, but again, it depends on the specifics.

Sorry I couldn’t provide more detailed thoughts but evaluating these things requires real specific information. But, the general approach for this study seems sound to me.

' src=

February 17, 2019 at 3:48 pm

I have a question, how well can we evaluate a regression equation “fits” the data by examing the R Square statistic, and test for statistical significance of the whole regression equation using the F-Test?

February 18, 2019 at 4:56 pm

I have two blog posts that will be perfect for you!

Interpreting R-squared Interpreting the F-test of Overall Significance

If you have questions about either one, please post it in the comments section of the corresponding post. But, I think those posts will go a long way in answering your questions!

' src=

January 18, 2019 at 7:00 pm

Mr. Frost I know I need to run a regression model however I’m still unsure of which one. I’m examining the effects of alcohol use on teenagers with 4 confounders.

January 19, 2019 at 6:47 pm

Hi Dahlia, to make the decision, I’d need to know what types of variables they all are (continuous, categorical, binary, etc). However, if the effect of alcohol is a continuous variable, then OLS linear regression is a great place to start!

Best of luck with your analysis!

' src=

January 5, 2019 at 2:39 am

Thank you very much Jim,

Very helpful, I think my problem is really on the number of observation (25 obs). Yes, I have read that post also, and I always keep the theory in mind when analyzing the IVs.

My main objective is to show the existing relationship between X2 and Y, which is also supported by literature, however, if I do not control for X1 I will never be sure that the effect I have found is due to X2 or X1, because X1 and X2 are correlated.

I think only correlation would be ok, since my number of observation are limited and by using regression it limits me about the number of IVs to be included in the model also, which may make me leave out of the model some others IVs, which is also bad.

Thank you again

Best regards!

January 4, 2019 at 9:40 am

Thank you for this very good post.

However, I have a question. What to do if the (IV) X1 and X2 are correlated (says 0.75) and both are correlated to Y (DV) at 0.60. However, when include X1 and X2 in the same model X2 is not statistically significant, but when put separably they become statistically significant. On the other hand, the model with only X1 has higher explanatory power than the model with only X2.

Note: In individual model both meet the OLS assumptions but, together, X2 become not statistically significant (using stepwise regression X2 is removed from the model), what this means. In addition, I know from the literarture that X2 affects Y, but I am testing X1, and X1 is showing better fits that X2.

Thank you in advance, I hope you understand my question!

January 4, 2019 at 3:15 pm

Yes, I understand completely! This situation isn’t too unusual. The underlying problem is that because the two IVs are correlated, they’re supplying a similar type of predictive information. There isn’t enough unique predictive information for both of them to be statistically significant. If you had a larger sample size, it’s possible that both would significant. Also, keep in mind that correlation is a pairwise measure and doesn’t account for other variables. When you include both IVs in the model, the relationship between each IV and the DV is determined after accounting for the other variables in the model. That’s why you can see a pairwise correlation but not a relationship in a regression model.

I know you’ve read a number of my posts, but I’m not sure if you’ve read the one about model specification. In that post, a key point I make is not to use statistical measures alone to determine which IVs to leave in the model. If theory suggests that X2 should be included, you have a very strong case for including it even if it’s not significant when X1 is in the model–just be sure to include that discussion in your write-up.

Conversely, just because X2 seems to provide a better fit statistically and is significant with or without X1 doesn’t mean you must include it in the model. Those are strong signs that you should consider including a variable in the model. However, as always, use theory as a guide and document the rational for the decisions you make.

For your case, you might consider include both IVs in the model. If they’re both supplying similar information and X2 is justified by theory, chances are that X1 is as well. Again, document your rationale. If you include both, check the VIFs to be sure that you don’t have problematic levels of multicollinearity when you include both IVs. If those are the only two IVs in your model, that won’t be problematic given the correlations you describe. But, it could be problematic if you more IVs in the model that are also correlated to X1 and X2.

Another thing to look at is whether the coefficients for X1 and X2 vary greatly depending on whether you have one or both of the IVs in the model. If they don’t change much, that’s nice and simple. However, if they do change quite a bit, then you need to determine which coefficient values are likely to be closer to the correct value because that corresponds to the choice about which IVs to include! I’m sounding like a broken record, but if this is a factor, document your rational and decisions.

I hope that helps! Best of luck with your analysis!

' src=

November 28, 2018 at 11:30 pm

Another great post! Thank you for truly making statistics intuitive. I learned a lot of this material back in school, but am only now understanding them more conceptually thanks to you. Super useful for my work in analytics. Please keep it up!

November 29, 2018 at 8:54 am

Thanks, Patrick! It’s great to hear that it was helpful!

' src=

November 12, 2018 at 12:54 pm

I think there may be a typo here – “These are important variables that the statistical model does include and, therefore, cannot control.” Shouldn’t it be “does not include”, if I understand correctly?

November 12, 2018 at 1:19 pm

Thanks, Jayant! Good eagle eyes! That is indeed a typo. I will fix it. Thanks for pointing it out!

' src=

November 3, 2018 at 12:07 pm

Mr. Jim thank you for making me understand econometrics. I thought that omitted variable is excluded from the model and that why they under/overestimate the coefficients. Somewhere in this article you mentioned that they are still included in the model but not controlled for. I find that very confusing, would you be able to clarify ? Thanks a lot.

November 3, 2018 at 2:26 pm

You’re definitely correct. Omitted variable bias occurs when you exclude a variable from the model. If I gave the impression that it’s included, please let me know where in the text because I want to clarify that! Thanks!

By excluding the variable, the model does not control for it, which biases the results. When you include a previously excluded variable, the model can now control for it and the bias goes away. Maybe I wrote that in a confusing way?

Thanks! I always strive to make my posts as clear as possible, so I’ll think about how to explain this better.

September 28, 2018 at 4:31 pm

In addition to mean square error, adj R-squared, I use Cp, IC, HQC, and SBIC to decide the number of dependent variables in multiple regression.

September 28, 2018 at 4:39 pm

I think there are a variety of good measures. I’d also add predicted R-squared–as long as you use them in conjunction with subject-area expertise. As I mention in this post, the entire set of estimate relationships must make theoretical sense. If they don’t, the statistical measures are not important.

September 28, 2018 at 4:13 pm

i have to read the article you named. Having said that, caution should be given when regression models model systems or processes not in statistical control. Also, some processes have physical bounds that a regression model does not capture and calculated predicted values have no physical meaning. Further, models from narrow ranges of independent variables may not be applicable outside the ranges of the independent variables.

September 28, 2018 at 4:19 pm

Hi Stan, those are all great points, and true. They all illustrate how you need to use your subject-area knowledge in conjunction with statistical analyses.

I talk about the issue of not going outside the range of the data, amongst other issues, in my post about Using Regression to Make Predictions .

I also agree about statistical control, which I think is under appreciated outside of the quality improvement arena. I’ve written about this in a post about using control charts with hypothesis tests .

September 28, 2018 at 2:30 pm

Valid confidence/prediction intervals are important if the regression model represents a process that is being characterized. When the prediction intervals are wide or too wide, the model’s validity and utility are in question.

September 28, 2018 at 2:49 pm

You’re definitely correct! If the model doesn’t fit the data, your predictions are worthless. One minor caveat that I’d add to your comment.

The prediction intervals can be too wide to be useful yet the model might still be valid. It’s really two separate assessments. Valid model and degree of precision. I write about this in several posts including the following: Understanding Precision in Prediction

September 26, 2018 at 9:13 am

Jim, does centering any independent explanatory variable require centering them all? Center the dependent and explanatory variables? I always make a normal probability plot of the deleted residuals as one test of the prediction capability of the fitted model. It is remarkable how good models give good normal probability plots. I also use the Shapiro-Wilks test to assess the deleted variables for normality. Stan Alekman

September 26, 2018 at 9:46 am

Yes, you should center all of the continuous independent variables if your goal is to reduce multicollinearity and/or to be able to interpret the intercept. I’ve never seen a reason to center the dependent variable.

It’s funny that you mention that about normally distributed residuals! I, too, have been impressed with how frequently that occurs even with fairly simple models. I’ve recently written a post about OLS assumptions and I mention how normal residuals are sort of optional. They only need to be normally distributed if you want to perform hypothesis tests and have valid confidence/prediction intervals. Most analysts want at least the hypothesis tests!

' src=

September 25, 2018 at 2:32 am

Hey Jim,your blogs are really helpful for me to learn data science.Here is my question in my assignment:

You have built a classification model with 90% accuracy but your client is not happy because False Positive rate was very high then what will you do? Can we do something to it by precision or recall??

this is the question..nothing is given in the background

though they should have given!

' src=

September 25, 2018 at 1:20 am

Thank you Jim Really interesting

September 25, 2018 at 1:26 am

Hi Brahim, you’re very welcome! I’m glad it was interesting!

' src=

September 24, 2018 at 10:30 pm

Hey Jim, you are awesome.

September 24, 2018 at 11:04 pm

Aw, MG, thanks so much!! 🙂

' src=

September 24, 2018 at 10:59 am

Thanks for another great article, Jim!.

Q: Could you expand with a specific plot example to explain more clearly, this statement: “We know that for omitted variable bias to exist, an independent variable must correlate with the residuals. Consequently, we can plot the residuals by the variables in our model. If we see a relationship in the plot, rather than random scatter, it both tells us that there is a problem and points us towards the solution. We know which independent variable correlates with the confounding variable.”

Thanks! SFdude

September 24, 2018 at 11:48 am

Hi, thanks!

I’ll try to find a good example plot to include soon. Basically, you’re looking for any non-random pattern. For example, the residuals might tend to either increase or decrease as the value of independent variable increases. That relationship can follow a straight line or display curvature, depending on the nature of relationship.

' src=

September 24, 2018 at 1:37 am

It’s been a long time I heard from you Jim . Missed your stats

September 24, 2018 at 9:53 am

Hi Saketh, thanks, you’re too kind! I try to post here every two weeks at least. Occasionally, weekly!

Comments and Questions Cancel reply

  • En español – ExME
  • Em português – EME

A beginner’s guide to confounding

Posted on 1st October 2018 by Eveliina Ilola

experimental studies confounding

Confounding means the distortion of the association between the independent and dependent variables because a third variable is independently associated with both.

A causal relationship between two variables is often described as the way in which the independent variable affects the dependent variable. The independent variable can take different values independently, and the dependent variable varies according to the value of the independent variable.

So, let’s say you want to find out how alcohol consumption affects mortality…

You decide to compare the mortality rates between two groups – one consisting of heavy users of alcohol, one consisting of teetotallers. In this case alcohol consumption would be your independent variable and mortality would be your dependent variable.

experimental studies confounding

If you find that people who consume more alcohol are more likely to die, it might seem intuitive to conclude that alcohol use increases the risk of death. In reality, however, the situation might be more complex. It is possible that alcohol use is not the only mortality-affecting factor that differs between the two groups.

People who consume less alcohol might be more likely to eat a healthier diet or less likely to smoke, for example. Eating a healthy diet or smoking might in turn affect mortality. These other influencing factors are called confounding variables . If you ignore them and assume that any differences in mortality must be caused by a difference in alcohol consumption, you could end up with results that don’t reflect reality all that well. You might find associations where in reality there are none, or fail to find associations where they do in fact exist.

experimental studies confounding

How to minimise the effects of confounding during study design

If you are investigating the effects of an intervention, you can randomly assign people to an intervention and control group. The aim of randomization is to evenly distribute the known and the unknown confounders between the two groups. The groups might still differ in potential confounders by chance but randomization minimises these differences.

In other types of studies you can address confounding through restriction or matching. Restriction means only studying people who are similar in terms of a confounding variable – for example, if you think age is a confounding variable you might only choose to study people older than 65. (This would obviously limit the applicability of your results to other groups). Matching means pairing people in the two groups based on potential confounders.

How to minimise the effects of confounding during statistical analysis

After completing the study you can minimise the effects of confounding using statistical methods.

If there is only a small number of potential confounders you can use stratification . In stratification you produce smaller groups in which the confounding variables don’t vary and then examine the relationship between the independent and dependent variable in each group. In the example we used before, for example, you might want to divide the sample into groups of smokers and non-smokers and examine the relationship between alcohol use and mortality within each.

If there is a larger number of potential confounders you can use multivariate analysis , for example logistic or linear regression .

Conclusions

The association between two variables might be modified by a third variable, and this can lead to distorted results. Even after taking this into account in study design and data analysis your data could still be distorted by confounding – there might e.g. be other confounding factors you don’t know of – but the first steps in reducing its effects are being aware of its potential to distort your results and planning accordingly.

Pourhoseingholi, M. A., Baghestani, A. R., & Vahedi, M. (2012). How to control confounding effects by statistical analysis . Gastroenterology and Hepatology From Bed to Bench , 5 (2), 79-83.

Catalogue of bias collaboration, Aronson JK, Bankhead C, Nunan D. Confounding . In Catalogue Of Biases. 2018. https://catalogofbias.org/biases/confounding/

http://www.ucl.ac.uk/ich/short-courses-events/about-stats-courses/stats-rm/chapter_1_content/Confounding_Factors

' src=

Eveliina Ilola

Leave a reply cancel reply.

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

No Comments on A beginner’s guide to confounding

' src=

Simplified, comprehensive.

' src=

Great piece Simple

Subscribe to our newsletter

You will receive our monthly newsletter and free access to Trip Premium.

Related Articles

""

Crossover trials: what are they and what are their advantages and limitations?

This blog introduces you to crossover trials with a clear explanation and example, together with some advantages and limitations of this study design.

experimental studies confounding

An early rehabilitation intervention to enhance recovery during hospital admission for an exacerbation of chronic respiratory disease: a critical appraisal

This blog is a critical appraisal of a randomized controlled trial, assessing the effectiveness of an early rehabilitation intervention to enhance recovery during hospital admission for an exacerbation of chronic respiratory disease.

pie chart

The difference between ‘Effect Modification’ & ‘Confounding’

Deevia takes a look at ‘effect modification’ and ‘confounding’ and explains the differences.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

An overview of confounding. Part 1: the concept and how to address it

Affiliation.

  • 1 Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA, USA.
  • PMID: 29341103
  • DOI: 10.1111/aogs.13295

Confounding is an important source of bias, but it is often misunderstood. We consider how confounding occurs and how to address confounding using examples. Study results are confounded when the effect of the exposure on the outcome, mixes with the effects of other risk and protective factors for the outcome. This problem arises when these factors are present to different degrees among the exposed and unexposed study participants, but not all differences between the groups result in confounding. Thinking about an ideal study where all of the population of interest is exposed in one universe and is unexposed in a parallel universe helps to distinguish confounders from other differences. In an actual study, an observed unexposed population is chosen to stand in for the unobserved parallel universe. Differences between this substitute population and the parallel universe result in confounding. Confounding by identified factors can be addressed analytically and through study design, but only randomization has the potential to address confounding by unmeasured factors. Nevertheless, a given randomized study may still be confounded. Confounded study results can lead to incorrect conclusions about the effect of the exposure of interest on the outcome.

Keywords: Bias; causality; confounding factors (epidemiology); data analysis; epidemiologic methods; epidemiologic research design.

© 2018 Nordic Federation of Societies of Obstetrics and Gynecology.

PubMed Disclaimer

Similar articles

  • An overview of confounding. Part 2: how to identify it and special situations. Howards PP. Howards PP. Acta Obstet Gynecol Scand. 2018 Apr;97(4):400-406. doi: 10.1111/aogs.13293. Epub 2018 Feb 6. Acta Obstet Gynecol Scand. 2018. PMID: 29341101 Review.
  • Information bias in epidemiological studies with a special focus on obstetrics and gynecology. Kesmodel US. Kesmodel US. Acta Obstet Gynecol Scand. 2018 Apr;97(4):417-423. doi: 10.1111/aogs.13330. Acta Obstet Gynecol Scand. 2018. PMID: 29453880 Review.
  • Confounding in health research. Greenland S, Morgenstern H. Greenland S, et al. Annu Rev Public Health. 2001;22:189-212. doi: 10.1146/annurev.publhealth.22.1.189. Annu Rev Public Health. 2001. PMID: 11274518 Review.
  • Cohort studies in the context of obstetric and gynecologic research: a methodologic overview. Messerlian C, Basso O. Messerlian C, et al. Acta Obstet Gynecol Scand. 2018 Apr;97(4):371-379. doi: 10.1111/aogs.13272. Epub 2017 Dec 26. Acta Obstet Gynecol Scand. 2018. PMID: 29194569 Review.
  • [Causality and confounding in epidemiology]. Stang A. Stang A. Gesundheitswesen. 2011 Dec;73(12):884-7. doi: 10.1055/s-0031-1287843. Epub 2011 Dec 22. Gesundheitswesen. 2011. PMID: 22193897 German.
  • Systematic review and bioinformatics analysis of plasma and serum extracellular vesicles proteome in type 2 diabetes. Arredondo-Damián JG, Martínez-Soto JM, Molina-Pelayo FA, Soto-Guzmán JA, Castro-Sánchez L, López-Soto LF, Candia-Plata MDC. Arredondo-Damián JG, et al. Heliyon. 2024 Feb 5;10(3):e25537. doi: 10.1016/j.heliyon.2024.e25537. eCollection 2024 Feb 15. Heliyon. 2024. PMID: 38356516 Free PMC article.
  • Preterm birth and subsequent intelligence and academic performance in youth: A cohort study. Sejer EPF, Ladelund AK, Bruun FJ, Slavensky JA, Mortensen EL, Kesmodel US. Sejer EPF, et al. Acta Obstet Gynecol Scand. 2024 May;103(5):850-861. doi: 10.1111/aogs.14796. Epub 2024 Feb 13. Acta Obstet Gynecol Scand. 2024. PMID: 38348635 Free PMC article.
  • Effectiveness and economic evaluation of rhTPO and rhIL-11 in the treatment of cancer therapy induced thrombocytopenia based on real-world research. Gong FM, Liu FY, Ma X, Ma ST, Xiao HT, Jiang G, Qi TT. Gong FM, et al. Front Pharmacol. 2024 Jan 23;15:1288964. doi: 10.3389/fphar.2024.1288964. eCollection 2024. Front Pharmacol. 2024. PMID: 38327986 Free PMC article.
  • Association of birth by cesarean section with academic performance and intelligence in youth: A cohort study. Ladelund AK, Slavensky JA, Bruun FJ, Fogtmann Sejer EP, Mortensen EL, Ladelund S, Kesmodel US. Ladelund AK, et al. Acta Obstet Gynecol Scand. 2023 May;102(5):532-540. doi: 10.1111/aogs.14535. Epub 2023 Mar 22. Acta Obstet Gynecol Scand. 2023. PMID: 36946073 Free PMC article.
  • COVID-19 onset reduced the sex ratio at birth in South Africa. Masukume G, Ryan M, Masukume R, Zammit D, Grech V, Mapanga W. Masukume G, et al. PeerJ. 2022 Aug 29;10:e13985. doi: 10.7717/peerj.13985. eCollection 2022. PeerJ. 2022. PMID: 36061753 Free PMC article.

Publication types

  • Search in MeSH

Related information

  • Cited in Books

LinkOut - more resources

Full text sources.

  • Ovid Technologies, Inc.

Other Literature Sources

  • scite Smart Citations

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

helpful professor logo

25 Confounding Variable Examples

25 Confounding Variable Examples

Chris Drew (PhD)

Dr. Chris Drew is the founder of the Helpful Professor. He holds a PhD in education and has published over 20 articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education. [Image Descriptor: Photo of Chris]

Learn about our Editorial Process

confounding variable example and definition, explained below

Confounding variables are variables that ‘confound’ (meaning to confuse) the data in a study. In scholarly terms, we say that they are extraneous variables that correlate (positively or negatively) with both the dependent variable and the independent variable (Scharrer & Ramasubramanian, 2021).

These variables present a challenge in research as they can obscure the potential relationships between the variables under examination, leading to spurious correlations and the famous third variable problem .

Accurately isolating and controlling confounding variables is thus crucial in maximizing the validity of an experiment or study, primarily when trying to determine cause-effect relationships between variables (Knapp, 2017; Nestor & Schutt, 2018).

chris

Confounding Variables Examples

1. IQ and Reading Ability A study could find a positive correlation between children’s IQ and reading ability. However, the socioeconomic status of the families could be a confounding variable, as children from wealthier families could have more access to books and educational resources.

2. Coffee Intake and Heart Disease A research finding suggests a positive correlation between coffee intake and heart disease. But the variable ‘exercise’ could confound the situation, as those who drink a lot of coffee might also do less exercise.

3. Medication and Recovery Time A study posits a link between a specific medication and faster recovery time from a disease. However, the overall health of the patient, which can significantly affect recovery, serves as a confounding variable.

4. Unemployment and Mental Health There seems to be a relationship between unemployment and poor mental health. However, the confounding variable can be the quality of the support network, as unemployed individuals with robust emotional support might have better mental health.

5. Exercise and Stress Levels A study might show a negative correlation between exercise levels and stress. But, sleep patterns could act as a confounder, as individuals who exercise more might also have better sleep, which in turn could lower stress levels.

6. Height and Self-esteem A study claims a positive correlation between height and self-esteem. In this case, attractiveness can confound the result, as sometimes taller people might be judged by society as more attractive, leading to higher self-esteem.

7. Class Attendance and Grades Research indicates that students who attend classes regularly have better grades. However, a student’s intrinsic motivation to learn could be a confounding variable, as these students might not only attend class but also study more outside of class.

8. Age and Job Satisfaction A study might suggest that older employees are more satisfied with their jobs. In this scenario, job position could be a confounder, as older employees might occupy higher, more gratifying positions in the company.

9. Light Exposure and Depression Researching seasonal depression might show a connection between reduced light exposure in winter and increased depression rates. However, physical activity (which tends to decrease in winter) could confound these results.

10. Parent’s Education and Children’s Success at School A study states that children of highly educated parents perform better at school. However, a confounding variable might be the parents’ income, which could allow for a range of educational resources.

11. Physical Exercise and Academic Performance A positive correlation may be found between daily physical exercise and academic performance. However, time management skills can be a potential confounder as students with good time management skills might be more likely to fit regular exercise into their schedule and also keep up with their academic work efficiently.

12. Daily Screen Time and Obesity Research suggests a link between extensive daily screen time and obesity. But the confounding variable could be the lack of physical activity, which is often associated with both increased screen time and obesity.

13. Breakfast Consumption and Academic Performance It might be suggested that students who eat breakfast regularly perform better academically. However, the confounding factor could be the overall nutritional status of the students, as those who eat breakfast regularly may also follow healthier eating habits that boost their academic performance.

14. Population Density and Disease Transmission A study may show higher disease transmission rates in densely populated areas. Still, public health infrastructure could be a confounding variable, as densely populated areas with poor health facilities might witness even higher transmission rates.

15. Age and Skin Cancer A study might suggest that older individuals are at a higher risk of skin cancer. However, exposure to sunlight, a major factor contributing to skin cancer, may confound the relationship, with individuals exposed to more sunlight over time having a greater risk.

16. Working Hours and Job Satisfaction A hypothetical study indicates that employees working longer hours report lower job satisfaction levels. However, the job’s intrinsic interest could be a confounder, as someone who finds their job genuinely interesting might report higher satisfaction levels despite working long hours.

17. Sugar Consumption and Tooth Decay Sugar intake is linked to tooth decay rates. However, dental hygiene practice is a typical confounding variable: individuals who consume a lot of sugar but maintain good oral hygiene might show lower tooth decay rates.

18. Farm Exposure and Respiratory Illness A study observes a relationship between farm exposure and reduced respiratory illnesses. Yet, a healthier overall lifestyle associated with living in rural areas might confound these results.

19. Outdoor Activities and Mental Health Research might suggest a link between participating in outdoor activities and improved mental health. However, pre-existing physical health could be a confounding variable, as those enjoying good physical health could be more likely to participate in frequent outdoor activities, thereby resulting in improved mental health.

20. Pet Ownership and Happiness A study shows that pet owners report higher levels of happiness. However, family dynamics can serve as a confounding variable, as the presence of pets might be linked to a more active and happier family life.

21. Vitamin D Levels and Depression Research indicates a correlation between low vitamin D levels and depression. However, sunlight exposure might act as a confounding variable, as it affects both vitamin D levels and mood.

22. Employee Training and Organizational Performance A positive relationship might be found between the level of employee training and organizational performance. Still, the organization’s leadership quality could confound these results, being significant in both successful employee training implementation and high organizational performance.

23. Social Media Use and Loneliness There appears to be a positive correlation between high social media use and feelings of loneliness. However, personal temperament can be a confounding variable, as individuals with certain temperaments may spend more time on social media and feel more isolated.

24. Respiratory Illnesses and Air Pollution Studies indicate that areas with higher air pollution have more respiratory illnesses. However, the time spent outdoors could be a confounding variable, as those spending more time outside in polluted areas have a higher exposure to pollutants.

25. Maternal Age and Birth Complications Advanced maternal age is linked to increased risk of birth complications. Yet, health conditions such as hypertension, more common in older women, could confound these results.

Types of Confounding Variables

The scope of confounding variables spans across order effects, participant variability, social desirability effect, Hawthorne effect, demand characteristics, and evaluation apprehension , among other types (Parker & Berman, 2016).

  • Order Effects refer to the impact on a participant’s performance or behavior brought on by the order in which the experimental tasks are presented (Riegelman, 2020). The learning or performance of a task could influence the performance or understanding of subsequent tasks (experiment with multiple language assessments: German followed by French, could have different results if tested in the reverse order).
  • Participant Variability tackles the inconsistencies stemming from unique characteristics or behaviors of individual participants, which could inadvertently impact the results. Physical fitness levels among participants in an exercise study could greatly influence the results.
  • Social Desirability Effect comes into play when participants modify their responses to be more socially acceptable, often leading to bias in self-reporting studies. For instance, in a study measuring dietary habits, participants might overreport healthy food consumption and underreport unhealthy food choices to align with what they perceive as socially desirable.
  • Hawthorne Effect constitutes a type of observer effect where individuals modify their behavior in response to being observed during a study (Nestor & Schutt, 2018; Riegelman, 2020). In a job efficiency study, employees may work harder just because they know they’re being observed.
  • Demand Characteristics include cues that might inadvertently inform participants of the experiment’s purpose or anticipated results, resulting in biased outcomes (Lock et al., 2020). If participants in a product testing study deduce the product being promoted, it might alter their responses.
  • Evaluation Apprehension could affect the findings of a study when participants’ anxiety about being evaluated leads them to alter their behavior (Boniface, 2019; Knapp, 2017). This is common in performance studies where participants know their results will be judged or compared.

Confounding variables can complicate and potentially distort the results of experiments and studies. Yet, by accurately recognizing and controlling for these confounding variables, researchers can ensure more valid findings and more precise observations about the relationships between variables. Understanding the nature and impact of confounding variables and the inherent challenges in isolating them is crucial for anyone engaged in rigorous research.

Boniface, D. R. (2019). Experiment Design and Statistical Methods For Behavioural and Social Research . CRC Press. ISBN: 9781351449298.

Knapp, H. (2017). Intermediate Statistics Using SPSS. SAGE Publications.

Lock, R. H., Lock, P. F., Morgan, K. L., Lock, E. F., & Lock, D. F. (2020). Statistics: Unlocking the Power of Data (3rd ed.). Wiley.

Nestor, P. G., & Schutt, R. K. (2018). Research Methods in Psychology: Investigating Human Behavior . SAGE Publications.

Parker, R. A., & Berman, N. G. (2016). Planning Clinical Research . Cambridge University Press.

Riegelman, R. K. (2020). Studying a Study and Testing a Test (7th ed.). Wolters Kluwer Health.

Scharrer, E., & Ramasubramanian, S. (2021). Quantitative Research Methods in Communication: The Power of Numbers for Social Justice . Taylor & Francis.

Chris

  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 25 Number Games for Kids (Free and Easy)
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 25 Word Games for Kids (Free and Easy)
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 25 Outdoor Games for Kids
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 50 Incentives to Give to Students

Leave a Comment Cancel Reply

Your email address will not be published. Required fields are marked *

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

3.5 - bias, confounding and effect modification.

Consider the figure below. If the true value is the center of the target, the measured responses in the first instance may be considered reliable, precise or as having negligible random error, but all the responses missed the true value by a wide margin. A biased estimate has been obtained. In contrast, the target on the right has more random error in the measurements, however, the results are valid, lacking systematic error. The average response is exactly in the center of the target. The middle target depicts our goal: observations that are both reliable (small random error) and valid (without systematic error).

Accuracy for a Sample Size of 5

Bias, confounding and effect modification in epidemiology Section  

When examining the relationship between an explanatory factor and an outcome, we are interested in identifying factors that may modify the factor's effect on the outcome (effect modifiers). We must also be aware of potential bias or confounding in a study because these can cause a reported association (or lack thereof) to be misleading. Bias and confounding are related to the measurement and study design. Let 's define these terms:

If the method used to select subjects or collect data results in an incorrect association, .

THINK >> Bias!

If an observed association is not correct because a different (lurking) variable is associated with both the potential risk factor and the outcome, but it is not a causal factor itself,

THINK >> Confounding!

If an effect is real but the magnitude of the effect is different for different groups of individuals (e.g., males vs females or blacks vs whites).

THINK >> Effect modification!

Bias Resulting from Study Design

Bias limits validity (the ability to measure the truth within the study design) and generalizability (the ability to confidently apply the results to a larger population) of study results. Bias is rarely eliminated during analysis. There are two major types of bias:

Selection bias: systematic error in the selection or retention of participants

  • Suppose you are selecting cases of rotator cuff tears (a shoulder injury). Many older people have experienced this injury to some degree, but have never been treated for it. Persons who are treated by a physician are far more likely to be diagnosed (and identified as cases) than persons who are not treated by a physician. If a study only recruits cases among patients receiving medical care, there will be selection bias.
  • Some investigators may identify cases predicated upon previous exposure. Suppose a new outbreak is related to a particular exposure, for example, a particular pain reliever. If a press release encourages people taking this pain reliever to report to a clinic to be checked to determine if they are a case and these people then become the cases for the study, a bias has been created in sample selection. Only those taking the medication were assessed for the problem. Ascertaining a case based upon previous exposure creates a bias that cannot be removed once the sample is selected.
  • Exposure may affect the selection of controls – e.g, hospitalized patients are more likely to have been smokers than the general population. If controls are selected among hospitalized patients, the relationship between an outcome and smoking may be underestimated because of the increased prevalence of smoking in the control population.
  • In a cohort study, people who share similar characteristics may be lost to follow-up. For example, people who are mobile are more likely to change their residence and be lost to follow-up. If the length of residence is related to the exposure then our sample is biased toward subjects with less exposure.
  • In a cross-sectional study, the sample may have been non-representative of the general population. This leads to bias. For example, suppose the study population includes multiple racial groups but members of one race participate less frequently in the type of study. A bias results.

Information bias (misclassification bias): Systematic error due to inaccurate measurement or classification of disease, exposure, or other variables.

  • Instrumentation - an inaccurately calibrated instrument creating a systematic error
  • Misdiagnosis - if a diagnostic test is consistently inaccurate, then information bias would occur
  • Recall bias - if individuals can't remember exposures accurately, then information bias would occur
  • Missing data - if certain individuals consistently have missing data, then information bias would occur
  • Socially desirable response - if study participants consistently give the answer that the investigator wants to hear, then information bias would occur

Misclassification can be differential or non-differential.

Are we more likely to misclassify cases than controls? For example, if you interview cases in-person for a long period of time, extracting exact information while the controls are interviewed over the phone for a shorter period of time using standard questions, this can lead to differential misclassification of exposure status between controls and cases.

Either type of misclassification can produce misleading results.

Confounding and Confounders Section  

Confounding : A situation in which a measure of association or relationship between exposure and outcome is distorted by the presence of another variable. Positive confounding (when the observed association is biased away from the null) and negative confounding (when the observed association is biased toward the null) both occur.

Confounder : an extraneous variable that wholly or partially accounts for the observed effect of a risk factor on disease status.. The presence of a confounder can lead to inaccurate results.

A confounder meets all three conditions listed below:

  • It is associated with putative risk factor.
  • It is not in the causal pathway between exposure and disease.

The first two of these conditions can be tested with data. The third is more biological and conceptual.

Confounding masks the true effect of a risk factor on a disease or outcome due to the presence of another variable. We determine identify potential confounders from our:

  • Prior experience with data
  • Three criteria for confounders

Example 3-6: Confounding Section  

We survey patients as a part of the cross-sectional study asking whether they have coronary heart disease and if they are diabetic. We generate a 2 × 2 table (below):

Crude Diabetes- CHD association

Diabetes (Prevalent Diabetes)
Frequency
Percent
Row Pct
Col Pct
CHD (Prevalent Coronary Heart Disease) Total
0 1
0

2249
87.99
96.11
92.21

91
3.56
3.89
77.78
2340
91.55

\(P_{0}=91 / 2340=3.96 \%\)

\(\mathrm{P}_{1}=26 / 216=12.04 \%\)

Prevalence Ratio:
\(PR=P_{1} / P_{0}=12.0 / 3.9=3.10\)

 

Odds ratio \(= (2249 \times 26] /[91 \times 190]=3.38\)

1 190
7.43
87.96
7.79
26
1.02
12.04
22.22
216
8.45
Total 2439
95.42
117
4.58
2556
100.00

'0' indicates those who do not have coronary heart disease, '1' is for those with coronary heart disease; similarly for diabetes, '0' is the absence, and '1' the presence of diabetes.

The prevalence of coronary heart disease among people without diabetes is 91 divided by 2340, or 3.9% of all people with diabetes have coronary heart disease. Similarly the prevalence among those with diabetes is 12.04%. Our prevalence ratio, considering whether diabetes is a risk factor for coronary heart disease is 12.04 / 3.9 = 3.1. The prevalence of coronary heart disease in people with diabetes is 3.1 times as great as it is in people without diabetes.

We can also use the 2 x 2 table to calculate an odds ratio as shown above:

( 2249 × 26) / ( 91 × 190) = 3.38

The odds of having diabetes among those with coronary heart disease is 3.38 times as high as the odds of having diabetes among those who do not have coronary heart disease.

Which of these do you use? They come up with slightly different estimates.

It depends upon your primary purpose. Is your purpose to compare prevalences? Or, do you wish to address the odds of dibetes as related to coronary health status?

Now, let's add hypertension as a potential confounder.

Ask: "Is hypertension a risk factor for CHD (among non-diabetics)?"

First of all, prior knowledge tells us that hypertension is related to many heart related diseases. Prior knowledge is an important first step but let's test this with data.

We consider the 2 × 2 table below:

HYPERT (Hypertension)
Frequency
Percent
Row Pct
Col Pct
CHD (PREVALENT CORONARY HEART DISEASE)
0 1 Total
0

1572
67.44
96.86
70.15

51
2.19

56.67
1623
69.63
1 669
28.70
94.49
29.85
39
1.67

43.33
708
30.37
Total 2241
96.14
90
3.86
2331
100.00

Is hypertension a risk factor for CHD (among
non-diabetics)?
Statistics for a table of Hypert by CHD

Statistic DF Value Prob
Chi-square 1 7.435 0.006
Likelihood Ratio Chi-square 1 6.998 0.008
Continuity Adj. Chi-square 1 6.811 0.009
Mantel- Haenszel Chi-square 1 7.432 0.006
Fisher's Exact Test       (Left)     0.997
Fisher's Exact Test       (Right)     5.45E-03
Fisher's Exact Test       (2-Tail)     9.66E-03
Phi Coefficient     0.056
Contingency Coefficient     0.056
Cramer's V     0.056

 

Effective Sample Size = 2331
Frequency Missing = 49

We are evaluating the relationship of CHD to hypertension in non-diabetics. You can calculate the prevalence ratios and odds ratios as suits your purpose.

These data show that there is a positive relationship between hypertension and CHD in non-diabetics. (note the small p-values)

This leads us to our next question, "Is diabetes (exposure) associated with hypertension?"

We can answer this with our data as well (below):

HYPERT (Hypertension)
Frequency
Percent
Row Pct
Col Pct
DIABETES (Diabetes) Total
0 1
0

1650
63.66
95.10
69.59

85
3.28

38.46
1735
66.94
1 721
27.82
84.13
30.41
136
5.25

61.54
857
33.06
Total 2371
91.47
221
8.53
2592
100.00

Is diabetes (exposure) associated with HYP?
Statistics for a table of Hypert by Diabetes

Statistic DF Value Prob
Chi-square 1 88.515 0.001
Likelihood Ratio Chi-square 1 82.438 0.001
Continuity Adj. Chi-square 1 87.114 0.001
Mantel- Haenszel Chi-square 1 88.481 0.001
Fisher's Exact Test       (Left)     1.000
Fisher's Exact Test       (Right)     1.01E-19
Fisher's Exact Test       (2-Tail)     1.79E-19

Again, the results are highly significant! Therefore, our first two criteria have been met for hypertension as a confounder in the relationship between diabetes and coronary heart disease.

A final question, "Is hypertension an intermediate pathway between diabetes (exposure) and development of CHD?" – or, vice versa, does diabetes cause hypertension which then causes coronary heart disease? Based on biology, that is not the case. Diabetes in and of itself can cause coronary heart disease. Using the data and our prior knowledge, we conclude that hypertension is a major confounder in the diabetes-CHD relationship.

What do we do now that we know that hypertension is a confounder?

Stratify....let's consider some stratified assessments...

Example 3-7: A cross-sectional study Section  

Stratification and adjustment - diabetes and chd relationship confounded by hypertension:.

Earlier we arrived at a crude odds ratio of 3.38.

Crude Diabetes- CHD association
Diabetes CHD Total
Yes No
Yes 26 190 216
No 91 2249 2340
Total 117 2439 2556
\(OR_{\text {crude }}=(26 \times 2249) /(91 \times 190)=3.38\)

Now we will use an extended Maentel Hanzel method to adjust for hypertension and produce an adjusted odds ratio When we do so, the adjusted OR = 2.84.

The Mantel-Haenszel method takes into account the effect of the strata, presence or absence of hypertension.

If we limit the analysis to normotensives we get an odds ratio of 2.4.

Diabetes & CHD Among Normotensives
Diabetes CHD Total
Yes No
Yes 6 77 83
No 51 1572 1623
Total 57 1649 1706
\(OR_{\text {HYP-NO }}=(6 \times 1572) /(77 \times 51)=2.40\)

Among hypertensives, we get an odds ratio of 3.04.

Diabetes & CHD Among Hypertensives
Diabetes CHD Total
Yes No
Yes 20 113 133
No 39 669 708
Total 59 782 841
\(OR_{\text {HYP-YES }}=(20 \times 669) /(39 \times 113)=3.04\)

Both estimates of the odds ratio are lower than the odds ratio based on the entire sample. If you stratify a sample, without losing any data, wouldn't you expect to find the crude odds ratio to be a weighted average of the stratified odds ratios?

This is an example of confounding - the stratified results are both on the same side of the crude odds ratio. This is positive confounding because the unstratified estimate is biased away from the null hypothesis. The null is 1.0. The true odds ratio, accounting for the effect of hypertension, is 2.8 from the Maentel Hanzel test. The crude odds ratio of 3.38 was biased away from the null of 1.0. (In some studies you are looking for a positive association; in others, a negative association, a protective effect; either way, differing from the null of 1.0)

This is one way to demonstrate the presence of confounding. You may have a priori knowledge of confounded effects, or you may examine the data and determine whether confounding exists. Either way, when confounding is present, as, in this example, the adjusted odds ratio should be reported. In this example, we report the odds ratio for the association of diabetes with CHD = 2.84, adjusted for hypertension.

If you are analyzing data using multivariable logistic regression, a rule of thumb is if the odds ratio changes by 10% or more, include the potential confounder in the multi-variable model. The question is not so much the statistical significance, but the amount of the confounding variable changes the effect. If a variable changes the effect by 10% or more, then we consider it a confounder and leave it in the model.

We will talk more about this later, but briefly here are some methods to control for a confounding variable (known a priori):

  • randomize individuals into different groups (use an experimental approach)
  • restrict/filter for certain groups
  • match in case-control studies
  • analysis (stratify, adjust)

Controlling potential confounding starts with a good study design including anticipating potential confounders.

Effect Modification (interaction) Section  

In the previous example, we saw both stratum-specific estimates of the odds ratio went to one side of the crude odds ratio. With effect modification, we expect the crude odds ratio to be between the estimates of the odds ratio for the stratum-specific estimates.

Consider the following examples:

  • The immunization status of an individual modifies the effect of exposure to a pathogen and specific types of infectious diseases. Why?
  • Breast Cancer occurs in both men and women. Breast cancer occurs in men at approximately a rate of 1.5/100,000 men. Breast cancer occurs in women at approximately a rate of 122.1/100,000 women. This is about an 800 fold difference. We can build a statistical model that shows that gender interacts with other risk factors for breast cancer, but why is this the case? Obviously, there are many biological reasons why this interaction should be present. This is the part that we want to look at from an epidemiological perspective. Consider whether the biology supports a statistical interaction that you might observe.

Think about it!

Why study effect modification why do we care.

  • to define high-risk subgroups for preventive actions,
  • to increase the precision of effect estimation by taking into account groups that may be affected differently,
  • to increase the ability to compare across studies that have different proportions of effect-modifying groups, and
  • to aid in developing a causal hypothesis for the disease

If you do not identify and handle properly an effect modifier, you will get an incorrect crude estimate. The (incorrect) crude estimator (e.g., RR, OR) is a weighted average of the (correct) stratum-specific estimators. If you do not sort out the stratum-specific results, you miss an opportunity to understand the biologic or psychosocial nature of the relationship between risk factors and outcome.

To consider effect modification in the design and conduct of a study:

  • Collect information on potential effect modifiers.
  • Power the study to test potential effect modifiers - if a priori you think that the effect may differ depending on the stratum, power the study to detect a difference.
  • Don't match on a potentially important effect modifier - if you do, you can't examine its effect.

To consider effect modification in the analysis of data:

  • Again, consider what potential effect modifiers might be.
  • Stratify the data by potential effect modifiers and calculate stratum-specific estimates of the effect of the risk on the outcome; determine if effect modification is present. If so,
  • Present stratum-specific estimates. Use Breslow-Day Test for Homogeneity of the odds ratios, from Extended Mantel-Haenszel method, or -2 log-likelihood test from logistic regression to test the statistical significance of potential effect modifiers and to calculate the estimators of exposure-disease association according to the levels of significant effect modifiers. Alternatively, if assumptions are met, use proportional hazards regression to produce an adjusted hazards ratio.

Example 3-8: Diabetes as a Risk for Coronary Heart Disease Section  

When you combine men and women the crude odds ratio = 4.30.

Diabetes and Incident CHD - Females

Diabetes (Diabetes)
Frequency
Percent
Row Pct
Col Pct
Incident CHD Total
0 1
0 1191 25 1216
 
1 93 13 106
 
Total 1248 38 1322

\(Cumulative \ Incidence_{0} \\\ = \\ 25/1219 \ = \\\ 2.05 %\)

\(Cumulative \ Incidence_{1} \\\ = \\ 13/106 \ = \\\ 12.26 %\)

\(Relative \ Risk \\\ = \\\ 12.26/2.05 =   5.98\)

\(Odds \ ratio   =   (1191*13)/(25*93)  =   6.66\)

Diabetes and Incident CHD - Males

Diabetes (Diabetes)
Frequency
Percent
Row Pct
Col Pct
Incident CHD Total
0 1
0 1003 70 1073
 
1 77 12 89
 
Total 1080 82 1162

\(CI_{0} \\ = \\ 6.52 %\)

\(CI_{1} \\ = \\ 13.48 %\)

\(RR \\ = \\ 2.07\)

\(Odds \ ratio \\ = \\ 2.23\)

Stratifying by gender, we can calculate different measures. Look at the odds ratios above. The odds ratio for women is 6.66, compared to the crude odds ratio of 4.30. Therefore, women are at much greater risk of diabetes leading to incident coronary heart disease. For men, the odds ratio is 2.23.

Is diabetes a risk for incident heart disease in men and in women? Yes. Is it the same level of risk? No. For men, the OR is 2.23, for women it is 6.66. The overall estimate is closer to a weighted average of the two stratum-specific estimates. Gender modifies the effect of diabetes on incident heart disease. We can see that numerically because the crude odds ratio is more representative of a weighted average of the two groups.

What is the most informative estimate of the risk of diabetes for heart disease? 4.30 is not very informative of the true relationship. What is much more informative is to present the stratum-specified analysis.

During data analysis, major confounders and effect modifiers can be identified by comparing stratified results to overall results.

In summary, the process is as follows:

  • Estimate a crude (unadjusted) estimate between exposure and disease.
  • Stratify the analysis by any potential major confounders to produce stratum-specific estimates.
  • Compare the crude estimator with stratum-specific estimates and examine the kind of relationships exhibited.
  • the crude estimator (e.g. RR, OR) is outside the range of the two stratum-specific estimators ( in the hypertension example - the crude odds ratio was higher than both of the stratum specific ratios);
  • If the adjusted estimator is importantly (not necessarily statistically) different (often 10%) from the crude estimator, the “adjusted variable” is a confounder. In other words, if including the potential confounder changes the estimate of the risk by 10% or more, we consider it important and leave it in the model.
  • Statistical methods (Extended Mantel-Haenszel method, multiple regression, multiple logistic regression, proportional hazards) are available to calculate the “adjusted” estimator, accounting for confounders.
  • the crude estimator (e.g. RR, OR) is closer to a weighted average of the stratum-specific estimators;
  • the two stratum-specific estimators differ from each other
  • Report separate stratified models or report an interaction term.

To review, confounders mask a true effect, and effect modifiers mean that there is a different effect for different groups.

You have reached the end of the reading material for Week 3!!! Go to the Week 3 activities in Canvas.

  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Sweepstakes
  • Guided Meditations
  • Verywell Mind Insights
  • 2024 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

Confounding Variables in Psychology Research

Getty Images / Andrew Brookes

  • Real World Examples

Confounding variables are external factors (typically a third variable) in research that can interfere with the relationship between dependent and independent variables .

At a Glance

A confounding variable alters the risk of the condition being studied and confuses the “true” relationship between the variables. The role of confounding variables in research is critical to understanding the causes of all kinds of physical, mental, and behavioral conditions and phenomena.

Real World Examples of Confounding Variables

Typical examples of confounding variables often relate to demographics and social and economic outcomes.

For instance, people who are relatively low in socioeconomic status during childhood tend to do, on average, worse financially than others do when they reach adulthood, explains Glenn Geher , PhD, professor of psychology at State University of New York at New Paltz and author of “Own Your Psychology Major!” While he said we could simply think this because poverty begets poverty, he also says there are other variables that are conflated with poverty.

People with lower economic means tend to have less access to high quality education, which is also related to fiscal success in adulthood, Geher explained. Furthermore, poverty is often associated with limited access to healthcare and, thus, with increased risk of adverse health outcomes. These factors can also play roles in fiscal success in adulthood.

“The bottom line here is that when looking to find factors that predict adult economic success, there are many variables that predict this outcome, and so many of these factors are confounded with one another,” Geher said. 

The Impact of Confounding Variables on Research

Psychology researchers must be diligent in controlling for confounding variables, because if they are not, they may draw inaccurate conclusions.

For example, during a research project, Geher’s team found the number of stitches one received in childhood predicted one’s sexual activity in adulthood.

However, Geher said "to conclude that getting stitches causes promiscuous behavior would be unwarranted and odd. In fact, it is much more likely that childhood health outcomes, such as getting stitches, predicts environmental instability during childhood, which has been found to indirectly bear on adult sexual and relationship outcomes,” said Geher.

In other words, the number of stitches is confounded with environmental instability in childhood. It's not that the number of stitches is directly correlated with sexual activity.

Another example that shows confounding variables is the idea that there is a positive correlation between ice cream sales and homicide rates. However, in fact, both these variables are confounded with time of year, said Geher. “They are both higher in summer when days are longer, days are hotter, and people are more likely to encounter others in social contexts because in the winter when it is cold people are more likely to stay home—so they are less likely to buy ice cream cones and to kill others,” he said. 

Both of these are examples of how it is in the best interest of researchers to ensure that they control for confounding variables to increase the likelihood that their conclusions are truly warranted.

Universal confounding variables across research on a particular topic can also be influential. In an evaluation of confounding variables that assessed the effect of alcohol consumption on the risk of ischemic heart disease, researchers found a large variation in the confounders considered across observational studies.

While 85 of 87 studies that the researchers analyzed made a connection to alcohol and ischemic heart disease, confounding variables that could influence ischemic heart disease included, smoking, age, and BMI, height, and/or weight. This means that these factors could have also affected heart disease, not just alcohol.

While most studies mentioned or alluded to “confounding” in their Abstract or Discussion sections, only one stated that their main findings were likely to be affected by confounding variables. The authors concluded that almost all studies ignored or eventually dismissed confounding variables in their conclusions.

Because study results and interpretations may be affected by the mix of potential confounders included within models, the researchers suggest that “efforts are necessary to standardize approaches for selecting and accounting for confounders in observational studies.”

Techniques to Identify Confounding Variables

The best way to control for confounding variables is to conduct “true experimental research,” which means researchers experimentally manipulate a variable that they think causes a certain outcome. They typically do this by randomly assigning study participants to different levels of the first variable, which is referred to as the “independent variable.”

For example, if researchers want to determine if, separate from other factors, receiving a full high-quality education, including a four-year college degree from a respected school, causes positive fiscal outcomes in adulthood, they would need to find a pool of participants, such as a group of young adults from the same broad socioeconomic group as one another. Once the group is selected, half of them would need to be randomly assigned to receive a free, high-quality education and the other half would need to be randomly assigned to not receive such an education.

“This methodology would allow you to see if there are fiscal outcomes on average for the two groups later in life and, if so, you could reasonably conclude that the cause of the differential fiscal outcomes is found in the educational differences across the two groups,” said Geher. “You can draw this conclusion because you randomly assigned the participants to these different groups—and process that naturally controls for confounding variables.” 

However, with this process, different problems emerge. For instance, it would not be ethical or practical to randomly assign some participants to a “high-quality education” group and others to a “no-education” group.

“[Controlling] confounding variables via experimental manipulation is not always feasible,” Geher said. 

Because of this, there are also statistical ways to try to control for confounding variables, such as “partial correlation,” which looks at a correlation between two variables (e.g., childhood SES and adulthood SES) while factoring out the effects of a potential confounding variable (e.g., educational attainment).

However, statistical control that addresses confounding by measurement can point to confounding through inappropriate control.

“This statistically oriented process is definitely not considered the gold standard compared with true experimental procedures, but often, it is the best you can do given ethical and/or practical constraints,” said Geher.

The Importance of Addressing Confounding Variables in Research

Controlling for confounding variables is critical in research primarily because it allows researchers to make sure that they are drawing valid and accurate conclusions. 

“If you don’t correct for confounding variables, you put yourself at risk for drawing conclusions regarding relationships between variables that are simply wrong (at the worst) or incomplete (at the best),” said Geher.

Controlling for confounding variables includes a basic set of skills when it comes to the social and behavioral sciences, he added. 

The Role of Confounding Variables in Valid Research

Human behavior is highly complex and any single action often has a broad array of variables that underlie it. 

“Understanding the concept of confounding variables, as well as how to control for these variables, makes for better behavioral science with conclusions that are, simply, more valid that research that does not effectively take confounding variables into account,” Geher said.

Wallach JD, Serghiou S, Chu L, et al. Evaluation of confounding in epidemiologic studies assessing alcohol consumption on the risk of ischemic heart disease. BMC Med Res Methodol. 2020;20(1):64. https://doi.org/10.1186/s12874-020-0914-6

Pourhoseingholi MA, Baghestani AR, Vahedi M. How to control confounding effects by statistical analysis. Gastroenterol Hepatol Bed Bench. 2012;5(2):79-83. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4017459/

By Cathy Cassata Cathy Cassata is a freelance writer who specializes in stories around health, mental health, medical news, and inspirational people.

Confound (Experimental)

  • Reference work entry
  • First Online: 01 January 2020
  • Cite this reference work entry

experimental studies confounding

  • Sven Hilbert 3 , 4 , 5  

484 Accesses

Confounding factor ; Confounding variable

An (experimental) confound is a factor affecting both the dependent and the independent variables systematically, thus being responsible for (at least part of) their statistical relationship.

Introduction

In quantitative psychological investigations, a researcher tries to discover statistical relationships between variables. This relationship is commonly quantified in terms of covariation in a statistical model. It is impossible to include all variables in the model, so any relationship revealed by the model may be caused or influenced by a variable that is not considered in the model. This variable responsible for the spurious relationship is called “confound.”

The Role of Confounds in (Psychological) Research

An empirical researcher conducting an investigation typically analyzes the relationship between dependent and independent variables using statistical models. These models have to be formulated including all variables...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and affiliations.

Department of Psychology, Psychological Methods and Assessment, Münich, Germany

Sven Hilbert

Faculty of Psychology, Educational Science, and Sport Science, University of Regensburg, Regensburg, Germany

Psychological Methods and Assessment, LMU Munich, Munich, Germany

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Sven Hilbert .

Editor information

Editors and affiliations.

Oakland University, Rochester, MI, USA

Virgil Zeigler-Hill

Todd K. Shackelford

Section Editor information

Humboldt University, Germany, Berlin, Germany

Matthias Ziegler

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this entry

Cite this entry.

Hilbert, S. (2020). Confound (Experimental). In: Zeigler-Hill, V., Shackelford, T.K. (eds) Encyclopedia of Personality and Individual Differences. Springer, Cham. https://doi.org/10.1007/978-3-319-24612-3_1286

Download citation

DOI : https://doi.org/10.1007/978-3-319-24612-3_1286

Published : 22 April 2020

Publisher Name : Springer, Cham

Print ISBN : 978-3-319-24610-9

Online ISBN : 978-3-319-24612-3

eBook Packages : Behavioral Science and Psychology Reference Module Humanities and Social Sciences Reference Module Business, Economics and Social Sciences

Share this entry

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research
  • Chapter 9. Experimental studies

Randomised controlled trials

Crossover studies, experimental study of populations.

  • Chapter 1. What is epidemiology?
  • Chapter 2. Quantifying disease in populations
  • Chapter 3. Comparing disease rates
  • Chapter 4. Measurement error and bias
  • Chapter 5. Planning and conducting a survey
  • Chapter 6. Ecological studies
  • Chapter 7. Longitudinal studies
  • Chapter 8. Case-control and cross sectional studies
  • Chapter 10. Screening
  • Chapter 11. Outbreaks of disease
  • Chapter 12. Reading epidemiological reports
  • Chapter 13. Further reading

Follow us on

Content links.

  • Collections
  • Health in South Asia
  • Women’s, children’s & adolescents’ health
  • News and views
  • BMJ Opinion
  • Rapid responses
  • Editorial staff
  • BMJ in the USA
  • BMJ in South Asia
  • Submit your paper
  • BMA members
  • Subscribers
  • Advertisers and sponsors

Explore BMJ

  • Our company
  • BMJ Careers
  • BMJ Learning
  • BMJ Masterclasses
  • BMJ Journals
  • BMJ Student
  • Academic edition of The BMJ
  • BMJ Best Practice
  • The BMJ Awards
  • Email alerts
  • Activate subscription

Information

Back Home

  • Science Notes Posts
  • Contact Science Notes
  • Todd Helmenstine Biography
  • Anne Helmenstine Biography
  • Free Printable Periodic Tables (PDF and PNG)
  • Periodic Table Wallpapers
  • Interactive Periodic Table
  • Periodic Table Posters
  • Science Experiments for Kids
  • How to Grow Crystals
  • Chemistry Projects
  • Fire and Flames Projects
  • Holiday Science
  • Chemistry Problems With Answers
  • Physics Problems
  • Unit Conversion Example Problems
  • Chemistry Worksheets
  • Biology Worksheets
  • Periodic Table Worksheets
  • Physical Science Worksheets
  • Science Lab Worksheets
  • My Amazon Books

What Is a Confounding Variable? Definition and Examples

A confounding variable leads to a false association between the independent and dependent variable.

A confounding variable is a variable that influences both the independent variable and dependent variable and leads to a false correlation between them. A confounding variable is also called a confounder, confounding factor, or lurking variable. Because confounding variables often exist in experiments, correlation does not mean causation. In other words, when you see a change in the independent variable and a change in the dependent variable, you can’t be certain the two variables are related.

Here are examples of confounding variables, a look at the difference between a confounder and a mediator, and ways to reduce the risk of confounding variables leading to incorrect conclusions.

Positive and Negative Confounding

Sometimes confounding points to a false cause-and-effect relationship, while other times it masks a true effect.

  • Positive Confounding: Positive confounding overestimates the relationship between the independent and dependent variables. It biases results away from the null hypothesis.
  • Negative Confounding: Negative confounding underestimates the relationship between the independent and dependent variables. It biases results toward the null hypothesis.

Confounding Variable Examples

  • In a study where the independent variable is ice cream sales and the dependent variable is shark attacks, a researcher sees that increased sales go hand-in-hand with shark attacks. The confounding variable is the heat index. When it’s hotter, more people buy ice cream and more people go swimming in (shark-infested) waters. There’s no causal relationship between people buying ice cream and getting attacked by sharks.
  • Real Positive Confounding Example: A 1981 Harvard study linked drinking coffee to pancreatic cancer. Smoking was the confounding variable in this study. Many of the coffee drinkers in the study also smoked. When the data was adjusted for smoking, the link between coffee consumption (the independent variable) and pancreatic cancer incidence (the dependent variable) vanished.
  • Real Negative Confounding Example: In a 2008 study of the toxicity (dependent variable) of methylmercury in fish and seafood (independent variable), researchers found the beneficial nutrients in the food (confounding variable) counteracted some of the negative effects of mercury toxicity.

Correlation does not imply causation. If you’re unconvinced, check out the spurious correlations compiled by Tyler Vigen.

How to Reduce the Risk of Confounding

The first step to reduce the risk of confounding variables affecting your experiment is to try to identify anything that might affect the study. It’s a good idea to check the literature or at least ask other researchers about confounders. Otherwise, you’re likely to find out about them during peer review!

When you design an experiment, consider these techniques for reducing the effect of confounding variables:

  • Introduce control variables . For example, if you think age is a confounder, only test within a certain age group. If temperature is a potential confounder, control it.
  • Be consistent about time. Take data at the same time of day. Repeat experiments at the same time of year. Don’t vary the duration of treatments within a single experiment.
  • When possible, use double blinding. In a double blind experiment , neither the researcher nor the subject knows whether or not a treatment was applied.
  • Randomize. Select control group subjects and test subjects randomly, rather than having the researcher choose the group or (in human experiments) letting the subjects select participation.
  • Use case controls or matching. If you suspect confounding variables, match the test subject and control as much as possible. In human experiments, you might select subjects of the same age, sex, ethnicity, education, diet, etc. For animal and plant studies, you’d use pure lines. In chemical studies, use samples from the same supplier and batch.

Confounder vs Mediator or Effect Modifier

A confounder affects both the independent and dependent variables. In contrast, a mediator or effect modifier does not affect the independent variable, but does modify the effect the independent variable has on the dependent variable. For example, in a test of drug effectiveness, the drug may be more effective in children than adults. In this case, age is an effect modifier. Age doesn’t affect the drug itself, so it is not a confounder.

Confounder vs Bias

In a way, a confounding variable results in bias in that it distorts the outcome of an experiment. However, bias usually refers to a type of systematic error from experimental design, data collection, or data analysis. An experiment can contain bias without being affected by a confounding variable.

Confounding Variable: A factor that affects both the independent and dependent variables, leading to a false association between them. Effect Modifier: A variable that positively or negatively modifies the the effect of the independent variable on the dependent variable. Bias: A systematic error that masks the true effect of the independent variable on the dependent variable.

  • Axelson, O. (1989). “Confounding from smoking in occupational epidemiology”.  British Journal of Industrial Medicine .  46  (8): 505–07. doi: 10.1136/oem.46.8.505
  • Kish, L (1959). “Some statistical problems in research design”.  Am Sociol .  26  (3): 328–338. doi: 10.2307/2089381
  • VanderWeele, T.J.; Shpitser, I. (2013). “On the definition of a confounder”.  Annals of Statistics .  41  (1): 196–220. doi: 10.1214/12-aos1058
  • Yule, G. Udny (1926). “Why do we Sometimes get Nonsense-Correlations between Time-Series? A Study in Sampling and the Nature of Time-Series”.  Journal of the Royal Statistical Society . 89 (1): 1–63. doi: 10.2307/2341482

Related Posts

Stack Exchange Network

Stack Exchange network consists of 183 Q&A communities including Stack Overflow , the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Famous easy to understand examples of a confounding variable invalidating a study

Are there any well-known statistical studies that were originally published and thought to be valid, but later had to be thrown out due to a confounding variable that wasn't taken into account? I'm looking for something easy to understand that could be explained to and appreciated by a quantitative literacy class that has zero pre-requisites.

  • experiment-design
  • confounding
  • observational-study
  • 7 $\begingroup$ What is the difference between "quantitative literacy" and "numeracy" as a state of mind? $\endgroup$ –  Henry Commented Oct 23, 2019 at 23:35
  • 1 $\begingroup$ The Stanford Marshmallow Experiment is the first one that came to my mind. $\endgroup$ –  Apollys supports Monica Commented Oct 25, 2019 at 21:23
  • 1 $\begingroup$ I think of all the public opinion polls that assume no correlation between a person’s opinion and their willingness to answer questions from strangers (or their willingness to tolerate a phone call interrupting whatever). $\endgroup$ –  WGroleau Commented Oct 26, 2019 at 15:04
  • 1 $\begingroup$ It's just the preferred terminology. But I guess really to me numeracy means having a great grasp on number sense itself, whereas in our QL class we try to teach a smattering of lots of things - but probably none of them too well (such as statistics and logic). $\endgroup$ –  NathanLite Commented Oct 28, 2019 at 14:52

12 Answers 12

Coffee drinking & lung cancer.

My favorite example is that supposedly, "coffee drinkers have a greater risk of lung cancer", despite most coffee drinkers... well... drinking coffee, rather than inhaling it.

There have been various studies about this, but the consensus remains that studies with this conclusion usually just have a larger proportion of smoking coffee drinkers, than non-smoking coffee drinkers. In other words, the effect of smoking confounds the effect of coffee consumption , if not included in the model. The most recent article on this I could find is a meta analysis by Vania Galarraga and Paolo Boffetta (2016). $^\dagger$

The Obesity Paradox

Another example that plagues clinical research, is the claim that obesity can be beneficial for certain diseases. Specifically, many articles, still to this day (just do a quick search for obesity paradox on pubmed and be amazed), claim the following:

  • While a higher BMI increases the risk of diabetes, cardiovascular disease and certain types of cancer, once a patient already has the disease, a higher BMI is associated with lower rates of major adversarial events or death.

Why does this happen? Obesity is defined as excess fat negatively affecting health, yet we classify obesity based on BMI. BMI is just calculated as:

$$\text{BMI} = \frac{\text{weight in kg}}{(\text{height in m})^2},$$

so the most direct way to combat obesity is through weight loss (or by growing taller somehow).

Regimes that focus on loss of weight rather than fat , tend to result in a proportionally large loss of muscle. This is likely what causes lower BMI to be associated with a higher rate of major adversarial events.

Because many studies do not include measures of body fat (percentage), but only BMI as a proxy, the amount of body fat confounds the effect of BMI on health.

A nice review of this phenomenon was written by Steven G. Chrysant (2018). $^\ddagger$ He ends with:

[B]ased on the recent evidence, the obesity paradox is a misnomer and could convey the wrong message to the general public that obesity is not bad.

Followed by:

Journals [should] no longer accept articles about the 'obesity paradox'.

$\dagger$ : Vania Galarraga and Paolo Boffetta (2016): Coffee Drinking and Risk of Lung Cancer—A Meta-Analysis. Cancer Epidemiol Biomarkers Prev June 1 2016 (25) (6) 951-957; DOI: 10.1158/1055-9965.EPI-15-0727

$\ddagger$ : Steven G. Chrysant (2018): Obesity is bad regardless of the obesity paradox for hypertension and heart disease. J Clin Hypertens (Greenwich). 2018 May;20(5):842-846. doi: 10.1111/jch.13281. Epub 2018 Apr 17.

Examples of (poor) studies claiming to have demonstrated the obesity paradox:

  • McAuley et al. (2018): Exercise Capacity and the Obesity Paradox in Heart Failure: The FIT (Henry Ford Exercise Testing) Project
  • Weatherald et al. (2018): The association between body mass index and obesity with survival in pulmonary arterial hypertension
  • Patel et al. (2018): The obestiy paradox: the protective effect of obesity on right ventricular function using echocardiographic strain imaging in patients with pulmonary hypertension

Articles refuting the obesity paradox as a mere confounding effect of body fat:

  • Lin et al. (2017): Impact of Misclassification of Obesity by Body Mass Index on Mortality in Patients With CKD
  • Leggio et al. (2018): High body mass index, healthy metabolic profile and low visceral adipose tissue: The paradox is to call it obesity again
  • Medina-Inojosa et al. (2018): Association Between Adiposity and Lean Mass With Long-Term Cardiovascular Events in Patients With Coronary Artery Disease: No Paradox
  • Flegal & Ioannidis (2018): The Obesity Paradox: A Misleading Term That Should Be Abandoned

Articles about the obesity paradox in cancer:

  • Cespedes et al. (2018): The Obesity Paradox in Cancer: How Important Is Muscle?
  • Caan et al. (2018): The Importance of Body Composition in Explaining the Overweight Paradox in Cancer-Counterpoint
  • 9 $\begingroup$ Another example I recall from our classes was the correlation of the amount of fire engines involved in fighting a fire with the damage of the fire. The confounding variable there was that the bigger the fire, the bigger the damages, but also the more engines were needed to fight the fire. I can't recall how serious those studies were or if it was just some toy example. $\endgroup$ –  JAD Commented Oct 24, 2019 at 9:33
  • 1 $\begingroup$ On the obesity paradox --- older subjects with higher BMI suffer fewer broken bones (hips etc) through falls it seems. Do they fall less? Are they 'cushioned' by fat (the Paradox explanation)? Have they got less athropied muscles and other health problems thus fewer falls? Etc. $\endgroup$ –  user3445853 Commented Oct 24, 2019 at 13:41
  • 1 $\begingroup$ To clarify, is the explanation of the obesity paradox that some people who are diagnosed with a disease try to lose weight, and consequently lose muscle, which has a negative effect, while others who do not try to lose weight, and therefore remain obese, do not suffer this effect? $\endgroup$ –  StackOverthrow Commented Oct 24, 2019 at 17:08
  • 1 $\begingroup$ On the obesity paradox, consider that many cancers cause the patient to lose a lot of weight, and "unexplained weight loss" is one of worrying symptoms that should lead a person to see a doctor - I'm convinced that this also has something do to with the paradox. $\endgroup$ –  Noctiphobia Commented Oct 24, 2019 at 21:05
  • $\begingroup$ I think I remember reading a "joke" study that determined that in football being ahead in the 3rd quarter was highly correlated with winning the game. The "takeaway" for coaches was to win more games, they should score more points than the other team. $\endgroup$ –  emory Commented Oct 25, 2019 at 11:35

You might want to introduce Simpson's Paradox .

The first example that page is the UC Berkeley gender bias case where it was thought that there was gender bias (towards males) in admissions when looking at overall acceptance rates, but this was eliminated or reversed when investigated by department . The confounding variable of department picked up on a gender difference in applying to more competetive departments.

  • $\begingroup$ Another takeaway from that should be to investigate if this applying to more competitive departments (suppose for women) is a more general phenomenon? Anybody have references/links about that? $\endgroup$ –  kjetil b halvorsen ♦ Commented Dec 1, 2019 at 5:28

Power Lines and Cancer

After an initial study finding a link between living next to high-voltage transmission lines and cancer, follow-up studies found that when you include income in the model the effect of the power lines goes away.

Living next to power lines is a moderately accurate predictor of low household income / wealth. Put bluntly, there aren't as many fancy mansions next to transmission lines as elsewhere.

There is correlation between poverty and cancer. When comparisons were made between households on similar income brackets close to and far away from transmission lines, the effect of transmission lines disappeared.

In this case, the confounding variables were household wealth and distance to the nearest high voltage line.

Background reading .

  • $\begingroup$ As I recall the author of these studies was later shown to have outright falsified his data, the results were discredited and he is now prevented from doing research in this field. $\endgroup$ –  meh Commented Oct 29, 2019 at 19:21
  • $\begingroup$ Interesting. Do you have links? $\endgroup$ –  Jason Commented Oct 29, 2019 at 21:49
  • $\begingroup$ @aginensky The original study found correlation (as Jason said), but incorrectly claimed causation without sufficient data or evidence. The link does reference another researcher who did falsify a study in 1992 though (who was later fired in 1999 when the falsification was discovered). $\endgroup$ –  Graham Commented Dec 2, 2019 at 11:59

Consider the following examples. I am not sure they are necessarily very famous but they help to demonstrate the potential negative effects of confounding variables.

Say one is studying the relation between birth order (1st child, 2nd child, etc.) and the presence of Down Syndrome in the child. In this scenario, maternal age would be a confounding variable:

Higher maternal age is directly associated with Down Syndrome in the child

Higher maternal age is directly associated with Down Syndrome, regardless of birth order (a mother having her 1st vs 3rd child at age 50 confers the same risk)

Maternal age is directly associated with birth order (the 2nd child, except in the case of twins, is born when the mother is older than she was for the birth of the 1st child)

Maternal age is not a consequence of birth order (having a 2nd child does not change the mother's age)

More examples

In risk assessments, factors such as age, gender, and educational levels often affect health status and so should be controlled. Beyond these factors, researchers may not consider or have access to data on other causal factors. An example is the study of smoking tobacco on human health. Smoking, drinking alcohol, and diet are lifestyle activities that are related. A risk assessment that looks at the effects of smoking but does not control for alcohol consumption or diet may overestimate the risk of smoking (Tjønneland, Grønbaek, Stripp, & Overvad, 1999). Smoking and confounding are reviewed in occupational risk assessments such as the safety of coal mining (Axelson, 1989). When there is not a large sample population of non-smokers or non-drinkers in a particular occupation, the risk assessment may be biased towards finding a negative effect on health.

References: https://en.wikipedia.org/wiki/Confounding

Tjønneland, A., Grønbaek, M., Stripp, C., & Overvad, K. (1999). Wine intake and diet in a random sample of 48763 Danish men and women. The American Journal of Clinical Nutrition, 69(1), 49-54.

Axelson, O. (1989). Confounding from smoking in occupational epidemiology. British Journal of Industrial Medicine, 46(8), 505-507.

There was one about diet that looked at diet in different countries and concluded that meat caused all sorts of problems (e.g. heart disease), but failed to account for the average lifespan in each country: The countries that ate very little meat also had lower life expectancies and the problems that meat "caused" were ones that were linked to age.

I don't have citations for this - I read about it about 25 years ago - but maybe someone will remember or maybe you can find it.

  • 1 $\begingroup$ Are you sure it wasn’t fat rather than meat? The infamous “seven countries study” that started the don’t-eat-fat myth not only ignored a few variables but strangely omitted data from many countries. $\endgroup$ –  WGroleau Commented Oct 26, 2019 at 15:01
  • 2 $\begingroup$ I'm sort of sure it was meat. There could easily have been multiple studies. Diet is notorious for bad statistics. $\endgroup$ –  Peter Flom Commented Oct 26, 2019 at 18:54

I'm not sure it entirely counts as a confounding variable so much as confounding situations , but animals' abilities to find their way through a maze may qualify.

As described in this ScienceDirect summary , studies of rats (or other animals) in mazes were popular for a large part of the 20th century, and continue today to some extent. One possible purpose is to study the subject's ability to remember a maze which it has previously run; another popular purpose is to study any bias in the subject's choices of whether to turn left or right at junctions, in a maze which the subject has not previously run.

It should be immediately clear that if the subject has forgotten the maze, then any inherent bias in choice of route will be a confounding factor. If the "right" direction coincides with the subject's bias, then they could find their way in spite of not remembering the route.

In addition to this, studies found various other confounding features exist which might not have been considered. The height of walls and width of passages are factors, for example. And if another subject has previously navigated the maze, subjects which rely strongly on their sense of smell (mice and dogs, for instance) may find their way simply by tracking the previous subject's scent. Even the construction of the maze may be an issue - animals tend to be less happy to run over "hollow-sounding" floors.

Many animal maze studies ended up finding confounding factors instead of the intended study results. More disturbingly, according to Richard Feynmann , the studies reporting these confounding factors were not picked up by researchers at the time. As a result we simply don't know if any animal maze studies carried out around this time have any validity whatsoever. That's decades worth of high-end research at the finest universities around the world, by the finest psychologists and animal behaviourists, and every last shred of work had to at best be taken with a very large spoon of salt. Later researchers had to go back and duplicate all this work, to find out what was actually valid and what wasn't repeatable.

  • 1 $\begingroup$ If I read your link correctly, it appears that Richard Feynmann, who is not a researcher in ethology, animal behaviour or psychology commented in a talk (not an academic conference or publication of any kind) on some unidentified unpublished “study” that may or may not have showed what he says it showed and then concluded that researchers in the field do not care about this because they do not talk about “Mr. Young”. And that's supposed to be an example of scientific rigor or evidence that researchers in the field do not care about confounding factors in general? $\endgroup$ –  Gala Commented Oct 26, 2019 at 10:56
  • 1 $\begingroup$ @Gala Not that they did not care, but that they did not know . Feynman's point wasn't about scientific rigor from individual scientists - he had no doubt they did the best they could. His point was that the scientific community as a whole was substandard at passing on this important information so that other people don't fall into the same trap. He was using this as a nice hook to hang his lesson on, but there are so many examples of this - Mendel, for instance. $\endgroup$ –  Graham Commented Oct 31, 2019 at 0:59

There was a great study of mobile phone use and brain cancer. Most people with a lateral brain cancer, when asked which hand they hold their phone in, answer the diseased side. This seemed to show that phone use caused cancer.

However, maybe the answers are informed by hindsight. Someone thought of a great test for this. The sample was big enough to include some people with two cancers. So you could ask, does the declared side of phone use influence the risk of a cancer on the other side of the brain? It was actually protective , thus showing the hindsight bias in the original result.

Sorry, I don't have the reference.

'Statistics' by Freedman, Purvis et al. has a number of examples in the first couple of chapters. My personal favorite is that ice cream causes polio. The confounding variable is that they are both prevalent in the summertime when young children are out, about, and spreading polio. The book is "Statistics (Fourth Edition) 4th Edition, Kindle Edition- by David Freedman (Author), Robert Pisani (Author), Roger Purves (Author)"

This is not a study, but a gallery of spurious correlations that could be appreciated by a quantitative literacy class. The downside of this is the lack of an explanation (aside from chance).

See: Subversive Subjects: Rule-Breaking and Deception in Clinical Trials https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4520402/

Hormone replacement Therapy and heart disease?

https://www.teachepi.org/wp-content/uploads/OldTE/documents/courses/bfiles/The%20B%20Files_File1_HRT_Final_Complete.pdf

The benefits were determined by observation, and essentially it appears that the people who chose to do hrt had higher socioeconomic status, healthier lifestyle etc

(So one could argue on confounding Vs observational study)

There are lots of good examples in Howard Weiner's books. In particular, Chapter 1 "The most dangerous equation" in "How to understand, communicate and control uncertainty through graphical display"

Examples include:

The small schools movement. People noticed that some small schools had better performance than large schools so spent money to reduce school size. It turned out that some small schools also had worse performance than large schools. It was largely an artefact of extreme outcomes showing up in small samples.

Kidney cancer rates (This example is also used Daniel Kahneman's "Thinking Fast and Slow", see the start of Chapter 10). Lowest kidney cancer rates in rural, sparsely populated counties. These low rates have to be because of the the clean living rural life style. But wait, counties with the highest incidence of kidney cancer are also rural and sparsely populated. This has to be because of the lack of access to good medical care and too much drinking. Of course, the extreme are actually an artefact of the small populations.

Your Answer

Sign up or log in, post as a guest.

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy .

Not the answer you're looking for? Browse other questions tagged experiment-design confounding observational-study paradox or ask your own question .

  • Featured on Meta
  • Announcing a change to the data-dump process
  • We've made changes to our Terms of Service & Privacy Policy - July 2024

Hot Network Questions

  • Vscode how to remove Incoming/Outgoing changes graph
  • Why is There a Spelling Shift in the Stem of Verb "reperire"?
  • Is leveling this ground and removing plastic by the side of my house a bad idea?
  • Cubic Concatenation
  • How to cover large patch of damp?
  • Trivializations of set-theoretic fiber bundles up to permutations
  • A simple logic control yet it does not work
  • "Highly skilled" or "high-skilled"?
  • Why do many CVT cars appear to index gears during normal automatic operation?
  • Which programming language first used negative indexing to mean counting from the end?
  • Does philosophy have formal theories to explain why many secular or religious stories apparently die and others persist or live as history?
  • pdflatex can't parse webman.tex
  • Assessing model fit in logistic regression with multiple imputation
  • What is this switch in the video link below that Bill Jenkins holds down before launching his drag car?
  • Cannot get clear photos with 48 megapixel camera
  • Passport Renewals
  • How to enable access to my local hard disk (from browsers) after it was unintentionally denied?
  • Is the English of India considered a separate standard variety?
  • Is a hotter heat source always necessary for an object to be heated?
  • Why can't we prove SAT is NP complete just using the Tseytin Transformation?
  • Can I replace the Sun with a non-nuclear equivalent?
  • How does Ashaya, Soul of the Wild and Realm Razor interact?
  • How is clang able to evaluate loops without getting stuck in an infinite loop?
  • Why do doctors seem to overcharge for services?

experimental studies confounding

MeasuringU Logo

Confounded Experimental Designs, Part 1: Incomplete Factorial Designs

experimental studies confounding

Earlier we wrote about different kinds of variables . In short, dependent variables are what you get (outcomes), independent variables are what you set, and extraneous variables are what you can’t forget (to account for).

When you measure a user experience using metrics—for example, the SUPR-Q, SUS, SEQ, or completion rate—and conclude that one website or product design is good, how do you know it’s really the design that is good and not something else? While it could be due to the design, it could also be that extraneous (or nuisance) variables, such as prior experiences, brand attitudes, and recruiting practices, are  confounding your findings.

A critical skill when reviewing UX research findings and published research is the ability to identify when the experimental design is confounded .

Confounding can happen when there are variables in play that the design does not control and can also happen when there is insufficient control of an independent variable.

There are numerous strategies for dealing with confounding that are outside the scope of this article. In fact, it’s a topic that covers several years of graduate work in disciplines such as experimental psychology.

Our goal in this first of a series of articles is to show how to identify a specific type of confounded design in published experiments and demonstrate how their data can be reinterpreted once you’ve identified the confounding.

Incomplete Factorial Designs

One of the great scientific innovations in the early 20 th century was the development of the analysis of variance (ANOVA) and its use in analyzing factorial designs . A full factorial design is one that includes multiple independent variables (factors), with experimental conditions set up to obtain measurements under each combination of levels of factors. This approach allows experimenters to estimate the significance of each factor individually (main effects) and see how the different levels of the factors might behave differently in combination (interactions). This is all great when the factorial design is complete, but when it’s incomplete, it becomes impossible to untangle potential interactions among the factors.

For example, imagine an experiment in which participants sort cards and there are two independent variables—the size of the cards (small and large) and the size of the print on the cards (small and large). This is the simplest full factorial experiment, having two independent variables (card size and print size), each with two levels (small and large). For this 2×2 factorial experiment, there are four experimental conditions:

  • Large cards, large print
  • Large cards, small print
  • Small cards, large print
  • Small cards, small print

The graph below shows hypothetical results for this imaginary experiment. There is an interaction such that the combination of large cards and large print led to a faster sort time (45 s), but all the other conditions have the same sort time (60 s).

experimental studies confounding

But what if for some reason the experimenter had not collected data for the small card/small print condition? If you averaged across card size, you’d get the same average as you would collapsing the data over print size, which would be (60+45)/2 = 52.5. An experimenter focused on the effect of print size might claim that the data show a benefit to larger prints, but the counterargument would be that the effect is due to card size instead. With this incomplete design, you couldn’t say with certainty whether the benefit in the large card/large print condition was due to card size, print size, or that specific combination.

Moving from hypothetical to published experiments, we first show confounding in a famous psychological study, then in a somewhat less famous but influential human factors study, and finally in UX measurement research.

Harry Harlow’s Monkeys and Surrogate Mothers

In the late 1950s and early 1960s, psychologist Harry Harlow conducted a series of studies with infant rhesus monkeys, most of which would be considered unethical by modern standards. In his most famous study, infant monkeys were removed from their mothers and given access to two surrogate mothers, one made of terry cloth (providing tactile comfort but no food) and one made of wire with a milk bottle (providing food but no tactile comfort). The key finding was that the infant monkeys preferred to spend more time close to the terry cloth mother, using the wire mother only to feed. The image below shows both mothers.

experimental studies confounding

Image from Wikipedia.

In addition to the manipulation of comfort and food, there was also a clear manipulation of the surrogate mothers’ faces. The terry cloth mother’s face was rounded and had ears, nose, big eyes, and a smile. The wire mother’s face was square and devoid of potentially friendly features. With this lack of control, it’s possible that the infants’ preference for the terry cloth mother might have been due to just tactile comfort, just the friendly face, or a combination of the two. In addition to ethical issues associated with traumatizing infant monkeys, the experiment was deeply confounded.

Split Versus Standard Keyboards

Typing keyboards have been around for over 100 years, and there has been a lot of research on their design —different types of keys, different key layouts, and from the 1960s through the 1990s, different keyboard configurations. Specifically, researchers conducted studies of different types of split keyboards intended to make typing more comfortable and efficient by allowing a more natural wrist posture. The first design of a split keyboard was the Klockenberg keyboard, described in his 1926 book .

One of the most influential papers promoting split keyboards was “ Studies on Ergonomically Designed Alphanumeric Keyboards ” by Nakaseko et al., published in 1985 in the journal Human Factors. In that study, they described an experiment in which participants used three different keyboards—a split keyboard with a large wrist rest (see the figure below), a split keyboard with a small wrist rest, and a standard keyboard with a large wrist rest. They did not provide a rationale for failing to include a standard keyboard with a small wrist rest, and this omission made their experiment an incomplete factorial.

experimental studies confounding

Image from Lewis et al. (1997) “ Keys and Keyboards .”

They had participants rank the keyboards by preference, with the following results:

RankSplit with Large RestSplit with Small RestStandard with Large Rest
11679
261311
391111

The researchers’ primary conclusion was “After the typing tasks, about two-thirds of the subjects asserted that they preferred the split keyboard models.” This is true because 23/32 participants’ first choice was a split keyboard condition. What they failed to note was that 25/32 participants’ first choice was a keyboard condition that included a large wrist rest. If they had collected data for with a standard keyboard and small wrist rest, it would have been possible to untangle the potential interaction—but they didn’t.

Effects of Verbal Labeling and Branching in Surveys

In recent articles, we explored the effect of verbal labeling of rating scale response options; specifically, whether partial or full labeling affects the magnitude of responses, first in a literature review , and then in a designed experiment .

One of the papers in our literature review was Krosnick and Berent (1993) [pdf]. They reported the results of a series of political science studies investigating the effects of full versus partial labeling of response options and branching. In the Branching condition, questions were split into two parts, with the first part capturing the direction of the response (e.g., “Are you a Republican, Democrat, or independent?”) and the second capturing the intensity (e.g., “How strong or weak is your party affiliation?”). In the Nonbranching condition, both direction and intensity were captured in one question. The key takeaway from their abstract was, “We report eight experiments … demonstrating that fully labeled branching measures of party identification and policy attitudes are more reliable than partially labeled nonbranching measures of those attitudes. This difference seems to be attributable to the effects of both verbal labeling and branching.”

If all you read was the abstract, you’d think that full labeling was a better measurement practice than partial labeling. But when you review research, you can’t just read and accept the claims in the abstract. The figure below shows part of Table 1 from Krosnick and Berent (1993). Note that they list only three question formats. If their experimental designs had been full factorials, there would have been four. Missing from the design is the combination of partial labeling and branching. The first four studies also omitted the combination of full labeling with nonbranching, so any “significant” findings in those studies could be due to labeling or branching differences.

experimental studies confounding

Image from Krosnick and Berent (1993) [pdf].

The fifth study at least included the Fully Labeled Nonbranching condition and produced the following results (numbers in cells are the percentage of respondents who gave the same answer on two different administrations of the same survey questions):

FullPartialDiff
Branching68.4%NANA
Nonbranching57.8%58.9%1.1%
Diff10.6%NA

To analyze these results, Krosnick and Berent conducted two tests, one on the differences between Branching and Nonbranching holding Full Labeling constant and the second on the differences between Full and Partial Labeling holding Nonbranching constant. They concluded there was a significant effect of branching but no significant effect of labeling, bringing into question the claim they made in their abstract.

If you really want to understand the effects of labeling and branching on response consistency, the missing cell in the table above is a problem. Consider two possible hypothetical sets of results, one in which the missing cell matches the cell to its left and one in which it matches the cell below.

FullPartialMean
Branching68.4%68.4%0.0%
Nonbranching57.8%58.9%1.1%
Difference10.6%9.5%
FullPartialMean
Branching68.4%58.9%-9.5%
Nonbranching57.8%58.9%1.1%
Difference10.6%0.0%

In the first hypothetical, the conclusion would be that branching is more reliable than nonbranching and labeling doesn’t matter. For the second hypothetical, the conclusion would be that there is an interaction suggesting that full labeling is better than partial, but only for branching questions and not for nonbranching. But without data for the missing cell, you just don’t know!

Summary and Discussion

When reading published research, it’s important to read critically. One aspect of critical reading is to identify whether the design of the reported experiment is confounded in a way that casts doubt on the researchers’ claims.

This is not a trivial issue, and as we’ve shown, influential research has been published that has affected social policy (Harlow’s infant monkeys), product claims (split keyboards), and survey design practices (labeling and branching). But upon close and critical inspection, the experimental designs were flawed by virtue of confounding; specifically, the researchers were drawing conclusions from incomplete factorial experimental designs.

In future articles, we’ll revisit this topic from time to time with analyses of other published experiments we’ve reviewed that, unfortunately, were confounded.

You might also be interested in

Feature image with ChatGPT logomark and tree test structure illustration

REVIEW article

Eeg-based study of design creativity: a review on research design, experiments, and analysis.

Morteza Zangeneh Soroush

  • Concordia Institute for Information Systems Engineering, Gina Cody School of Engineering and Computer Science, Concordia University, Montreal, QC, Canada

Brain dynamics associated with design creativity tasks are largely unexplored. Despite significant strides, there is a limited understanding of the brain-behavior during design creation tasks. The objective of this paper is to review the concepts of creativity and design creativity as well as their differences, and to explore the brain dynamics associated with design creativity tasks using electroencephalography (EEG) as a neuroimaging tool. The paper aims to provide essential insights for future researchers in the field of design creativity neurocognition. It seeks to examine fundamental studies, present key findings, and initiate a discussion on associated brain dynamics. The review employs thematic analysis and a forward and backward snowball search methodology with specific inclusion and exclusion criteria to select relevant studies. This search strategy ensured a comprehensive review focused on EEG-based creativity and design creativity experiments. Different components of those experiments such as participants, psychometrics, experiment design, and creativity tasks, are reviewed and then discussed. The review identifies that while some studies have converged on specific findings regarding EEG alpha band activity in creativity experiments, there remain inconsistencies in the literature. The paper underscores the need for further research to unravel the interplays between these cognitive processes. This comprehensive review serves as a valuable resource for readers seeking an understanding of current literature, principal discoveries, and areas where knowledge remains incomplete. It highlights both positive and foundational aspects, identifies gaps, and poses lingering questions to guide future research endeavors.

1 Introduction

1.1 creativity, design, and design creativity.

Investigating design creativity presents significant challenges due to its multifaceted nature, involving nonlinear cognitive processes and various subtasks such as divergent and convergent thinking, perception, memory retrieval, learning, inferring, understanding, and designing ( Gero, 1994 ; Gero, 2011 ; Nguyen and Zeng, 2012 ; Jung and Vartanian, 2018 ; Xie, 2023 ). Additionally, design creativity tasks are often ambiguous, intricate, and nonlinear, further complicating efforts to understand the underlying mechanisms and the brain dynamics associated with creative design processes.

Creativity, one of the higher-order cognitive processes, is defined as the ability to develop useful, novel, and surprising ideas ( Sternberg and Lubart, 1998 ; Boden, 2004 ; Runco and Jaeger, 2012 ; Simonton, 2012 ). Needless to say, creativity occurs in all parts of social and personal life and all situations and places, including everyday cleverness, the arts, sciences, business, social interaction, and education ( Mokyr, 1990 ; Cropley, 2015b ). However, this study particularly focuses on reviewing EEG-based studies of creativity and design creativity tasks.

Design, as a fundamental and widespread human activity, aiming at changing existing situations into desired ones ( Simon, 1996 ), is nonlinear and complex ( Zeng, 2001 ), and lies at the heart of creativity ( Guilford, 1959 ; Gero, 1996 ; Jung and Vartanian, 2018 ; Xie, 2023 ). According to the recursive logic of design ( Zeng and Cheng, 1991 ), a designer intensively interacts with the design problem, design environment (including stakeholders of design, design context, and design knowledge), and design solutions in the recursive environment-based design evolution process ( Zeng and Gu, 1999 ; Zeng, 2004 , 2015 ; Nagai and Gero, 2012 ). Zeng (2002) conceptualized the design process as an environment-changing process in which the product emerges from the environment, serves the environment, and changes the environment ( Zeng, 2015 ). Convergent and divergent thinking are two primary modes of thinking in the design process, which are involved in analytical, critical, and synthetic processes. Divergent thinking leads to possible solutions, some of which might be creative, to the design problem whereas convergent thinking will evaluate and filter the divergent solutions to choose appropriate and practical ones ( Pahl et al., 1988 ).

Creative design is inherently unpredictable; at times, it may seem implausible – yet it happens. Some argue that a good design process and methodology form the foundation of creative design, while others emphasize the significance of both design methodology and knowledge in fostering creativity. It is noteworthy that different designers may propose varied solutions to the same design problem, and even the same designer might generate diverse design solutions for the same problem over time ( Zeng, 2001 ; Boden, 2004 ). Creativity may spontaneously emerge even if one does not intend to conduct a creative design, whereas creative design just may not come out no matter how hard one tries. A design is considered routine if it operates within a design space of known and ordinary designs, innovative if it navigates within a defined state space of potential designs but yields different outcomes, and creative if it introduces new variables and structures into the space of potential designs ( Gero, 1990 ). Moreover, it is conceivable that a designer may lack creativity while the product itself demonstrates creative attributes, and conversely, a designer may exhibit creativity while the resulting product does not ( Yang et al., 2022 ).

Several models of design creativity have been proposed in the literature. In some earlier studies, design creativity was addressed as engineering creativity or creative problem-solving ( Cropley, 2015b ). Used in recent studies ( Jia et al., 2021 ; Jia and Zeng, 2021 ), the stages of design creativity include problem understanding, idea generation, idea evolution, and idea validation ( Guilford, 1959 ). Problem understanding and idea evaluation are assumed to be convergent cognitive tasks whereas idea generation and idea evolution are considered divergent tasks in design creativity. An earlier model of creative thinking proposed by Wallas (1926) is presented in four phases including preparation, incubation, illumination, and verification ( Cropley, 2015b ). The “Preparation” phase involves understanding a topic and defining the problem. During “Incubation,” one processes the information, usually subconsciously. In the “Illumination” phase, a solution appears, often unexpectedly. Lastly, “Verification” involves evaluating and implementing the derived solution. In addition to this model, a seven-phase model (an extended version of the 4-phase model) was later introduced containing preparation, activation, generation, illumination, verification, communication, and validation ( Cropley, 2015a , b ). It is crucial to emphasize that these phases are not strictly sequential or distinct in that interactions, setbacks, restarts, or premature conclusions might occur ( Haner, 2005 ). In contrast to those emperical models of creativity, the nonlinear recursive logic of design creativity was rigorously formalized in a mathematical design creativity theory ( Zeng, 2001 ; Zeng et al., 2004 ; Zeng and Yao, 2009 ; Nguyen and Zeng, 2012 ). For further details on the theories and models of creativity and design creativity, readers are directed to the referenced literature ( Gero, 1994 , 2011 ; Kaufman and Sternberg, 2010 ; Williams et al., 2011 ; Nagai and Gero, 2012 ; Cropley (2015b) ; Jung and Vartanian, 2018 ; Yang et al., 2022 ; Xie, 2023 ).

1.2 Design creativity neurocognition

First, we would like to provide the definitions of “design” and “creativity” which can be integrated into the definition of “design creativity.” According to the Cambridge Dictionary, the definition of design is: “to make or draw plans for something.” In addition, the definition of creativity is: “the ability to make something new or imaginative.” So, the definition of design creativity is: “the ability to design something new and valuable.” With these definitions, we focus on design creativity neurocognition in this section.

It is of great importance to study design creativity neurocognition as the brain plays a pivotal role in the cognitive processes underlying design creativity tasks. So, to better investigate design creativity we need to concentrate on brain mechanisms associated with the related cognitive processes. However, the complexity of these tasks has led to a significant gap in our understanding; consequently, our knowledge about the neural activities associated with design creativity remains largely limited and unexplored. To address this gap, a burgeoning field known as design creativity neurocognition has emerged. This field focuses on investigating the intricate and unstructured brain dynamics involved in design creativity using various neuroimaging tools such as electroencephalography (EEG).

In a nonlinear evolutionary model of design creativity, it is suggested that the brain handles problems and ideas in a way that leads to unpredictable and potentially creative solutions ( Zeng, 2001 ; Nguyen and Zeng, 2012 ). This involves cognitive processes like thinking of ideas, evolving and evaluating them, along with physical actions like drawing ( Zeng et al., 2004 ; Jia, 2021 ). This indicates that the brain, as a complex and nonlinear system with characteristics like emergence and self-organization, goes through several cognitive processes which enable the generation of creative ideas and solutions. Exploring brain activities during design creativity tasks helps us get a better insight into the design process and improves how designers perform. As a result, design neurocognition combines traditional design study methods with approaches from cognitive neuroscience, neurophysiology, and artificial intelligence, offering unique perspectives on understanding design thinking ( Balters et al., 2023 ). Although several studies have focused on design and creativity, brain dynamics associated with design creativity are largely untouched. It motivated us to conduct this literature review to explore the studies, gather the information and findings, and finally discuss them. Due to the advantages of electroencephalography (EEG) in design creativity experiments which will be explained in the following paragraphs, we decided to focus on EEG-based neurocognition in design creativity.

As mentioned before, design creativity tasks are cognitive activities which are complex, dynamic, nonlinear, self-organized, and emergent. The brain dynamics of design creativity are largely unknown. Brain behavior recognition during design-oriented tasks helps scientists investigate neural mechanisms, vividly understand design tasks, enhance whole design processes, and better help designers ( Nguyen and Zeng, 2014a , b , 2017 ; Liu et al., 2016 ; Nguyen et al., 2018 , 2019 ; Zhao et al., 2018 , 2020 ; Jia, 2021 ; Jia et al., 2021 ; Jia and Zeng, 2021 ). Exploring brain neural circuits in design-related processes has recently gained considerable attention in different fields of science. Several studies have been conducted to decode brain activity in different steps of design creativity ( Petsche et al., 1997 ; Nguyen and Zeng, 2010 , 2014a , b , 2017 ; Liu et al., 2016 ; Nguyen et al., 2018 ; Vieira et al., 2019 ). Such attempts will lead to investigating the mechanism and nature of the design creativity process and consequently enhance designers’ performance ( Balters et al., 2023 ). The main question of the studies performed in design creativity neurocognition is whether and how we can explore brain dynamics and infer designers’ cognitive states using neuro-cognitive and physiological data like EEG signals.

Neuroimaging is a vital tool in understanding the brain’s structure and function, offering insights into various neurological and psychological conditions. It employs a range of techniques to visualize the brain’s activity and structure. Neuroimaging methods mainly include magnetic resonance imaging (MRI), computed tomography (CT), electroencephalography (EEG), functional near-infrared spectroscopy (fNIRS), functional MRI (fMRI), and magnetoencephalography (MEG). Neuroimaging techniques have helped researchers explore brain dynamics in complex cognitive tasks, one of which is design creativity ( Nguyen and Zeng, 2014b ; Gao et al., 2017 ; Zhao et al., 2020 ). While several neuroimaging methods exist to study brain activity, electroencephalography (EEG) is one of the best methods which has been widely used in several studies in different applications. EEG, as an inexpensive and simple neuroimaging technique with a high temporal resolution and an acceptable spatial resolution, has been used to infer designers’ cognitive and emotional states. Zangeneh Soroush et al. (2023a , b) have recently introduced two comprehensive datasets encompassing EEG recordings in design and creativity experiments, stemmed from several EEG-based design and design creativity studies ( Nguyen and Zeng, 2014a ; Nguyen et al., 2018 , 2019 ; Jia, 2021 ; Jia et al., 2021 ; Jia and Zeng, 2021 ). In this paper, we review some of the most fundamental studies which have employed electroencephalography (EEG) to explore brain behavior in creativity and design creativity tasks.

1.3 EEG approach to studying creativity neurocognition

EEG stands out as a highly promising method for investigating brain dynamics across various fields, including cognitive, clinical, and computational neuroscience studies. In the context of design creativity, EEG offers a valuable means to explore brain activity, particularly considering the physical movements inherent in the design process. However, EEG analysis poses challenges due to its complexity, nonlinearity, and susceptibility to various artifacts. Therefore, gaining a comprehensive understanding of EEG and mastering its utilization and processing is crucial for conducting effective experiments in design creativity research. This review aims to examine studies that have utilized EEG in investigating design creativity tasks.

EEG is a technique for recording the electrical activity of the brain, primarily generated by neuronal firing within the human brain. This activity is almost always captured non-invasively from the scalp in most cognitive studies, though intracranial EEG (iEEG) is recorded inside the skull, for instance in surgical planning for epilepsy. EEG signals are the result of voltage differences measured across two points on the scalp, reflecting the summed synchronized synaptic activities of large populations of cortical neurons, predominantly from pyramidal cells ( Teplan, 2002 ; Sanei and Chambers, 2013 ).

While the spatial resolution of EEG is relatively poor, EEG offers excellent temporal resolution, capturing neuronal dynamics within milliseconds, a feature not matched by other neuroimaging modalities like functional Near-Infrared Spectroscopy (fNIRS), Positron Emission Tomography (PET), or functional Magnetic Resonance Imaging (fMRI).

In contrast, fMRI provides much higher spatial resolution, offering detailed images of brain activity by measuring blood flow changes associated with neuronal activity. However, fMRI’s temporal resolution is lower than EEG, as hemodynamic responses are slower than electrical activities. PET, like fMRI, offers high spatial resolution and involves tracking a radioactive tracer injected into the bloodstream to image metabolic processes in the brain. It is particularly useful for observing brain metabolism and neurochemical changes but is invasive and has limited temporal resolution. fNIRS, measuring hemodynamic responses in the brain via near-infrared light, stands between EEG and fMRI in terms of spatial resolution. It is non-invasive and offers better temporal resolution than fMRI but is less sensitive to deep brain structures compared to fMRI and PET. Each of these techniques, with their unique strengths and limitations, provides complementary insights into brain function ( Baillet et al., 2001 ; Sanei and Chambers, 2013 ; Choi and Kim, 2018 ; Peng, 2019 ).

This understanding of EEG, from its historical development by Hans Berger in 1924 to modern digital recording and analysis techniques, underscores its significance in studying brain function and diagnosing neurological conditions. Despite advancements in technology, the fundamental methods of EEG recording have remained largely unchanged, emphasizing its enduring relevance in neuroscience ( Teplan, 2002 ; Choi and Kim, 2018 ).

1.4 Objectives and structure of the paper

Balters et al. (2023) conducted a comprehensive systematic review including 82 papers on design neurocognition covering nine topics and a large variety of methodological approaches in design neurocognition. A systematic review ( Pidgeon et al., 2016 ), reported several EEG-based studies on functional neuroimaging of visual creativity. Although such a comprehensive review exists in the field of design neurocognition, just a few early reviews focused on creativity neurocognition ( Fink and Benedek, 2014 , 2021 ; Benedek and Fink, 2019 ).

The present review not only reports the studies but also critically discusses the previous findings and results. The rest of this paper is organized as follows: Section 2 introduces our review methodology; Section 3 presents the results from our review process, and Section 4 discusses the major implications of the existing design creativity neurocognition research in future studies. Section 5 concludes the paper.

2 Methods and materials

Figure 1 shows the main components of EEG-based design creativity studies: (1) experiment design, (2) participants, (3) psychometric tests, (4) experiments (creativity tasks), (5) EEG recording and analysis methods, and (6) final data analysis. The experiment design consists of experiment protocol which includes (design) creativity tasks, the criteria to choose participants, the conditions of the experiment, and recorded physiological responses (which is EEG here). Setting and adjusting these components play a crucial role in successful experiments and reliable results. In this paper, we review studies based on the components in Figure 1 .

www.frontiersin.org

Figure 1 . The main components of EEG-based design creativity studies.

The components described in Figure 1 are consistent with the stress-effort model proposed by Nguyan and Zeng ( Nguyen and Zeng, 2012 ; Zhao et al., 2018 ; Yang et al., 2021 ) which characterizes the relationship between mental stress and mental effort by a bell-shaped curve. This model defines mental stress as a ratio of the perceived task workload over the mental capability constituted by affect, skills, and knowledge. Knowledge is shaped by individual experience and understanding related to the given task workload. Skills encompass thinking styles, strategies, and reasoning ability. The degree of affect in response to a task workload can influence the effective utilization of the skills and knowledge. We thus used this model to form our research questions, determine the keywords, and conduct our analysis and discussions.

2.1 Research questions

We focused on the studies assessing brain function in design creativity experiments through EEG analysis. For a comprehensive review, we followed a thorough search strategy, called thematic analysis ( Braun and Clarke, 2012 ), which helped us to code and extract themes from the initial (seed) papers. We began without a fixed topic, immersing ourselves in the existing literature to shape our research questions, keywords, and search queries. Our research questions formed the search keywords and later the search inquiries.

Our main research questions (RQs) were:

RQ1: What are the effective experiment design and protocol to ensure high-quality EEG-based design creativity studies?
RQ2: How can we efficiently record, preprocess, and process EEG reflecting the cognitive workload associated with design creativity tasks?
RQ3: What are the existing methods to analyze the data extracted from EEG signals recorded during design creativity tasks?
RQ4: How can EEG signals provide significant insight into neural circuits and brain dynamics associated with design creativity tasks?
RQ5: What are the significant neuroscientific findings, shortcomings, and inconsistencies in the literature?

With the initial information extracted from the seed papers and the previous studies by the authors in this field ( Nguyen and Zeng, 2012 , 2014a , b ; Jia et al., 2021 ; Jia and Zeng, 2021 ; Yang et al., 2022 ; Zangeneh Soroush et al., 2024 ), we built a conceptual model represented by Figure 1 and then formed these research questions. With this understanding and the RQs, we set our search strategy.

2.2 Search strategy and inclusion-exclusion criteria

Our search started with broad terms like “design,” “creativity,” and “EEG.” These terms encapsulate the overarching cognitive activities and physiological measurement. As we identified relevant papers, we refined our search keywords for a more targeted search. We utilized the Boolean operators such as “OR” and “AND” to finetune our search inquiries. The search inquiries were enhanced by the authors through the knowledge they obtained through selected papers. The first phase started with thematic analysis and continued with choosing papers, obtaining knowledge, discussing the keywords, and updating the search inquiries, recursively until reaching an appropriate search inquiry which resulted in the desired search results. We applied the thematic analysis only in the first iteration to make sure that we had the right and comprehensive understanding of EEG-based design creativity, the appropriate set of keywords, and search inquiries. Finally, we came up with a comprehensive search inquiry as follows:

(“EEG” OR “Electroenceph*” OR “brain” OR “neur*” OR “neural correlates” OR “cognit*”) AND (“design creativity” OR “ideation” OR “creative” OR “divergent thinking” OR “convergent thinking” OR “design neurocognition” OR “creativity” OR “creative design” OR “design thinking” OR “design cognition” OR “creation”)

The search inquiry is a combination of terminologies related to design and creativity, as well as terminologies about EEG, neural activity, and the brain. In a general and quick evaluation, we found out that our proposed search inquiry resulted in relevant studies in the field. This evaluation was a quick way to check how effectively our search keywords work. Then, we went through well-known databases such as PubMed, Scopus, and Web of Science to collect a comprehensive set of original papers, theses, and reviews. These electronic databases were searched to reduce the risk of bias, to get more accurate findings, and to provide coverage of the literature. We continued our search in the aforementioned databases until no more significant papers emerged from those specific databases. It is worth mentioning that we do not consider any specific time interval in our search procedure. We used the fields “title,” “abstract,” and “keywords” in our search process. Then, we selected the papers based on the following inclusion criteria:

1. The paper should answer one or more research questions (RQ1-RQ5).

2. The paper must be a peer-reviewed journal paper authored in English.

3. The paper should focus on EEG analysis related to creativity or design creativity for adult participants.

4. The paper should be related to creativity or design creativity in terms of the concepts, experiments, protocols, and probable models employed in the studies.

5. The paper should use established creativity tasks, including the Alternative Uses Task (AUT), the Torrance Tests of Creative Thinking (TTCT), or a specific design task. (These tasks will be detailed further on.)

6. The paper should include a quantitative analysis of EEG signals in the creativity or design creativity domain.

7. In addition to the above-mentioned criteria, the authors checked the papers to make sure that the included publications have the characteristics of high-quality papers.

These criteria were used to select our initial papers from the large set of papers we collected from Scopus, Web of Science, and PubMed. It should be mentioned that conflicts were resolved through discussion and duplicate papers were removed.

After our initial selection, we used Google Scholar to perform a forward and backward snowball search approach. We chose the snowball search method over the systematic review approach as the forward and backward snowball search methodologies offer efficient alternatives to a systematic review. Unlike systematic reviews, the snowball search method is particularly valuable when dealing with emerging fields or when the scope of inquiry is evolving, allowing researchers to quickly uncover pertinent insights and form connections between seminal and contemporary works. During each iteration of the snowball search, we applied the aforementioned criteria to include or exclude papers accordingly. We continued our snowball search procedure until it converged to the final set of papers. After repeating this over six iterations, we found no new and significant papers, suggesting we had reached a convergent set of papers.

By October 1 st (2023), our search was complete. We then organized and studied the final included publications.

3.1 Search results

Figure 2 illustrates the general flow of our search procedure, adapted from PRISMA guidelines ( Liberati et al., 2009 ). With the search keywords, we identified 1878 studies during the thematic analysis phase. We considered these studies to select the seed papers for the further snowball search process. After performing the snowball search and considering inclusion and exclusion criteria, we finally selected 154 studies including 82 studies related to creativity (comprising 60 original papers, 12 theses, and 10 review papers) and 72 studies related to design creativity (comprising 63 original papers, 5 theses, and 4 review papers). In our search, we also found 6 related textbooks and 157 studies using other modalities (such as fMRI, fNIRS, etc.) which were excluded. We used these textbooks, theses, and their resources to gain more knowledge in the initial steps of our review. Some studies using fMRI and fNIRS were used to evaluate the results in the discussion. In the snowball search process, a large number of studies have consistently appeared across all iterations implying their high relevance and influence in the field. These papers, which have been repeatedly selected throughout the search process, demonstrate their significant contributions to the understanding of design creativity and EEG studies. The snowball process effectively identifies such pivotal studies by highlighting their recurrent presence and citation in the literature, underscoring their importance in shaping the research landscape.

www.frontiersin.org

Figure 2 . Search procedure and results (adopted from PRISMA) using the thematic analysis in the first iteration and snowball search in the following iterations.

3.2 Design creativity neurocognition: history and trend

As discussed in Section 1, creativity and design creativity studies are different yet closely related in that design creativity involves a more complex design process. In this subsection, we will look at how the design neurocognition creativity study followed the creativity neurocognition study (though not necessarily in a causal manner).

3.2.1 History of creativity neurocognition

Three early studies in the field of creativity neurocognition are Martindale and Mines (1975) , Martindale and Hasenfus (1978) , and Martindale et al. (1984) . In the first study ( Martindale and Mines, 1975 ), it is stated that creative individuals may exhibit certain traits linked to lower cortical activation. This research has shown distinct neural activities when participants engage in two creativity tasks: the Alternate Uses Tasks (AUT) and the Remote Associate Task (RAT). The AUT, which gauges ideational fluency and involves unfocused attention, is related to higher alpha power in the brain. Conversely, the RAT, which centers on producing specific answers, shows varied alpha levels. Previous psychological research aligns with these findings, emphasizing the different nature of these tasks. Creativity, as determined by both tests, is associated with high alpha percentages during the AUT, hinting at an association between creativity and reduced cortical activation during creative tasks. However, highly creative individuals also show a mild deficit in cortical self-control, evident in their increased alpha levels, even when trying to suppress them. This behavior mirrors findings from earlier and later studies and implies that these individuals might have a predisposition to disinhibition. The varying alpha levels during cognitive tasks likely stem from their reaction to tasks rather than intentional focus shifts ( Martindale and Mines, 1975 ).

In the second study ( Martindale and Hasenfus, 1978 ), the authors explored the relationship between creativity and EEG alpha band presence during different stages of the creative process. There were two experiments in this study. Experiment 1 found that highly creative individuals had lower alpha wave presence during the elaboration stage of the creative process, while Experiment 2 found that effort to be original during the inspiration stage was associated with higher alpha wave presence. Overall, the findings suggest that creativity is associated with changes in EEG alpha wave presence during different stages of the creative process. However, the relationship is complex and may depend on factors such as effort to be original and the specific stage of the creative process.

Finally, a series of three studies indicated a link between creativity and hemispheric asymmetry during creative tasks ( Martindale et al., 1984 ). Creative individuals typically exhibited heightened right-hemisphere activity compared to the left during creative output. However, no distinct correlation was found between creativity and varying levels of hemispheric asymmetry during the inspiration versus elaboration phases. The findings suggest that this relationship is consistent across different stages of creative production. These findings were the foundation of design creativity studies which were more explored later and confirmed by other studies ( Petsche et al., 1997 ). Later studies have used these findings to validate their results. In addition to these early studies, there exist several reviews such as Fink and Benedek (2014) , Pidgeon et al. (2016) , and Rominger et al. (2022a) which provide a comprehensive literature review of previous studies and their main findings including early studies as well as recent creativity research.

3.2.2 EEG-based creativity studies

In the preceding sections, we aimed to lay a foundational understanding of neurocognition in creativity, equipping readers with essential knowledge for the subsequent content. In this subsection, we will briefly introduce the main and most important points regarding creativity experiments. More detailed information can be found in Simonton (2000) , Srinivasan (2007) , Arden et al. (2010) , Fink and Benedek (2014) , Pidgeon et al. (2016) , Lazar (2018) , and Hu and Shepley (2022) .

This section presents key details from the selected studies in a structured format to facilitate easy understanding and comparison for readers. As outlined earlier, crucial elements in creativity research include the participants, psychometric tests used, creativity tasks, EEG recording and analysis techniques, and the methods of final data analysis. We have organized these factors, along with the principal findings of each study, into Table 1 . This approach allows readers to quickly grasp the essential information and compare various aspects of different studies. The table format not only aids in presenting data clearly and concisely but also helps in highlighting similarities and differences across studies, providing a comprehensive overview of the field. Following the table, we have included a discussion section. This discussion synthesizes the information from the table, offering insights and interpretations of the trends, implications, and significance of these studies in the broader context of creativity neurocognition. This structured presentation of studies, followed by a detailed discussion, is designed to enhance the reader’s understanding, and provide a solid foundation for future research in this dynamic and evolving field.

www.frontiersin.org

Table 1 . A summary of EEG-based creativity neurocognition studies.

In our research, we initially conducted a thematic analysis and utilized a forward and backward snowball search method to select relevant studies. Out of these, five studies employed machine learning techniques, while the remaining ones concentrated on statistically analyzing EEG features. It is noteworthy that all the chosen studies followed a similar methodology, involving the recruitment of participants, administering probable psychometric tests, conducting creativity tasks, recording EEG data, and concluding with final data analysis.

While most studies follow similar structure for their experiments, some other studies focus on other aspects of creativity such as artistic creativity and poetry, targeting different evaluation methods, and through different approaches. In Shemyakina and Dan’ko (2004) and Danko et al. (2009) , the authors targeted creativity to produce proverbs or definitions of emotions of notions. In other studies ( Leikin, 2013 ; Hetzroni et al., 2019 ), the experiments are focused on creativity and problem-solving in autism and bilingualism. Moreover, some studies such as Volf and Razumnikova (1999) and Razumnikova (2004) focus more on the gender differences in brain organization during creativity tasks. In another study ( Petsche, 1996 ), approaches to verbal, visual, and musical creativity were explored through EEG coherence analysis. Similarly, the study ( Bhattacharya and Petsche, 2005 ) analyzed brain dynamics in mentally composing drawings through differences in cortical integration patterns between artists and non-artists. We summarized the findings of EEG-based creativity studies in Table 1 .

3.2.3 Neurocognitive studies of design and design creativity

Design is closely associated with creativity. On the one hand, by definition, creativity is a measure of the process of creating, for which design, either intentional or unconscious, is an indispensable constituent. On the other hand, it is important to note that not all designs are inherently creative; many designs follow established patterns and resemble existing ones, differing only in their specific context. Early research on design creativity aimed to differentiate between design and design creativity tasks by examining when and how designers exhibited creativity in their work. In recent years, much of the focus in design creativity research has shifted towards cognitive and neurocognitive investigations, as well as the development of computational models to simulate creative processes ( Borgianni and Maccioni, 2020 ; Lloyd-Cox et al., 2022 ). Neurocognitive studies employ neuroimaging methods (such as EEG) while computational models often leverage artificial intelligence or cognitive modeling techniques ( Zeng and Yao, 2009 ; Gero, 2020 ; Gero and Milovanovic, 2020 ). In this section, we review significant EEG-based studies in design creativity to focus more on design creation and highlight the differences. While most studies have processed EEG to provide more detailed insight into brain dynamics, some studies such as Goel (2014) outlined a preliminary framework encompassing cognitive and neuropsychological systems essential for explaining creativity in designing artifacts.

Several studies have recorded and analyzed EEG in design and design creativity tasks. Most neuro-cognitive studies have directly or indirectly employed frequency-based analysis which is based on the analysis of EEG in specific frequency bands including delta (0.5–4 Hz), theta (4–8 Hz), alpha (8–13 Hz), beta (13–30 Hz), and gamma (>30 Hz). One of the main analyses is called task-related potential (TRP) which has provided a foundation for other analyses. It computes the relative power of the EEG signal associated with a design task in a specific frequency band with respect to the power of EEG in the rest mode. This analysis is simple and effective and reveals the physiological processes underlying EEG dynamics ( Rominger et al., 2018 ; Jia and Zeng, 2021 ; Gubler et al., 2022 ; Rominger et al., 2022b ).

Frequency-based analyses have been widely employed. For example, the study ( Borgianni and Maccioni, 2020 ) applied TRP analysis to compare the neurophysiological activations of mechanical engineers and industrial designers while conducting design tasks including problem-solving, basic design, and open design. These studies have agreed that higher alpha band activity is sensitive to specific task-related requirements, while the lower alpha corresponds to attention processes such as vigilance and alertness ( Klimesch et al., 1998 ; Klimesch, 1999 ; Chrysikou and Gero, 2020 ). Higher alpha activity in the prefrontal region reflects complex cognitive processes, higher internal attention (such as in imagination), and task-irrelevant inhibition ( Fink et al., 2009a , b ; Fink and Benedek, 2014 ). On the other hand, higher alpha activity in the occipital and temporal lobes corresponds to visualization processes ( Vieira et al., 2022a ). In design research, to compare EEG characteristics in design activities (such as idea generation or evaluation) ( Liu et al., 2016 ), frequency-based analysis has been widely employed ( Liu et al., 2018 ). Higher alpha is associated with open-ended tasks, visual association in expert designers, and divergent thinking ( Nguyen and Zeng, 2014b ; Nguyen et al., 2019 ). Higher beta and theta play a pivotal role in convergent thinking, and constraint tasks ( Nguyen and Zeng, 2010 ; Liu et al., 2016 ; Liang and Liu, 2019 ).

The research in design and design creativity is not limited to frequency-based analyses. Nguyen et al. (2019) introduced Microstate analysis to EEG-based design studies. Using the microstate analysis, Jia and Zeng investigated EEG characteristics in design creativity experiment ( Jia and Zeng, 2021 ), where EEG signals were recorded while participants conducted design creativity experiments which were modified TTCT tasks ( Nguyen and Zeng, 2014b ).

Following the same approach, Jia et al. (2021) analyzed EEG microstates to decode brain dynamics in design cognitive states including problem understanding, idea generation, rating idea generation, idea evaluation, and rating idea evaluation, where six design problems including designing a birthday cake, a toothbrush, a recycle bin, a drinking fountain, a workplace, and a wheelchair were used for the EEG based design experimental studies ( Nguyen and Zeng, 2017 ). The data of these two loosely controlled EEG-based design experiments are summarized and available for the research community ( Zangeneh Soroush et al., 2024 ).

We summarized the findings of EEG-based design and design creativity studies in Table 2 .

www.frontiersin.org

Table 2 . A summary of EEG-based design creativity neurocognition studies.

3.2.4 Trend analysis

The selected studies span a broad range of years, stretching from 1975 ( Martindale and Mines, 1975 ) to the present day, reflecting advancements in neuro-imaging techniques and machine learning methods that have significantly aided researchers in their investigations. From the earliest studies to more recent ones, the primary focus has centered on EEG sub-bands, brain asymmetry, coherence analysis, and brain topography. Recently, machine learning methods have been employed to classify EEG samples into designers’ cognitive states. These studies can be roughly classified into the following distinct categories based on their proposed experiments and EEG analysis methods ( Pidgeon et al., 2016 ; Jia, 2021 ): (1) visual creativity versus baseline rest/fixation, (2) visual creativity versus non-rest control task(s), (3) individuals of high versus low creativity, (4) generation of original versus standard visual images, (5) creativity in virtual reality vs. real environment, (6) loosely controlled vs. strictly controlled creativity experiments.

The included studies exhibited considerable variation in the tasks utilized and the primary contrasts examined. Some studies employed frequency-based or EEG power analysis to compare brain activity during visual creativity tasks with tasks involving verbal creativity or both verbal and visual tasks. These tasks often entail memory tasks or tasks focused on convergent thinking. Several studies, however, adopted a simpler approach by comparing electrophysiological activity during visual creativity tasks against a baseline fixation or rest condition. Some studies compared neural activities between individuals characterized by high and low levels of creativity, while others compared the generation of original creative images with that of standard creative images. Several studies analyze brain behavior concerning creativity factors such as fluency, originality, and others. These studies typically employ statistical analysis techniques to illustrate and elucidate differences between various creativity factors and their corresponding brain behaviors. This variability underscores the diverse approaches taken by researchers to examine the neural correlates of design creativity ( Pidgeon et al., 2016 ). However, few studies significantly and deeply delved into areas such as gender differences in creativity, creativity among individuals with mental or physical disorders, or creativity in diverse job positions or skill sets. This suggests that there is significant untapped potential within the EEG-based design creativity research area.

In recent years, advancements in fMRI imaging and its applications have led several studies to replace EEG with fMRI to investigate brain behavior. fMRI extracts metabolism, resulting in relatively high spatial resolution compared to EEG. However, it is important to note that fMRI has lower temporal resolution compared to EEG. Despite this difference, the shift towards fMRI highlights the ongoing evolution and exploration of neuroimaging techniques in understanding the neural correlates of design creativity. fMRI studies provide a deep understanding of neural circuits associated with creativity and can be used to evaluate EEG-based studies ( Abraham et al., 2018 ; Japardi et al., 2018 ; Zhuang et al., 2021 ).

The emergence of virtual reality (VR) has had a significant impact on design creativity studies, offering a wide range of experimentation possibilities. VR enables researchers to create diverse scenarios and creativity tasks, providing a dynamic and immersive environment for participants ( Agnoli et al., 2021 ; Chang et al., 2022 ). Through VR technology, various design creativity experiments can be conducted, allowing for novel approaches and innovative methodologies to explore the creative process. This advancement opens up new avenues for researchers to investigate the complexities of design creativity more interactively and engagingly.

Regardless of the significant progress over the past few decades, design and design creativity neurocognitive research is still in its early stages, due to the challenges identified ( Zhao et al., 2020 ; Jia et al., 2021 ), which is summarized below:

1. Design tasks are open-ended, meaning there is no single correct outcome and countless acceptable solutions are possible. There are no predetermined or optimal design solutions; multiple feasible solutions may exist for an open-ended design task.

2. Design tasks are ill-defined as finding a solution might change or redefine the original task, leading to new tasks emerging.

3. Various emergent design tasks trigger design knowledge and solutions, which in turn can change or redefine tasks further.

4. The process of completing a design task depends on emerging tasks and the perceived priorities for completion.

5. The criteria to evaluate a design solution are set by the solution itself.

While a lot of lessons learned from creativity neurocognitive research can be borrowed to study design and design creativity neurocognition, new paradigms should be proposed, tested, and validated to advance this new discipline. This advancement will in turn move forward creativity neurocognition research.

3.3 Experiment protocol

Concerning the model described in Figure 1 , we arranged the following sections to cover all the main components of EEG-based design creativity studies. To bring a general picture of the EEG-based design creativity studies, we briefly explain the procedure of such experiments. Since most design creativity neurocognition research inherited more or less procedures in general creativity research, we will focus on AUT and TTCT tasks. The introduction of a loosely controlled paradigm, tEEG, can be found in Zhao et al. (2020) , Jia et al. (2021) , and Jia and Zeng (2021) . Taking a look at Tables 1 , 2 , it can be inferred that almost all included studies record EEG signals while selected participants are performing creativity tasks. The first step is determining the sample size, recruiting participants, and psychometrics according to which participants get selected. In some of these studies, participants take psychometric tests before performing the creativity tasks for screening or categorization. In this review, the tasks used to gauge creativity are the Alternative Uses Test (AUT) and the Torrance Test of Creative Thinking (TTCT). During these tasks, EEG is recorded and then preprocessed to remove any probable artifacts. These artifact-free EEGs are then processed to extract specific features, which are subsequently subjected to either statistical analysis or machine learning methods. Statistical analysis typically compares brain dynamics across different creativity tasks like idea generation, idea evolution, and idea evaluation. Machine learning, on the other hand, categorizes EEG signals based on associated creativity tasks. The final stage involves data analysis, which aims to deduce how brain dynamics correlate with the creativity tasks given to participants. This data analysis also compares EEG results with psychometric test findings to discern any significant differences in EEG dynamics and neural activity between groups.

3.3.1 Participants

The first factor of the studies is their participants. In most studies, participants are right-handed, non-medicated, and have normal or corrected to normal vision. In some cases, the Edinburgh Handedness Inventory ( Oldfield, 1971 ) (with 11 elements) or hand dominance test (HDT) ( Steingrüber et al., 1971 ) were employed to determine participants’ handedness ( Rominger et al., 2020 ; Gubler et al., 2023 ; Mazza et al., 2023 ). While in several creativity studies, right-handedness has been considered; relatively, in design creativity studies it has been less mentioned.

In most studies, participants are undergraduate or graduate students with different educational backgrounds and an age range of 18 to 30 years. In the included papers, participants did not report any history of psychiatric or neurological disorders, or treatment. It should be noted that some studies such as Ayoobi et al. (2022) and Gubler et al. (2022) analyzed creativity in health conditions like multiple sclerosis or participants with chronic pain, respectively. These studies usually conduct statistical analysis to investigate the results of creativity tasks such as AUT or Remote Association Task (RAT) and then associate the results with the health condition. In some studies, it is reported that participants were asked not to smoke cigarettes for 1 h, not to have coffee for 2 h, alcohol for 12 h, or other stimulating beverages for several hours before experiments. As mentioned in some design creativity studies, similar rules apply to design creativity experiments (participants are not allowed to have stimulating beverages).

In most studies, the sample size of participants was as large as 15 up to 45 participants except for a few studies ( Jauk et al., 2012 ; Perchtold-Stefan et al., 2020 ; Rominger et al., 2022a , b ) which had larger numbers such as 100, 55, 93, and 74 participants, respectively. Some studies such as Agnoli et al. (2020) and Rominger et al. (2020) calculated their required sample size through G*power software ( Faul et al., 2007 ) concerning their desirable chance (or power) of detecting a specific interaction effect involving the response, hemisphere, and position ( Agnoli et al., 2020 ). Considering design creativity studies, the same trend can be seen as the minimum and maximum numbers of participants are 8 and 84, respectively. Similarly, in a few studies, sample sizes were estimated through statistical methods such as G*power ( Giannopulu et al., 2022 ).

In most studies, a considerable number of participants were excluded due to several reasons such as not being fluent in the language used in the experiment, left-handedness, poor quality of recorded signals, extensive EEG artifacts, misunderstanding the procedure of the experiment correctly, technical errors, losing the data during the experiment, no variance in the ratings, and insufficient behavioral data. This shows that recording a high-quality dataset is quite challenging as several factors determine whether the quality is acceptable. Two datasets (in design and creativity) with public access have recently been published in Mendeley Data ( Zangeneh Soroush et al., 2023a , b ). Except for these two datasets, to the best of our knowledge, there is no publicly available dataset of EEG signals recorded in design and design creativity experiments.

Regarding the gender analysis, among the included papers, there were a few studies which directly focused on the association between gender, design creativity, and brain dynamics ( Vieira et al., 2021 , 2022a ). In addition, most of the included papers did not choose the participants’ gender to include or exclude them. In some cases, participants’ genders were not reported.

3.3.2 Psychometric tests

Before the EEG recording sessions, participants are often screened using psychometric tests, which are usually employed to categorize participants based on different aspects of intellectual abilities, ideational fluency, and cognitive development. These tests provide scores on various cognitive abilities. Additionally, personality tests are used to create personas for participants. Self-report questionnaires measure traits such as anxiety, mood, and depression. Some of the psychometric tests include the Intelligenz-Struktur-Test 2000-R (I-S-T 2000 R), which assesses general mental ability and specific intellectual abilities like visuospatial, numerical, and verbal abilities. The big five test is used for measuring personality traits like conscientiousness, extraversion, neuroticism, openness to experience, and agreeableness. Other tests such as Spielberger’s state–trait anxiety inventory (STAI) are used for mood and anxiety, while the Eysenck Personality Questionnaire (EPQ-R) investigates possible personality correlates of task performance ( Fink and Neubauer, 2006 , 2008 ; Fink et al., 2009a ; Jauk et al., 2012 ; Wang et al., 2019 ). To the best of our knowledge, the included design creativity studies have not directly utilized psychometrics ( Table 2 ) to explore the association between participants’ cognitive characteristics and brain behavior. There exist a few studies which have indirectly used cognitive characteristics. For instance, Eymann et al. (2022) assessed the shared mechanisms of creativity and intelligence in creative reasoning and their correlations with EEG characteristics.

3.3.3 Creativity and design creativity tasks

In this section, we introduce some experimental creativity tasks such as the Alternate Uses Task (AUT), and the Torrance Test of Creative Thinking (TTCT). Here, we would like to shed light on these tasks and their correlation with design creativity. One of the main characteristics of design creativity is divergent thinking as its first phase which is addressed by these two creativity tasks. In addition, AUT and TTCT are adopted and modified by several studies such as Hartog et al. (2020) , Hartog (2021) , Jia et al. (2021) , Jia and Zeng (2021) , and Li et al. (2021) for design creativity neurocognition studies. The figural version of TTCT is aligned with the goals of design creativity tasks where designers (specifically in engineering domains) create or draw their ideas, generate solutions, and evaluate and evolve generated solutions ( Srinivasan, 2007 ; Mayseless et al., 2014 ; Jia et al., 2021 ).

Furthermore, design creativity studies have introduced different types of design tasks from sequence of simple design problems to constrained and open design tasks ( Nguyen et al., 2018 ; Vieira et al., 2022a ). This variety of tasks opens new perspectives to the design creativity neurocognition studies. For example, the six design problems have been employed in some studies ( Nguyen and Zeng, 2014b ); ill-defined design tasks are used to explore brain dynamics differences between novice and expert designers ( Vieira et al., 2020d ).

The Alternate Uses Task (AUT), established by Guilford (1967) , is a prominent tool in psychological evaluations for assessing divergent thinking, an essential element of creativity. In AUT ( Guilford, 1967 ), participants are prompted to think of new and unconventional uses for everyday objects. Each object is usually shown twice – initially in the normal (common) condition and subsequently in the uncommon condition. In the common condition, participants are asked to consider regular, everyday uses for the objects. Conversely, in uncommon conditions, they are encouraged to come up with unique, inventive uses for the objects ( Stevens and Zabelina, 2020 ). The test includes several items for consideration, e.g., brick, foil, hanger, helmet, key, magnet, pencil, and pipe. In the uncommon condition, participants are asked to come up with as many uses as they can for everyday objects, such as shoes. It requires them to think beyond the typical uses (e.g., foot protection) and envision novel uses (e.g., a plant pot or ashtray). The responses in this classic task do not distinguish between the two key elements of creativity: originality (being novel and unique) and appropriateness (being relevant and meaningful) ( Runco and Mraz, 1992 ; Wang et al., 2017 ). For instance, when using a newspaper in the AUT, responses can vary from common uses like reading or wrapping to more inventive ones like creating a temporary umbrella. The AUT requires participants to generate multiple uses for everyday objects thereby measuring creativity through four main criteria: fluency (quantity of ideas), originality (uniqueness of ideas), flexibility (diversity of idea categories), and elaboration (detail in ideas) ( Cropley, 2000 ; Runco and Acar, 2012 ). In addition to the original indices of AUT, there are some creativity tests which include other indices such as fluency-valid and usefulness. Usefulness refers to how functional the ideas are ( Cropley, 2000 ; Runco and Acar, 2012 ) whereas fluency-valid, which only counts unique and non-repeated ideas, is defined as a valid number of ideas ( Prent and Smit, 2020 ). The AUT’s straightforward design and versatility make it a favored method for gauging creative capacity in diverse groups and settings, reflecting its universal applicability in creativity assessment ( Runco and Acar, 2012 ).

Developed by E. Paul Torrance in the late 1960s, the Torrance Test of Creative Thinking (TTCT) ( Torrance, 1966 ) is a foundational instrument for evaluating creative thinking. TTCT is recognized as a highly popular and extensively utilized tool for assessing creativity. Unlike the AUT, the TTCT is more structured and exists in two versions: verbal and figural. The verbal part of the TTCT, known as TTCT-Verbal, includes several subtests ( Almeida et al., 2008 ): (a) Asking Questions and Making Guesses (subtests 1, 2, and 3), where participants are required to pose questions and hypothesize about potential causes and effects; (b) Improvement of a Product (subtest 4), which involves suggesting modifications to the product; (c) Unusual Uses (subtest 5), where participants think of creative and atypical uses; and (d) Supposing (subtest 6), where participants imagine the outcomes of an unlikely event, as per Torrance. The figural component, TTCT-Figural, contains three tasks ( Almeida et al., 2008 ): (a) creating a drawing; (b) completing an unfinished drawing; and (c) developing a new drawing starting from parallel lines. An example of a figural TTCT task might involve uniquely finishing a partially drawn image, with evaluations based on the aforementioned criteria ( Rominger et al., 2018 ).

The TTCT includes a range of real-world reflective activities that encourage diverse thinking styles, essential for daily life and professional tasks. The TTCT assesses abilities in Questioning, Hypothesizing Causes and Effects, and Product Enhancement, each offering insights into an individual’s universal creative potential and originality ( Boden, 2004 ; Runco and Jaeger, 2012 ; Sternberg, 2020 ). It acts like a comprehensive test battery, evaluating multiple facets of creativity’s complex nature ( Guzik et al., 2023 ).

There are also other creativity tests such as Remote Associates Test (RAT), Runco Creativity Assessment Battery (rCAB), and Consensual Assessment Technique (CAT). TTCT is valued for its extensive historical database of human responses, which serves as a benchmark for comparison, owing to the consistent demographic profile of participants over many years and the systematic gathering of responses for evaluation ( Kaufman et al., 2008 ). The Alternate Uses Task (AUT) and the Remote Associates Test (RAT) are appreciated for their straightforward administration, scoring, and analysis. The Creative Achievement Test (CAT) is notable for its adaptability to specific fields, made possible by employing a panel of experts in relevant domains to assess creative works. Consequently, the CAT is particularly suited for evaluating creative outputs in historical contexts or significant “Big-C” creativity ( Kaufman et al., 2010 ). In contrast, the AUT and TTCT are more relevant for examining creativity in everyday, psychological, and professional contexts. As such, AUT and TTCT tests will establish a solid baseline for more complex design creativity studies employing more realistic design problems.

3.4 EEG recording and analysis: methods and algorithms

Electroencephalogram (EEG) signal analysis is a crucial component in the study of creativity whereby brain behavior associated with creativity tasks can be explored. Due to its advantages, EEG has emerged as one of the most suitable neuroimaging techniques for investigating brain activity during creativity tasks. Its affordability and suitability for studies involving physical movement, ease of recording and usage, and notably high temporal resolution make EEG a preferred choice in creativity research.

The dynamics during creative tasks are complex, nonlinear, and self-organized ( Nguyen and Zeng, 2012 ). It can thus be assumed that the brain could exhibits the similar characteristics, which shall be reflected in EEG signals. Capturing these complex and nonlinear patterns of brain behavior can be challenging for other neuroimaging methods ( Soroush et al., 2018 ).

3.4.1 Preprocessing: artifact removal

In design creativity studies utilizing EEG, the susceptibility of EEG signals to noise and artifacts is a significant concern due to the accompanying physical movements inherent in these tasks. Consequently, EEG preprocessing becomes indispensable in ensuring data quality and reliability. Unfortunately, not all the included studies in this review have clearly explained their pre-processing and artifact removal approaches. There also exist some well-known preprocessing pipelines such as HAPPE ( Gabard-Durnam et al., 2018 ) which (in contrast to their high efficiency) have been rarely used in design creativity neurocognition ( Jia et al., 2021 ; Jia and Zeng, 2021 ). The included papers in our analysis have introduced various preprocessing methods, including wavelet analysis, frequency-based filtering, and independent component analysis (ICA) ( Beaty et al., 2017 ; Fink et al., 2018 ; Lou et al., 2020 ). The primary objective of preprocessing remains consistent: to obtain high-quality EEG data devoid of noise or artifacts while minimizing information loss. Achieving this goal is crucial for the accurate interpretation and analysis of EEG signals in design creativity research.

3.4.2 Preprocessing: segmentation

Design creativity studies often encompass a multitude of cognitive tasks occurring simultaneously or sequentially, rendering them ill-defined and unstructured. This complexity leads to the generation of unstructured EEG data, posing a challenge for subsequent analysis ( Zhao et al., 2020 ). Therefore, segmentation methods play a crucial role in classifying recorded EEG signals into distinct cognitive tasks, such as idea generation, idea evolution, and idea evaluation.

Several segmentation methods have been adopted, including the ones relying on Task-Related Potential (TRP) analysis and microstate analysis, followed by clustering techniques like K-means clustering ( Nguyen and Zeng, 2014a ; Nguyen et al., 2019 ; Zhao et al., 2020 ; Jia et al., 2021 ; Jia and Zeng, 2021 ; Rominger et al., 2022b ). These methods aid in organizing EEG data into meaningful segments corresponding to different phases of the design creativity process, facilitating more targeted and insightful analysis. In addition, they provide possibilities to look into a more comprehensive list of design-related cognitions implied in but not intended by conventional AUT and TTCT experiments.

While there are some uniform segmentation methods (such as the ones based on TRP) employing frequency-based methods. Nguyen et al. (2019) introduced a fully automatic dynamic method based on microstate analysis. Since then, microstate analysis has been used in several studies to categorize the EEG dynamics in design creativity tasks ( Jia et al., 2021 ; Jia and Zeng, 2021 ). Microstate analysis provides a novel method for EEG-based design creativity studies with the capabilities of high temporal resolution and topography results ( Yuan et al., 2012 ; Custo et al., 2017 ; Jia et al., 2021 ; Jia and Zeng, 2021 ).

3.4.3 Feature extraction

The EEG data, after undergoing preprocessing, is directed to feature extraction, where relevant attributes are extracted to delve deeper into EEG dynamics and brain activity. These extracted features serve as the basis for conducting statistical analyses or employing machine learning algorithms.

In our review of the literature, we found that EEG frequency, time, and time-frequency analyses are the most commonly employed methods among the papers we considered. Specifically, the EEG alpha, beta, and gamma bands are often highlighted as critical indicators for studying brain dynamics in creativity and design creativity. Significant variations in the EEG bands have been observed during different stages of design creation tasks, including idea generation, idea evaluation, and idea elaboration ( Nguyen and Zeng, 2010 ; Liu et al., 2016 ; Rominger et al., 2019 ; Giannopulu et al., 2022 ; Lukačević et al., 2023 ; Mazza et al., 2023 ). For instance, the very first creativity studies used EEG alpha asymmetry to explore the relationship between creativity and left-hemisphere and right-hemisphere brain activity ( Martindale and Mines, 1975 ; Martindale and Hasenfus, 1978 ; Martindale et al., 1984 ). Other studies divided the EEG alpha band into lower (8–10 Hz) and upper alpha (10–13 Hz) and concluded that low alpha is more significant compared to the high EEG alpha band. Although the alpha band has been extensively explored by previous studies, several studies have also analyzed other EEG sub-bands such as beta, gamma, and delta and later concluded that these sub-bands are also significantly associated with creativity mechanisms, and can explain the differences between genders in different creativity experiments ( Razumnikova, 2004 ; Volf et al., 2010 ; Nair et al., 2020 ; Vieira et al., 2022a ).

Several studies have utilized Task-related power changes (TRP) to compare the EEG dynamics in different creativity tasks. TRP analysis is a high-temporal resolution method used to examine changes in brain activity associated with specific tasks or cognitive processes. In TRP analysis, the power of EEG signals, typically measured in terms of frequency bands (like alpha, beta, theta, etc.), is analyzed to identify how brain activity varies during the performance of a task compared to baseline or resting states. This method is particularly useful for understanding the dynamics of brain function as it allows researchers to pinpoint which areas of the brain are more active or less active during specific cognitive or motor tasks ( Rominger et al., 2022b ; Gubler et al., 2023 ). Reportedly, TRP has wide usage in EEG-based design creativity studies ( Jia et al., 2021 ; Jia and Zeng, 2021 ; Gubler et al., 2022 ).

Event-related synchronization (ERS) and de-synchronization (ERD) have also been reported to be effective in creativity studies ( Wang et al., 2017 ). ERD refers to a decrease in EEG power (in a specific frequency band) compared to a baseline state. The reduction in alpha power, for instance, is often interpreted as an increase in cortical activity. Conversely, ERS denotes an increase in EEG power. The increase in alpha power, for example, is associated with a relative decrease in cortical activity ( Doppelmayr et al., 2002 ; Babiloni et al., 2014 ). Researchers have concluded that these two indicators play a pivotal role in creativity studies as they are significantly correlated with brain dynamics during creativity tasks ( Srinivasan, 2007 ; Babiloni et al., 2014 ; Fink and Benedek, 2014 ).

Brain functional connectivity analysis, EEG source localization, brain topography maps, and event-related potentials analysis are other EEG processing methods which have been employed in a few studies ( Srinivasan, 2007 ; Dietrich and Kanso, 2010 ; Giannopulu et al., 2022 ; Kuznetsov et al., 2023 ). Considering that these methods have not been employed in several studies and with respect to their potential to provide insight into brain activity in transient modes or the correlations between the brain lobes, future studies are suggested to utilize such methods.

3.4.4 Data analysis and knowledge extraction

What was mentioned indicates that EEG frequency analysis is an effective approach for examining brain behavior in creativity and design creativity processes ( Fink and Neubauer, 2006 ; Nguyen and Zeng, 2010 ; Benedek et al., 2011 , 2014 ; Wang et al., 2017 ; Rominger et al., 2018 ; Vieira et al., 2022b ). Analyzing EEG channels in the time or frequency domains across various creativity tasks helps identify key channels contributing to these experiments. TRP and ERD/ERS are well-known EEG analysis methods widely applied in the included studies. Some studies have used other EEG sub-bands such as delta or gamma ( Boot et al., 2017 ; Stevens and Zabelina, 2020 ; Mazza et al., 2023 ). Besides these methods, other studies have utilized EEG connectivity and produced brain topography maps to explore different stages of design creativity. The final stage of EEG-based research involves statistical analysis and classification.

In statistical analysis, researchers examine EEG characteristics like power or alpha band amplitude to determine if there are notable differences during creativity tasks. Comparisons are made across different brain lobes and participants to identify which brain regions are more active during various stages of creativity. Techniques such as TRP, ERD, and ERS are scrutinized using statistical hypothesis testing to see if brain dynamics vary among participants or across different creativity tasks. Additionally, the relationship between EEG features and creativity scores is explored. For instance, researchers might investigate whether there is a link between EEG alpha power and creativity scores like originality and fluency. These statistical analyses can be conducted through either temporal or frequency EEG data.

In the classification phase, EEG data are classified according to different cognitive states of the brain. For example, EEG recordings might be classified based on the stages of creativity tasks, such as idea generation and idea evolution ( Hu et al., 2017 ; Stevens and Zabelina, 2020 ; Lloyd-Cox et al., 2022 ; Ahad et al., 2023 ; Şekerci et al., 2024 ). Except for a few studies which employed machine learning, other studies targeted EEG analysis and statistical methods. In these studies, the main objective is reported to be the classification of designers’ cognitive states, their emotional states, or the level of their creativity. In the included papers, traditional classifiers such as support vector machines and k-nearest neighbor have been employed. Modern deep learning approaches can be used in future studies to extract the hidden valuable information of EEG in design creativity states ( Jia, 2021 ). In open-ended loosely controlled creativity studies, where the phases of creativity are not clearly defined, clustering techniques are employed to categorize or segment EEG time intervals according to the corresponding creativity tasks ( Jia et al., 2021 ; Jia and Zeng, 2021 ). While loosely controlled design creativity studies results in more reliable and natural outcomes compared to strictly controlled ones, analyzing EEG signals in loosely controlled experiments is challenging as the recorded signals are not structured. Clustering methods are applied to microstate analysis to segment EEG signals into pre-defined states and have structured blocks that may align with certain cognitive functions ( Nguyen et al., 2019 ; Jia et al., 2021 ; Jia and Zeng, 2021 ). Therefore, statistical analysis, classification, and clustering form the core methods of data analysis in studies of creativity.

Table 2 represents EEG-based design studies with details about the number of participants, probable psychometric tests, experiment protocol, EEG analysis methods, and main findings. These studies are reported in this paper to highlight some of the differences between creativity and design creativity.

In addition to the studies reported in Table 2 , previous reviews and studies ( Srinivasan, 2007 ; Nguyen and Zeng, 2010 ; Lazar, 2018 ; Chrysikou and Gero, 2020 ; Hu and Shepley, 2022 ; Kim et al., 2022 ; Balters et al., 2023 ) can be found, which comprehensively reported approaches in design creativity neurocognition. Moreover, neurophysiological studies in design creativity are not limited to EEG or the components in Table 2 . For instance, in Liu et al. (2014) , EEG, heart rate (HR), and galvanic skin response (GSR) was used to detect the designer’s emotions in computer-aided design tasks. They determined the emotional states of CAD design tasks by processing CAD operators’ physiological signals and a fuzzy logic model. Aiello (2022) investigated the effects of external factors (such as light) and human ones on design processes, which also explored the association between the behavioral and neurophysiological responses in design creativity experiments. They employed ANOVA tests and found a significant correlation between neurophysiological recordings and daytime, participants’ stress, and their performance in terms of novelty and quality. They also recognized different patterns of brain dynamics corresponding to different kinds of performance measures. Montagna et al. ( Montagna and Candusso, n.d. ; Montagna and Laspia, 2018 ) analyzed brain behavior during the creative ideation process in the earliest phases of product development. In addition to EEG, they employed eye tracking to analyze the correlations between brain responses and eye movements. They utilized statistical analysis to recognize significant differences in brain hemispheres and lobes with respect to participants’ background, academic degree, and gender during the two modes of divergent and convergent thinking. Although some of their results are not consistent with those from the literature, these experiments shed light on the experiment design and provide insights and a framework for future experiments.

4 Discussion

In the present paper, we reviewed EEG-based design creativity studies in terms of their main components such as participants, psychometrics, and creativity tasks. Numerous studies have delved into brain activities associated with design creativity tasks, examined from various angles. While Table 1 showcases studies centered on the Alternate Uses Test (AUT), and the Torrance Tests of Creative Thinking (TTCT), Table 2 summarizes the EEG-based studies on design and design creativity-related tasks. In this section, we are going to discuss the impact of some most important factors including participants, experiment design, and EEG recording and processing on EEG-based design creativity studies. Research gaps and open questions are thus presented based on the discussion.

4.1 Participants

4.1.1 psychometrics: do we have a population that we wished for.

Psychometric testing is crucial for participant selection, with participant screening often based merely on self-reported information or based on their educational background. Examining Tables 1 , 2 reveals that psychometrics are not frequently utilized in design creativity studies, indicating a notable gap in these investigations. Future research should consider establishing a standard set of psychometric tests to create comprehensive participant profiles, particularly focusing on intellectual capabilities ( Jauk et al., 2015 ; Ueno et al., 2015 ; Razumnikova, 2022 ). Taking a look at the studies which employed psychometrics, it could be inferred that there is a correlation between cognitive abilities such as intelligence and creativity ( Arden et al., 2010 ; Jung and Haier, 2013 ). The few psychometric tests employed primarily focus on determining and providing a cognitive profile, encompassing factors such as mood, stress, IQ, anxiety, memory, and intelligence. Notably, intelligence-related assessments are more commonly used compared to other tests. These psychometrics are subject to social masking according to which there is the possibility of unreliable self-report psychometrics being recorded in the experiments. These results might yield less accurate findings.

4.1.2 Sample size and participants’ characteristics

Participant numbers in these studies vary widely, indicating a broad spectrum of sample sizes in this research area. The populations in the studies varied in size, with most having around 40 participants, predominantly students. In the design of experiments, it is important to highlight that the sample size in the selected studies had a mean of 43.76 and a standard deviation of 20.50. It is worth noting that while some studies employed specific experimental designs to determine sample size, many did not have clear and specific criteria for sample size determination, leaving the ideal sample size in such studies an open question. Any studies determine their sample sizes using G* power ( Erdfelder et al., 1996 ; Faul et al., 2007 ), a prevalent tool for power analysis in social and behavioral research.

Initial investigations typically involved healthy adults to more thoroughly understand creativity’s underlying mechanisms. These foundational studies, conducted under optimal conditions, aimed to capture the essence of brain behavior during creative tasks. A handful of studies ( Ayoobi et al., 2022 ; Gubler et al., 2022 , 2023 ) have begun exploring creativity in the context of chronic pain or multiple sclerosis, but broader participant diversity remains an area for further research. Additionally, not all studies provided information on the ages of their participants. There is a noticeable gap in research involving older adults or those with health conditions, suggesting an area ripe for future exploration. Diversity in participant backgrounds, such as varying academic disciplines, could offer richer insights, given creativity’s multifaceted nature and its link to individual skills, affect, and perceived workload ( Yang et al., 2022 ). For instance, the creative approaches of students with engineering thinking might differ significantly from those with art thinking.

Gender was not examined in most included studies. There are just a few studies analyzing the effects of gender on creativity and design creativity ( Razumnikova, 2004 ; Volf et al., 2010 ; Vieira et al., 2020b , 2022a ; Gubler et al., 2022 ). There is a notable need for further investigation to fully understand the impact of gender on the brain dynamics of design creativity.

4.2 Experiment design

While the Alternate Uses Test (AUT) and the Torrance Tests of Creative Thinking (TTCT) are commonly used in creativity research, other tasks like the Remote Associate Task are also prevalent ( Schuler et al., 2019 ; Zhang et al., 2020 ). AUT and figural TTCT are particularly favored in design creativity experiments for their compatibility with design tasks, surpassing verbal or other creativity tasks in applicability ( Boot et al., 2017 ). When considering the creativity tasks in the studies, it is notable that the AUT is more frequently utilized than TTCT, owing to its simplicity and ease of quantifying creativity scores. In contrast, TTCT often requires subjective assessments and expert ratings for scoring ( Rogers et al., 2023 ). However, both TTCT and AUT have undergone modifications in several studies to investigate their potential characteristics further ( Nguyen and Zeng, 2014a ).

While the majority of studies have adhered to strictly controlled frameworks for their experiments, two studies ( Nguyen and Zeng, 2017 ; Nguyen et al., 2019 ; Jia, 2021 ; Jia et al., 2021 ) have adopted novel, loosely controlled approaches, which reportedly yield more natural and reliable results compared to the strictly controlled ones. The rigidity from strictly controlled creativity experiments can exert additional cognitive stress on participants, potentially impacting experimental outcomes. In contrast, the loosely controlled experiments are characterized as self-paced and open-ended, allowing participants ample time to comprehend the design problem, generate ideas, evaluate them, and iterate upon them as needed. Recent behavioral and theoretical research suggests that creativity is better explored within a loosely controlled framework, where sufficient flexibility and freedom are essential. This approach, which contrasts with the highly regulated nature of traditional creativity studies, aims to capture the unpredictable elements of design activities ( Zhao et al., 2020 ). Loosely controlled design studies offer a more realistic portrayal of the actual design process. In these settings, participants enjoy the liberty to develop ideas at their own pace, reflecting true design practices ( Jia, 2021 ). The flexibility in such experiments allows for a broader range of scenarios and outcomes, depending on the complexity and the designers’ understanding of the tests and processes. Prior research has confirmed the effectiveness of this approach, examining its validity from both neuropsychological and design perspectives. Despite their less rigid structure, these loosely controlled experiments are valid and consistent with previous studies. Loosely controlled creativity experiments allow researchers to engage with the nonlinear, ill-defined, open-ended, and intricate nature of creativity tasks. However, it is important to note that data collection and processing can pose challenges in loosely controlled experiments due to the resulting unstructured data. These challenges can be handled through machine learning and signal processing methods ( Zhao et al., 2020 ). For further details regarding the loosely controlled experiments, readers can refer to the provided references ( Zhao et al., 2020 ; Jia et al., 2021 ; Jia and Zeng, 2021 ; Zangeneh Soroush et al., 2024 ).

Participants are affected by external or internal sources during the experiments. Participants are asked not to have caffeine, alcohol, or other stimulating beverages. The influence of stimulants like caffeine, alcohol, and other substances on creative brain dynamics is another under-researched area. While some studies have investigated the impact of cognitive and affective stimulation on creativity [such as pain ( Gubler et al., 2022 , 2023 )], more extensive research is needed. The study concerning environmental factors like temperature, humidity, and lighting, has been noted to significantly influence creativity ( Kimura et al., 2023 ; Lee and Lee, 2023 ). Investigating these environmental aspects could lead to more conclusive findings. Understanding these variables related to participants and their surroundings will enable more holistic and comprehensive creativity studies.

4.3.1 Advantages and disadvantages of EEG being used in design creativity experiments

As previously discussed and generally known in the neuroscience research community, EEG stands out as a simple and cost-effective biosignal with high temporal resolution, facilitating the exploration of microseconds of brain dynamics and providing detailed insights into neural activity, which was summarized in Balters and Steinert (2017) and Soroush et al. (2018) . However, despite its advantages in creativity experiments, EEG recording is prone to high levels of noise and artifacts due to its low amplitude and bandwidth ( Zangeneh Soroush et al., 2022 ). The inclusion of physical movements in design creativity experiments further increases the likelihood of artifacts such as movement and electrode replacement artifacts. Additionally, it is essential to acknowledge that EEG does have limitations, including relatively low spatial resolution. It also provides less information regarding brain behavior compared to other methods such as fMRI which provides detailed spatial brain activity.

4.3.2 EEG processing and data analysis

In design creativity experiments, EEG preprocessing is an inseparable phase ensuring the quality of EEG data in design creativity experiments. Widely employed artifact removal methods include frequency-based filters and independent component analysis. Unfortunately, not all studies provide a detailed description of their artifact removal procedures ( Zangeneh Soroush et al., 2022 ), compromising the reproducibility of the findings. Moreover, while there are standard evaluation metrics for assessing the quality of preprocessed EEG data, these metrics are often overlooked or not discussed in the included papers. It is essential to note that EEG preprocessing extends beyond artifact removal to include the segmentation of unstructured EEG data into well-defined structured EEG windows each of which corresponds to a specific cognitive task. This presents a challenge, particularly in loosely controlled experiments where the cognitive activities of designers during drawing tasks may not be clearly delineated since design tasks are recursive, nonlinear, self-paced, and complex, further complicating the segmentation process ( Nguyen and Zeng, 2012 ; Yang et al., 2022 ).

EEG analysis methods in creativity research predominantly utilize frequency-based analysis, with the alpha band (particularly the upper alpha band, 10–13 Hz) being a key focus due to its effectiveness in capturing various phases of creativity, including divergent and convergent thinking. Across studies, a consistent pattern of decreases in EEG power during design creativity compared to rest has been observed in the low-frequency delta and theta bands, as well as in the lower and upper alpha bands in bilateral frontal, central, and occipital brain regions ( Fink and Benedek, 2014 , 2021 ). This phenomenon, known as task-related desynchronization (TRD), is a common finding in EEG analysis during creativity tasks ( Jausovec and Jausovec, 2000 ; Pidgeon et al., 2016 ). A recurrent observation in numerous studies is the link between alpha band activity and creative cognition, particularly original idea generation and divergent thinking. Alpha synchronization, especially in the right hemisphere and frontal regions, is commonly associated with creative tasks and the generation of original ideas ( Rominger et al., 2022a ). Task-Related Power (TRP) analysis in the alpha band is widely used to decipher creativity-related brain activities. Creativity tasks typically result in increased alpha power, with more innovative responses correlating with stronger alpha synchronization in the posterior cortices. The TRP dynamics, marked by an initial rise, subsequent fall, and a final increase in alpha power, reflect the cognitive processes underlying creative ideation ( Rominger et al., 2018 ). Creativity is influenced by both cognitive processes and affective states, with studies showing that cognitive and affective interventions can enhance creative cognition through stronger prefrontal alpha activity. Different creative phases (e.g., idea generation, evolution, evaluation) exhibit unique EEG activity patterns. For instance, idea evolution is linked to a smaller decrease in lower alpha power, indicating varying attentional demands ( Fink and Benedek, 2014 , 2021 ; Rominger et al., 2019 , 2022a ; Jia and Zeng, 2021 ).

Hemispheric asymmetry plays a crucial role in creativity, with increased alpha power in the right hemisphere linked to the generation of more novel ideas. This asymmetry intensifies as the creative process unfolds. The frontal cortex, particularly through alpha synchronization, is frequently involved in creative cognition and idea evaluation, indicating a role in top-down control and internal attention ( Benedek et al., 2014 ). The parietal cortex, especially the right parietal cortex, is significant for focused internal attention during creative tasks ( Razumnikova, 2004 ; Benedek et al., 2011 , 2014 ).

EEG phase locking is another frequently employed analysis method. Most studies have focused on EEG coherence, EEG power and frequency analysis, brain asymmetry methods (hemispheric lateralization), and EEG temporal methods ( Rominger et al., 2020 ). However, creativity, being a higher-order, complex, nonlinear, and non-stationary cognitive task, suggests that linear and deterministic methods like frequency-based analysis might not fully capture its intricacies. This raises the possibility of incorporating alternative, specifically nonlinear EEG processing methods, which, to our knowledge, have been sparingly used in creativity research ( Stevens and Zabelina, 2020 ; Jia and Zeng, 2021 ). Additional analyses such as wavelet analysis, brain source separation, and source localization hold promise for future research endeavors in this domain.

As mentioned in the previous section, most studies have considered participants without their cognitive profile and characteristics. In addition, the included studies have chosen two main approaches including traditional statistical analysis and machine learning methods ( Goel, 2014 ; Stevens and Zabelina, 2020 ; Fink and Benedek, 2021 ). It should be noted that almost all of the included studies have employed the traditional statistical methods to examine their hypotheses or explore the differences between participants performing creativity tasks ( Fink and Benedek, 2014 , 2021 ; Rominger et al., 2019 , 2022a ; Stevens and Zabelina, 2020 ; Jia and Zeng, 2021 ).

Individual differences, such as intelligence, personality traits, and humor comprehension, also affect EEG patterns during creative tasks. For example, individuals with higher monitoring skills and creative potential exhibit distinct alpha power changes during creative ideation and evaluation ( Perchtold-Stefan et al., 2020 ). The diversity in creativity tasks (e.g., AUT, TTCT, verbal tasks) and EEG analysis methods (e.g., ERD/ERS, TRP, phase locking) used in studies highlights the methodological variety in this field, emphasizing the complexity of creativity research and the necessity for multiple approaches to fully grasp its neurocognitive mechanisms ( Goel, 2014 ; Gero and Milovanovic, 2020 ; Rominger et al., 2020 ; Fink and Benedek, 2021 ; Jia and Zeng, 2021 ).

In statistical analysis, studies often assess the differences in extracted features across different categories. For instance, in a study ( Gopan et al., 2022 ), various features, including nonlinear and temporal features, are extracted from single-channel EEG data to evaluate levels of Visual Creativity during sketching tasks. This involves comparing different groups within the experimental population based on specific features. Notably, the traditional statistical analyses not only provide insights into differences between experimental groups but also offer valuable information for machine learning methods ( Stevens and Zabelina, 2020 ). In another study ( Gubler et al., 2023 ), researchers conducted statistical analysis on frequency-based features to explore the impact of experimentally induced pain on creative ideation among female participants using an adaptation of the Alternate Uses Task (AUT). The analysis involved examining EEG features across channels and brain hemispheres under pain and pain-free conditions. Similarly, in another study ( Benedek et al., 2014 ), researchers conducted statistical analysis on EEG alpha power to investigate the functional significance of alpha power increases in the right parietal cortex, which reflects focused internal attention. They found that the Alternate Uses Task (AUT) inherently relies on internal attention (sensory-independence). Specifically, enforcing internal attention led to increased alpha power only in tasks requiring sensory intake but not in tasks requiring sensory independence. Moreover, sensory-independent tasks generally exhibited higher task-related alpha power levels than sensory intake tasks across both experimental conditions ( Benedek et al., 2011 , 2014 ).

Although most studies have employed statistical measures and analyses to investigate brain dynamics in a limited number of participants, there is a considerable lack of within-subjects and between-subjects analyses ( Rominger et al., 2022b ). There exist several studies which differentiate the brain dynamics of expert and novice designers or engineering students in different fields ( Vieira et al., 2020c , d ); however, more investigations with a larger number of participants are required.

While statistical approaches are commonly employed in EEG-based design creativity studies, there is a notable absence of machine learning methods within this domain. Among the included studies, only one ( Gopan et al., 2022 ) utilized machine learning techniques. In this study, statistical and nonlinear features were extracted from preprocessed EEG signals to classify EEG data into predefined cognitive tasks based on EEG characteristics. The study employed machine learning algorithms such as Long Short-Term Memory (LSTM), Support Vector Machines (SVM), and k-Nearest Neighbor (KNN) to classify EEG samples. These methods were utilized to enhance the understanding of the relationship between EEG signals and cognitive tasks, offering a promising avenue for further exploration in EEG-based design creativity research ( Stevens and Zabelina, 2020 ).

4.4 Research gaps and open questions

In this review, we aimed to empower readers to decide on experiments, EEG markers, feature extraction algorithms, and processing methods based on their study objectives, requirements, and limitations. However, it is essential to acknowledge that this review, while valuable in exploring EEG-based creativity and design creativity, has certain limitations which are summarized below:

1. Our review focuses on just the neuroscientific aspects of prior creativity and design creativity studies. Design methodologies and creativity models should be reviewed in other studies.

2. Included studies have employed only a limited number of adult participants with no mental or physical disorder.

3. Most studies have utilized fNIRS or EEG as they are more suitable for design creativity experiments, but we only focused on EEG based studies.

According to what was discussed above, it is obvious that EEG-based design creativity studies have been quite recently introduced to the field of design. This indicates that research gaps and open questions should be addressed for future studies. The following provides ten open questions we extracted from this review.

1. What constitutes an optimal protocol for participant selection, creativity task design, and procedural guidelines in EEG-based design creativity research?

2. How can we reconcile inconsistencies arising from variations in creativity tests and procedures across different studies? Furthermore, how can we address disparities between findings in EEG and fMRI studies?

3. What notable disparities exist in brain dynamics when comparing different creativity tests within the realm of design creativity?

4. In what ways can additional physiological markers, such as ECG and eye tracking, contribute to understanding neurocognition in design creativity?

5. How can alternative EEG processing methods beyond frequency-based analysis enhance the study of brain behavior during design creativity tasks?

6. What strategies can be employed to integrate combinational methods like EEG-fMRI to investigate design creativity?

7. How can the utilization of advanced wearable recording systems facilitate the implementation of more naturalistic and ecologically valid design creativity experiments?

8. What are the most effective approaches for transforming unstructured data into organized formats in loosely controlled creativity experiments?

9. What neural mechanisms are associated with design creativity in various mental and physical disorders?

10. In what ways can the application of advanced EEG processing methods offer deeper insights into the neurocognitive aspects of design creativity?

5 Conclusion

Design creativity stands as one of the most intricate high-order cognitive tasks, encompassing both mental and physical activities. It is a domain where design and creativity are intertwined, each representing a complex cognitive process. The human brain, an immensely sophisticated biological system, undergoes numerous intricate dynamics to facilitate creative abilities. The evolution of neuroimaging techniques, computational technologies, and machine learning has now enabled us to delve deeper into the brain behavior in design creativity tasks.

This literature review aims to scrutinize and highlight pivotal, and foundational research in this area. Our goal is to provide essential, comprehensive, and practical insights for future investigators in this field. We employed the snowball search method to reach the final set of papers which met our inclusion criteria. In this review, more than 1,500 studies were monitored and assessed as EEG-based creativity and design creativity studies. We reviewed over 120 studies with respect to their experimental details including participants, (design) creativity tasks, EEG analyses methods, and their main findings. Our review reports the most important experimental details of EEG-based studies and it also highlights research gaps, potential future trends, and promising avenues for future investigations.

Author contributions

MZ: Formal analysis, Investigation, Writing – original draft, Writing – review & editing. YZ: Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – review & editing.

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was supported by NSERC Discovery Grant (RGPIN-2019-07048), NSERC CRD Project (CRDPJ514052-17), and NSERC Design Chairs Program (CDEPJ 485989-14).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Abraham, A., Rutter, B., Bantin, T., and Hermann, C. (2018). Creative conceptual expansion: a combined fMRI replication and extension study to examine individual differences in creativity. Neuropsychologia 118, 29–39. doi: 10.1016/j.neuropsychologia.2018.05.004

Crossref Full Text | Google Scholar

Agnoli, S., Zanon, M., Mastria, S., Avenanti, A., and Corazza, G. E. (2020). Predicting response originality through brain activity: an analysis of changes in EEG alpha power during the generation of alternative ideas. NeuroImage 207:116385. doi: 10.1016/j.neuroimage.2019.116385

PubMed Abstract | Crossref Full Text | Google Scholar

Agnoli, S., Zenari, S., Mastria, S., and Corazza, G. E. (2021). How do you feel in virtual environments? The role of emotions and openness trait over creative performance. Creativity 8, 148–164. doi: 10.2478/ctra-2021-0010

Ahad, M. T., Hartog, T., Alhashim, A. G., Marshall, M., and Siddique, Z. (2023). Electroencephalogram experimentation to understand creativity of mechanical engineering students. ASME Open J. Eng. 2:21005. doi: 10.1115/1.4056473

Aiello, L. (2022). Time of day and background: How they affect designers neurophysiological and behavioural performance in divergent thinking. Polytechnic of Turin.

Google Scholar

Almeida, L. S., Prieto, L. P., Ferrando, M., Oliveira, E., and Ferrándiz, C. (2008). Torrance test of creative thinking: the question of its construct validity. Think. Skills Creat. 3, 53–58. doi: 10.1016/j.tsc.2008.03.003

Arden, R., Chavez, R. S., Grazioplene, R., and Jung, R. E. (2010). Neuroimaging creativity: a psychometric view. Behav. Brain Res. 214, 143–156. doi: 10.1016/j.bbr.2010.05.015

Ayoobi, F., Charmahini, S. A., Asadollahi, Z., Solati, S., Azin, H., Abedi, P., et al. (2022). Divergent and convergent thinking abilities in multiple sclerosis patients. Think. Skills Creat. 45:101065. doi: 10.1016/j.tsc.2022.101065

Babiloni, C., Del Percio, C., Arendt-Nielsen, L., Soricelli, A., Romani, G. L., Rossini, P. M., et al. (2014). Cortical EEG alpha rhythms reflect task-specific somatosensory and motor interactions in humans. Clin. Neurophysiol. 125, 1936–1945. doi: 10.1016/j.clinph.2014.04.021

Baillet, S., Mosher, J. C., and Leahy, R. M. (2001). Electromagnetic brain mapping. IEEE Signal Process. Mag. 18, 14–30. doi: 10.1109/79.962275

Balters, S., and Steinert, M. (2017). Capturing emotion reactivity through physiology measurement as a foundation for affective engineering in engineering design science and engineering practices. J. Intell. Manuf. 28, 1585–1607. doi: 10.1007/s10845-015-1145-2

Balters, S., Weinstein, T., Mayseless, N., Auernhammer, J., Hawthorne, G., Steinert, M., et al. (2023). Design science and neuroscience: a systematic review of the emergent field of design neurocognition. Des. Stud. 84:101148. doi: 10.1016/j.destud.2022.101148

Beaty, R. E., Christensen, A. P., Benedek, M., Silvia, P. J., and Schacter, D. L. (2017). Creative constraints: brain activity and network dynamics underlying semantic interference during idea production. NeuroImage 148, 189–196. doi: 10.1016/j.neuroimage.2017.01.012

Benedek, M., Bergner, S., Könen, T., Fink, A., and Neubauer, A. C. (2011). EEG alpha synchronization is related to top-down processing in convergent and divergent thinking. Neuropsychologia 49, 3505–3511. doi: 10.1016/j.neuropsychologia.2011.09.004

Benedek, M., and Fink, A. (2019). Toward a neurocognitive framework of creative cognition: the role of memory, attention, and cognitive control. Curr. Opin. Behav. Sci. 27, 116–122. doi: 10.1016/j.cobeha.2018.11.002

Benedek, M., Schickel, R. J., Jauk, E., Fink, A., and Neubauer, A. C. (2014). Alpha power increases in right parietal cortex reflects focused internal attention. Neuropsychologia 56, 393–400. doi: 10.1016/j.neuropsychologia.2014.02.010

Bhattacharya, J., and Petsche, H. (2005). Drawing on mind’s canvas: differences in cortical integration patterns between artists and non-artists. Hum. Brain Mapp. 26, 1–14. doi: 10.1002/hbm.20104

Boden, M. A. (2004). The creative mind: Myths and mechanisms . London and New York: Routledge.

Boot, N., Baas, M., Mühlfeld, E., de Dreu, C. K. W., and van Gaal, S. (2017). Widespread neural oscillations in the delta band dissociate rule convergence from rule divergence during creative idea generation. Neuropsychologia 104, 8–17. doi: 10.1016/j.neuropsychologia.2017.07.033

Borgianni, Y., and Maccioni, L. (2020). Review of the use of neurophysiological and biometric measures in experimental design research. Artif. Intell. Eng. Des. Anal. Manuf. 34, 248–285. doi: 10.1017/S0890060420000062

Braun, V., and Clarke, V. (2012). “Thematic analysis” in APA handbook of research methods in psychology, Vol 2: Research designs: Quantitative, qualitative, neuropsychological, and biological . eds. H. Cooper, P. M. Camic, D. L. Long, A. T. Panter, D. Rindskopf, and K. J. Sher (American Psychological Association), 57–71.

Camarda, A., Salvia, É., Vidal, J., Weil, B., Poirel, N., Houdé, O., et al. (2018). Neural basis of functional fixedness during creative idea generation: an EEG study. Neuropsychologia 118, 4–12. doi: 10.1016/j.neuropsychologia.2018.03.009

Chang, Y., Kao, J.-Y., and Wang, Y.-Y. (2022). Influences of virtual reality on design creativity and design thinking. Think. Skills Creat. 46:101127. doi: 10.1016/j.tsc.2022.101127

Choi, J. W., and Kim, K. H. (2018). Methods for functional connectivity analysis. in Computational EEG analysis. Biological and medical physics, biomedical engineering . ed. I. M. CH (Singapore: Springer).

Chrysikou, E. G., and Gero, J. S. (2020). Using neuroscience techniques to understand and improve design cognition. AIMS Neurosci. 7, 319–326. doi: 10.3934/Neuroscience.2020018

Cropley, A. J. (2000). Defining and measuring creativity: are creativity tests worth using? Roeper Rev. 23, 72–79. doi: 10.1080/02783190009554069

Cropley, D. H. (2015a). “Chapter 2 – The importance of creativity in engineering” in Creativity in engineering . ed. D. H. Cropley (London: Academic Press), 13–34.

Cropley, D. H. (2015b). “Chapter 3 – Phases: creativity and the design process” in Creativity in engineering . ed. D. H. Cropley (London: Academic Press), 35–61.

Custo, A., Van De Ville, D., Wells, W. M., Tomescu, M. I., Brunet, D., and Michel, C. M. (2017). Electroencephalographic resting-state networks: source localization of microstates. Brain Connect. 7, 671–682. doi: 10.1089/brain.2016.0476

Danko, S. G., Shemyakina, N. V., Nagornova, Z. V., and Starchenko, M. G. (2009). Comparison of the effects of the subjective complexity and verbal creativity on EEG spectral power parameters. Hum. Physiol. 35, 381–383. doi: 10.1134/S0362119709030153

Dietrich, A., and Kanso, R. (2010). A review of EEG, ERP, and neuroimaging studies of creativity and insight. Psychol. Bull. 136, 822–848. doi: 10.1037/a0019749

Doppelmayr, M., Klimesch, W., Stadler, W., Pöllhuber, D., and Heine, C. (2002). EEG alpha power and intelligence. Intelligence 30, 289–302. doi: 10.1016/S0160-2896(01)00101-5

Erdfelder, E., Faul, F., and Buchner, A. (1996). GPOWER: a general power analysis program. Behav. Res. Methods Instrum. Comput. 28, 1–11. doi: 10.3758/BF03203630

Eymann, V., Beck, A.-K., Jaarsveld, S., Lachmann, T., and Czernochowski, D. (2022). Alpha oscillatory evidence for shared underlying mechanisms of creativity and fluid intelligence above and beyond working memory-related activity. Intelligence 91:101630. doi: 10.1016/j.intell.2022.101630

Faul, F., Erdfelder, E., Lang, A.-G., and Buchner, A. (2007). G*power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav. Res. Methods 39, 175–191. doi: 10.3758/BF03193146

Fink, A., and Benedek, M. (2014). EEG alpha power and creative ideation. Neurosci. Biobehav. Rev. 44, 111–123. doi: 10.1016/j.neubiorev.2012.12.002

Fink, A., and Benedek, M. (2021). The neuroscience of creativity. e-Neuroforum 25, 231–240. doi: 10.1515/nf-2019-0006

Fink, A., Benedek, M., Koschutnig, K., Papousek, I., Weiss, E. M., Bagga, D., et al. (2018). Modulation of resting-state network connectivity by verbal divergent thinking training. Brain Cogn. 128, 1–6. doi: 10.1016/j.bandc.2018.10.008

Fink, A., Grabner, R. H., Benedek, M., Reishofer, G., Hauswirth, V., Fally, M., et al. (2009a). The creative brain: investigation of brain activity during creative problem solving by means of EEG and fMRI. Hum. Brain Mapp. 30, 734–748. doi: 10.1002/hbm.20538

Fink, A., Graif, B., and Neubauer, A. C. (2009b). Brain correlates underlying creative thinking: EEG alpha activity in professional vs. novice dancers. NeuroImage 46, 854–862. doi: 10.1016/j.neuroimage.2009.02.036

Fink, A., and Neubauer, A. C. (2006). EEG alpha oscillations during the performance of verbal creativity tasks: differential effects of sex and verbal intelligence. Int. J. Psychophysiol. 62, 46–53. doi: 10.1016/j.ijpsycho.2006.01.001

Fink, A., and Neubauer, A. C. (2008). Eysenck meets Martindale: the relationship between extraversion and originality from the neuroscientific perspective. Personal. Individ. Differ. 44, 299–310. doi: 10.1016/j.paid.2007.08.010

Fink, A., Schwab, D., and Papousek, I. (2011). Sensitivity of EEG upper alpha activity to cognitive and affective creativity interventions. Int. J. Psychophysiol. 82, 233–239. doi: 10.1016/j.ijpsycho.2011.09.003

Gabard-Durnam, L. J., Mendez Leal, A. S., Wilkinson, C. L., and Levin, A. R. (2018). The Harvard automated processing pipeline for electroencephalography (HAPPE): standardized processing software for developmental and high-Artifact data. Front. Neurosci. 12:97. doi: 10.3389/fnins.2018.00097

Gao, M., Zhang, D., Wang, Z., Liang, B., Cai, Y., Gao, Z., et al. (2017). Mental rotation task specifically modulates functional connectivity strength of intrinsic brain activity in low frequency domains: a maximum uncertainty linear discriminant analysis. Behav. Brain Res. 320, 233–243. doi: 10.1016/j.bbr.2016.12.017

Gero, J. S. (1990). Design prototypes: a knowledge representation schema for design. AI Mag. 11:26. doi: 10.1609/aimag.v11i4.854

Gero, J. S. (1994). “Introduction: creativity and design” in Artificial intelligence and creativity: An interdisciplinary approach . ed. T. Dartnall (Netherlands: Springer), 259–267.

Gero, J. S. (1996). Creativity, emergence and evolution in design. Knowl. Based Syst. 9, 435–448. doi: 10.1016/S0950-7051(96)01054-4

Gero, J. (2011). Design creativity 2010 doi: 10.1007/978-0-85729-224-7

Gero, J. S. (2020). Nascent directions for design creativity research. Int. J. Des. Creat. Innov. 8, 144–146. doi: 10.1080/21650349.2020.1767885

Gero, J. S., and Milovanovic, J. (2020). A framework for studying design thinking through measuring designers’ minds, bodies and brains. Design Sci. 6:e19. doi: 10.1017/dsj.2020.15

Giannopulu, I., Brotto, G., Lee, T. J., Frangos, A., and To, D. (2022). Synchronised neural signature of creative mental imagery in reality and augmented reality. Heliyon 8:e09017. doi: 10.1016/j.heliyon.2022.e09017

Goel, V. (2014). Creative brains: designing in the real world. Front. Hum. Neurosci. 8, 1–14. doi: 10.3389/fnhum.2014.00241

Göker, M. H. (1997). The effects of experience during design problem solving. Des. Stud. 18, 405–426. doi: 10.1016/S0142-694X(97)00009-4

Gopan, K. G., Reddy, S. V. R. A., Rao, M., and Sinha, N. (2022). Analysis of single channel electroencephalographic signals for visual creativity: a pilot study. Biomed. Signal Process. Control. 75:103542. doi: 10.1016/j.bspc.2022.103542

Grabner, R. H., Fink, A., and Neubauer, A. C. (2007). Brain correlates of self-rated originality of ideas: evidence from event-related power and phase-locking changes in the EEG. Behav. Neurosci. 121, 224–230. doi: 10.1037/0735-7044.121.1.224

Gubler, D. A., Rominger, C., Grosse Holtforth, M., Egloff, N., Frickmann, F., Goetze, B., et al. (2022). The impact of chronic pain on creative ideation: an examination of the underlying attention-related psychophysiological mechanisms. Eur. J. Pain (United Kingdom) 26, 1768–1780. doi: 10.1002/ejp.2000

Gubler, D. A., Rominger, C., Jakob, D., and Troche, S. J. (2023). How does experimentally induced pain affect creative ideation and underlying attention-related psychophysiological mechanisms? Neuropsychologia 183:108514. doi: 10.1016/j.neuropsychologia.2023.108514

Guilford, J. P. (1959). “Traits of creativity” in Creativity and its cultivation . ed. H. H. Anderson (New York: Harper & Row), 142–161.

Guilford, J. P. (1967). The nature of human intelligence . New York, NY, US: McGraw-Hill.

Guzik, E. E., Byrge, C., and Gilde, C. (2023). The originality of machines: AI takes the Torrance test. Journal of Creativity 33:100065. doi: 10.1016/j.yjoc.2023.100065

Haner, U.-E. (2005). Spaces for creativity and innovation in two established organizations. Creat. Innov. Manag. 14, 288–298. doi: 10.1111/j.1476-8691.2005.00347.x

Hao, N., Ku, Y., Liu, M., Hu, Y., Bodner, M., Grabner, R. H., et al. (2016). Reflection enhances creativity: beneficial effects of idea evaluation on idea generation. Brain Cogn. 103, 30–37. doi: 10.1016/j.bandc.2016.01.005

Hartog, T. (2021). EEG investigations of creativity in engineering and engineering design. shareok.org . Available at: https://shareok.org/handle/11244/329532

Hartog, T., Marshall, M., Alhashim, A., Ahad, M. T., et al. (2020). Work in Progress: using neuro-responses to understand creativity, the engineering design process, and concept generation. Paper Presented at …. Available at: https://par.nsf.gov/biblio/10208519

Hetzroni, O., Agada, H., and Leikin, M. (2019). Creativity in autism: an examination of general and mathematical creative thinking among children with autism Spectrum disorder and children with typical development. J. Autism Dev. Disord. 49, 3833–3844. doi: 10.1007/s10803-019-04094-x

Hu, W.-L., Booth, J. W., and Reid, T. (2017). The relationship between design outcomes and mental states during ideation. J. Mech. Des. 139:51101. doi: 10.1115/1.4036131

Hu, Y., Ouyang, J., Wang, H., Zhang, J., Liu, A., Min, X., et al. (2022). Design meets neuroscience: an electroencephalogram study of design thinking in concept generation phase. Front. Psychol. 13:832194. doi: 10.3389/fpsyg.2022.832194

Hu, L., and Shepley, M. M. C. (2022). Design meets neuroscience: a preliminary review of design research using neuroscience tools. J. Inter. Des. 47, 31–50. doi: 10.1111/joid.12213

Japardi, K., Bookheimer, S., Knudsen, K., Ghahremani, D. G., and Bilder, R. M. (2018). Functional magnetic resonance imaging of divergent and convergent thinking in big-C creativity. Neuropsychologia 118, 59–67. doi: 10.1016/j.neuropsychologia.2018.02.017

Jauk, E., Benedek, M., and Neubauer, A. C. (2012). Tackling creativity at its roots: evidence for different patterns of EEG alpha activity related to convergent and divergent modes of task processing. Int. J. Psychophysiol. 84, 219–225. doi: 10.1016/j.ijpsycho.2012.02.012

Jauk, E., Neubauer, A. C., Dunst, B., Fink, A., and Benedek, M. (2015). Gray matter correlates of creative potential: a latent variable voxel-based morphometry study. NeuroImage 111, 312–320. doi: 10.1016/j.neuroimage.2015.02.002

Jausovec, N., and Jausovec, K. (2000). EEG activity during the performance of complex mental problems. Int. J. Psychophysiol. 36, 73–88. doi: 10.1016/S0167-8760(99)00113-0

Jia, W. (2021). Investigating neurocognition in design creativity under loosely controlled experiments supported by EEG microstate analysis [Concordia University]. Available at: https://spectrum.library.concordia.ca/id/eprint/988724/

Jia, W., von Wegner, F., Zhao, M., and Zeng, Y. (2021). Network oscillations imply the highest cognitive workload and lowest cognitive control during idea generation in open-ended creation tasks. Sci. Rep. 11:24277. doi: 10.1038/s41598-021-03577-1

Jia, W., and Zeng, Y. (2021). EEG signals respond differently to idea generation, idea evolution and evaluation in a loosely controlled creativity experiment. Sci. Rep. 11:2119. doi: 10.1038/s41598-021-81655-0

Jung, R. E., and Haier, R. J. (2013). “Creativity and intelligence: brain networks that link and differentiate the expression of genius” in Neuroscience of creativity . eds. O. Vartanian, A. S. Bristol, and J. C. Kaufman (Cambridge, MA: MIT Press). 233–254. (Accessed 18 June 2024).

Jung, R. E., and Vartanian, O. (Eds.). (2018). The Cambridge handbook of the neuroscience of creativity . Cambridge: Cambridge University Press.

Kaufman, J. C., Beghetto, R. A., Baer, J., and Ivcevic, Z. (2010). Creativity polymathy: what Benjamin Franklin can teach your kindergartener. Learn. Individ. Differ. 20, 380–387. doi: 10.1016/j.lindif.2009.10.001

Kaufman, J. C., John Baer, J. C. C., and Sexton, J. D. (2008). A comparison of expert and nonexpert Raters using the consensual assessment technique. Creat. Res. J. 20, 171–178. doi: 10.1080/10400410802059929

Kaufman, J. C., and Sternberg, R. J. (Eds.). (2010). The Cambridge handbook of creativity. Cambridge University Press.

Kim, N., Chung, S., and Kim, D. I. (2022). Exploring EEG-based design studies: a systematic review. Arch. Des. Res. 35, 91–113. doi: 10.15187/adr.2022.11.35.4.91

Kimura, T., Mizumoto, T., Torii, Y., Ohno, M., Higashino, T., and Yagi, Y. (2023). Comparison of the effects of indoor and outdoor exercise on creativity: an analysis of EEG alpha power. Front. Psychol. 14:1161533. doi: 10.3389/fpsyg.2023.1161533

Klimesch, W. (1999). EEG alpha and theta oscillations reflect cognitive and memory performance: a review and analysis. Brain Res. Rev. 29, 169–195. doi: 10.1016/s0165-0173(98)00056-3

Klimesch, W., Doppelmayr, M., Russegger, H., Pachinger, T., and Schwaiger, J. (1998). Induced alpha band power changes in the human EEG and attention. Neurosci. Lett. 244, 73–76. doi: 10.1016/S0304-3940(98)00122-0

Kruk, K. A., Aravich, P. F., Deaver, S. P., and deBeus, R. (2014). Comparison of brain activity during drawing and clay sculpting: a preliminary qEEG study. Art Ther. 31, 52–60. doi: 10.1080/07421656.2014.903826

Kuznetsov, I., Kozachuk, N., Kachynska, T., Zhuravlov, O., Zhuravlova, O., and Rakovets, O. (2023). Inner speech as a brain mechanism for preconditioning creativity process. East Eur. J. Psycholinguist. 10, 136–151. doi: 10.29038/eejpl.2023.10.1.koz

Lazar, L. (2018). The cognitive neuroscience of design creativity. J. Exp. Neurosci. 12:117906951880966. doi: 10.1177/1179069518809664

Lee, J. H., and Lee, S. (2023). Relationships between physical environments and creativity: a scoping review. Think. Skills Creat. 48:101276. doi: 10.1016/j.tsc.2023.101276

Leikin, M. (2013). The effect of bilingualism on creativity: developmental and educational perspectives. Int. J. Biling. 17, 431–447. doi: 10.1177/1367006912438300

Li, S., Becattini, N., and Cascini, G. (2021). Correlating design performance to EEG activation: early evidence from experimental data. Proceedings of the Design Society. Available at: https://www.cambridge.org/core/journals/proceedings-of-the-design-society/article/correlating-design-performance-to-eeg-activation-early-evidence-from-experimental-data/8F4FCB64135209CAD9B97C1433E7CB99

Liang, C., Chang, C. C., and Liu, Y. C. (2019). Comparison of the cerebral activities exhibited by expert and novice visual communication designers during idea incubation. Int. J. Des. Creat. Innov. 7, 213–236. doi: 10.1080/21650349.2018.1562995

Liberati, A., Altman, D. G., Tetzlaff, J., Mulrow, C., Gøtzsche, P. C., Ioannidis, J. P. A., et al. (2009). The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. J. Clin. Epidemiol. 62, e1–e34. doi: 10.1016/j.jclinepi.2009.06.006

Liu, L., Li, Y., Xiong, Y., Cao, J., and Yuan, P. (2018). An EEG study of the relationship between design problem statements and cognitive behaviors during conceptual design. Artif. Intell. Eng. Des. Anal. Manuf. 32, 351–362. doi: 10.1017/S0890060417000683

Liu, L., Nguyen, T. A., Zeng, Y., and Hamza, A. B. (2016). Identification of relationships between electroencephalography (EEG) bands and design activities. Volume 7: doi: 10.1115/DETC2016-59104

Liu, Y., Ritchie, J. M., Lim, T., Kosmadoudi, Z., Sivanathan, A., and Sung, R. C. W. (2014). A fuzzy psycho-physiological approach to enable the understanding of an engineer’s affect status during CAD activities. Comput. Aided Des. 54, 19–38. doi: 10.1016/j.cad.2013.10.007

Lloyd-Cox, J., Chen, Q., and Beaty, R. E. (2022). The time course of creativity: multivariate classification of default and executive network contributions to creative cognition over time. Cortex 156, 90–105. doi: 10.1016/j.cortex.2022.08.008

Lou, S., Feng, Y., Li, Z., Zheng, H., and Tan, J. (2020). An integrated decision-making method for product design scheme evaluation based on cloud model and EEG data. Adv. Eng. Inform. 43:101028. doi: 10.1016/j.aei.2019.101028

Lukačević, F., Becattini, N., Perišić, M. M., and Škec, S. (2023). Differences in engineers’ brain activity when CAD modelling from isometric and orthographic projections. Sci. Rep. 13:9726. doi: 10.1038/s41598-023-36823-9

Martindale, C., and Hasenfus, N. (1978). EEG differences as a function of creativity, stage of the creative process, and effort to be original. Biol. Psychol. 6, 157–167. doi: 10.1016/0301-0511(78)90018-2

Martindale, C., Hines, D., Mitchell, L., and Covello, E. (1984). EEG alpha asymmetry and creativity. Personal. Individ. Differ. 5, 77–86. doi: 10.1016/0191-8869(84)90140-5

Martindale, C., and Mines, D. (1975). Creativity and cortical activation during creative, intellectual and eeg feedback tasks. Biol. Psychol. 3, 91–100. doi: 10.1016/0301-0511(75)90011-3

Mastria, S., Agnoli, S., Zanon, M., Acar, S., Runco, M. A., and Corazza, G. E. (2021). Clustering and switching in divergent thinking: neurophysiological correlates underlying flexibility during idea generation. Neuropsychologia 158:107890. doi: 10.1016/j.neuropsychologia.2021.107890

Mayseless, N., Aharon-Peretz, J., and Shamay-Tsoory, S. (2014). Unleashing creativity: the role of left temporoparietal regions in evaluating and inhibiting the generation of creative ideas. Neuropsychologia 64, 157–168. doi: 10.1016/j.neuropsychologia.2014.09.022

Mazza, A., Dal Monte, O., Schintu, S., Colombo, S., Michielli, N., Sarasso, P., et al. (2023). Beyond alpha-band: the neural correlate of creative thinking. Neuropsychologia 179:108446. doi: 10.1016/j.neuropsychologia.2022.108446

Mokyr, J. (1990). The lever of riches: Technological creativity and economic progress : New York and Oxford: Oxford University Press.

Montagna, F., and Candusso, A. (n.d.). Electroencephalogram: the definition of the assessment methodology for verbal responses and the analysis of brain waves in an idea creativity experiment. In webthesis.biblio.polito.it. Available at: https://webthesis.biblio.polito.it/13445/1/tesi.pdf

Montagna, F., and Laspia, A. (2018). A new approach to investigate the design process. webthesis.biblio.polito.it. Available at: https://webthesis.biblio.polito.it/10011/1/tesi.pdf

Nagai, Y., and Gero, J. (2012). Design creativity. J. Eng. Des. 23, 237–239. doi: 10.1080/09544828.2011.642495

Nair, N., Hegarty, J. P., Ferguson, B. J., Hecht, P. M., Tilley, M., Christ, S. E., et al. (2020). Effects of stress on functional connectivity during problem solving. NeuroImage 208:116407. doi: 10.1016/j.neuroimage.2019.116407

Nguyen, P., Nguyen, T. A., and Zeng, Y. (2018). Empirical approaches to quantifying effort, fatigue and concentration in the conceptual design process. Res. Eng. Des. 29, 393–409. doi: 10.1007/s00163-017-0273-4

Nguyen, P., Nguyen, T. A., and Zeng, Y. (2019). Segmentation of design protocol using EEG. Artif. Intell. Eng. Des. Anal. Manuf. 33, 11–23. doi: 10.1017/S0890060417000622

Nguyen, T. A., and Zeng, Y. (2010). Analysis of design activities using EEG signals. Vol. 5: 277–286. doi: 10.1115/DETC2010-28477

Nguyen, T. A., and Zeng, Y. (2012). A theoretical model of design creativity: nonlinear design dynamics and mental stress-creativity relation. J. Integr. Des. Process. Sci. 16, 65–88. doi: 10.3233/jid-2012-0007

Nguyen, T. A., and Zeng, Y. (2014a). A physiological study of relationship between designer’s mental effort and mental stress during conceptual design. Comput. Aided Des. 54, 3–18. doi: 10.1016/j.cad.2013.10.002

Nguyen, T. A., and Zeng, Y. (2014b). A preliminary study of EEG spectrogram of a single subject performing a creativity test. Proceedings of the 2014 international conference on innovative design and manufacturing (ICIDM), 16–21. doi: 10.1109/IDAM.2014.6912664

Nguyen, T. A., and Zeng, Y. (2017). Effects of stress and effort on self-rated reports in experimental study of design activities. J. Intell. Manuf. 28, 1609–1622. doi: 10.1007/s10845-016-1196-z

Oldfield, R. C. (1971). The assessment and analysis of handedness: the Edinburgh inventory. Neuropsychologia 9, 97–113. doi: 10.1016/0028-3932(71)90067-4

Pahl, G., Beitz, W., Feldhusen, J., and Grote, K.-H. (1988). Engineering design: A systematic approach . ( Vol. 3 ). London: Springer.

Peng, W. (2019). EEG preprocessing and denoising. In EEG signal processing and feature extraction. doi: 10.1007/978-981-13-9113-2_5

Perchtold-Stefan, C. M., Papousek, I., Rominger, C., Schertler, M., Weiss, E. M., and Fink, A. (2020). Humor comprehension and creative cognition: shared and distinct neurocognitive mechanisms as indicated by EEG alpha activity. NeuroImage 213:116695. doi: 10.1016/j.neuroimage.2020.116695

Petsche, H. (1996). Approaches to verbal, visual and musical creativity by EEG coherence analysis. Int. J. Psychophysiol. 24, 145–159. doi: 10.1016/S0167-8760(96)00050-5

Petsche, H., Kaplan, S., von Stein, A., and Filz, O. (1997). The possible meaning of the upper and lower alpha frequency ranges for cognitive and creative tasks. Int. J. Psychophysiol. 26, 77–97. doi: 10.1016/S0167-8760(97)00757-5

Pidgeon, L. M., Grealy, M., Duffy, A. H. B., Hay, L., McTeague, C., Vuletic, T., et al. (2016). Functional neuroimaging of visual creativity: a systematic review and meta-analysis. Brain Behavior 6:e00540. doi: 10.1002/brb3.540

Prent, N., and Smit, D. J. A. (2020). The dynamics of resting-state alpha oscillations predict individual differences in creativity. Neuropsychologia 142:107456. doi: 10.1016/j.neuropsychologia.2020.107456

Razumnikova, O. M. (2004). Gender differences in hemispheric organization during divergent thinking: an EEG investigation in human subjects. Neurosci. Lett. 362, 193–195. doi: 10.1016/j.neulet.2004.02.066

Razumnikova, O. M. (2022). Baseline measures of EEG power as correlates of the verbal and nonverbal components of creativity and intelligence. Neurosci. Behav. Physiol. 52, 124–134. doi: 10.1007/s11055-022-01214-6

Razumnikova, O. M., Volf, N. V., and Tarasova, I. V. (2009). Strategy and results: sex differences in electrographic correlates of verbal and figural creativity. Hum. Physiol. 35, 285–294. doi: 10.1134/S0362119709030049

Rogers, C. J., Tolmie, A., Massonnié, J., et al. (2023). Complex cognition and individual variability: a mixed methods study of the relationship between creativity and executive control. Front. Psychol. 14:1191893. doi: 10.3389/fpsyg.2023.1191893

Rominger, C., Benedek, M., Lebuda, I., Perchtold-Stefan, C. M., Schwerdtfeger, A. R., Papousek, I., et al. (2022a). Functional brain activation patterns of creative metacognitive monitoring. Neuropsychologia 177:108416. doi: 10.1016/j.neuropsychologia.2022.108416

Rominger, C., Gubler, D. A., Makowski, L. M., and Troche, S. J. (2022b). More creative ideas are associated with increased right posterior power and frontal-parietal/occipital coupling in the upper alpha band: a within-subjects study. Int. J. Psychophysiol. 181, 95–103. doi: 10.1016/j.ijpsycho.2022.08.012

Rominger, C., Papousek, I., Perchtold, C. M., Benedek, M., Weiss, E. M., Schwerdtfeger, A., et al. (2019). Creativity is associated with a characteristic U-shaped function of alpha power changes accompanied by an early increase in functional coupling. Cogn. Affect. Behav. Neurosci. 19, 1012–1021. doi: 10.3758/s13415-019-00699-y

Rominger, C., Papousek, I., Perchtold, C. M., Benedek, M., Weiss, E. M., Weber, B., et al. (2020). Functional coupling of brain networks during creative idea generation and elaboration in the figural domain. NeuroImage 207:116395. doi: 10.1016/j.neuroimage.2019.116395

Rominger, C., Papousek, I., Perchtold, C. M., Weber, B., Weiss, E. M., and Fink, A. (2018). The creative brain in the figural domain: distinct patterns of EEG alpha power during idea generation and idea elaboration. Neuropsychologia 118, 13–19. doi: 10.1016/j.neuropsychologia.2018.02.013

Runco, M. A., and Acar, S. (2012). Divergent thinking as an indicator of creative potential. Creat. Res. J. 24, 66–75. doi: 10.1080/10400419.2012.652929

Runco, M. A., and Jaeger, G. J. (2012). The standard definition of creativity. Creat. Res. J. 24, 92–96. doi: 10.1080/10400419.2012.650092

Runco, M. A., and Mraz, W. (1992). Scoring divergent thinking tests using total ideational output and a creativity index. Educ. Psychol. Meas. 52, 213–221. doi: 10.1177/001316449205200126

Sanei, S., and Chambers, J. A. (2013). EEG signal processing . John Wiley & Sons.

Schuler, A. L., Tik, M., Sladky, R., Luft, C. D. B., Hoffmann, A., Woletz, M., et al. (2019). Modulations in resting state networks of subcortical structures linked to creativity. NeuroImage 195, 311–319. doi: 10.1016/j.neuroimage.2019.03.017

Schwab, D., Benedek, M., Papousek, I., Weiss, E. M., and Fink, A. (2014). The time-course of EEG alpha power changes in creative ideation. Front. Hum. Neurosci. 8:310. doi: 10.3389/fnhum.2014.00310

Şekerci, Y., Kahraman, M. U., Özturan, Ö., Çelik, E., and Ayan, S. Ş. (2024). Neurocognitive responses to spatial design behaviors and tools among interior architecture students: a pilot study. Sci. Rep. 14:4454. doi: 10.1038/s41598-024-55182-7

Shemyakina, N. V., and Dan’ko, S. G. (2004). Influence of the emotional perception of a signal on the electroencephalographic correlates of creative activity. Hum. Physiol. 30, 145–151. doi: 10.1023/B:HUMP.0000021641.41105.86

Simon, H. A. (1996). The sciences of the artificial . 3rd Edn: MIT Press.

Simonton, D. K. (2000). Creativity: cognitive, personal, developmental, and social aspects. American psychologist, 55:151.

Simonton, D. K. (2012). Taking the U.S. patent office criteria seriously: a quantitative three-criterion creativity definition and its implications. Creat. Res. J. 24, 97–106. doi: 10.1080/10400419.2012.676974

Soroush, M. Z., Maghooli, K., Setarehdan, S. K., and Nasrabadi, A. M. (2018). A novel method of eeg-based emotion recognition using nonlinear features variability and Dempster–Shafer theory. Biomed. Eng.: Appl., Basis Commun. 30:1850026. doi: 10.4015/S1016237218500266

Srinivasan, N. (2007). Cognitive neuroscience of creativity: EEG based approaches. Methods 42, 109–116. doi: 10.1016/j.ymeth.2006.12.008

Steingrüber, H.-J., Lienert, G. A., and Gustav, A. (1971). Hand-Dominanz-Test : Verlag für Psychologie Available at: https://cir.nii.ac.jp/crid/1130282273024678144 .

Sternberg, R. J. (2020). What’s wrong with creativity testing? J. Creat. Behav. 54, 20–36. doi: 10.1002/jocb.237

Sternberg, R. J., and Lubart, T. I. (1998). “The concept of creativity: prospects and paradigms” in Handbook of creativity . ed. R. J. Sternberg (Cambridge: Cambridge University Press), 3–15.

Stevens, C. E., and Zabelina, D. L. (2020). Classifying creativity: applying machine learning techniques to divergent thinking EEG data. NeuroImage 219:116990. doi: 10.1016/j.neuroimage.2020.116990

Teplan, M. (2002). Fundamentals of EEG measurement. Available at: https://api.semanticscholar.org/CorpusID:17002960

Torrance, E. P. (1966). Torrance tests of creative thinking (TTCT). APA PsycTests . doi: 10.1037/t05532-000

Ueno, K., Takahashi, T., Takahashi, K., Mizukami, K., Tanaka, Y., and Wada, Y. (2015). Neurophysiological basis of creativity in healthy elderly people: a multiscale entropy approach. Clin. Neurophysiol. 126, 524–531. doi: 10.1016/j.clinph.2014.06.032

Vieira, S. L. D. S., Benedek, M., Gero, J. S., Cascini, G., and Li, S. (2021). Brain activity of industrial designers in constrained and open design: the effect of gender on frequency bands. Proceedings of the Design Society, 1(AUGUST), 571–580. doi: 10.1017/pds.2021.57

Vieira, S., Benedek, M., Gero, J., Li, S., and Cascini, G. (2022a). Brain activity in constrained and open design: the effect of gender on frequency bands. Artif. Intell. Eng. Des. Anal. Manuf. 36:e6. doi: 10.1017/S0890060421000202

Vieira, S., Benedek, M., Gero, J., Li, S., and Cascini, G. (2022b). Design spaces and EEG frequency band power in constrained and open design. Int. J. Des. Creat. Innov. 10, 193–221. doi: 10.1080/21650349.2022.2048697

Vieira, S. L. D. S., Gero, J. S., Delmoral, J., Gattol, V., Fernandes, C., and Fernandes, A. A. (2019). Comparing the design neurocognition of mechanical engineers and architects: a study of the effect of Designer’s domain. Proceedings of the Design Society: International Conference on Engineering Design, 1(1), 1853–1862. doi: 10.1017/dsi.2019.191

Vieira, S., Gero, J. S., Delmoral, J., Gattol, V., Fernandes, C., Parente, M., et al. (2020a). The neurophysiological activations of mechanical engineers and industrial designers while designing and problem-solving. Design Sci. 6:e26. doi: 10.1017/dsj.2020.26

Vieira, S., Gero, J. S., Delmoral, J., Li, S., Cascini, G., and Fernandes, A. (2020b). Brain activity in constrained and open design spaces: an EEG study. The Sixth International Conference on Design Creativity-ICDC2020. doi: 10.35199/ICDC.2020.09

Vieira, S., Gero, J. S., Delmoral, J., Parente, M., Fernandes, A. A., Gattol, V., et al. (2020c). “Industrial designers problem-solving and designing: An EEG study” in Research & Education in design: People & Processes & Products & philosophy . eds. R. Almendra and J. Ferreira 211–220. ( 1st ed. ) Lisbon, Portugal. CRC Press.

Vieira, S., Gero, J., Gattol, V., Delmoral, J., Li, S., Cascini, G., et al. (2020d). The neurophysiological activations of novice and experienced professionals when designing and problem-solving. Proceedings of the Design Society: DESIGN Conference, 1, 1569–1578. doi: 10.1017/dsd.2020.121

Volf, N. V., and Razumnikova, O. M. (1999). Sex differences in EEG coherence during a verbal memory task in normal adults. Int. J. Psychophysiol. 34, 113–122. doi: 10.1016/s0167-8760(99)00067-7

Volf, N. V., and Tarasova, I. V. (2010). The relationships between EEG θ and β oscillations and the level of creativity. Hum. Physiol. 36, 132–138. doi: 10.1134/S0362119710020027

Volf, N. V., Tarasova, I. V., and Razumnikova, O. M. (2010). Gender-related differences in changes in the coherence of cortical biopotentials during image-based creative thought: relationship with action efficacy. Neurosci. Behav. Physiol. 40, 793–799. doi: 10.1007/s11055-010-9328-y

Wallas, G. (1926). The art of thought . London: J. Cape.

Wang, Y., Gu, C., and Lu, J. (2019). Effects of creative personality on EEG alpha oscillation: based on the social and general creativity comparative study. J. Creat. Behav. 53, 246–258. doi: 10.1002/jocb.243

Wang, M., Hao, N., Ku, Y., Grabner, R. H., and Fink, A. (2017). Neural correlates of serial order effect in verbal divergent thinking. Neuropsychologia 99, 92–100. doi: 10.1016/j.neuropsychologia.2017.03.001

Wang, Y.-Y., Weng, T.-H., Tsai, I.-F., Kao, J.-Y., and Chang, Y.-S. (2023). Effects of virtual reality on creativity performance and perceived immersion: a study of brain waves. Br. J. Educ. Technol. 54, 581–602. doi: 10.1111/bjet.13264

Williams, A., Ostwald, M., and Askland, H. (2011). The relationship between creativity and design and its implication for design education. Des. Princ. Pract. 5, 57–71. doi: 10.18848/1833-1874/CGP/v05i01/38017

Xie, X. (2023). The cognitive process of creative design: a perspective of divergent thinking. Think. Skills Creat. 48:101266. doi: 10.1016/j.tsc.2023.101266

Yang, J., Quan, H., and Zeng, Y. (2022). Knowledge: the good, the bad, and the ways for designer creativity. J. Eng. Des. 33, 945–968. doi: 10.1080/09544828.2022.2161300

Yang, J., Yang, L., Quan, H., and Zeng, Y. (2021). Implementation barriers: a TASKS framework. J. Integr. Des. Process. Sci. 25, 134–147. doi: 10.3233/JID-210011

Yin, Y., Zuo, H., and Childs, P. R. N. (2023). An EEG-based method to decode cognitive factors in creative processes. AI EDAM. Available at: https://www.cambridge.org/core/journals/ai-edam/article/an-eegbased-method-to-decode-cognitive-factors-in-creative-processes/FD24164B3D2C4ABA3A57D9710E86EDD4

Yuan, H., Zotev, V., Phillips, R., Drevets, W. C., and Bodurka, J. (2012). Spatiotemporal dynamics of the brain at rest--exploring EEG microstates as electrophysiological signatures of BOLD resting state networks. NeuroImage 60, 2062–2072. doi: 10.1016/j.neuroimage.2012.02.031

Zangeneh Soroush, M., Tahvilian, P., Nasirpour, M. H., Maghooli, K., Sadeghniiat-Haghighi, K., Vahid Harandi, S., et al. (2022). EEG artifact removal using sub-space decomposition, nonlinear dynamics, stationary wavelet transform and machine learning algorithms. Front. Physiol. 13:910368. doi: 10.3389/fphys.2022.910368

Zangeneh Soroush, M., Zhao, M., Jia, W., and Zeng, Y. (2023a). Conceptual design exploration: EEG dataset in open-ended loosely controlled design experiments. Mendeley Data . doi: 10.17632/h4rf6wzjcr.1

Zangeneh Soroush, M., Zhao, M., Jia, W., and Zeng, Y. (2023b). Design creativity: EEG dataset in loosely controlled modified TTCT-F creativity experiments. Mendeley Data . doi: 10.17632/24yp3xp58b.1

Zangeneh Soroush, M., Zhao, M., Jia, W., and Zeng, Y. (2024). Loosely controlled experimental EEG datasets for higher-order cognitions in design and creativity tasks. Data Brief 52:109981. doi: 10.1016/j.dib.2023.109981

Zeng, Y. (2001). An axiomatic approach to the modeling of conceptual product design using set theory. Department of Mechanical and Manufacturing Engineering, 218.

Zeng, Y. (2002). Axiomatic theory of design modeling. J. Integr. Des. Process. Sci. 6, 1–28.

Zeng, Y. (2004). Environment-based formulation of design problem. J. Integr. Des. Process. Sci. 8, 45–63.

Zeng, Y. (2015). Environment-based design (EBD): a methodology for transdisciplinary design. J. Integr. Des. Process. Sci. 19, 5–24. doi: 10.3233/jid-2015-0004

Zeng, Y., and Cheng, G. D. (1991). On the logic of design. Des. Stud. 12, 137–141. doi: 10.1016/0142-694X(91)90022-O

Zeng, Y., and Gu, P. (1999). A science-based approach to product design theory part II: formulation of design requirements and products. Robot. Comput. Integr. Manuf. 15, 341–352. doi: 10.1016/S0736-5845(99)00029-0

Zeng, Y., Pardasani, A., Dickinson, J., Li, Z., Antunes, H., Gupta, V., et al. (2004). Mathematical foundation for modeling conceptual design Sketches1. J. Comput. Inf. Sci. Eng. 4, 150–159. doi: 10.1115/1.1683825

Zeng, Y., and Yao, S. (2009). Understanding design activities through computer simulation. Adv. Eng. Inform. 23, 294–308. doi: 10.1016/j.aei.2009.02.001

Zhang, W., Sjoerds, Z., and Hommel, B. (2020). Metacontrol of human creativity: the neurocognitive mechanisms of convergent and divergent thinking. NeuroImage 210:116572. doi: 10.1016/j.neuroimage.2020.116572

Zhao, M., Jia, W., Yang, D., Nguyen, P., Nguyen, T. A., and Zeng, Y. (2020). A tEEG framework for studying designer’s cognitive and affective states. Design Sci. 6:e29. doi: 10.1017/dsj.2020.28

Zhao, M., Yang, D., Liu, S., and Zeng, Y. (2018). Mental stress-performance model in emotional engineering . ed. S. Fukuda. (Cham: Springer). Vol. 6 .

Zhuang, K., Yang, W., Li, Y., Zhang, J., Chen, Q., Meng, J., et al. (2021). Connectome-based evidence for creative thinking as an emergent property of ordinary cognitive operations. NeuroImage 227:117632. doi: 10.1016/j.neuroimage.2020.117632

Keywords: design creativity, creativity, neurocognition, EEG, higher-order cognitive tasks, thematic analysis

Citation: Zangeneh Soroush M and Zeng Y (2024) EEG-based study of design creativity: a review on research design, experiments, and analysis. Front. Behav. Neurosci . 18:1331396. doi: 10.3389/fnbeh.2024.1331396

Received: 01 November 2023; Accepted: 07 May 2024; Published: 01 August 2024.

Reviewed by:

Copyright © 2024 Zangeneh Soroush and Zeng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yong Zeng, [email protected]

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • v.2(7); 2021 Jul 29

Logo of kidney360

Confounding in Observational Studies Evaluating the Safety and Effectiveness of Medical Treatments

Magdalene m. assimon.

1 University of North Carolina Kidney Center, Division of Nephrology and Hypertension, Department of Medicine, University of North Carolina School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina

Introduction

Randomized controlled trials (RCTs) are considered the “gold standard” for establishing the safety and efficacy of medical treatments, such as drugs, devices, and procedures. Patients with kidney disease are often excluded from these studies ( 1 ), and it is well established that trial participants tend to be healthier than the broader kidney disease population ( 2 ). Furthermore, the number of nephrology-specific trials conducted continues to lag behind other subspecialties ( 3 ).

In the absence of RCT data, nephrology practitioners may look to population-specific observational evidence to guide therapy selection. Observational studies using real-world data ( e.g. , administrative claims and electronic healthcare record data) to evaluate the safety and effectiveness of medical treatments can provide highly generalizable and valuable information to clinicians ( 4 ). However, like nonrandomized prospective cohort studies, these studies may suffer from biases that limit their validity, such as confounding.

In this commentary, I describe what confounding is and provide a brief overview of common types of confounding that can arise in observational studies of medical treatments. I then highlight some common strategies for addressing confounding and discuss potential sources of residual confounding.

Confounding

In an observational study, confounding occurs when a risk factor for the outcome also affects the exposure of interest, either directly or indirectly. The resultant bias can strengthen, weaken, or completely reverse the true exposure-outcome association. For a factor to be a confounder, it has to be associated with both the study exposure and the study outcome, and temporally precede the exposure ( i.e. , it cannot be an intermediary factor on the causal pathway between the exposure and the outcome) ( 5 ).

Confounding by Indication and Examples of Other Types of Confounding

Confounding by indication ( 6 ) is one of the most common forms of bias present in observational studies evaluating the safety and effectiveness of medical treatments. It occurs when the clinical indication for treatment, such as the presence of a disease or disease severity, also affects the outcome of interest. Bias due to confounding by indication can make it appear that a treatment under investigation is associated with the occurrence of an outcome that it is supposed to prevent, especially in studies comparing the use of a medical treatment with nonuse. For example, confounding by indication would likely be present in an observational study assessing the association between aldosterone antagonist use versus nonuse and mortality in patients with heart failure. In such a study, heart failure severity is an important confounder. Clinicians are more likely to prescribe an aldosterone antagonist to patients with more severe heart failure, and more severe heart failure is also a risk factor for death. If heart failure severity is not adequately controlled for, it may appear that the use of an aldosterone antagonist increases the risk of death, which is contrary to existing evidence from placebo-controlled trials ( 7 ).

Confounding by frailty ( 8 ) can be another important source of bias in observational studies of medical treatments. This type of confounding occurs because frail patients, who are close to death, tend to have a lower likelihood of receiving preventative therapies than individuals who are healthier. When confounding by frailty is present, the preventative treatment being evaluated appears to be more beneficial than it actually is. For instance, confounding by frailty has been proposed as a potential explanation for the implausible 40%–60% mortality reduction seen in observational studies assessing influenza vaccine effectiveness in older adults ( 9 ). Compared to healthier patients, frailer patients with a poor short-term prognosis may be less likely to receive an influenza vaccine due to a perceived lack of benefit. In this scenario, frailty is a confounder because it associates with vaccine receipt and death.

Other types of confounding can arise when heathy behaviors are associated with both the medical exposure under study and the outcome of interest. For example, confounding by the healthy adherer effect ( 10 ) occurs because patients who adhere to treatments tend to have a higher likelihood of taking part in other beneficial healthy behaviors ( e.g. , exercising) than their nonadherent counterparts. When confounding by the healthy adherer effect is present, studies evaluating the effect of treatment adherence versus nonadherence on the occurrence of adverse clinical outcomes will often overestimate the beneficial effects of treatment adherence.

Finally, time-varying confounding occurs when the exposure of interest and potential confounders change across time. A common type of time-varying confounding that may be present in observational studies of medical treatments is “time-varying confounding affected by previous exposure.” ( 11 ) It arises when the clinical parameter indicating that a treatment change is necessary is independently related to the outcome of interest and is also affected by previous exposure to the treatment ( 12 ). For example, in a study assessing the association between erythropoietin-stimulating agent (ESA) dose and mortality in patients on hemodialysis, serum hemoglobin is a time-varying confounder that needs to be accounted for. Hemoglobin levels predict ESA dose, are influenced by prior ESA dose, and are independently associated with mortality (the outcome).

Strategies for Addressing Confounding

Confounding can be addressed in the design and analytic phases of observational studies. Common strategies are discussed below, and their advantages and disadvantages are summarized in Table 1 .

Advantages and disadvantages of common strategies used to address confounding

MethodOverviewAdvantagesDisadvantages
 RestrictionSetting criteria for study inclusionEasy to implementOnly removes or reduces confounding by the inclusion criteria
Reduces sample size
Cannot generalize findings to those excluded
 MatchingCreates matched sets of patients who have similar values of one or more confoundersIntuitiveDifficult to match on multiple confounders
Only removes or reduces confounding by the matching factors
Unmatched patients are excluded, reducing sample size, effect estimate precision, and generalizability
 Active comparatorComparing the treatment of interest to an active comparator rather than treatment nonuseMitigates confounding by indication
Clinically relevant head-to-head comparison of two or more treatments
Cannot be used when there is only one treatment option
 Multivariable adjustmentPotential confounders are included as covariates in regression modelsEasy to implement in standard statistical software packagesOnly controls for measured confounders
The total number of confounders that can be included in regression models is contingent on the number of outcome events
 Propensity score matchingEach patient who received the treatment of interest is matched to one or more patients who received the comparator treatment with an equivalent propensity score, generating a matched cohort of treated and comparator patients that have similar baseline characteristicsPreferred in studies where there are relatively few outcome events compared with the number of potential confounders
Ability to check if covariate balance between the treated and comparator groups was achieved in the matched cohort
Only controls for measured confounders
Unmatched patients are excluded, reducing sample size, effect estimate precision, and generalizability
 Propensity score weightingThe propensity score is used to generate weights that are applied to the original study cohort to create a pseudo-population of treated and comparator patients that have similar baseline characteristicsPreferred in studies where there are relatively few outcome events compared with the number of potential confounders
Ability to check if covariate balance between the treated and comparator groups was achieved in the weighed cohort
Only controls for measured confounders
Less intuitive than propensity score matching
 G methodsComplex analytic methods that handle time-varying confounding in the setting of time-varying exposuresAppropriately handle time-varying confoundingOnly controls for measured confounders
Complex methods requiring advanced statistical expertise

Addressing Confounding in the Design Phase

Restriction is a method than can be used for confounding control in the design phase. Similar to RCTs, restriction in an observational study involves setting criteria for study inclusion. By limiting the study to individuals who meet specific criteria, confounding by each respective inclusion criterion is either eliminated or reduced. For instance, in an observational study evaluating the risk of fracture associated with the use versus nonuse of benzodiazepines, age and sex are likely important confounders. Restricting the study cohort to males who are <65 years of age would eliminate confounding by sex and reduce confounding by age. Confounding by sex is eliminated because there is no variation in benzodiazepine use by sex—all benzodiazepine users and nonusers are male. Limiting the study cohort to individuals <65 years of age does not completely remove confounding by age, because benzodiazepine use patterns and fracture risk likely varies across the 18- to 64-year-old age group. Although restriction is an intuitive method that can be easily implemented, potential disadvantages include sample size reduction and decreased generalizability.

Another confounding control strategy that can be used in the design phase is matching. In a cohort study, matching involves selecting a comparator group that is matched to the treatment group on one or more confounders. Usually, individual-level matching is performed. Consider the previously mentioned observational study evaluating the benzodiazepine-fracture association. Because age and sex are important confounders, one or more benzodiazepine nonusers would be matched to a patient taking a benzodiazepine on the basis of age and sex. For example, a 63-year-old female not taking a benzodiazepine would be matched to a 63-year-old female taking a benzodiazepine. Although exact matching on the basis of age is ideal, it may not be possible. Broader age-based matching categories—such as matching on age within 5 years—can be used, but residual confounding by age may remain. In addition, it is important to keep in mind that identifying matched pairs of treated and comparator patients becomes more difficult as the number of matching factors increases.

Specific to observational studies evaluating medical treatments, a design strategy that can be used to minimize the effect of confounding by indication is using an active comparator rather than a nonuser comparator. The treatment of interest and the selected comparator should have the same clinical indication and therapeutic role, and in the case of medications, have the same mode of delivery ( 4 ). Furthermore, using an active comparator is the only logical comparator choice when irretractable confounding by indication is expected. Besides mitigating confounding by indication, head-to-head comparisons of two or more treatments with the same indication provide relevant information on comparative safety and effectiveness that can be used to inform the selection of one treatment over another in clinical practice.

Addressing Confounding in the Analytic Phase

There are several statistical approaches that can be used for confounding control in the analysis phase. Multivariable adjustment, which involves including potential confounders as covariates in regression models, is the most common analytic technique used. However, recently, propensity score methods, such as propensity score matching and propensity score weighting, have gained popularity ( 13 ).

In studies evaluating medical treatments, a propensity score is a patient’s predicted probability of receiving the treatment of interest versus a comparator, given their measured baseline characteristics. This summary score is estimated for each patient in the study cohort and is subsequently used for confounding control. In propensity score matching, each patient who received the treatment of interest is matched to one or more patients who received the comparator with an equivalent propensity score. This results in the generation of a matched cohort of treated and comparator patients that have similar baseline characteristics. In propensity score weighing, the propensity score is used to generate weights that are applied to the original study cohort to create a pseudo-population of treated and comparator patients that have similar baseline characteristics ( 14 ). The resultant matched and weighted cohorts can be used to estimate the treatment-outcome association, where the influence of measured baseline confounding is minimized. Propensity score methods and multivariable adjustment typically yield similar adjusted estimates of the treatment-outcome association ( 13 ). However, because a propensity score combines multiple covariates into a single summary score, these methods are preferred when the exposure of interest is common and outcome of interest is rare, a setting where multivariable outcome models are susceptible to overfitting. Readers interested in learning more about propensity score methods can refer to the tutorial provided by Fu et al. ( 15 ).

G methods, such as inverse probability–weighted marginal structural models, are complex analytic methods that appropriately handle time-varying confounding in the setting of time-varying exposures. A thorough description of G methods is beyond the scope of this commentary and can be found elsewhere ( 11 ). However, it is important to recognize that the use of these methods is increasing in the field of nephrology.

Common Sources of Residual Confounding

Despite the use of study designs and analytic strategies that aim to eliminate confounding, residual confounding may persist. Common reasons why residual confounding may be present are: ( 1 ) information on a confounder is not available; ( 2 ) the version of the confounding variable present in the data source is an imperfect surrogate or is misclassified; and ( 3 ) continuous confounders are parameterized as categoric variables, especially when overly broad categories are used ( 16 ).

Observational studies using real-world data can provide clinically actionable information on the potential benefits and harms of medical treatments in populations excluded from RCTs, such as patients with kidney disease. Confounding is a common source of bias threatening the validity of these studies. Thus, it is important to be aware of the types confounding that may be present and understand the advantages and disadvantages of common strategies used for confounding control.

Disclosures

M.M. Assimon reports receiving honoraria from the American Society of Nephrology and the International Society of Nephrology, and investigator-initiated research funding from the Renal Research Institute, a subsidiary of Fresenius Medical Care, North America in the last 3 years.

M.M. Assimon is supported by National Heart, Lung, and Blood Institute grant R01 HL152034.

Author Contributions

M.M. Assimon wrote the original draft and reviewed and edited the manuscript.

Getty / Futurism

Former NASA Scientist Doing Experiment to Prove We Live in a Simulation

Did we really take the red pill, the blue pill.

Could we be trapped inside a simulated reality, rather than the physical universe we usually assume?

It's a tantalizing theory, long theorized by philosophers and popularized by the 1999 blockbuster "The Matrix." What if there was a way to find out once and for all if we're living inside a computer?

A former NASA physicist named Thomas Campbell has taken it upon himself to do just that. He devised several experiments, as detailed in a 2017 paper published in the journal The International Journal of Quantum Foundations , designed to detect if something is rendering the world around us like a video game.

Now, scientists at the California State Polytechnic University (CalPoly) have gotten started on the first experiment, putting Campbell's far-fetched hypothesis to the test.

And Campbell has set up an entire non-profit called Center for the Unification of Science and Consciousness (CUSAC) to fund these endeavors. The experiments are "expected to provide strong scientific evidence that we live in a computer-simulated virtual reality," according to a press release by the group.

Needless to say, it's an eyebrow-raising project. As always, extraordinary claims will require extraordinary evidence — but regardless, it's a fun idea.

Simulation Hypothesis

Campbell's experiments include a new spin on the double-slit experiment, a physics demonstration designed to show how light and matter can act like both waves and particles.

Campbell believes that by removing the observer from these experiments, the actual recorded information never existed in the first place. That's instead of current quantum physics suggesting the existence of entanglement that links particles across a distance.

In simple terms, without a player, the universe around them doesn't exist, much like a video game — proof, in Campbell's thinking , that the universe is exclusively "participatory."

Campbell isn't the first to explore a simulation hypothesis. Back in 2003, Swedish philosopher Nick Bostrom published a paper titled " Are You Living in a Computer Simulation? "

Basically, his idea was that if we progress far enough technologically, we'll probably end up running a simulation of our ancestors. Give those simulated ancestors enough time, and they'll end up simulating their own ancestors. Eventually, most minds in existence will be inside layers of simulations — meaning that we probably are too.

Campbell's hypothesis takes a different tack than Bostrom's "ancestor simulation," arguing that our "consciousness is not a product of the simulation — it is fundamental to reality," in CUSAC's press release.

If he were to be successful in his bid to prove that humanity is trapped in a virtual reality — an endeavor that would subvert our basic understanding of the world around us — it could have major implications.

Campbell argued that the five experiments could "challenge the conventional understanding of reality and uncover profound connections between consciousness and the cosmos."

More on the simulation hypothesis: Famous Hacker Thinks We're Living in Simulation, Wants to Escape

Share This Article

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

buildings-logo

Article Menu

experimental studies confounding

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Experimental study on improving the performance of cement mortar with self-synthesized viscosity-reducing polycarboxylic acid superplasticizer.

experimental studies confounding

1. Introduction

2. experiment, 2.1. materials and sample preparation, 2.1.1. cement, 2.1.2. self-synthesized vrpce, 2.1.3. cement paste, 2.1.4. cement mortar, 2.2. methods, 2.2.3. zeta potential, 2.2.4. cement particle size, 2.2.9. fluidity, 2.2.10. compressive strength, 2.2.11. shrinkage, 2.2.12. creep, 3.1. molecular weight and molecular weight distribution, 3.2. adsorption, 3.3. dispersion, 3.4. particle size, 3.5. compositions, 3.9. fluidity, 3.10. compressive strength, 3.11. shrinkage, 3.12. creep, 4. discussion, 4.1. molecular weight and molecular weight distribution, 4.2. adsorption, 4.3. dispersion, 4.4. particle size, 4.5. compositions, 4.9. fluidity, 4.10. compressive strength, 4.11. shrinkage, 4.12. creep, 5. conclusions.

  • Reducing the content of HPEG and increasing the content of AA can reduce the number average molecular weight and weight average molecular weight of the VRPCE from 46,162 to 34,053, and 82,186 to 71,985. Moreover, the content variation of HPEG and AA had a small impact on the polymer dispersibility index which was in the range of 2.0–2.2;
  • When the concentration of the VRPCEs increased by five times, the adsorption capacity of the VRPCE1, VRPCE2, and VRPCE3 onto cement particles increased by 5.86, 6.01, and 7.64 times, respectively. Therefore, the effect on increasing the adsorption amount of cement particles was VRPCE1 > VRPCE2 > VRPCE3. When the concentration of VRPCEs was 10 g/L, the absolute zeta potential values of the cement particles within the VRPCEs increased by 33.1%, 33.3%, and 32.0% compared to that when the concentration of the VRPCEs was 2 g/L. When the concentration of the VRPCEs was 10 g/L, the particle size of the cement increased by 22.87%, 39.56%, and 46.86%. Hence, the effect of increasing the absolute value of zeta potential on the surface of cement particles and the particle size of cement particles were VRPCE1 < VRPCE2 < VRPCE3;
  • The porosity of C0, C1, C2, and C3 gradually increased, increasing by 0.03%, 30.0%, and 56.9%, compared to C0. The mass loss values of C1, C2, and C3 increased by 23.0%, 37.0%, and 55.5%, compared with C0. Therefore, the effect of improving the hydration degree of cement, increasing the porosity, and optimizing the pore structure was VRPCE1 < VRPCE2 < VRPCE3;
  • When the standing time was 2 h, the flow time of M1, M2, and M3 decreased by 23.8%, 44.3%, and 64.2%, compared to M0. The compressive strength of M1, M2, and M3 increased by 16.7%, 6.5%, and 2.1%, compared with M0. The shrinkage of M1, M2, and M3 increased by 6.6%, 24.2%, 35.4%. compared to M0. The creep degrees of M1, M2, and M3 decreased by 7.1%, 15.2%, and 22.5%, compared to M0. Hence, the impact on increasing the fluidity and shrinkage of cement mortar, reducing compressive strength and creep, was VRPCE1 < VRPCE2 < VRPCE3.

Author Contributions

Data availability statement, conflicts of interest.

  • Yi, N.H.; Kim, J.H.J.; Han, T.S.; Cho, Y.G.; Lee, J.H. Blast-resistant characteristics of ultra-high strength concrete and reactive powder concrete. Constr. Build. Mater. 2012 , 28 , 694–707. [ Google Scholar ] [ CrossRef ]
  • Riedel, W.; Noldgen, M.; Strassburger, E.; Thoma, K.; Fehling, E. Local damage to Ultra High Performance Concrete structures caused by an impact of aircraft engine missiles. Nucl. Eng. Des. 2010 , 240 , 2633–2642. [ Google Scholar ] [ CrossRef ]
  • Wang, C.; Yang, C.H.; Liu, F.; Wan, C.J.; Pu, X.C. Preparation of Ultra-High Performance Concrete with common technology and materials. Cem. Concr. Compos. 2012 , 34 , 538–544. [ Google Scholar ] [ CrossRef ]
  • Emdadi, A.; Mehdipour, I.; Libre, N.A.; Shekarchi, M. Optimized workability and mechanical properties of FRCM by using fiber factor approach: Theoretical and experimental study. Mater. Struct. 2015 , 48 , 1149–1161. [ Google Scholar ] [ CrossRef ]
  • Mardani-Aghabaglou, A.; Tuyan, M.; Yilmaz, G.; Arioz, O.; Ramyar, K. Effect of different types of superplasticizer on fresh, rheological and strength properties of self-consolidating concrete. Constr. Build. Mater. 2013 , 47 , 1020–1025. [ Google Scholar ] [ CrossRef ]
  • Shi, C.J.; Wu, Z.M.; Xiao, J.F.; Wang, D.H.; Huang, Z.Y.; Fang, Z. A review on ultra high performance concrete: Part I. Raw materials and mixture design. Constr. Build. Mater. 2015 , 101 , 741–751. [ Google Scholar ] [ CrossRef ]
  • Li, J.Y.; Yao, Y. A study on creep and drying shrinkage of high performance concrete. Cem. Concr. Res. 2001 , 31 , 1203–1206. [ Google Scholar ] [ CrossRef ]
  • Sobuz, H.R.; Visintin, P.; Ali, M.S.M.; Singh, M.; Griffith, M.C.; Sheikh, A.H. Manufacturing ultra-high performance concrete utilising conventional materials and production methods. Constr. Build. Mater. 2016 , 111 , 251–261. [ Google Scholar ] [ CrossRef ]
  • Wen, X.D.; Feng, L.; Hu, D.Y.; Wang, K.; Zhenya, Z.Y. Effect of side-chain length in polycarboxylic superplasticizer on the early-age performance of cement-based materials. Constr. Build. Mater. 2019 , 211 , 26–32. [ Google Scholar ] [ CrossRef ]
  • Yamada, K.; Takahashi, T.; Hanehara, S.; Matsuhisa, M. Effects of the chemical structure on the properties of polycarboxylate-type superplasticizer. Cem. Concr. Res. 2000 , 30 , 197–207. [ Google Scholar ] [ CrossRef ]
  • Zingg, A.; Winnefeld, F.; Holzer, L.; Pakusch, J.; Becker, S.; Figi, R.; Gauckler, L. Interaction of polycarboxylate-based superplasticizers with cements containing different C(3)A amounts. Cem. Concr. Compos. 2009 , 31 , 153–162. [ Google Scholar ] [ CrossRef ]
  • Tan, H.B.; Gu, B.Q.; Guo, Y.L.; Ma, B.G.; Huang, J.; Ren, J.; Zou, F.B.; Guo, Y.F. Improvement in compatibility of polycarboxylate superplasticizer with poor-quality aggregate containing montmorillonite by incorporating polymeric ferric sulfate. Constr. Build. Mater. 2018 , 162 , 566–575. [ Google Scholar ] [ CrossRef ]
  • Atarashi, D.; Yamada, K.; Itoh, A.; Miyauchi, M.; Sakai, E. Interaction between Montmorillonite and Chemical Admixture. J. Adv. Concr. Technol. 2015 , 13 , 325–331. [ Google Scholar ] [ CrossRef ]
  • Liu, J.Z.; Wang, K.J.; Zhang, Q.Q.; Han, F.Y.; Sha, J.F.; Liu, J.P. Influence of superplasticizer dosage on the viscosity of cement paste with low water-binder ratio. Constr. Build. Mater. 2017 , 149 , 359–366. [ Google Scholar ] [ CrossRef ]
  • Plank, J.; Sachsenhauser, B. Experimental determination of the effective anionic charge density of polycarboxylate superplasticizers in cement pore solution. Cem. Concr. Res. 2009 , 39 , 1–5. [ Google Scholar ] [ CrossRef ]
  • Alonso, M.M.; Puertas, F. Adsorption of PCE and PNS superplasticisers on cubic and orthorhombic C(3)A. Effect of sulfate. Constr. Build. Mater. 2015 , 78 , 324–332. [ Google Scholar ] [ CrossRef ]
  • Karihaloo, B.L.; Ghanbari, A. Mix proportioning of self-compacting high- and ultra-high-performance concretes with and without steel fibres. Mag. Concr. Res. 2012 , 64 , 1089–1100. [ Google Scholar ] [ CrossRef ]
  • Poe, G.D.; Jarrett, W.L.; Scales, C.W.; McCormick, C.L. Enhanced coil expansion and intrapolymer complex formation of linear poly(methacrylic acid) containing poly(ethylene glycol) grafts. Macromolecules 2004 , 37 , 2603–2612. [ Google Scholar ] [ CrossRef ]
  • Alonso, M.M.; Palacios, M.; Puertas, F. Compatibility between polycarboxylate-based admixtures and blended-cement pastes. Cem. Concr. Compos. 2013 , 35 , 151–162. [ Google Scholar ] [ CrossRef ]
  • Perez-Nicolas, M.; Duran, A.; Navarro-Blasco, I.; Fernandez, J.M.; Sirera, R.; Alvarez, J.I. Study on the effectiveness of PNS and LS superplasticizers in air lime-based mortars. Cem. Concr. Res. 2016 , 82 , 11–22. [ Google Scholar ] [ CrossRef ]
  • Lei, L.; Plank, J. Synthesis and Properties of a Vinyl Ether-Based Polycarboxylate Superplasticizer for Concrete Possessing Clay Tolerance. Ind. Eng. Chem. Res. 2014 , 53 , 1048–1055. [ Google Scholar ] [ CrossRef ]
  • Zhang, Q.Q.; Liu, J.Z.; Liu, J.P.; Han, F.Y.; Lin, W. Effect of Superplasticizers on Apparent Viscosity of Cement-Based Material with a Low Water-Binder Ratio. J. Mater. Civ. Eng. 2016 , 28 , 1–7. [ Google Scholar ] [ CrossRef ]
  • Zhang, Y.R.; Luo, X.; Kong, X.M.; Wang, F.Z.; Gao, L. Rheological properties and microstructure of fresh cement pastes with varied dispersion media and superplasticizers. Powder Technol. 2018 , 330 , 219–227. [ Google Scholar ] [ CrossRef ]
  • Peng, X.Y.; Yi, C.H.; Deng, Y.H.; Qiu, X.Q. Synthesis and Evaluation of Polycarboxylate-Type Superplasticizers with Different Carboxylic Contents Used in a Cement System. Int. J. Polym. Mater. 2011 , 60 , 923–938. [ Google Scholar ] [ CrossRef ]
  • He, Y.; Zhang, X.; Kong, Y.N.; Wang, X.F.; Shui, L.L.; Wang, H.R. Influence of Polycarboxylate Superplasticizer on Rheological Behavior in Cement Paste. J. Wuhan Univ. Technol.-Mater. Sci. Ed. 2018 , 33 , 932–937. [ Google Scholar ] [ CrossRef ]
  • Plank, J.; Pollmann, K.; Zouaoui, N.; Andres, P.R.; Schaefer, C. Synthesis and performance of methacrylic ester based polycarboxylate superplasticizers possessing hydroxy terminated poly(ethylene glycol) side chains. Cem. Concr. Res. 2008 , 38 , 1210–1216. [ Google Scholar ] [ CrossRef ]
  • Huang, Z.; Yang, Y.; Ran, Q.P.; Liu, J.P. Preparing hyperbranched polycarboxylate superplasticizers possessing excellent viscosity-reducing performance through in situ redox initialized polymerization method. Cem. Concr. Compos. 2018 , 93 , 323–330. [ Google Scholar ] [ CrossRef ]
  • Dalas, F.; Pourchet, S.; Nonat, A.; Rinaldi, D.; Sabio, S.; Mosquet, M. Fluidizing efficiency of comb-like superplasticizers: The effect of the anionic function, the side chain length and the grafting degree. Cem. Concr. Res. 2015 , 71 , 115–123. [ Google Scholar ] [ CrossRef ]
  • Qian, S.S.; Yao, Y.; Wang, Z.M.; Cui, S.P.; Liu, X.; Jiang, H.D.; Guo, Z.L.; Lai, G.H.; Xu, Q.; Guan, J.N. Synthesis, characterization and working mechanism of a novel polycarboxylate superplasticizer for concrete possessing reduced viscosity. Constr. Build. Mater. 2018 , 169 , 452–461. [ Google Scholar ] [ CrossRef ]
  • Janowska-Renkas, E. The effect of superplasticizers’ chemical structure on their efficiency in cement pastes. Constr. Build. Mater. 2013 , 38 , 1204–1210. [ Google Scholar ] [ CrossRef ]
  • Du, K.L.; Hou, H.H.; Liu, Z.Y.; Li, X.H.; Wen, Q.R. Preparation and Properties of Anti-mud and Viscosity-Reducing polycarboxylate superplasticizer. In Proceedings of the 5th International Conference on Advances in Energy, Environment and Chemical Engineering (AEECE), Shanghai, China, 16–18 August 2019; pp. 1–5. [ Google Scholar ]
  • Chen, Y.; Chen, Y.; Liu, Y.; Tao, J.; Liu, R.; Li, Z.; Liu, F.; Li, M. Effect of the Alkyl Density of Acrylic Acid Ester on the Viscosity-Reducing Effect of Polycarboxylate Superplasticizer. Molecules 2023 , 28 , 7293. [ Google Scholar ] [ CrossRef ]
  • Chen, J.; Yang, C.; He, Y.; Wang, F.; Zeng, C. Effects and Mechanism of Hyperbranched Phosphate Polycarboxylate Superplasticizers on Reducing Viscosity of Cement Paste. Materials 2024 , 17 , 1896. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Li, W.; Qian, C.; Zheng, C.; Jiang, H.; Yu, Z.; Wu, Z.; Zhou, Z. Synthesis and Performance of Polycarboxylate Superplasticizer with Viscosity-Reducing and Low-Shrinkage Properties for Fair-Faced Concrete. Materials 2024 , 17 , 2685. [ Google Scholar ] [ CrossRef ]
  • ISO 19596: 2017 ; Admixtures for Concrete. International Standardization Organization: Geneva, Switzerland, 2017.
  • ISO 13320: 2020 ; Particle Size Analysis—Laser Diffraction Methods. International Standardization Organization: Geneva, Switzerland, 2020.
  • ISO 679: 2009 ; Cement—Test Methods—Determination of Strength. International Standardization Organization: Geneva, Switzerland, 2009.
  • Plank, J.; Hirsch, C. Impact of zeta potential of early cement hydration phases on superplasticizer adsorption. Cem. Concr. Res. 2007 , 37 , 537–542. [ Google Scholar ] [ CrossRef ]
  • Hou, D.S.; Jia, Y.T.; Yu, J.; Wang, P.G.; Liu, Q.F. Transport Properties of Sulfate and Chloride Ions Confined between Calcium Silicate Hydrate Surfaces: A Molecular Dynamics Study. J. Phys. Chem. C 2018 , 122 , 28021–28032. [ Google Scholar ] [ CrossRef ]
  • Yamada, K.; Ogawa, S.; Hanehara, S. Controlling of the adsorption and dispersing force of polycarboxylate-type superplasticizer by sulfate ion concentration in aqueous phase. Cem. Concr. Res. 2001 , 31 , 375–383. [ Google Scholar ] [ CrossRef ]
  • Soin, A.V.; Catalan, L.J.J.; Kinrade, S.D. A combined QXRD/TG method to quantify the phase composition of hydrated portland cements (vol 48, pg 17, 2013). Cem. Concr. Res. 2013 , 50 , 88. [ Google Scholar ] [ CrossRef ]
  • Jin, S.S.; Zhang, J.X.; Han, S. Fractal analysis of relation between strength and pore structure of hardened mortar. Constr. Build. Mater. 2017 , 135 , 1–7. [ Google Scholar ] [ CrossRef ]
  • Silva, F.G.S.; Fiuza, R.A.; da Silva, J.S.; de Brito, C.; Andrade, H.M.C.; Goncalves, J.P. Consumption of calcium hydroxide and formation of C-S-H in cement pastes. J. Therm. Anal. Calorim. 2014 , 116 , 287–293. [ Google Scholar ] [ CrossRef ]
  • Lyu, K.; She, W.; Miao, C.; Chang, H.; Gu, Y. Quantitative characterization of pore morphology in hardened cement paste via SEM-BSE image analysis. Constr. Build. Mater. 2019 , 202 , 589–602. [ Google Scholar ] [ CrossRef ]
  • Ma, Y.H.; Jiao, D.W.; Sha, S.N.; Zhou, B.B.; Liu, Y.; Shi, C.J. Effect of anchoring groups of polycarboxylate ether superplasticizer on the adsorption and dispersion of cement paste containing montmorillonite. Cem. Concr. Compos. 2022 , 134 , 104737. [ Google Scholar ] [ CrossRef ]
  • Walkley, B.; Geddes, D.A.; Matsuda, T.; Provis, J.L. Reversible Adsorption of Polycarboxylates on Silica Fume in High pH, High Ionic Strength Environments for Control of Concrete Fluidity. Langmuir 2022 , 38 , 1662–1671. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Benboudjema, F.; Meftah, F.; Torrenti, J.M. Interaction between drying, shrinkage, creep and cracking phenomena in concrete. Eng. Struct. 2005 , 27 , 239–250. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

CompositionSiO Al O Fe O CaOMgOSO K ONa OLOI
wt.%21.924.463.4364.551.980.340.650.462.21
TypeHPEG3000AAPFMMDMA
VRPCE187913
VRPCE2811513
VRPCE3791713
CodeVRPCE TypeWater/CementVRPCE Content (%)
C0-0.250
C1VRPCE10.250.3
C2VRPCE20.250.3
C3VRPCE30.250.3
CodeVRPCE TypeWater/CementCement/SandVRPCE Content (%)
M0-0.250.50
M1VRPCE10.250.50.3
M2VRPCE20.250.50.3
M3VRPCE30.250.50.3
CodeM M PDI
VRPCE140,16282,1862.05
VRPCE237,51080,5982.15
VRPCE334,05371,9852.11
CodeC0C1C2C3
Porosity (%)7.027.049.1211.01
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Wang, Z.; Shen, Y.; Li, Y.; Tian, Y. Experimental Study on Improving the Performance of Cement Mortar with Self-Synthesized Viscosity-Reducing Polycarboxylic Acid Superplasticizer. Buildings 2024 , 14 , 2418. https://doi.org/10.3390/buildings14082418

Wang Z, Shen Y, Li Y, Tian Y. Experimental Study on Improving the Performance of Cement Mortar with Self-Synthesized Viscosity-Reducing Polycarboxylic Acid Superplasticizer. Buildings . 2024; 14(8):2418. https://doi.org/10.3390/buildings14082418

Wang, Zigeng, Yonghao Shen, Yue Li, and Yuan Tian. 2024. "Experimental Study on Improving the Performance of Cement Mortar with Self-Synthesized Viscosity-Reducing Polycarboxylic Acid Superplasticizer" Buildings 14, no. 8: 2418. https://doi.org/10.3390/buildings14082418

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

IMAGES

  1. A Complete Overview of Confounding Variables in Research

    experimental studies confounding

  2. When a Confounding Variable Is Present in an Experiment

    experimental studies confounding

  3. 25 Confounding Variable Examples (2024)

    experimental studies confounding

  4. What Is A Confounding Variable

    experimental studies confounding

  5. Confounding in Experimental Design: Definitions With Examples

    experimental studies confounding

  6. When a Confounding Variable Is Present in an Experiment

    experimental studies confounding

VIDEO

  1. Differentiating between experimental and non experimental studies in psychology

  2. Observational, Experimental, & Meta-analysis Studies

  3. Study Types

  4. Profound Mind: Journey Through Thought Experiments

  5. Example of non-experimental research design (11 of 11)

  6. Experimental Studies of the Interaction Between Ecology and Evolution in a Natural Ecosystem

COMMENTS

  1. Assessing bias: the importance of considering confounding

    What is confounding? Confounding is often referred to as a "mixing of effects" 1, 2 wherein the effects of the exposure under study on a given outcome are mixed in with the effects of an additional factor (or set of factors) resulting in a distortion of the true relationship. In a clinical trial, this can happen when the distribution of a known prognostic factor differs between groups ...

  2. Confounding Variables

    Confounding variables (a.k.a. confounders or confounding factors) are a type of extraneous variable that are related to a study's independent and dependent variables. A variable must meet two conditions to be a confounder: It must be correlated with the independent variable. This may be a causal relationship, but it does not have to be.

  3. How to control confounding effects by statistical analysis

    There are various ways to exclude or control confounding variables including Randomization, Restriction and Matching. But all these methods are applicable at the time of study design. When experimental designs are premature, impractical, or impossible, researchers must rely on statistical methods to adjust for potentially confounding effects.

  4. Confounding Variables in Psychology: Definition & Examples

    Case-control studies: Case-control studies assign confounders to both groups (the experimental group and the control group) equally. Example. ... Confounding bias is a bias that is the result of having confounding variables in your study design. If the observed association overestimates the effect of the independent variable on the dependent ...

  5. Confounding: What it is and how to deal with it

    Confounding, sometimes referred to as confounding bias, is mostly described as a 'mixing' or 'blurring' of effects. 1 It occurs when an investigator tries to determine the effect of an exposure on the occurrence of a disease (or other outcome), but then actually measures the effect of another factor, a confounding variable. As most medical studies attempt to investigate disease ...

  6. The Confounding Question of Confounding Causes in Randomized Trials

    The researcher sets up an experimental condition and a control condition so that they are as alike as possible in every causally relevant way except for one experimental factor. If there is a difference in effect, then logic compels us to accept that the experimental factor is a cause of the effect. ... In the CONFOUND study, causal conclusions ...

  7. 8.3

    8.3 - Confounding. Confounding is a situation in which the effect or association between an exposure and outcome is distorted by the presence of another variable. Positive confounding (when the observed association is biased away from the null) and negative confounding (when the observed association is biased toward the null) both occur.

  8. Confounding Variable: Definition & Examples

    Confounding Variable Definition. In studies examining possible causal links, a confounding variable is an unaccounted factor that impacts both the potential cause and effect and can distort the results. Recognizing and addressing these variables in your experimental design is crucial for producing valid findings.

  9. A beginner's guide to confounding

    In other types of studies you can address confounding through restriction or matching. Restriction means only studying people who are similar in terms of a confounding variable - for example, if you think age is a confounding variable you might only choose to study people older than 65. (This would obviously limit the applicability of your ...

  10. What is a Confounding Variable? (Definition & Example)

    Confounding variable: A variable that is not included in an experiment, yet affects the relationship between the two variables in an experiment. This type of variable can confound the results of an experiment and lead to unreliable findings. For example, suppose a researcher collects data on ice cream sales and shark attacks and finds that the ...

  11. An overview of confounding. Part 1: the concept and how to address it

    Abstract. Confounding is an important source of bias, but it is often misunderstood. We consider how confounding occurs and how to address confounding using examples. Study results are confounded when the effect of the exposure on the outcome, mixes with the effects of other risk and protective factors for the outcome.

  12. 25 Confounding Variable Examples (2024)

    1. IQ and Reading Ability. A study could find a positive correlation between children's IQ and reading ability. However, the socioeconomic status of the families could be a confounding variable, as children from wealthier families could have more access to books and educational resources. 2.

  13. 3.5

    3.5 - Bias, Confounding and Effect Modification. Consider the figure below. If the true value is the center of the target, the measured responses in the first instance may be considered reliable, precise or as having negligible random error, but all the responses missed the true value by a wide margin. A biased estimate has been obtained.

  14. Confounding Variables in Psychology Research

    The best way to control for confounding variables is to conduct "true experimental research," which means researchers experimentally manipulate a variable that they think causes a certain outcome. ... Chu L, et al. Evaluation of confounding in epidemiologic studies assessing alcohol consumption on the risk of ischemic heart disease. BMC Med ...

  15. Confound (Experimental)

    A confound variable in a psychological experiment is called an experimental confound. An example of a situation in which a confound is likely to lead to wrong conclusions drawn from an experiment is a working memory training study. When a group of participants of the study train with working memory tasks, they may become better in a subsequent ...

  16. Chapter 9. Experimental studies

    Experimental studies are less susceptible to confounding because the investigator determines who is exposed and who is unexposed. In particular, if exposure is allocated randomly and the number of groups or individuals randomised is large then even unrecognised confounding effects become statistically unlikely.

  17. Observational studies: a review of study designs, challenges and

    This section discusses two major challenges of observational studies—selection bias and confounding—and approaches that can be taken to minimize these problems. ... Crude, unadjusted results of non-experimental studies may lead to invalid inference regarding the effects of the intervention. Confounding can cause over- or under-estimation of ...

  18. The confounder matrix: A tool to assess confounding bias in systematic

    These criteria are then applied to assess level of confounding control in each component study. Confounding control across the studies is summarized using a visual confounder matrix. ... Duvendack M, et al. Quasi-experimental study designs series-paper 10: synthesizing evidence for effects collected from quasi-experimental studies presents ...

  19. What Is a Confounding Variable? Definition and Examples

    Use case controls or matching. If you suspect confounding variables, match the test subject and control as much as possible. In human experiments, you might select subjects of the same age, sex, ethnicity, education, diet, etc. For animal and plant studies, you'd use pure lines. In chemical studies, use samples from the same supplier and batch.

  20. Famous easy to understand examples of a confounding variable

    More disturbingly, according to Richard Feynmann, the studies reporting these confounding factors were not picked up by researchers at the time. As a result we simply don't know if any animal maze studies carried out around this time have any validity whatsoever. That's decades worth of high-end research at the finest universities around the ...

  21. Confounded Experimental Designs, Part 1: Incomplete ...

    This is the simplest full factorial experiment, having two independent variables (card size and print size), each with two levels (small and large). For this 2×2 factorial experiment, there are four experimental conditions: Large cards, large print. Large cards, small print. Small cards, large print. Small cards, small print.

  22. PDF Causal inference with observational data and unobserved confounding

    3 66 system to design an analysis accounting for all confounders to enable credible causal inferences 67 from observational data is really hard, even for the most experienced researchers. As a result, causal inference from observational data is often dismissed as impossible,68 prompting the saying 69 "correlation is not causation." Dealing with the problems created by not controlling for

  23. EEG-based study of design creativity: a review on research design

    These studies can be roughly classified into the following distinct categories based on their proposed experiments and EEG analysis methods (Pidgeon et al., 2016; Jia, 2021): (1) visual creativity versus baseline rest/fixation, (2) visual creativity versus non-rest control task(s), (3) individuals of high versus low creativity, (4) generation ...

  24. PDF VME 6771 Veterinary Epidemiologic Research 3 credits; Class # 22807

    published experimental and observational studies, please read the assigned papers . before class. Guidelines to assist you in the preparation of the study designs will be ... Confounding & interaction JH 15-16 Th 19 Clinical trials JH 17-18 T 24 Journal Club: clinical trials*** Students, JH, DK 19-20 Th 26 12.50-1.40 Review & study assignment ...

  25. Confounding in Observational Studies Evaluating the Safety and

    Confounding by Indication and Examples of Other Types of Confounding. Confounding by indication is one of the most common forms of bias present in observational studies evaluating the safety and effectiveness of medical treatments.It occurs when the clinical indication for treatment, such as the presence of a disease or disease severity, also affects the outcome of interest.

  26. Former NASA Scientist Doing Experiment to Prove We Live in a ...

    The experiments are "expected to provide strong scientific evidence that we live in a computer-simulated virtual reality," according to a press release by the group.

  27. Buildings

    In this study, a viscosity-reducing polycarboxylic acid superplasticizer (VRPCE) was synthesized using methylallyl polyoxyethylene ether (HPEG), acrylic acid (AA), and maltodextrin maleic acid monoester (MDMA) as the main raw materials. The influences of the VRPCE on the microscopic properties of cement paste were studied by gel permeation chromatography (GPC), total organic carbon test (TOC ...