View an example
When you place an order, you can specify your field of study and we’ll match you with an editor who has familiarity with this area.
However, our editors are language specialists, not academic experts in your field. Your editor’s job is not to comment on the content of your dissertation, but to improve your language and help you express your ideas as clearly and fluently as possible.
This means that your editor will understand your text well enough to give feedback on its clarity, logic and structure, but not on the accuracy or originality of its content.
Good academic writing should be understandable to a non-expert reader, and we believe that academic editing is a discipline in itself. The research, ideas and arguments are all yours – we’re here to make sure they shine!
After your document has been edited, you will receive an email with a link to download the document.
The editor has made changes to your document using ‘Track Changes’ in Word. This means that you only have to accept or ignore the changes that are made in the text one by one.
It is also possible to accept all changes at once. However, we strongly advise you not to do so for the following reasons:
You choose the turnaround time when ordering. We can return your dissertation within 24 hours , 3 days or 1 week . These timescales include weekends and holidays. As soon as you’ve paid, the deadline is set, and we guarantee to meet it! We’ll notify you by text and email when your editor has completed the job.
Very large orders might not be possible to complete in 24 hours. On average, our editors can complete around 13,000 words in a day while maintaining our high quality standards. If your order is longer than this and urgent, contact us to discuss possibilities.
Always leave yourself enough time to check through the document and accept the changes before your submission deadline.
Scribbr is specialised in editing study related documents. We check:
Calculate the costs
The fastest turnaround time is 24 hours.
You can upload your document at any time and choose between four deadlines:
At Scribbr, we promise to make every customer 100% happy with the service we offer. Our philosophy: Your complaint is always justified – no denial, no doubts.
Our customer support team is here to find the solution that helps you the most, whether that’s a free new edit or a refund for the service.
Yes, in the order process you can indicate your preference for American, British, or Australian English .
If you don’t choose one, your editor will follow the style of English you currently use. If your editor has any questions about this, we will contact you.
Determining the relationship between two or more variables.
Verywell / Brianna Gilmartin
Frequently asked questions.
A correlational study is a type of research design that looks at the relationships between two or more variables. Correlational studies are non-experimental, which means that the experimenter does not manipulate or control any of the variables.
A correlation refers to a relationship between two variables. Correlations can be strong or weak and positive or negative. Sometimes, there is no correlation.
There are three possible outcomes of a correlation study: a positive correlation, a negative correlation, or no correlation. Researchers can present the results using a numerical value called the correlation coefficient, a measure of the correlation strength. It can range from –1.00 (negative) to +1.00 (positive). A correlation coefficient of 0 indicates no correlation.
Correlational studies are often used in psychology, as well as other fields like medicine. Correlational research is a preliminary way to gather information about a topic. The method is also useful if researchers are unable to perform an experiment.
Researchers use correlations to see if a relationship between two or more variables exists, but the variables themselves are not under the control of the researchers.
While correlational research can demonstrate a relationship between variables, it cannot prove that changing one variable will change another. In other words, correlational studies cannot prove cause-and-effect relationships.
When you encounter research that refers to a "link" or an "association" between two things, they are most likely talking about a correlational study.
There are three types of correlational research: naturalistic observation, the survey method, and archival research. Each type has its own purpose, as well as its pros and cons.
The naturalistic observation method involves observing and recording variables of interest in a natural setting without interference or manipulation.
Can inspire ideas for further research
Option if lab experiment not available
Variables are viewed in natural setting
Can be time-consuming and expensive
Extraneous variables can't be controlled
No scientific control of variables
Subjects might behave differently if aware of being observed
This method is well-suited to studies where researchers want to see how variables behave in their natural setting or state. Inspiration can then be drawn from the observations to inform future avenues of research.
In some cases, it might be the only method available to researchers; for example, if lab experimentation would be precluded by access, resources, or ethics. It might be preferable to not being able to conduct research at all, but the method can be costly and usually takes a lot of time.
Naturalistic observation presents several challenges for researchers. For one, it does not allow them to control or influence the variables in any way nor can they change any possible external variables.
However, this does not mean that researchers will get reliable data from watching the variables, or that the information they gather will be free from bias.
For example, study subjects might act differently if they know that they are being watched. The researchers might not be aware that the behavior that they are observing is not necessarily the subject's natural state (i.e., how they would act if they did not know they were being watched).
Researchers also need to be aware of their biases, which can affect the observation and interpretation of a subject's behavior.
Surveys and questionnaires are some of the most common methods used for psychological research. The survey method involves having a random sample of participants complete a survey, test, or questionnaire related to the variables of interest. Random sampling is vital to the generalizability of a survey's results.
Cheap, easy, and fast
Can collect large amounts of data in a short amount of time
Results can be affected by poor survey questions
Results can be affected by unrepresentative sample
Outcomes can be affected by participants
If researchers need to gather a large amount of data in a short period of time, a survey is likely to be the fastest, easiest, and cheapest option.
It's also a flexible method because it lets researchers create data-gathering tools that will help ensure they get the information they need (survey responses) from all the sources they want to use (a random sample of participants taking the survey).
Survey data might be cost-efficient and easy to get, but it has its downsides. For one, the data is not always reliable—particularly if the survey questions are poorly written or the overall design or delivery is weak. Data is also affected by specific faults, such as unrepresented or underrepresented samples .
The use of surveys relies on participants to provide useful data. Researchers need to be aware of the specific factors related to the people taking the survey that will affect its outcome.
For example, some people might struggle to understand the questions. A person might answer a particular way to try to please the researchers or to try to control how the researchers perceive them (such as trying to make themselves "look better").
Sometimes, respondents might not even realize that their answers are incorrect or misleading because of mistaken memories .
Many areas of psychological research benefit from analyzing studies that were conducted long ago by other researchers, as well as reviewing historical records and case studies.
For example, in an experiment known as "The Irritable Heart ," researchers used digitalized records containing information on American Civil War veterans to learn more about post-traumatic stress disorder (PTSD).
Large amount of data
Can be less expensive
Researchers cannot change participant behavior
Can be unreliable
Information might be missing
No control over data collection methods
Using records, databases, and libraries that are publicly accessible or accessible through their institution can help researchers who might not have a lot of money to support their research efforts.
Free and low-cost resources are available to researchers at all levels through academic institutions, museums, and data repositories around the world.
Another potential benefit is that these sources often provide an enormous amount of data that was collected over a very long period of time, which can give researchers a way to view trends, relationships, and outcomes related to their research.
While the inability to change variables can be a disadvantage of some methods, it can be a benefit of archival research. That said, using historical records or information that was collected a long time ago also presents challenges. For one, important information might be missing or incomplete and some aspects of older studies might not be useful to researchers in a modern context.
A primary issue with archival research is reliability. When reviewing old research, little information might be available about who conducted the research, how a study was designed, who participated in the research, as well as how data was collected and interpreted.
Researchers can also be presented with ethical quandaries—for example, should modern researchers use data from studies that were conducted unethically or with questionable ethics?
You've probably heard the phrase, "correlation does not equal causation." This means that while correlational research can suggest that there is a relationship between two variables, it cannot prove that one variable will change another.
For example, researchers might perform a correlational study that suggests there is a relationship between academic success and a person's self-esteem. However, the study cannot show that academic success changes a person's self-esteem.
To determine why the relationship exists, researchers would need to consider and experiment with other variables, such as the subject's social relationships, cognitive abilities, personality, and socioeconomic status.
The difference between a correlational study and an experimental study involves the manipulation of variables. Researchers do not manipulate variables in a correlational study, but they do control and systematically vary the independent variables in an experimental study. Correlational studies allow researchers to detect the presence and strength of a relationship between variables, while experimental studies allow researchers to look for cause and effect relationships.
If the study involves the systematic manipulation of the levels of a variable, it is an experimental study. If researchers are measuring what is already present without actually changing the variables, then is a correlational study.
The variables in a correlational study are what the researcher measures. Once measured, researchers can then use statistical analysis to determine the existence, strength, and direction of the relationship. However, while correlational studies can say that variable X and variable Y have a relationship, it does not mean that X causes Y.
The goal of correlational research is often to look for relationships, describe these relationships, and then make predictions. Such research can also often serve as a jumping off point for future experimental research.
Heath W. Psychology Research Methods . Cambridge University Press; 2018:134-156.
Schneider FW. Applied Social Psychology . 2nd ed. SAGE; 2012:50-53.
Curtis EA, Comiskey C, Dempsey O. Importance and use of correlational research . Nurse Researcher . 2016;23(6):20-25. doi:10.7748/nr.2016.e1382
Carpenter S. Visualizing Psychology . 3rd ed. John Wiley & Sons; 2012:14-30.
Pizarro J, Silver RC, Prause J. Physical and mental health costs of traumatic war experiences among civil war veterans . Arch Gen Psychiatry . 2006;63(2):193. doi:10.1001/archpsyc.63.2.193
Post SG. The echo of Nuremberg: Nazi data and ethics . J Med Ethics . 1991;17(1):42-44. doi:10.1136/jme.17.1.42
Lau F. Chapter 12 Methods for Correlational Studies . In: Lau F, Kuziemsky C, eds. Handbook of eHealth Evaluation: An Evidence-based Approach . University of Victoria.
Akoglu H. User's guide to correlation coefficients . Turk J Emerg Med . 2018;18(3):91-93. doi:10.1016/j.tjem.2018.08.001
Price PC. Research Methods in Psychology . California State University.
By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."
Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.
Learning objectives.
Psychologists agree that if their ideas and theories about human behavior are to be taken seriously, they must be backed up by data. However, the research of different psychologists is designed with different goals in mind, and the different goals require different approaches. These varying approaches, summarized in Table 2.2 “Characteristics of the Three Research Designs” , are known as research designs . A research design is the specific method a researcher uses to collect, analyze, and interpret data . Psychologists use three major types of research designs in their research, and each provides an essential avenue for scientific investigation. Descriptive research is research designed to provide a snapshot of the current state of affairs . Correlational research is research designed to discover relationships among variables and to allow the prediction of future events from present knowledge . Experimental research is research in which initial equivalence among research participants in more than one group is created, followed by a manipulation of a given experience for these groups and a measurement of the influence of the manipulation . Each of the three research designs varies according to its strengths and limitations, and it is important to understand how each differs.
Table 2.2 Characteristics of the Three Research Designs
Research design | Goal | Advantages | Disadvantages |
---|---|---|---|
Descriptive | To create a snapshot of the current state of affairs | Provides a relatively complete picture of what is occurring at a given time. Allows the development of questions for further study. | Does not assess relationships among variables. May be unethical if participants do not know they are being observed. |
Correlational | To assess the relationships between and among two or more variables | Allows testing of expected relationships between and among variables and the making of predictions. Can assess these relationships in everyday life events. | Cannot be used to draw inferences about the causal relationships between and among the variables. |
Experimental | To assess the causal impact of one or more experimental manipulations on a dependent variable | Allows drawing of conclusions about the causal relationships among variables. | Cannot experimentally manipulate many important variables. May be expensive and time consuming. |
There are three major research designs used by psychologists, and each has its own advantages and disadvantages. |
Stangor, C. (2011). Research methods for the behavioral sciences (4th ed.). Mountain View, CA: Cengage.
Descriptive research is designed to create a snapshot of the current thoughts, feelings, or behavior of individuals. This section reviews three types of descriptive research: case studies , surveys , and naturalistic observation .
Sometimes the data in a descriptive research project are based on only a small set of individuals, often only one person or a single small group. These research designs are known as case studies — descriptive records of one or more individual’s experiences and behavior . Sometimes case studies involve ordinary individuals, as when developmental psychologist Jean Piaget used his observation of his own children to develop his stage theory of cognitive development. More frequently, case studies are conducted on individuals who have unusual or abnormal experiences or characteristics or who find themselves in particularly difficult or stressful situations. The assumption is that by carefully studying individuals who are socially marginal, who are experiencing unusual situations, or who are going through a difficult phase in their lives, we can learn something about human nature.
Sigmund Freud was a master of using the psychological difficulties of individuals to draw conclusions about basic psychological processes. Freud wrote case studies of some of his most interesting patients and used these careful examinations to develop his important theories of personality. One classic example is Freud’s description of “Little Hans,” a child whose fear of horses the psychoanalyst interpreted in terms of repressed sexual impulses and the Oedipus complex (Freud (1909/1964).
Political polls reported in newspapers and on the Internet are descriptive research designs that provide snapshots of the likely voting behavior of a population.
Another well-known case study is Phineas Gage, a man whose thoughts and emotions were extensively studied by cognitive psychologists after a railroad spike was blasted through his skull in an accident. Although there is question about the interpretation of this case study (Kotowicz, 2007), it did provide early evidence that the brain’s frontal lobe is involved in emotion and morality (Damasio et al., 2005). An interesting example of a case study in clinical psychology is described by Rokeach (1964), who investigated in detail the beliefs and interactions among three patients with schizophrenia, all of whom were convinced they were Jesus Christ.
In other cases the data from descriptive research projects come in the form of a survey — a measure administered through either an interview or a written questionnaire to get a picture of the beliefs or behaviors of a sample of people of interest . The people chosen to participate in the research (known as the sample ) are selected to be representative of all the people that the researcher wishes to know about (the population ). In election polls, for instance, a sample is taken from the population of all “likely voters” in the upcoming elections.
The results of surveys may sometimes be rather mundane, such as “Nine out of ten doctors prefer Tymenocin,” or “The median income in Montgomery County is $36,712.” Yet other times (particularly in discussions of social behavior), the results can be shocking: “More than 40,000 people are killed by gunfire in the United States every year,” or “More than 60% of women between the ages of 50 and 60 suffer from depression.” Descriptive research is frequently used by psychologists to get an estimate of the prevalence (or incidence ) of psychological disorders.
A final type of descriptive research—known as naturalistic observation —is research based on the observation of everyday events . For instance, a developmental psychologist who watches children on a playground and describes what they say to each other while they play is conducting descriptive research, as is a biopsychologist who observes animals in their natural habitats. One example of observational research involves a systematic procedure known as the strange situation , used to get a picture of how adults and young children interact. The data that are collected in the strange situation are systematically coded in a coding sheet such as that shown in Table 2.3 “Sample Coding Form Used to Assess Child’s and Mother’s Behavior in the Strange Situation” .
Table 2.3 Sample Coding Form Used to Assess Child’s and Mother’s Behavior in the Strange Situation
Coder name: | ||||
---|---|---|---|---|
Mother and baby play alone | ||||
Mother puts baby down | ||||
Stranger enters room | ||||
Mother leaves room; stranger plays with baby | ||||
Mother reenters, greets and may comfort baby, then leaves again | ||||
Stranger tries to play with baby | ||||
Mother reenters and picks up baby | ||||
The baby moves toward, grasps, or climbs on the adult. | ||||
The baby resists being put down by the adult by crying or trying to climb back up. | ||||
The baby pushes, hits, or squirms to be put down from the adult’s arms. | ||||
The baby turns away or moves away from the adult. | ||||
This table represents a sample coding sheet from an episode of the “strange situation,” in which an infant (usually about 1 year old) is observed playing in a room with two adults—the child’s mother and a stranger. Each of the four coding categories is scored by the coder from 1 (the baby makes no effort to engage in the behavior) to 7 (the baby makes a significant effort to engage in the behavior). More information about the meaning of the coding can be found in Ainsworth, Blehar, Waters, and Wall (1978). |
The results of descriptive research projects are analyzed using descriptive statistics — numbers that summarize the distribution of scores on a measured variable . Most variables have distributions similar to that shown in Figure 2.5 “Height Distribution” , where most of the scores are located near the center of the distribution, and the distribution is symmetrical and bell-shaped. A data distribution that is shaped like a bell is known as a normal distribution .
Table 2.4 Height and Family Income for 25 Students
Student name | Height in inches | Family income in dollars |
---|---|---|
Lauren | 62 | 48,000 |
Courtnie | 62 | 57,000 |
Leslie | 63 | 93,000 |
Renee | 64 | 107,000 |
Katherine | 64 | 110,000 |
Jordan | 65 | 93,000 |
Rabiah | 66 | 46,000 |
Alina | 66 | 84,000 |
Young Su | 67 | 68,000 |
Martin | 67 | 49,000 |
Hanzhu | 67 | 73,000 |
Caitlin | 67 | 3,800,000 |
Steven | 67 | 107,000 |
Emily | 67 | 64,000 |
Amy | 68 | 67,000 |
Jonathan | 68 | 51,000 |
Julian | 68 | 48,000 |
Alissa | 68 | 93,000 |
Christine | 69 | 93,000 |
Candace | 69 | 111,000 |
Xiaohua | 69 | 56,000 |
Charlie | 70 | 94,000 |
Timothy | 71 | 73,000 |
Ariane | 72 | 70,000 |
Logan | 72 | 44,000 |
Figure 2.5 Height Distribution
The distribution of the heights of the students in a class will form a normal distribution. In this sample the mean ( M ) = 67.12 and the standard deviation ( s ) = 2.74.
A distribution can be described in terms of its central tendency —that is, the point in the distribution around which the data are centered—and its dispersion , or spread. The arithmetic average, or arithmetic mean , is the most commonly used measure of central tendency . It is computed by calculating the sum of all the scores of the variable and dividing this sum by the number of participants in the distribution (denoted by the letter N ). In the data presented in Figure 2.5 “Height Distribution” , the mean height of the students is 67.12 inches. The sample mean is usually indicated by the letter M .
In some cases, however, the data distribution is not symmetrical. This occurs when there are one or more extreme scores (known as outliers ) at one end of the distribution. Consider, for instance, the variable of family income (see Figure 2.6 “Family Income Distribution” ), which includes an outlier (a value of $3,800,000). In this case the mean is not a good measure of central tendency. Although it appears from Figure 2.6 “Family Income Distribution” that the central tendency of the family income variable should be around $70,000, the mean family income is actually $223,960. The single very extreme income has a disproportionate impact on the mean, resulting in a value that does not well represent the central tendency.
The median is used as an alternative measure of central tendency when distributions are not symmetrical. The median is the score in the center of the distribution, meaning that 50% of the scores are greater than the median and 50% of the scores are less than the median . In our case, the median household income ($73,000) is a much better indication of central tendency than is the mean household income ($223,960).
Figure 2.6 Family Income Distribution
The distribution of family incomes is likely to be nonsymmetrical because some incomes can be very large in comparison to most incomes. In this case the median or the mode is a better indicator of central tendency than is the mean.
A final measure of central tendency, known as the mode , represents the value that occurs most frequently in the distribution . You can see from Figure 2.6 “Family Income Distribution” that the mode for the family income variable is $93,000 (it occurs four times).
In addition to summarizing the central tendency of a distribution, descriptive statistics convey information about how the scores of the variable are spread around the central tendency. Dispersion refers to the extent to which the scores are all tightly clustered around the central tendency, like this:
Or they may be more spread out away from it, like this:
One simple measure of dispersion is to find the largest (the maximum ) and the smallest (the minimum ) observed values of the variable and to compute the range of the variable as the maximum observed score minus the minimum observed score. You can check that the range of the height variable in Figure 2.5 “Height Distribution” is 72 – 62 = 10. The standard deviation , symbolized as s , is the most commonly used measure of dispersion . Distributions with a larger standard deviation have more spread. The standard deviation of the height variable is s = 2.74, and the standard deviation of the family income variable is s = $745,337.
An advantage of descriptive research is that it attempts to capture the complexity of everyday behavior. Case studies provide detailed information about a single person or a small group of people, surveys capture the thoughts or reported behaviors of a large population of people, and naturalistic observation objectively records the behavior of people or animals as it occurs naturally. Thus descriptive research is used to provide a relatively complete understanding of what is currently happening.
Despite these advantages, descriptive research has a distinct disadvantage in that, although it allows us to get an idea of what is currently happening, it is usually limited to static pictures. Although descriptions of particular experiences may be interesting, they are not always transferable to other individuals in other situations, nor do they tell us exactly why specific behaviors or events occurred. For instance, descriptions of individuals who have suffered a stressful event, such as a war or an earthquake, can be used to understand the individuals’ reactions to the event but cannot tell us anything about the long-term effects of the stress. And because there is no comparison group that did not experience the stressful situation, we cannot know what these individuals would be like if they hadn’t had the stressful experience.
In contrast to descriptive research, which is designed primarily to provide static pictures, correlational research involves the measurement of two or more relevant variables and an assessment of the relationship between or among those variables. For instance, the variables of height and weight are systematically related (correlated) because taller people generally weigh more than shorter people. In the same way, study time and memory errors are also related, because the more time a person is given to study a list of words, the fewer errors he or she will make. When there are two variables in the research design, one of them is called the predictor variable and the other the outcome variable . The research design can be visualized like this, where the curved arrow represents the expected correlation between the two variables:
Figure 2.2.2
One way of organizing the data from a correlational study with two variables is to graph the values of each of the measured variables using a scatter plot . As you can see in Figure 2.10 “Examples of Scatter Plots” , a scatter plot is a visual image of the relationship between two variables . A point is plotted for each individual at the intersection of his or her scores for the two variables. When the association between the variables on the scatter plot can be easily approximated with a straight line, as in parts (a) and (b) of Figure 2.10 “Examples of Scatter Plots” , the variables are said to have a linear relationship .
When the straight line indicates that individuals who have above-average values for one variable also tend to have above-average values for the other variable, as in part (a), the relationship is said to be positive linear . Examples of positive linear relationships include those between height and weight, between education and income, and between age and mathematical abilities in children. In each case people who score higher on one of the variables also tend to score higher on the other variable. Negative linear relationships , in contrast, as shown in part (b), occur when above-average values for one variable tend to be associated with below-average values for the other variable. Examples of negative linear relationships include those between the age of a child and the number of diapers the child uses, and between practice on and errors made on a learning task. In these cases people who score higher on one of the variables tend to score lower on the other variable.
Relationships between variables that cannot be described with a straight line are known as nonlinear relationships . Part (c) of Figure 2.10 “Examples of Scatter Plots” shows a common pattern in which the distribution of the points is essentially random. In this case there is no relationship at all between the two variables, and they are said to be independent . Parts (d) and (e) of Figure 2.10 “Examples of Scatter Plots” show patterns of association in which, although there is an association, the points are not well described by a single straight line. For instance, part (d) shows the type of relationship that frequently occurs between anxiety and performance. Increases in anxiety from low to moderate levels are associated with performance increases, whereas increases in anxiety from moderate to high levels are associated with decreases in performance. Relationships that change in direction and thus are not described by a single straight line are called curvilinear relationships .
Figure 2.10 Examples of Scatter Plots
Some examples of relationships between two variables as shown in scatter plots. Note that the Pearson correlation coefficient ( r ) between variables that have curvilinear relationships will likely be close to zero.
Adapted from Stangor, C. (2011). Research methods for the behavioral sciences (4th ed.). Mountain View, CA: Cengage.
The most common statistical measure of the strength of linear relationships among variables is the Pearson correlation coefficient , which is symbolized by the letter r . The value of the correlation coefficient ranges from r = –1.00 to r = +1.00. The direction of the linear relationship is indicated by the sign of the correlation coefficient. Positive values of r (such as r = .54 or r = .67) indicate that the relationship is positive linear (i.e., the pattern of the dots on the scatter plot runs from the lower left to the upper right), whereas negative values of r (such as r = –.30 or r = –.72) indicate negative linear relationships (i.e., the dots run from the upper left to the lower right). The strength of the linear relationship is indexed by the distance of the correlation coefficient from zero (its absolute value). For instance, r = –.54 is a stronger relationship than r = .30, and r = .72 is a stronger relationship than r = –.57. Because the Pearson correlation coefficient only measures linear relationships, variables that have curvilinear relationships are not well described by r , and the observed correlation will be close to zero.
It is also possible to study relationships among more than two measures at the same time. A research design in which more than one predictor variable is used to predict a single outcome variable is analyzed through multiple regression (Aiken & West, 1991). Multiple regression is a statistical technique, based on correlation coefficients among variables, that allows predicting a single outcome variable from more than one predictor variable . For instance, Figure 2.11 “Prediction of Job Performance From Three Predictor Variables” shows a multiple regression analysis in which three predictor variables are used to predict a single outcome. The use of multiple regression analysis shows an important advantage of correlational research designs—they can be used to make predictions about a person’s likely score on an outcome variable (e.g., job performance) based on knowledge of other variables.
Figure 2.11 Prediction of Job Performance From Three Predictor Variables
Multiple regression allows scientists to predict the scores on a single outcome variable using more than one predictor variable.
An important limitation of correlational research designs is that they cannot be used to draw conclusions about the causal relationships among the measured variables. Consider, for instance, a researcher who has hypothesized that viewing violent behavior will cause increased aggressive play in children. He has collected, from a sample of fourth-grade children, a measure of how many violent television shows each child views during the week, as well as a measure of how aggressively each child plays on the school playground. From his collected data, the researcher discovers a positive correlation between the two measured variables.
Although this positive correlation appears to support the researcher’s hypothesis, it cannot be taken to indicate that viewing violent television causes aggressive behavior. Although the researcher is tempted to assume that viewing violent television causes aggressive play,
there are other possibilities. One alternate possibility is that the causal direction is exactly opposite from what has been hypothesized. Perhaps children who have behaved aggressively at school develop residual excitement that leads them to want to watch violent television shows at home:
Although this possibility may seem less likely, there is no way to rule out the possibility of such reverse causation on the basis of this observed correlation. It is also possible that both causal directions are operating and that the two variables cause each other:
Still another possible explanation for the observed correlation is that it has been produced by the presence of a common-causal variable (also known as a third variable ). A common-causal variable is a variable that is not part of the research hypothesis but that causes both the predictor and the outcome variable and thus produces the observed correlation between them . In our example a potential common-causal variable is the discipline style of the children’s parents. Parents who use a harsh and punitive discipline style may produce children who both like to watch violent television and who behave aggressively in comparison to children whose parents use less harsh discipline:
In this case, television viewing and aggressive play would be positively correlated (as indicated by the curved arrow between them), even though neither one caused the other but they were both caused by the discipline style of the parents (the straight arrows). When the predictor and outcome variables are both caused by a common-causal variable, the observed relationship between them is said to be spurious . A spurious relationship is a relationship between two variables in which a common-causal variable produces and “explains away” the relationship . If effects of the common-causal variable were taken away, or controlled for, the relationship between the predictor and outcome variables would disappear. In the example the relationship between aggression and television viewing might be spurious because by controlling for the effect of the parents’ disciplining style, the relationship between television viewing and aggressive behavior might go away.
Common-causal variables in correlational research designs can be thought of as “mystery” variables because, as they have not been measured, their presence and identity are usually unknown to the researcher. Since it is not possible to measure every variable that could cause both the predictor and outcome variables, the existence of an unknown common-causal variable is always a possibility. For this reason, we are left with the basic limitation of correlational research: Correlation does not demonstrate causation. It is important that when you read about correlational research projects, you keep in mind the possibility of spurious relationships, and be sure to interpret the findings appropriately. Although correlational research is sometimes reported as demonstrating causality without any mention being made of the possibility of reverse causation or common-causal variables, informed consumers of research, like you, are aware of these interpretational problems.
In sum, correlational research designs have both strengths and limitations. One strength is that they can be used when experimental research is not possible because the predictor variables cannot be manipulated. Correlational designs also have the advantage of allowing the researcher to study behavior as it occurs in everyday life. And we can also use correlational designs to make predictions—for instance, to predict from the scores on their battery of tests the success of job trainees during a training session. But we cannot use such correlational information to determine whether the training caused better job performance. For that, researchers rely on experiments.
The goal of experimental research design is to provide more definitive conclusions about the causal relationships among the variables in the research hypothesis than is available from correlational designs. In an experimental research design, the variables of interest are called the independent variable (or variables ) and the dependent variable . The independent variable in an experiment is the causing variable that is created (manipulated) by the experimenter . The dependent variable in an experiment is a measured variable that is expected to be influenced by the experimental manipulation . The research hypothesis suggests that the manipulated independent variable or variables will cause changes in the measured dependent variables. We can diagram the research hypothesis by using an arrow that points in one direction. This demonstrates the expected direction of causality:
Figure 2.2.3
Consider an experiment conducted by Anderson and Dill (2000). The study was designed to test the hypothesis that viewing violent video games would increase aggressive behavior. In this research, male and female undergraduates from Iowa State University were given a chance to play with either a violent video game (Wolfenstein 3D) or a nonviolent video game (Myst). During the experimental session, the participants played their assigned video games for 15 minutes. Then, after the play, each participant played a competitive game with an opponent in which the participant could deliver blasts of white noise through the earphones of the opponent. The operational definition of the dependent variable (aggressive behavior) was the level and duration of noise delivered to the opponent. The design of the experiment is shown in Figure 2.17 “An Experimental Research Design” .
Figure 2.17 An Experimental Research Design
Two advantages of the experimental research design are (1) the assurance that the independent variable (also known as the experimental manipulation) occurs prior to the measured dependent variable, and (2) the creation of initial equivalence between the conditions of the experiment (in this case by using random assignment to conditions).
Experimental designs have two very nice features. For one, they guarantee that the independent variable occurs prior to the measurement of the dependent variable. This eliminates the possibility of reverse causation. Second, the influence of common-causal variables is controlled, and thus eliminated, by creating initial equivalence among the participants in each of the experimental conditions before the manipulation occurs.
The most common method of creating equivalence among the experimental conditions is through random assignment to conditions , a procedure in which the condition that each participant is assigned to is determined through a random process, such as drawing numbers out of an envelope or using a random number table . Anderson and Dill first randomly assigned about 100 participants to each of their two groups (Group A and Group B). Because they used random assignment to conditions, they could be confident that, before the experimental manipulation occurred, the students in Group A were, on average, equivalent to the students in Group B on every possible variable, including variables that are likely to be related to aggression, such as parental discipline style, peer relationships, hormone levels, diet—and in fact everything else.
Then, after they had created initial equivalence, Anderson and Dill created the experimental manipulation—they had the participants in Group A play the violent game and the participants in Group B play the nonviolent game. Then they compared the dependent variable (the white noise blasts) between the two groups, finding that the students who had viewed the violent video game gave significantly longer noise blasts than did the students who had played the nonviolent game.
Anderson and Dill had from the outset created initial equivalence between the groups. This initial equivalence allowed them to observe differences in the white noise levels between the two groups after the experimental manipulation, leading to the conclusion that it was the independent variable (and not some other variable) that caused these differences. The idea is that the only thing that was different between the students in the two groups was the video game they had played.
Despite the advantage of determining causation, experiments do have limitations. One is that they are often conducted in laboratory situations rather than in the everyday lives of people. Therefore, we do not know whether results that we find in a laboratory setting will necessarily hold up in everyday life. Second, and more important, is that some of the most interesting and key social variables cannot be experimentally manipulated. If we want to study the influence of the size of a mob on the destructiveness of its behavior, or to compare the personality characteristics of people who join suicide cults with those of people who do not join such cults, these relationships must be assessed using correlational designs, because it is simply not possible to experimentally manipulate these variables.
Aiken, L., & West, S. (1991). Multiple regression: Testing and interpreting interactions . Newbury Park, CA: Sage.
Ainsworth, M. S., Blehar, M. C., Waters, E., & Wall, S. (1978). Patterns of attachment: A psychological study of the strange situation . Hillsdale, NJ: Lawrence Erlbaum Associates.
Anderson, C. A., & Dill, K. E. (2000). Video games and aggressive thoughts, feelings, and behavior in the laboratory and in life. Journal of Personality and Social Psychology, 78 (4), 772–790.
Damasio, H., Grabowski, T., Frank, R., Galaburda, A. M., Damasio, A. R., Cacioppo, J. T., & Berntson, G. G. (2005). The return of Phineas Gage: Clues about the brain from the skull of a famous patient. In Social neuroscience: Key readings. (pp. 21–28). New York, NY: Psychology Press.
Freud, S. (1964). Analysis of phobia in a five-year-old boy. In E. A. Southwell & M. Merbaum (Eds.), Personality: Readings in theory and research (pp. 3–32). Belmont, CA: Wadsworth. (Original work published 1909)
Kotowicz, Z. (2007). The strange case of Phineas Gage. History of the Human Sciences, 20 (1), 115–131.
Rokeach, M. (1964). The three Christs of Ypsilanti: A psychological study . New York, NY: Knopf.
Introduction to Psychology Copyright © 2015 by University of Minnesota is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.
Run a free plagiarism check in 10 minutes, generate accurate citations for free.
Methodology
Published on May 8, 2019 by Shona McCombes . Revised on November 20, 2023.
A case study is a detailed study of a specific subject, such as a person, group, place, event, organization, or phenomenon. Case studies are commonly used in social, educational, clinical, and business research.
A case study research design usually involves qualitative methods , but quantitative methods are sometimes also used. Case studies are good for describing , comparing, evaluating and understanding different aspects of a research problem .
When to do a case study, step 1: select a case, step 2: build a theoretical framework, step 3: collect your data, step 4: describe and analyze the case, other interesting articles.
A case study is an appropriate research design when you want to gain concrete, contextual, in-depth knowledge about a specific real-world subject. It allows you to explore the key characteristics, meanings, and implications of the case.
Case studies are often a good choice in a thesis or dissertation . They keep your project focused and manageable when you don’t have the time or resources to do large-scale research.
You might use just one complex case study where you explore a single subject in depth, or conduct multiple case studies to compare and illuminate different aspects of your research problem.
Research question | Case study |
---|---|
What are the ecological effects of wolf reintroduction? | Case study of wolf reintroduction in Yellowstone National Park |
How do populist politicians use narratives about history to gain support? | Case studies of Hungarian prime minister Viktor Orbán and US president Donald Trump |
How can teachers implement active learning strategies in mixed-level classrooms? | Case study of a local school that promotes active learning |
What are the main advantages and disadvantages of wind farms for rural communities? | Case studies of three rural wind farm development projects in different parts of the country |
How are viral marketing strategies changing the relationship between companies and consumers? | Case study of the iPhone X marketing campaign |
How do experiences of work in the gig economy differ by gender, race and age? | Case studies of Deliveroo and Uber drivers in London |
Discover proofreading & editing
Once you have developed your problem statement and research questions , you should be ready to choose the specific case that you want to focus on. A good case study should have the potential to:
TipIf your research is more practical in nature and aims to simultaneously investigate an issue as you solve it, consider conducting action research instead.
Unlike quantitative or experimental research , a strong case study does not require a random or representative sample. In fact, case studies often deliberately focus on unusual, neglected, or outlying cases which may shed new light on the research problem.
Example of an outlying case studyIn the 1960s the town of Roseto, Pennsylvania was discovered to have extremely low rates of heart disease compared to the US average. It became an important case study for understanding previously neglected causes of heart disease.
However, you can also choose a more common or representative case to exemplify a particular category, experience or phenomenon.
Example of a representative case studyIn the 1920s, two sociologists used Muncie, Indiana as a case study of a typical American city that supposedly exemplified the changing culture of the US at the time.
While case studies focus more on concrete details than general theories, they should usually have some connection with theory in the field. This way the case study is not just an isolated description, but is integrated into existing knowledge about the topic. It might aim to:
To ensure that your analysis of the case has a solid academic grounding, you should conduct a literature review of sources related to the topic and develop a theoretical framework . This means identifying key concepts and theories to guide your analysis and interpretation.
There are many different research methods you can use to collect data on your subject. Case studies tend to focus on qualitative data using methods such as interviews , observations , and analysis of primary and secondary sources (e.g., newspaper articles, photographs, official records). Sometimes a case study will also collect quantitative data.
Example of a mixed methods case studyFor a case study of a wind farm development in a rural area, you could collect quantitative data on employment rates and business revenue, collect qualitative data on local people’s perceptions and experiences, and analyze local and national media coverage of the development.
The aim is to gain as thorough an understanding as possible of the case and its context.
Professional editors proofread and edit your paper by focusing on:
See an example
In writing up the case study, you need to bring together all the relevant aspects to give as complete a picture as possible of the subject.
How you report your findings depends on the type of research you are doing. Some case studies are structured like a standard scientific paper or thesis , with separate sections or chapters for the methods , results and discussion .
Others are written in a more narrative style, aiming to explore the case from various angles and analyze its meanings and implications (for example, by using textual analysis or discourse analysis ).
In all cases, though, make sure to give contextual details about the case, connect it back to the literature and theory, and discuss how it fits into wider patterns or debates.
If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.
Research bias
If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.
McCombes, S. (2023, November 20). What Is a Case Study? | Definition, Examples & Methods. Scribbr. Retrieved September 3, 2024, from https://www.scribbr.com/methodology/case-study/
Other students also liked, primary vs. secondary sources | difference & examples, what is a theoretical framework | guide to organizing, what is action research | definition & examples, what is your plagiarism score.
A case study and experiment are the two prominent approaches often used at the forefront of scholarly inquiry. While case studies study the complexities of real-life situations, aiming for depth and contextual understanding, experiments seek to uncover causal relationships through controlled manipulation and observation. Both these research methods are indispensable tools in understanding phenomena, yet they diverge significantly in their approaches, aims, and applications.
In this article, we’ll unpack the key differences between case studies and experiments, exploring their strengths, limitations, and the unique insights they offer when working with quantitative data. In the meantime, feel free to use our specialized case study writing service if you seek to streamline your efforts when handling this academic task.
A case study is a research method that involves an in-depth examination of a particular individual, group, event, or phenomenon within its real-life context. It aims to provide a detailed and comprehensive analysis of the subject under investigation, often using multiple data sources such as interviews, observations, documents, and archival records.
The case study method is used in psychology, sociology, anthropology, education, and business to explore complex issues, understand unique situations, and generate rich, contextualized insights. They allow scholars to explore the intricacies of real-world phenomena, uncovering patterns, relationships, and underlying factors in social sciences that may not be readily apparent through other research methods.
Overall, case studies offer a holistic and nuanced understanding of the subject of interest, facilitating deeper exploration and interpretation of complex social and human phenomena. If you’re struggling with this assignment, simply say, ‘ write my case study for me ,’ and our experts will help you promptly.
Compared to the case study method, an experiment investigates cause-and-effect relationships by systematically manipulating one or more variables and observing the effects on other variables. In an experiment, students aim to establish causal relationships between an independent variable (the factors being manipulated) and a dependent variable (the outcomes being measured).
Experiments are characterized by their controlled and systematic approach, often involving the random assignment of participants to different experimental conditions to minimize bias and ensure the validity of the findings. They are commonly used in such fields of social sciences as psychology, biology, physics, and medicine to test hypotheses, identify causal mechanisms, and provide empirical evidence for theories.
An experiment method allows scholars to establish causal relationships with high confidence, providing valuable insights into the underlying mechanisms of behavior, natural phenomena, and social processes. Other research methods include:
Case study and experiment definitions.
A case study method involves a deep investigation into a specific individual, group, event, or phenomenon within its real-life context, aiming to provide rich and detailed insights into complex issues. Learners gather research data from multiple sources, such as interviews, observations, documents, and archival records, to comprehensively understand the subject under study.
Case studies are particularly useful for exploring unique or rare phenomena, offering a holistic view that captures the intricacies and nuances of the situation. However, findings from case studies may be challenging to generalize to broader populations due to the specificity of the case and the lack of experimental control. To learn more about how to write a case study , please refer to our guide.
An experiment is a research method that systematically manipulates one or more variables to observe their effects on other variables, aiming to establish cause-and-effect relationships under controlled conditions. Researchers design experiments with high control over variables, often using standardized procedures and quantitative measures for research data collection.
Experiments are well-suited for testing hypotheses and identifying causal relationships in controlled environments, allowing educatees to conclude the effects of specific interventions or manipulations. However, experiments may lack the depth and contextual richness of case studies, and findings are typically limited to the specific conditions of the experiment.
Manipulating Variables
Areas of Implementation
She was flawless! first time using a website like this, I've ordered article review and i totally adored it! grammar punctuation, content - everything was on point
This writer is my go to, because whenever I need someone who I can trust my task to - I hire Joy. She wrote almost every paper for me for the last 2 years
Term paper done up to a highest standard, no revisions, perfect communication. 10s across the board!!!!!!!
I send him instructions and that's it. my paper was done 10 hours later, no stupid questions, he nailed it.
Sometimes I wonder if Michael is secretly a professor because he literally knows everything. HE DID SO WELL THAT MY PROF SHOWED MY PAPER AS AN EXAMPLE. unbelievable, many thanks
Stay in touch
Statistics By Jim
Making statistics intuitive
By Jim Frost 2 Comments
A correlational study is an experimental design that evaluates only the correlation between variables. The researchers record measurements but do not control or manipulate the variables. Correlational research is a form of observational study .
A correlation indicates that as the value of one variable increases, the other tends to change in a specific direction:
For example, researchers conducting correlational research explored the relationship between social media usage and levels of anxiety in young adults. Participants reported their demographic information and daily time on various social media platforms and completed a standardized anxiety assessment tool.
The correlational study looked for relationships between social media usage and anxiety. Is increased social media usage associated with higher anxiety? Is it worse for particular demographics?
Learn more about Interpreting Correlation .
Correlational research design is crucial in various disciplines, notably psychology and medicine. This type of design is generally cheaper, easier, and quicker to conduct than an experiment because the researchers don’t control any variables or conditions. Consequently, these studies often serve as an initial assessment, especially when random assignment and controlling variables for a true experiment are not feasible or unethical.
However, an unfortunate aspect of a correlational study is its limitation in establishing causation. While these studies can reveal connections between variables, they cannot prove that altering one variable will cause changes in another. Hence, correlational research can determine whether relationships exist but cannot confirm causality.
Remember, correlation doesn’t necessarily imply causation !
The difference between the two designs is simple.
In a correlational study, the researchers don’t systematically control any variables. They’re simply observing events and do not want to influence outcomes.
In an experiment, researchers manipulate variables and explicitly hope to affect the outcomes. For example, they might control the treatment condition by giving a medication or placebo to each subject. They also randomly assign subjects to the control and treatment groups, which helps establish causality.
Learn more about Randomized Controlled Trials (RCTs) , which statisticians consider to be true experiments.
Researchers divide these studies into three broad types.
One approach to correlational research is to utilize pre-existing data, which may include official records, public polls, or data from earlier studies. This method can be cost-effective and time-efficient because other researchers have already gathered the data. These existing data sources can provide large sample sizes and longitudinal data , thereby showing relationship trends.
However, it also comes with potential drawbacks. The data may be incomplete or irrelevant to the new research question. Additionally, as a researcher, you won’t have control over the original data collection methods, potentially impacting the data’s reliability and validity .
Using existing data makes this approach a retrospective study .
Surveys are a great way to collect data for correlational studies while using a consistent instrument across all respondents. You can use various formats, such as in-person, online, and by phone. And you can ask the questions necessary to obtain the particular variables you need for your project. In short, it’s easy to customize surveys to match your study’s requirements.
However, you’ll need to carefully word all the questions to be clear and not introduce bias in the results. This process can take multiple iterations and pilot studies to produce the finished survey.
For example, you can use a survey to find correlations between various demographic variables and political opinions.
Naturalistic observation is a method of collecting field data for a correlational study. Researchers observe and measure variables in a natural environment. The process can include counting events, categorizing behavior, and describing outcomes without interfering with the activities.
For example, researchers might observe and record children’s behavior after watching television. Does a relationship exist between the type of television program and behaviors?
Naturalistic observations occur in a prospective study .
Statistical analysis of correlational research frequently involves correlation and regression analysis .
A correlation coefficient describes the strength and direction of the relationship between two variables with a single number.
Regression analysis can evaluate how multiple variables relate to a single outcome. For example, in the social media correlational study example, how do the demographic variables and daily social media usage collectively correlate with anxiety?
Curtis EA, Comiskey C, Dempsey O. Importance and use of correlational research . Nurse Researcher . 2016;23(6):20-25. doi:10.7748/nr.2016.e1382
January 14, 2024 at 4:34 pm
Hi Jim. Have you written a blog note dedicated to clinical trials? If not, besides the note on hypothesis testing, are there other blogs ypo have written that touch on clinical trials?
January 14, 2024 at 5:49 pm
Hi Stan, I haven’t written a blog post specifically about clinical trials, but I have the following related posts:
Randomized Controlled Trials Clinical Trial about a COVID vaccine Clinical Trials about flu vaccines
Explore the fundamental disparities between experimental and observational studies in this comprehensive guide by Santos Research Center, Corp. Uncover concepts such as control group, random sample, cohort studies, response variable, and explanatory variable that shape the foundation of these methodologies. Discover the significance of randomized controlled trials and case control studies, examining causal relationships and the role of dependent variables and independent variables in research designs.
This enlightening exploration also delves into the meticulous scientific study process, involving survey members, systematic reviews, and statistical analyses. Investigate the careful balance of control group and treatment group dynamics, highlighting how researchers meticulously assign variables and analyze statistical patterns to discern meaningful insights. From dissecting issues like lung cancer to understanding sleep patterns, this guide emphasizes the precision of controlled experiments and controlled trials, where variables are isolated and scrutinized, paving the way for a deeper comprehension of the world through empirical research.
These two studies are the cornerstones of scientific inquiry, each offering a distinct approach to unraveling the mysteries of the natural world.
Observational studies allow us to observe, document, and gather data without direct intervention. They provide a means to explore real-world scenarios and trends, making them valuable when manipulating variables is not feasible or ethical. From surveys to meticulous observations, these studies shed light on existing conditions and relationships.
Experimental studies , in contrast, put researchers in the driver's seat. They involve the deliberate manipulation of variables to understand their impact on specific outcomes. By controlling the conditions, experimental studies establish causal relationships, answering questions of causality with precision. This approach is pivotal for hypothesis testing and informed decision-making.
At Santos Research Center, Corp., we recognize the importance of both observational and experimental studies. We employ these methodologies in our diverse research projects to ensure the highest quality of scientific investigation and to answer a wide range of research questions.
In our exploration of research methodologies, let's zoom in on observational research studies—an essential facet of scientific inquiry that we at Santos Research Center, Corp., expertly employ in our diverse research projects.
Observational research studies involve the passive observation of subjects without any intervention or manipulation by researchers. These studies are designed to scrutinize the relationships between variables and test subjects, uncover patterns, and draw conclusions grounded in real-world data.
Researchers refrain from interfering with the natural course of events in controlled experiment. Instead, they meticulously gather data by keenly observing and documenting information about the test subjects and their surroundings. This approach permits the examination of variables that cannot be ethically or feasibly manipulated, making it particularly valuable in certain research scenarios.
Now, let's delve into the various forms that observational studies can take, each with its distinct characteristics and applications.
Cohort Studies: A cohort study is a type of observational study that entails tracking one group of individuals over an extended period. Its primary goal is to identify potential causes or risk factors for specific outcomes or treatment group. Cohort studies provide valuable insights into the development of conditions or diseases and the factors that influence them.
Case-Control Studies: Case-control studies, on the other hand, involve the comparison of individuals with a particular condition or outcome to those without it (the control group). These studies aim to discern potential causal factors or associations that may have contributed to the development of the condition under investigation.
Cross-Sectional Studies: Cross-sectional studies take a snapshot of a diverse group of individuals at a single point in time. By collecting data from this snapshot, researchers gain insights into the prevalence of a specific condition or the relationships between variables at that precise moment. Cross-sectional studies are often used to assess the health status of the different groups within a population or explore the interplay between various factors.
Observational studies, as we've explored, are a vital pillar of scientific research, offering unique insights into real-world phenomena. In this section, we will dissect the advantages and limitations that characterize these studies, shedding light on the intricacies that researchers grapple with when employing this methodology.
Advantages: One of the paramount advantages of observational studies lies in their utilization of real-world data. Unlike controlled experiments that operate in artificial settings, observational studies embrace the complexities of the natural world. This approach enables researchers to capture genuine behaviors, patterns, and occurrences as they unfold. As a result, the data collected reflects the intricacies of real-life scenarios, making it highly relevant and applicable to diverse settings and populations.
Moreover, in a randomized controlled trial, researchers looked to randomly assign participants to a group. Observational studies excel in their capacity to examine long-term trends. By observing one group of subjects over extended periods, research scientists gain the ability to track developments, trends, and shifts in behavior or outcomes. This longitudinal perspective is invaluable when studying phenomena that evolve gradually, such as chronic diseases, societal changes, or environmental shifts. It allows for the detection of subtle nuances that may be missed in shorter-term investigations.
Limitations: However, like any research methodology, observational studies are not without their limitations. One significant challenge of statistical study lies in the potential for biases. Since researchers do not intervene in the subjects' experiences, various biases can creep into the data collection process. These biases may arise from participant self-reporting, observer bias, or selection bias in random sample, among others. Careful design and rigorous data analysis are crucial for mitigating these biases.
Another limitation is the presence of confounding variables. In observational studies, it can be challenging to isolate the effect of a specific variable from the myriad of other factors at play. These confounding variables can obscure the true relationship between the variables of interest, making it difficult to establish causation definitively. Research scientists must employ statistical techniques to control for or adjust these confounding variables.
Additionally, observational studies face constraints in their ability to establish causation. While they can identify associations and correlations between variables, they cannot prove causality or causal relationship. Establishing causation typically requires controlled experiments where researchers can manipulate independent variables systematically. In observational studies, researchers can only infer potential causation based on the observed associations.
In the intricate landscape of scientific research, we now turn our gaze toward experimental studies—a dynamic and powerful method that Santos Research Center, Corp. skillfully employs in our pursuit of knowledge.
While some studies observe and gather data passively, experimental studies take a more proactive approach. Here, researchers actively introduce an intervention or treatment to an experiment group study its effects on one or more variables. This methodology empowers researchers to manipulate independent variables deliberately and examine their direct impact on dependent variables.
Experimental research are distinguished by their exceptional ability to establish cause-and-effect relationships. This invaluable characteristic allows researchers to unlock the mysteries of how one variable influences another, offering profound insights into the scientific questions at hand. Within the controlled environment of an experimental study, researchers can systematically test hypotheses, shedding light on complex phenomena.
Central to statistical analysis, the rigor and reliability of experimental studies are several key features that ensure the validity of their findings.
Randomized Controlled Trials: Randomization is a critical element in experimental studies, as it ensures that subjects are assigned to groups in a random assignment. This randomly assigned allocation minimizes the risk of unintentional biases and confounding variables, strengthening the credibility of the study's outcomes.
Control Groups: Control groups play a pivotal role in experimental studies by serving as a baseline for comparison. They enable researchers to assess the true impact of the intervention being studied. By comparing the outcomes of the intervention group to those of survey members of the control group, researchers can discern whether the intervention caused the observed changes.
Blinding: Both single-blind and double-blind techniques are employed in experimental studies to prevent biases from influencing the study or controlled trial's outcomes. Single-blind studies keep either the subjects or the researchers unaware of certain aspects of the study, while double-blind studies extend this blindness to both parties, enhancing the objectivity of the study.
These key features work in concert to uphold the integrity and trustworthiness of the results generated through experimental studies.
As with any research methodology, this one comes with its unique set of advantages and limitations.
Advantages: These studies offer the distinct advantage of establishing causal relationships between two or more variables together. The controlled environment allows researchers to exert authority over variables, ensuring that changes in the dependent variable can be attributed to the independent variable. This meticulous control results in high-quality, reliable data that can significantly contribute to scientific knowledge.
Limitations: However, experimental ones are not without their challenges. They may raise ethical concerns, particularly when the interventions involve potential risks to subjects. Additionally, their controlled nature can limit their real-world applicability, as the conditions in experiments may not accurately mirror those in the natural world. Moreover, executing an experimental study in randomized controlled, often demands substantial resources, with other variables including time, funding, and personnel.
Having previously examined observational and experimental studies individually, we now embark on a side-by-side comparison to illuminate the key distinctions and commonalities between these foundational research approaches.
Methodologies
Observational studies excel at exploring associations and uncovering patterns within the intricacies of real-world settings, while experimental studies shine as the gold standard for discerning cause-and-effect relationships through meticulous control and manipulation in controlled environments. Understanding these differences and similarities empowers researchers to choose the most appropriate method for their specific research objectives.
The decision to employ either observational or experimental studies hinges on the research objectives at hand and the available resources. Observational studies prove invaluable when variable manipulation is impractical or ethically challenging, making them ideal for delving into long-term trends and uncovering intricate associations between certain variables (response variable or explanatory variable). On the other hand, experimental studies emerge as indispensable tools when the aim is to definitively establish causation and methodically control variables.
At Santos Research Center, Corp., our approach to both scientific study and methodology is characterized by meticulous consideration of the specific research goals. We recognize that the quality of outcomes hinges on selecting the most appropriate method of research study. Our unwavering commitment to employing both observational and experimental research studies further underscores our dedication to advancing scientific knowledge across diverse domains.
In conclusion, both observational and experimental studies are integral to scientific research, offering complementary approaches with unique strengths and limitations. At Santos Research Center, Corp., we leverage these methodologies to contribute meaningfully to the scientific community.
Explore our projects and initiatives at Santos Research Center, Corp. by visiting our website or contacting us at (813) 249-9100, where our unwavering commitment to rigorous research practices and advancing scientific knowledge awaits.
At Santos Research Center, a medical research facility dedicated to advancing TBI treatments, we emphasize the importance of tailored rehabilitation...
Learn about COVID-19 rebound after Paxlovid, its symptoms, causes, and management strategies. Join our study at Santos Research Center. Apply now!
Learn everything about Respiratory Syncytial Virus (RSV), from symptoms and diagnosis to treatment and prevention. Stay informed and protect your health with...
Discover key insights on Alzheimer's disease, including symptoms, stages, and care tips. Learn how to manage the condition and find out how you can...
Discover expert insights on migraines, from symptoms and causes to management strategies, and learn about our specialized support at Santos Research Center.
Explore our in-depth guide on UTIs, covering everything from symptoms and causes to effective treatments, and learn how to manage and prevent urinary tract infections.
Your definitive guide to COVID symptoms. Dive deep into the signs of COVID-19, understand the new variants, and get answers to your most pressing questions.
Santos Research Center, Corp. is a research facility conducting paid clinical trials, in partnership with major pharmaceutical companies & CROs. We work with patients from across the Tampa Bay area.
Navigation menu.
Correlational vs. experimental, empirical vs. non-empirical.
Qualitative Research gathers data about lived experiences, emotions or behaviors, and the meanings individuals attach to them. It assists in enabling researchers to gain a better understanding of complex concepts, social interactions or cultural phenomena. This type of research is useful in the exploration of how or why things have occurred, interpreting events and describing actions.
Quantitative Research gathers numerical data which can be ranked, measured or categorized through statistical analysis. It assists with uncovering patterns or relationships, and for making generalizations. This type of research is useful for finding out how many, how much, how often, or to what extent.
: can be structured, semi-structured or unstructured. | : the same questions asked to large numbers of participants (e.g., Likert scale response) (see book below). |
: several participants discussing a topic or set of questions. | : test hypothesis in controlled conditions (see video below). |
: can be on-site, in-context, or role play (see video below). | : counting the number of times a phenomenon occurs or coding observed data in order to translate it into numbers. |
: analysis of correspondence or reports. | : using numerical data from financial reports or counting word occurrences. |
: memories told to a researcher. |
Correlational Research cannot determine causal relationships. Instead they examine relationships between variables.
Experimental Research can establish causal relationship and variables can be manipulated.
Empirical Studies are based on evidence. The data is collected through experimentation or observation.
Non-empirical Studies do not require researchers to collect first-hand data.
What's the difference.
Case study and single-case experimental designs are both research methods used in psychology and other social sciences to investigate individual cases or subjects. However, they differ in their approach and purpose. Case studies involve in-depth examination of a single case, such as an individual, group, or organization, to gain a comprehensive understanding of the phenomenon being studied. On the other hand, single-case experimental designs focus on studying the effects of an intervention or treatment on a single subject over time. These designs use repeated measures and control conditions to establish cause-and-effect relationships. While case studies provide rich qualitative data, single-case experimental designs offer more rigorous experimental control and allow for the evaluation of treatment effectiveness.
Attribute | Case Study | Single-Case Experimental Designs |
---|---|---|
Research Design | Qualitative | Quantitative |
Focus | Exploratory | Hypothesis Testing |
Sample Size | Usually small | Usually small |
Data Collection | Observations, interviews, documents | Observations, measurements |
Data Analysis | Qualitative analysis | Statistical analysis |
Generalizability | Low | Low |
Internal Validity | Low | High |
External Validity | Low | Low |
Introduction.
When conducting research in various fields, it is essential to choose the appropriate study design to answer research questions effectively. Two commonly used designs are case study and single-case experimental designs. While both approaches aim to provide valuable insights into specific phenomena, they differ in several key attributes. This article will compare and contrast the attributes of case study and single-case experimental designs, highlighting their strengths and limitations.
A case study is an in-depth investigation of a particular individual, group, or event. It involves collecting and analyzing qualitative or quantitative data to gain a comprehensive understanding of the subject under study. Case studies are often used to explore complex phenomena, generate hypotheses, or provide detailed descriptions of unique cases.
On the other hand, single-case experimental designs are a type of research design that focuses on studying a single individual or a small group over time. These designs involve manipulating an independent variable and measuring its effects on a dependent variable. Single-case experimental designs are particularly useful for examining cause-and-effect relationships and evaluating the effectiveness of interventions or treatments.
In terms of data collection, case studies rely on various sources such as interviews, observations, documents, and artifacts. Researchers often employ multiple methods to gather rich and diverse data, allowing for a comprehensive analysis of the case. The data collected in case studies are typically qualitative in nature, although quantitative data may also be included.
In contrast, single-case experimental designs primarily rely on quantitative data collection methods. Researchers use standardized measures and instruments to collect data on the dependent variable before, during, and after the manipulation of the independent variable. This allows for a systematic analysis of the effects of the intervention or treatment on the individual or group being studied.
One of the key differences between case studies and single-case experimental designs is their generalizability. Case studies are often conducted on unique or rare cases, making it challenging to generalize the findings to a larger population. The focus of case studies is on providing detailed insights into specific cases rather than making broad generalizations.
On the other hand, single-case experimental designs aim to establish causal relationships and can provide evidence for generalizability. By systematically manipulating the independent variable and measuring its effects on the dependent variable, researchers can draw conclusions about the effectiveness of interventions or treatments that may be applicable to similar cases or populations.
Internal validity refers to the extent to which a study accurately measures the cause-and-effect relationship between variables. In case studies, establishing internal validity can be challenging due to the lack of control over extraneous variables. The presence of multiple data sources and the potential for subjective interpretation may also introduce bias.
In contrast, single-case experimental designs prioritize internal validity by employing rigorous control over extraneous variables. Researchers carefully design the intervention or treatment, implement it consistently, and measure the dependent variable under controlled conditions. This allows for a more confident determination of the causal relationship between the independent and dependent variables.
Case studies often require significant time and resources due to their in-depth nature. Researchers need to spend considerable time collecting and analyzing data from various sources, conducting interviews, and immersing themselves in the case. Additionally, case studies may involve multiple researchers or a research team, further increasing the required resources.
On the other hand, single-case experimental designs can be more time and resource-efficient. Since they focus on a single individual or a small group, data collection and analysis can be more streamlined. Researchers can also implement interventions or treatments in a controlled manner, reducing the time and resources needed for data collection.
Both case studies and single-case experimental designs require researchers to consider ethical implications. In case studies, researchers must ensure the privacy and confidentiality of the individuals or groups being studied. Informed consent and ethical guidelines for data collection and analysis should be followed to protect the rights and well-being of the participants.
Similarly, in single-case experimental designs, researchers must consider ethical considerations when implementing interventions or treatments. The well-being and safety of the individual or group being studied should be prioritized, and informed consent should be obtained. Additionally, researchers should carefully monitor and evaluate the potential risks and benefits associated with the intervention or treatment.
Case studies and single-case experimental designs are valuable research approaches that offer unique insights into specific phenomena. While case studies provide in-depth descriptions and exploratory analyses of individual cases, single-case experimental designs focus on establishing causal relationships and evaluating interventions or treatments. Researchers should carefully consider the attributes and goals of their study when choosing between these two designs, ensuring that the selected approach aligns with their research questions and objectives.
Comparisons may contain inaccurate information about people, places, or facts. Please report any issues.
Case studies and experiments are two distinct research methods used across various disciplines, providing researchers with the ability to study and analyze a subject through different approaches. This variety in research methods allows the researcher to gather both qualitative and quantitative data, cross-check the data, and assign greater validity to the conclusions and overall findings of the research. A case study is a research method in which the researcher explores the subject in depth, while an experiment is a research method where two specific groups or variables are used to test a hypothesis. This article will examine the differences between case study and experiment further.
A case study is a research method where an individual, event, or significant place is studied in depth. In the case of an individual, the researcher studies the person’s life history, which can include important days or special experiences. The case study method is used in various social sciences such as sociology, anthropology, and psychology. Through a case study, the researcher can identify and understand the subjective experiences of an individual regarding a specific topic. For example, a researcher studying the impact of second rape on the lives of rape victims can conduct several case studies to understand the subjective experiences of individuals and social mechanisms that contribute to this phenomenon. The case study is a qualitative research method that can be subjective.
An experiment, unlike a case study, can be classified as a quantitative research method, as it provides statistically significant data and an objective, empirical approach. Experiments are primarily used in natural sciences, as they allow the scientist to control variables. In social sciences, controlling variables can be challenging and may lead to faulty conclusions. In an experiment, there are mainly two variables: the independent variable and the dependent variable. The researcher tries to test their hypothesis by manipulating these variables. There are different types of experiments, such as laboratory experiments (conducted in laboratories where conditions can be strictly controlled) and natural experiments (which take place in real-life settings). As seen, case study methods and experiments are very different from one another. However, most researchers prefer to use triangulation when conducting research to minimize biases.
Save my name, email, and website in this browser for the next time I comment.
Difference between power & authority, distinguishing could of & could have, distinguishing pixie & bob haircuts, distinguishing between debate & discussion, distinguishing between dialogue & conversation, distinguishing between a present & a gift, distinguishing between will & can, distinguishing between up & upon.
Home Market Research
Descriptive research and Correlational research are two important types of research studies that help researchers make ambitious and measured decisions in their respective fields. Both descriptive research and correlational research are used in descriptive correlational research.
Descriptive research is defined as a research method that involves observing behavior to describe attributes objectively and systematically. A descriptive research project seeks to comprehend phenomena or groups in depth.
Correlational research , on the other hand, is a method that describes and predicts how variables are naturally related in the real world without the researcher attempting to alter them or assign causation between them.
The main objective of descriptive research is to create a snapshot of the current state of affairs, whereas correlational research helps in comparing two or more entities or variables.
Descriptive correlational research is a type of research design that tries to explain the relationship between two or more variables without making any claims about cause and effect. It includes collecting and analyzing data on at least two variables to see if there is a link between them.
In descriptive correlational research, researchers collect data to explain the variables of interest and figure out how they relate. The main goal is to give a full account of the variables and how they are related without changing them or assuming that one thing causes another.
In descriptive correlational research, researchers do not change any variables or try to find cause-and-effect connections. Instead, they just watch and measure the variables of interest and then look at the patterns and relationships that emerge from the data.
Experimental research involves the independent variable to see how it affects the dependent variable, while descriptive correlational research just describes the relationship between variables.
In descriptive correlational research, correlational research designs measure the magnitude and direction of the relationship between two or more variables, revealing their associations. At the outset creating initial equivalence between the groups or variables being compared is essential in descriptive correlational research
The independent variable occurs prior to the measurement of the measured dependent variable in descriptive correlational research. Its goal is to explain the traits or actions of a certain population or group and look at the connections between independent and dependent variables.
Descriptive research is carried out using three methods, namely:
Correlational research also uses naturalistic observation to collect data. However, in addition, it uses archival data to gather information. Archival data is collected from previously conducted research of a similar nature. Archival data is collected through primary research.
In contrast to naturalistic observation, information collected through archived is straightforward. For example, counting the number of people named Jacinda in the United States using their social security number.
Descriptive research is used to uncover new facts and the meaning of research. | Correlational research is carried out to measure two variables. | |
Descriptive research is analytical, where in-depth studies help collect information during research. | Correlational nature is mathematical in nature. A positive correlation appears coefficient to statistically measure the relationship between two variables. | |
Descriptive nature provides a knowledge base for carrying out other | This type of research is used to explore the extent to which two variables in a study are related. | |
Research was done to obtain information on the hospitality industry’s most widely used employee motivation tools. | Research has been done to know if cancer and marriage are related. |
The key features of descriptive correlational research include the following:
The main goal, just like with descriptive research, is to describe the variables of interest thoroughly. Researchers aim to explain a certain group or event’s traits, behaviors, or attitudes.
Like correlational research, descriptive correlational research looks at how two or more factors are related. It looks at how variables are connected to each other, such as how they change over time or how they are linked.
Most methods for analyzing quantitative analysis data are used in descriptive correlational research. Researchers use statistical methods to study and measure the size and direction of relationships between variables.
As with correlational research, the researcher does not change or control the variables. The data is taken in its natural environment without any changes or interference.
Cross-sectional or longitudinal designs can be used for descriptive correlational research. It collects data at one point in time, while longitudinal research collects data over a long period of time to look at changes and relationships over time.
For example, descriptive correlational research could look at the link between a person’s age and how much money they make. The researcher would take a sample of people’s ages and incomes and then look at the data to see if there is a link between the two factors.
Descriptive correlational research is a good way to learn about the characteristics of a population or group and the relationships between its different parts. It lets researchers describe variables in detail and look into their relationships without suggesting that one variable caused another.
Descriptive correlational research gives useful insights and can be used as a starting point for more research or to come up with hypotheses. It’s important to be aware of the problems with this type of study, such as the fact that it can’t show cause and effect and relies on cross-sectional data.
Still, descriptive correlational research helps us understand things and makes making decisions in many areas easier.
QuestionPro is a very useful tool for descriptive correlational research. Its many features and easy-to-use interface help researchers collect and study data quickly, giving them a better understanding of the characteristics and relationships between variables in a certain population or group.
The different kinds of questions, analytical research tools, and reporting features on the software improve the research process and help researchers come up with useful results. QuestionPro makes it easier to do descriptive correlational research, which makes it a useful tool for learning important things and making decisions in many fields.
LEARN MORE FREE TRIAL
Sep 2, 2024
Aug 30, 2024
Aug 29, 2024
Warning: The NCBI web site requires JavaScript to function. more...
An official website of the United States government
The .gov means it's official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
Lau F, Kuziemsky C, editors. Handbook of eHealth Evaluation: An Evidence-based Approach [Internet]. Victoria (BC): University of Victoria; 2017 Feb 27.
Chapter 12 methods for correlational studies.
Francis Lau .
Correlational studies aim to find out if there are differences in the characteristics of a population depending on whether or not its subjects have been exposed to an event of interest in the naturalistic setting. In eHealth, correlational studies are often used to determine whether the use of an eHealth system is associated with a particular set of user characteristics and/or quality of care patterns ( Friedman & Wyatt, 2006 ). An example is a computerized provider order entry ( cpoe ) study to differentiate the background, usage and performance between clinical users and non-users of the cpoe system after its implementation in a hospital.
Correlational studies are different from comparative studies in that the evaluator does not control the allocation of subjects into comparison groups or assignment of the intervention to specific groups. Instead, the evaluator defines a set of variables including an outcome of interest then tests for hypothesized relations among these variables. The outcome is known as the dependent variable and the variables being tested for association are the independent variables. Correlational studies are similar to comparative studies in that they take on an objectivist view where the variables can be defined, measured and analyzed for the presence of hypothesized relations. As such, correlational studies face the same challenges as comparative studies in terms of their internal and external validity. Of particular importance are the issues of design choices, selection bias, confounders, and reporting consistency.
In this chapter we describe the basic types of correlational studies seen in the eHealth literature and their methodological considerations. Also included are three case examples to show how these studies are done.
Correlational studies, better known as observational studies in epidemiology, are used to examine event exposure, disease prevalence and risk factors in a population ( Elwood, 2007 ). In eHealth, the exposure typically refers to the use of an eHealth system by a population of subjects in a given setting. These subjects may be patients, providers or organizations identified through a set of variables that are thought to differ in their measured values depending on whether or not the subjects were “exposed” to the eHealth system.
There are three basic types of correlational studies that are used in eHealth evaluation: cohort, cross-sectional, and case-control studies ( Vandenbroucke et al., 2014 ). These are described below.
A cross-sectional survey is a type of cross-sectional study where the data source is drawn from postal questionnaires and interviews. This topic will be covered in the chapter on methods for survey studies.
While correlational studies are considered less rigorous than rct s, they are the preferred designs when it is neither feasible nor ethical to conduct experimental trials. Key methodological issues arise in terms of: (a) design options, (b) biases and confounders, (c) controlling for confounding effects, (d) adherence to good practices, and (e) reporting consistency. These issues are discussed below.
There are growing populations with multiple chronic conditions and healthcare interventions. They have made it difficult to design rct s with sufficient sample size and long-term follow-up to account for all the variability this phenomenon entails. Also rct s are intended to test the efficacy of an intervention in a restricted sample of subjects under ideal settings. They have limited generalizability to the population at large in routine settings ( Fleurence, Naci, & Jansen, 2010 ). As such, correlational studies, especially those involving the use of routinely collected ehr data from the general population, have become viable alternatives to rct s. There are advantages and disadvantages to each of the three design options presented above. They are listed below.
Shamliyan, Kane, and Dickinson (2010) conducted a systematic review on tools used to assess the quality of observational studies. Despite the large number of quality scales and checklists found in the literature, they concluded that the universal concerns are in the areas of selection bias, confounding, and misclassification. These concerns, also mentioned by Vandenbroucke and colleagues (2014) in their reporting guidelines for observational studies, are summarized below.
It is important to note that bias and confounding are not synonymous. Bias is caused by finding the wrong association from flawed information or subject selection. Confounding is factually correct with respect to the relationship found, but is incorrect in its interpretation due to an extraneous factor that is associated with both the exposure and outcome.
There are three common methods to control for confounding effects. These are by matching, stratification, and modelling. They are described below ( Higgins & Green, 2011 ).
The ispor Good Research Practices Task Force published a set of recommendations in designing, conducting and reporting prospective observational studies for comparative effectiveness research ( Berger et al., 2012 ) that are relevant to eHealth evaluation. Their key recommendations are listed below.
Vandenbroucke et al. (2014) published an expanded version of the Strengthening the Reporting of Observational Studies in Epidemiology ( strobe ) statement to improve the reporting of observational studies that can be applied in eHealth evaluation. It is made up of 22 items, of which 18 are common to cohort, case-control and cross-sectional studies, with four being specific to each of the three designs. The 22 reporting items are listed below (for details refer to the cited reference).
The four items specific to study design relate to the reporting of participants, statistical methods, descriptive results and outcome data. They are briefly described below for the three types of designs.
12.4.1. cohort study of automated immunosuppressive care.
Park and colleagues (2010) conducted a retrospective cohort study to examine the association between the use of a cds (clinical decision support) system in post-liver transplant immunosuppressive care and the rates of rejection episode and drug toxicity. The study is summarized below.
Linder, Schnipper, and Middleton (2012) conducted a cross-sectional study to examine the association between the type of ehr documentation used by physicians and the quality of care provided. The study is summarized below.
Nielsen, Halamka, and Kinkel (2012) conducted a case-control study to evaluate whether there was an association between active Internet patient portal use by Multiple Sclerosis ( ms ) patients and medical resource utilization. Patient predictors and barriers to portal use were also identified. The study is summarized below.
A general limitation of a correlational study is that it can determine association between exposure and outcomes but cannot predict causation. The more specific limitations of the three case examples cited by the authors are listed below.
In this chapter we described cohort, case-control and cross-sectional studies as three types of correlational studies used in eHealth evaluation. The methodological issues addressed include bias and confounding, controlling for confounders, adherence to good practices and consistency in reporting. Three case examples were included to show how eHealth correlational studies are done.
1 ISPOR – International Society for Pharmacoeconomics and Outcomes Research
This publication is licensed under a Creative Commons License, Attribution-Noncommercial 4.0 International License (CC BY-NC 4.0): see https://creativecommons.org/licenses/by-nc/4.0/
Your browsing activity is empty.
Activity recording is turned off.
Turn recording back on
Connect with NLM
National Library of Medicine 8600 Rockville Pike Bethesda, MD 20894
Web Policies FOIA HHS Vulnerability Disclosure
Help Accessibility Careers
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Nature Communications volume 15 , Article number: 7108 ( 2024 ) Cite this article
28 Altmetric
Metrics details
Climate warming disproportionately impacts countries in the Global South by increasing extreme heat exposure. However, geographic disparities in adaptation capacity are unclear. Here, we assess global inequality in green spaces, which urban residents critically rely on to mitigate outdoor heat stress. We use remote sensing data to quantify daytime cooling by urban greenery in the warm seasons across the ~500 largest cities globally. We show a striking contrast, with Global South cities having ~70% of the cooling capacity of cities in the Global North (2.5 ± 1.0 °C vs. 3.6 ± 1.7 °C). A similar gap occurs for the cooling adaptation benefits received by an average resident in these cities (2.2 ± 0.9 °C vs. 3.4 ± 1.7 °C). This cooling adaptation inequality is due to discrepancies in green space quantity and quality between cities in the Global North and South, shaped by socioeconomic and natural factors. Our analyses further suggest a vast potential for enhancing cooling adaptation while reducing global inequality.
Introduction.
Heat extremes are projected to be substantially intensified by global warming 1 , 2 , imposing a major threat to human mortality and morbidity in the coming decades 3 , 4 , 5 , 6 . This threat is particularly concerning as a majority of people now live in cities 7 , including those cities suffering some of the hottest climate extremes. Cities face two forms of warming: warming due to climate change and warming due to the urban heat island effect 8 , 9 , 10 . These two forms of warming have the potential to be additive, or even multiplicative. Climate change in itself is projected to result in rising maximum temperatures above 50 °C for a considerable fraction of the world if 2 °C global warming is exceeded 2 ; the urban heat island effect will cause up to >10 °C additional (surface) warming 11 . Exposures to temperatures above 35 °C with high humidity or above 40 °C with low humidity can lead to lethal heat stress for humans 12 . Even before such lethal temperatures are reached, worker productivity 13 and general health and well-being 14 can suffer. Heat extremes are especially risky for people living in the Global South 15 , 16 due to warmer climates at low latitudes. Climate models project that the lethal temperature thresholds will be exceeded with increasing frequencies and durations, and such extreme conditions will be concentrated in low-latitude regions 17 , 18 , 19 . These low-latitude regions overlap with the major parts of the Global South where population densities are already high and where population growth rates are also high. Consequently, the number of people exposed to extreme heat will likely increase even further, all things being equal 16 , 20 . That population growth will be accompanied by expanded urbanization and intensified urban heat island effects 21 , 22 , potentially exacerbating future Global North-Global South heat stress exposure inequalities.
Fortunately, we know that heat stress can be buffered, in part, by urban vegetation 23 . Urban green spaces, and especially urban forests, have proven an effective means through which to ameliorate heat stress through shading 24 , 25 and transpirational cooling 26 , 27 . The buffering effect of urban green spaces is influenced by their area (relative to the area of the city) and their spatial configuration 28 . In this context, green spaces become a kind of infrastructure that can and should be actively managed. At broad spatial scales, the effect of this urban green infrastructure is also mediated by differences among regions, whether in their background climate 29 , composition of green spaces 30 , or other factors 31 , 32 , 33 , 34 . The geographic patterns of the buffering effects of green spaces, whether due to geographic patterns in their areal extent or region-specific effects, have so far been poorly characterized.
On their own, the effects of climate change and urban heat islands on human health are likely to become severe. However, these effects will become even worse if they fall disproportionately in cities or countries with less economic ability to invest in green space 35 or in other forms of cooling 36 , 37 . A number of studies have now documented the so-called ‘luxury effect,’ wherein lower-income parts of cities tend to have less green space and, as a result, reduced biodiversity 38 , 39 . Where the luxury effect exists, green space and its benefits become, in essence, a luxury good 40 . If the luxury effect holds among cities, and lower-income cities also have smaller green spaces, the Global South may have the least potential to mitigate the combined effects of climate warming and urban heat islands, leading to exacerbated and rising inequalities in heat exposure 41 .
Here, we assess the global inequalities in the cooling capability of existing urban green infrastructure across urban areas worldwide. To this end, we use remotely sensed data to quantify three key variables, i.e., (1) cooling efficiency, (2) cooling capacity, and (3) cooling benefit of existing urban green infrastructure for ~500 major cities across the world. Urban green infrastructure and temperature are generally negatively and relatively linearly correlated at landscape scales, i.e., higher quantities of urban green infrastructure yield lower temperatures 42 , 43 . Cooling efficiency is widely used as a measure of the extent to which a given proportional increase in the area of urban green infrastructure leads to a decrease in temperature, i.e., the slope of the urban green infrastructure-temperature relationship 42 , 44 , 45 (see Methods for details). This simple metric allows quantifying the quality of urban green infrastructure in terms of ameliorating the urban heat island effect. Meanwhile, the extent to which existing urban green infrastructure cools down an entire city’s surface temperatures (compared to the non-vegetated built-up areas) is referred to as cooling capacity. Hence, cooling capacity is a function of the total quantity of urban green infrastructure and its cooling efficiency (see Methods).
As a third step, we account for the spatial distributions of urban green infrastructure and populations to quantify the benefit of cooling mitigation received by an average urban inhabitant in each city given their location. This cooling benefit is a more direct measure of the cooling realized by people, after accounting for the within-city geography of urban green infrastructure and population density. We focus on cooling capacity and cooling benefit as the measures of the cooling capability of individual cities for assessing their global inequalities. We are particularly interested in linking cooling adaptation inequality with income inequality 40 , 46 . While this can be achieved using existing income metrics for country classifications 47 , here we use the traditional Global North/South classification due to its historical ties to geography which is influential in climate research.
Our analyses indicate that existing green infrastructure of an average city has a capability of cooling down surface temperatures by ~3 °C during warm seasons. However, a concerning disparity is evident; on average Global South cities have only two-thirds the cooling capacity and cooling benefit compared to Global North cities. This inequality is attributable to the differences in both quantity and quality of existing urban green infrastructure among cities. Importantly, we find that there exists considerable potential for many cities to enhance the cooling capability of their green infrastructure; achieving this potential could dramatically reduce global inequalities in adaptation to outdoor heat stress.
Our analyses showed that both the quantity and quality of the existing urban green infrastructure vary greatly among the world’s ~500 most populated cities (see Methods for details, and Fig. 1 for examples). The quantity of urban green infrastructure measured based on remotely sensed indicators of spectral greenness (Normalized Difference Vegetation Index, NDVI, see Methods) had a coefficient of variation (CV) of 35%. Similarly, the quality of urban green infrastructure in terms of cooling efficiency (daytime land surface temperatures during peak summer) had a CV of 37% (Supplementary Figs. 1 , 2 ). The global mean value of cooling capacity is 2.9 °C; existing urban green infrastructure ameliorates warm-season heat stress by 2.9 °C of surface temperature in an average city. In truth, however, the variation in cooling capacity was great (global CV in cooling capacity as large as ~50%), such that few cities were average. This variation is strongly geographically structured. Cities closer to the equator - tropical and subtropical cities - tend to have relatively weak cooling capacities (Fig. 2a, b ). As Global South countries are predominantly located at low latitudes, this pattern leads to a situation in which Global South cities, which tend to be hotter and relatively lower-income, have, on average, approximately two-thirds the cooling capacity of the Global North cities (2.5 ± 1.0 vs. 3.6 ± 1.7°C, Wilcoxon test, p = 2.7e-12; Fig. 2c ). The cities that most need to rely on green infrastructure are, at present, those that are least able to do so.
a , e , i , m , q Los Angeles, US. b , f , j , n , r Paris, France. c , g , k , o , s Shanghai, China. d , h , l , p , t Cairo, Egypt. Local cooling efficiency is calculated for different local climate zone types to account for within-city heterogeneity. In densely populated parts of cities, local cooling capacity tends to be lower due to reduced green space area, whereas local cooling benefit (local cooling capacity multiplied by a weight term of local population density relative to city mean) tends to be higher as more urban residents can receive cooling amelioration.
a Global distribution of cooling capacity for the 468 major urbanized areas. b Latitudinal pattern of cooling capacity. c Cooling capacity difference between the Global North and South cities. The cooling capacity offered by urban green infrastructure evinces a latitudinal pattern wherein lower-latitude cities have weaker cooling capacity ( b , cubic-spline fitting of cooling capacity with 95% confidence interval is shown), representing a significant inequality between Global North and South countries: city-level cooling capacity for Global North cities are about 1.5-fold higher than in Global South cities ( c ). Data are presented as box plots, where median values (center black lines), 25th percentiles (box lower bounds), 75th percentiles (box upper bounds), whiskers extending to 1.5-fold of the interquartile range (IQR), and outliers are shown. The tails of the cooling capacity distributions are truncated at zero as all cities have positive values of cooling capacity. Notice that no cities in the Global South have a cooling capacity greater than 5.5 °C ( c ). This is because no cities in the Global South have proportional green space areas as great as those seen in the Global North (see also Fig. 4b ). A similar pattern is found for cooling benefit (Supplementary Fig. 3 ). The two-sided non-parametric Wilcoxon test was used for statistical comparisons.
When we account for the locations of urban green infrastructure relative to humans within cities, the cooling benefit of urban green infrastructure realized by an average urban resident generally becomes slightly lower than suggested by cooling capacity (see Methods; Supplementary Fig. 3 ). Urban residents tend to be densest in the parts of cities with less green infrastructure. As a result, the average urban resident experiences less cooling amelioration than expected. However, this heterogeneity has only a minor effect on global-scale inequality. As a result, the geographic trends in cooling capacity and cooling benefit are similar: mean cooling benefit for an average urban resident also presents a 1.5-fold gap between Global South and North cities (2.2 ± 0.9 vs. 3.4 ± 1.7 °C, Wilcoxon test, p = 3.2e-13; Supplementary Fig. 3c ). Urban green infrastructure is a public good that has the potential to help even the most marginalized populations stay cool; unfortunately, this public benefit is least available in the Global South. When walking outdoors, the average person in an average Global South city receives only two-thirds the cooling amelioration from urban green infrastructure experienced by a person in an average Global North city. The high cooling amelioration capacity and benefit of the Global North cities is heavily influenced by North America (specifically, Canada and the US), which have both the highest cooling efficiency and the largest area of green infrastructure, followed by Europe (Supplementary Fig. 4 ).
One way to illustrate the global inequality of cooling capacity or benefit is to separately look at the cities that are most and least effective in ameliorating outdoor heat stress. Our results showed that ~85% of the 50 most effective cities (with highest cooling capacity or cooling benefit) are located in the Global North, while ~80% of the 50 least effective are Global South cities (Fig. 3 , Supplementary Fig. 5 ). This is true without taking into account the differences in the background temperatures and climate warming of these cities, which will exacerbate the effects on human health; cities in the Global South are likely to be closer to the limits of human thermal comfort and even, increasingly, the limits of the temperatures and humidities (wet-bulb temperatures) at which humans can safely work or even walk, such that the ineffectiveness of green spaces in those cities in cooling will lead to greater negative effects on human health 48 , work 14 , and gross domestic product (GDP) 49 . In addition, Global South cities commonly have higher population densities (Fig. 3 , Supplementary Fig. 5 ) and are projected to have faster population growth 50 . This situation will plausibly intensify the urban heat island effect because of the need of those populations for housing (and hence tensions between the need for buildings and the need for green spaces). It will also increase the number of people exposed to extreme urban heat island effects. Therefore, it is critical to increase cooling benefit via expanding urban green spaces, so that more people can receive the cooling mitigation from a given new neighboring green space if they live closer to each other. Doing so will require policies that incentivize urban green spaces as well as architectural innovations that make innovations such as plant-covered buildings easier and cheaper to implement.
The axes on the right are an order of magnitude greater than those on the left, such that the cooling capacity of Charlotte in the United States is about 37-fold greater than that of Mogadishu (Somalia) and 29-fold greater than that of Sana’a (Yemen). The cities presenting lowest cooling capacities are most associated with Global South cities at higher population densities.
Of course, cities differ even within the Global North or within the Global South. For example, some Global South cities have high green space areas (or relatively high cooling efficiency in combination with moderate green space areas) and hence high cooling capacity. These cities, such as Pune (India), will be important to study in more detail, to shed light on the mechanistic details of their cooling abilities as well as the sociopolitical and other factors that facilitated their high green area coverage and cooling capabilities (Supplementary Figs. 6 , 7 ).
We conducted our primary analyses using a spatial grain of 100-m grid cells and Landsat NDVI data for quantifying spectral greenness. Our results, however, were robust at the coarser spatial grain of 1 km. We find a slightly larger global cooling inequality (~2-fold gap between Global South and North cities) at the 1-km grain using MODIS data (see Methods and Supplementary Fig. 17 ). MODIS data have been frequently used for quantifying urban heat island effects and cooling mitigation 44 , 45 , 51 . Our results reinforce its robustness for comparing urban thermal environments between cities across broad scales.
The global inequality of cooling amelioration could have a number of proximate causes. To understand their relative influence, we first separately examined the effects of quality (cooling efficiency) and quantity (NDVI as a proxy indicator of urban green space area) of urban green infrastructure. The simplest null model is one in which cooling capacity (at the city scale) and cooling benefit (at the human scale) are driven primarily by the proportional area in a city dedicated to green spaces. Indeed, we found that both cooling capacity and cooling benefit were strongly correlated with urban green space area (Fig. 4 , Supplementary Fig. 8 ). This finding is useful with regards to practical interventions. In general, cities that invest in saving or restoring more green spaces will receive more cooling benefits from those green spaces. By contrast, differences among cities in cooling efficiency played a more minor role in determining the cooling capacity and benefit of cities (Fig. 4 , Supplementary Fig. 8 ).
a Relationship between cooling efficiency and cooling capacity. b Relationship between green space area (measured by mean Landsat NDVI in the hottest month of 2018) and cooling capacity. Note that the highest level of urban green space area in the Global South cities is much lower than that in the Global North (dashed line in b ). Gray bands indicate 95% confidence intervals. Two-sided t-tests were conducted. c A piecewise structural equation model based on assumed direct and indirect (through influencing cooling efficiency and urban green space area) effects of essential natural and socioeconomic factors on cooling capacity. Mean annual temperature and precipitation, and topographic variation (elevation range) are selected to represent basic background natural conditions; GDP per capita is selected to represent basic socioeconomic conditions. The spatial extent of built-up areas is included to correct for city size. A bi-directional relationship (correlation) is fitted between mean annual temperature and precipitation. Red and blue solid arrows indicate significantly negative and positive coefficients with p ≤ 0.05, respectively. Gray dashed arrows indicate p > 0.05. The arrow width illustrates the effect size. Similar relationships are found for cooling benefits realized by an average urban resident (see Supplementary Fig. 8 ).
A further question is what shapes the quality and quantity of urban green infrastructure (which in turn are driving cooling capacity)? Many inter-correlated factors are possibly operating at multiple scales, making it difficult to disentangle their effects, especially since experiment-based causal inference is usually not feasible for large-scale urban systems. From a macroscopic perspective, we test the simple hypothesis that the background natural and socioeconomic conditions of cities jointly affect their cooling capacity and benefit in both direct and indirect ways. To this end, we constructed a minimal structural equation model including only the most essential variables reflecting background climate (mean annual temperature and precipitation), topographic variation (elevation range), as well as gross domestic product (GDP) per capita and city area (see Methods; Fig. 4c ).
We found that the quantity of green spaces in a city (again, in proportion to its size) was positively correlated with GDP per capita and city area; wealthier cities have more green spaces. It is well known that wealth and green spaces are positively correlated within cities (the luxury effect) 40 , 46 ; our analysis shows that a similar luxury effect occurs among them at a global scale. In addition, larger cities often have proportionally more green spaces, an effect that may be due to the tendency for large cities (particularly in the US and Canada) to have lower population densities. Cities that were hotter and had more topographic variation tended to have fewer green spaces and those that were more humid tended to have more green spaces. Given that temperature and humidity are highly correlated with the geography of the Global South and Global North, it is difficult to know whether these effects are due to the direct effects of temperature and precipitation, for example, on the growth rate of vegetation and hence the transition of abandoned lots into green spaces, or are associated with historical, cultural and political differences that via various mechanisms correlate to climate. Our structural equation model explained only a small fraction of variation among cities in their cooling efficiency, which is to say the quality of their green space. Cooling efficiency was modestly influenced by background temperature and precipitation—the warmer a city, the greater the cooling efficiency in that city; conversely, the more humid a city the less the cooling efficiency of that city.
Our analyses suggested that the lower cooling adaptation capabilities of Global South cities can be explained by their lower quantity of green infrastructure and, to a much lesser extent, their weaker cooling efficiency (quality; Supplementary Fig. 2 ). These patterns appear to be in part structured by GDP, but are also associated with climatic conditions 39 , and other factors. A key question, unresolved by our work, is whether the climatic correlates of the size of green spaces in cities are due to the effects of climate per se or if they, instead, reflect correlates between contemporary climate and the social, cultural, and political histories of cities in the Global South 52 . Since urban planning has much inertia, especially in big cities, those choices might be correlated with climate because of the climatic correlates of political histories. It is also possible that these dynamics relate, in part, to the ways in which climate influences vegetation structure. However, this seems less likely given that under non-urban conditions vegetation cover (and hence cooling capacity) is normally positively correlated with mean annual temperature across the globe, opposite to our observed negative relationships for urban systems (Supplementary Fig. 9g ). Still, it is possible that increased temperatures in cities due to the urban heat island effects may lead to temperature-vegetation cover-cooling capacity relationships that differ from those in natural environments 53 , 54 . Indeed, a recent study found that climate warming will put urban forests at risk, and the risk is disproportionately higher in the Global South 55 .
Our model serves as a starting point for unraveling the mechanisms underlying global cooling inequality. We cannot rule out the possibility that other unconsidered factors correlated with the studied variables play important roles. We invite systematic studies incorporating detailed sociocultural and ecological variables to address this question across scales.
Can we reduce the inequality in cooling capacity and benefits that we have discovered among the world’s largest cities? Nuanced assessments of the potential to improve cooling mitigation require comprehensive considerations of socioeconomic, cultural, and technological aspects of urban management and policy. It is likely that cities differ greatly in their capacity to implement cooling through green infrastructure, whether as a function of culture, governance, policy or some mix thereof. However, any practical attempts to achieve greater cooling will occur in the context of the realities of climate and existing land use. To understand these realities, we modeled the maximum additional cooling capacity that is possible in cities, given existing constraints. We assume that this capacity depends on the quality (cooling efficiency) and quantity of urban green infrastructure. Our approach provides a straightforward metric of the cooling that could be achieved if all parts of a city’s green infrastructure were to be enhanced systematically.
The positive outlook is that our analyses suggest a considerable potential of improving cooling capacity by optimizing urban green infrastructure. An obvious way is through increases in urban green infrastructure quantity. We employ an approach in which we consider each local climate zone 56 to have a maximum NDVI and cooling efficiency (see Methods). For a given local climate zone, the city with the largest NDVI values or cooling efficiency sets the regional upper bounds for urban green infrastructure quantities or quality that can be achieved. Notably, these maxima are below the maxima for forests or other non-urban spaces for the simple reason that, as currently imagined, cities must contain gray (non-green) spaces in the form of roads and buildings. In this context, we conduct a thought experiment. What if we could systematically increase NDVI of all grid cells in each city, per local climate zone type, to a level corresponding to the median NDVI of grid cells in that upper bound city while keeping cooling efficiency unchanged (see Methods). If we were able to achieve this goal, the cooling capacity of cities would increase by ~2.4 °C worldwide. The increase would be even greater, ~3.8°C, if the 90th percentile (within the reference maximum city) was reached (Fig. 5a ). The potential for cooling benefit to the average urban resident is similar to that of cooling capacity (Supplementary Fig. 10a ). There is also potential to reduce urban temperatures if we can enhance cooling efficiency. However, the benefits of increases in cooling efficiency are modest (~1.5 °C increases at the 90th percentile of regional upper bounds) when holding urban green infrastructure quantity constant. In theory, if we could maximize both quantity and cooling efficiency of urban green infrastructure (to 90th percentiles of their regional upper bounds respectively), we would yield increases in cooling capacity and benefit up to ~10 °C, much higher than enhancing green space area or cooling efficiency alone (Fig. 5a , Supplementary Fig. 10a ). Notably, such co-maximization of green space area and cooling efficiency would substantially reduce global inequality to Gini <0.1 (Fig. 5b , Supplementary Fig. 10b ). Our analyses thus provide an important suggestion that enhancing both green space quantity and quality can yield a synergistic effect leading to much larger gains than any single aspect alone.
a The potential of enhancing cooling capacity via either enhancing urban green infrastructure quality (i.e., cooling efficiency) while holding quantity (i.e., green space area) fixed (yellow), or enhancing quantity while holding quality fixed (blue) is much lower than that of enhancing both quantity and quality (green). The x-axis indicates the targets of enhancing urban green infrastructure quantity and/or quality relative to the 50–90th percentiles of NDVI or cooling efficiency, see Methods). The dashed horizontal lines indicate the median cooling capacity of current cities. Data are presented as median values with the colored bands corresponding to 25–75th percentiles. b The potential of reducing cooling capacity inequality is also higher when enhancing both urban green infrastructure quantity and quality. The Gini index weighted by population density is used to measure inequality. Similar results were found for cooling benefit (Supplementary Fig. 10 ).
Different estimates of cooling capacity potential may be reached based on varying estimates and assumptions regarding the maximum possible quantity and quality of urban green infrastructure. There is no single, simple way to make these estimates, especially considering the huge between-city differences in society, culture, and structure across the globe. Our example case (above) begins from the upper bound city’s median NDVI, taking into account different local climate zone types and background climate regions (regional upper bounds). This is based on the assumption that for cities within the same climate regions, their average green space quantity may serve as an attainable target. Still, urban planning is often made at the level of individual cities, often only implemented to a limited extent and made with limited consideration of cities in other regions and countries. A potentially more realistic reference may be taken from the existing green infrastructure (again, per local climate zone type) within each particular city itself (see Methods): if a city’s sparsely vegetated areas was systematically elevated to the levels of 50–90th percentiles of NDVI within their corresponding local climate zones within the city, cooling capacity would still increase, but only by 0.5–1.5 °C and with only slightly reduced inequalities among cities (Supplementary Fig. 11 ). This highlights that ambitious policies, inspired by the greener cities worldwide, are necessary to realize the large cooling potential in urban green infrastructure.
In summary, our results demonstrate clear inequality in the extent to which urban green infrastructure cools cities and their denizens between the Global North and South. Much attention has been paid to the global inequality of indoor heat adaptation arising from the inequality of resources (e.g., less affordable air conditioning and more frequent power shortages in the Global South) 36 , 57 , 58 , 59 . Our results suggest that the inequality in outdoor adaptation is particularly concerning, especially as urban populations in the Global South are growing rapidly and are likely to face the most severe future temperature extremes 60 .
Previous studies have been focusing on characterizing urban heat island effects, urban vegetation patterns, resident exposure, and cooling effects in particular cities 26 , 28 , 34 , 61 , regions 22 , 25 , 62 , or continents 32 , 44 , 63 . Recent studies start looking at global patterns with respect to cooling efficiency or green space exposure 35 , 45 , 64 , 65 . Our approach is one drawn from the fields of large-scale ecology and macroecology. This approach is complementary to and, indeed, can, in the future, be combined with (1) mechanism driven biophysical models 66 , 67 to predict the influence of the composition and climate of green spaces on their cooling efficiency, (2) social theory aimed at understanding the factors that govern the amount of green space in cities as well as the disparity among cities 68 , (3) economic models of the effects of policy changes on the amount of greenspace and even (4) artist-driven projects that seek to understand the ways in which we might reimagine future cities 69 . Our simple explanatory model is, ultimately, one lens on a complex, global phenomenon.
Our results convey some positive outlook in that there is considerable potential to strengthen the cooling capability of cities and to reduce inequalities in cooling capacities at the same time. Realizing this nature-based solution, however, will be challenging. First, enhancing urban green infrastructure requires massive investments, which are more difficult to achieve in Global South cities. Second, it also requires smart planning strategies and advanced urban design and greening technologies 37 , 70 , 71 , 72 . Spatial planning of urban green spaces needs to consider not only the cooling amelioration effect, but also their multifunctional aspects that involve multiple ecosystem services, mental health benefits, accessibility, and security 73 . In theory, a city can maximize its cooling while also maximizing density through the combination of high-density living, ground-level green spaces, and vertical and rooftop gardens (or even forests). In practice, the current cities with the most green spaces tend to be lower-density cities 74 (Supplementary Fig. 12 ). Still, innovation and implementation of new technologies that allow green spaces and high-density living to be combined have the potential to reduce or disconnect the negative relationship between green space area and population density 71 , 75 . However, this development has yet to be realized. Another dimension of green spaces that deserves more attention is the geography of green spaces relative to where people are concentrated within cities. A critical question is how best should we distribute green spaces within cities to maximize cooling efficiency 76 and minimize within-city cooling inequality towards social equity 77 ? Last but not least, it is crucial to design and manage urban green spaces to be as resilient as possible to future climate stress 78 . For many cities, green infrastructure is likely to remain the primary means people will have to rely on to mitigate the escalating urban outdoor heat stress in the coming decades 79 .
We used the world population data from the World’s Cities in 2018 Data Booklet 80 to select 502 major cities with population over 1 million people (see Supplementary Data 1 for the complete list of the studied cities). Cities are divided into the Global North and Global South based on the Human Development Index (HDI) from the Human Development Report 2019 81 . For each selected city, we used the 2018 Global Artificial Impervious Area (GAIA) data at 30 m resolution 82 to determine its geographic extent. The derived urban boundary polygons thus encompass a majority of the built-up areas and urban residents. In using this approach, rather than urban administrative boundaries, we can focus on the relatively densely populated areas where cooling mitigation is most needed, and exclude areas dominated by (semi) natural landscapes that may bias the subsequent quantifications of the cooling effect. Our analyses on the cooling effect were conducted at the 100 m spatial resolution using Landsat data and WorldPop Global Project Population Data of 2018 83 . In order to test for the robustness of the results to coarser spatial scales, we also repeated the analyses at 1 km resolution using MODIS data, which have been extensively used for quantifying urban heat island effects and cooling mitigation 44 , 45 , 51 . We discarded the five cities with sizes <30 km 2 as they were too small for us to estimate their cooling efficiency based on linear regression (see section below for details). We combined closely located cities that form contiguous urban areas or urban agglomerations, if their urban boundary polygons from GAIA merged (e.g., Phoenix and Mesa in the United States were combined). Our approach yielded 468 polygons, each representing a major urbanized area that were the basis for all subsequent analyses. Because large water bodies can exert substantial and confounding cooling effects, we excluded permanent water bodies including lakes, reservoirs, rivers, and oceans using the Copernicus Global Land Service (CGLS) Land Cover data for 2018 at 10 m resolution 84 .
As a first step, we calculated cooling efficiency for each studied city within the GAIA-derived urban boundary. Cooling efficiency quantifies the extent to which a given area of green spaces in a city can reduce temperatures. It is a measure of the effectiveness (quality) of urban green spaces in terms of heat amelioration. Cooling efficiency is typically measured by calculating the slope of the relationship between remotely-sensed land surface temperature (LST) and vegetation cover through ordinary least square regression 42 , 44 , 45 . It is known that cooling efficiency varies between cities. Influencing factors might include background climate 29 , species composition 30 , 85 , landscape configuration 28 , topography 86 , proximity to large water bodies 33 , 87 , urban morphology 88 , and city management practices 31 . However, the mechanism underlying the global pattern of cooling efficiency remains unclear.
We used Landsat satellite data provided by the United States Geological Survey (USGS) to calculate the cooling efficiency of each studied city. We used the cloud-free Landsat 8 Level 2 LST and NDVI data. For each city we calculated the mean LST in each month of 2018 to identify the hottest month, and then derived the hottest month LST; we used the cloud-free Landsat 8 data to calculate the mean NDVI for the hottest month correspondingly.
We quantified cooling efficiency for different local climate zones 56 separately for each city, to account for within-city variability of thermal environments. To this end, we used the Copernicus Global Land Service data (CGLS) 84 and Global Human Settlement Layers (GHSL) Built-up height data 89 of 2018 at the 100 m resolution to identify five types of local climate zones: non-tree vegetation (shrubs, herbaceous vegetation, and cultivated vegetation according to the CGLS classification system), low-rise buildings (built up and bare according to the CGLS classification system, with building heights ≤10 m according to the GHSL data), medium-high-rise buildings (built up and bare areas with building heights >10 m), open tree cover (open forest with tree cover 15–70% according to the CGLS system), and closed tree cover (closed forest with tree cover >70%).
For each local climate zone type in each city, we constructed a regression model with NDVI as the predictor variable and LST as the response variable (using the ordinary least square method). We took into account the potential confounding factors including topographic elevation (derived from MERIT DEM dataset 90 ), building height (derived from the GHSL dataset 89 ), and distance to water bodies (derived from the GSHHG dataset 91 ), the model thus became: LST ~ NDVI + topography + building height + distance to water. Cooling efficiency was calculated as the absolute value of the regression coefficient of NDVI, after correcting for those confounding factors. To account for the multi-collinearity issue, we conducted variable selection based on the variance inflation factor (VIF) to achieve VIF < 5. Before the analysis, we discarded low-quality Landsat pixels, and filtered out the pixels with NDVI < 0 (normally less than 1% in a single city). Cooling efficiency is known to be influenced by within-city heterogeneity 92 , 93 , and, as a result, might sometimes better fit non-linear relationships at local scales 65 , 76 . However, our central aim is to assess global cooling inequality based on generalized relationships that fit the majority of global cities. Previous studies have shown that linear relationships can do this job 42 , 44 , 45 , therefore, here we used linear models to assess cooling efficiency.
As a second step, we calculated the cooling capacity of each city. Cooling capacity is a positive function of the magnitude of cooling efficiency and the proportional area of green spaces in a city and is calculated based on NDVI and the derived cooling efficiency (Eq. 1 , Supplementary Fig. 13 ):
where CC lcz and CE lcz are the cooling capacity and cooling efficiency for a given local climate zone type in a city, respectively; NDVI i is the mean NDVI for 100-m grid cell i ; NDVI min is the minimum NDVI across the city; and n is the total number of grid cells within the local climate zone. Local cooling capacity for each grid cell i (Fig. 1 , Supplementary Fig. 7 ) can be derived in this way as well (Supplementary Fig. 13 ). For a particular city, cooling capacity may be dependent on the spatial configuration of its land use/cover 28 , 94 , but here we condensed cooling capacity to city average (Eq. 2 ), thus did not take into account these local-scale factors.
where CC is the average cooling capacity of a city; n lcz is the number of grid cells of the local climate zone; m is the total number of grid cells within the whole city.
As a third step, we calculated the cooling benefit realized by an average urban resident (cooling benefit in short) in each city. Cooling benefit depends not only on the cooling capacity of a city, but also on where people live within a city relative to greener or grayer areas of the city. For example, cooling benefits in a city might be low even if the cooling capacity is high if the green parts and the dense-population parts of a city are inversely correlated. Here, we are calculating these averages while aware that in any particular city the exposure of a particular person will depend on the distribution of green spaces in a city, and the occupation, movement trajectories of a person, etc. On the scale of a city, we calculated cooling benefit following a previous study 35 , that is, simply adding a weight term of population size per 100-m grid cell into cooling capacity in Eq. ( 1 ):
Where CB lcz is the cooling benefit of a given local climate zone type in a specific city, pop i is the number of people within grid cell i , \(\overline{{pop}}\) is the mean population of the city.
Where CB is the average cooling benefit of a city. The population data were obtained from the 100-m resolution WorldPop Global Project Population Data of 2018 83 . Local cooling benefit for a given grid cell i can be calculated in a similar way, i.e., local cooling capacity multiplied by a weight term of local population density relative to mean population density. Local cooling benefits were mapped for example cities for the purpose of illustrating the effect of population spatial distribution (Fig. 1 , Supplementary Fig. 7 ), but their patterns were not examined here.
Based on the aforementioned three key variables quantified at 100 m grid cells, we conducted multivariate analyses to examine if and to what extent cooling efficiency and cooling benefit are shaped by essential natural and socioeconomic factors, including background climate (mean annual temperature from ECMWF ERA5 dataset 95 and precipitation from TerraClimate dataset 96 ), topography (elevation range 90 ), and GDP per capita 97 , with city size (geographic extent) corrected for. We did not include humidity because it is strongly correlated with temperature and precipitation, causing serious multi-collinearity problems. We used piecewise structural equation modeling to test the direct effects of these factors and indirect effects via influencing cooling efficiency and vegetation cover (Fig. 4c , Supplementary Fig. 8c ). To account for the potential influence of spatial autocorrelation, we used spatially autoregressive models (SAR) to test for the robustness of the observed effects of natural and socioeconomic factors on cooling capacity and benefit (Supplementary Fig. 14 ).
We conducted the following additional analyses to test for robustness. We obtained consistent results from these robustness analyses.
(1) We looked at the mean hottest-month LST and NDVI within 3 years (2017-2019) to check the consistency between the results based on relatively short (1 year) vs. long (3-year average) time periods (Supplementary Fig. 15 ).
(2) We carried out the approach at a coarser spatial scale of 1 km, using MODIS-derived NDVI and LST, as well as the population data 83 in the hottest month of 2018. In line with our finer-scale analysis of Landsat data, we selected the hottest month and excluded low-quality grids affected by cloud cover and water bodies 98 (water cover > 20% in 1 × 1 km 2 grid cells) of MODIS LST, and calculated the mean NDVI for the hottest month. We ultimately obtained 441 cities (or urban agglomerations) for analysis. At the 1 km resolution, some local climate zone types would yield insufficient samples for constructing cooling efficiency models. Therefore, instead of identifying local climate zone explicitly, we took an indirect approach to account for local climate confounding factors, that is, we constructed a multiple regression model for a whole city incorporating the hottest-month local temperature 95 , precipitation 96 , and humidity (based on NASA FLDAS dataset 99 ), albedo (derived from the MODIS MCD43A3 product 100 ), aerosol loading (derived from the MODIS MCD19A2 product 101 ), wind speed (based on TerraClimate dataset 96 ), topography elevation 90 , distance to water 91 , urban morphology (building height 102 ), and human activity intensity (VIIRS nighttime light data as a proxy indicator 103 ). We used the absolute value of the linear regression coefficient of NDVI as the cooling efficiency of the whole city (model: LST ~ NDVI + temperature + precipitation + humidity + distance to water + topography + building height + albedo + aerosol + wind speed + nighttime light), and calculated cooling capacity and cooling benefit based on the same method. Variable selection was conducted using the criterion of VIF < 5.
Our results indicated that MODIS-based cooling capacity and cooling benefit are significantly correlated with the Landsat-based counterparts (Supplementary Fig. 16 ); importantly, the gap between the Global South and North cities is around two-fold, close to the result from the Landsat-based result (Supplementary Fig. 17 ).
(3) For the calculation of cooling benefit, we considered different spatial scales of human accessibility to green spaces: assuming the population in each 100 × 100 m 2 grid cell could access to green spaces within neighborhoods of certain extents, we calculated cooling benefit by replacing NDVI i in Eq. ( 3 ) with mean NDVI within the 300 × 300 m 2 and 500 × 500 m 2 extents centered at the focal grid cell (Supplementary Fig. 18 ).
(4) Considering cities may vary in minimum NDVI, we assessed if this variation could affect resulting cooling capacity patterns. To this end, we calculated the cooling capacity for each studied city using NDVI = 0 as the reference (i.e., using NDVI = 0 instead of minimum NDVI in Supplementary Fig. 13b ), and correlated it with that using minimum NDVI as the reference (Supplementary Fig. 19 ).
Inequalities in access to the benefits of green spaces in cities exist within cities, as is increasingly well-documented 104 . Here, we focus instead on the inequalities among cities. We used the Gini coefficient to measure the inequality in cooling capacity and cooling benefit between all studied cities across the globe as well as between Global North or South cities. We calculated Gini using the population-density weighted method (Fig. 5b ), as well as the unweighted and population-size weighted methods (Supplementary Fig. 20 ).
We estimated the potential of enhancing cooling amelioration based on the assumptions that urban green space quality (cooling efficiency) and quantity (NDVI) can be increased to different levels, and that relative spatial distributions of green spaces and population can be idealized (so that their spatial matches can maximize cooling benefit). We assumed that macro-climate conditions act as the constraints of vegetation cover and cooling efficiency. We calculated the 50th, 60th, 70th, 80th, and 90th percentiles of NDVI within each type of local climate zone of each city. For a given local climate zone type, we obtained the city with the highest NDVI per percentile value as the regional upper bounds of urban green infrastructure quantity. The regional upper bounds of cooling efficiency are derived in a similar way. For each local climate zone in a city, we generated a potential NDVI distribution where all grid cells reach the regional upper bound values for the 50th, 60th, 70th, 80th, or 90th percentile of urban green space quantity or quality, respectively. NDVI values below these percentiles were increased, whereas those above these percentiles remained unchanged. The potential estimates are essentially dependent on the references, i.e., the optimal cooling efficiency and NDVI that a given city can reach. However, such references are obviously difficult to determine, because complex natural and socioeconomic conditions could play important roles in determining those cooling optima, and the dominant factors are unknown at a global scale. We employed the simplifying assumption that background climate could act as an essential constraint according to our results. We therefore used the Köppen climate classification system 105 to determine the reference separately in each climate region (tropical, arid, temperate, and continental climate regions were involved for all studied cities).
We calculated potential cooling capacity and cooling benefit based on these potential NDVI maps (Fixed cooling efficiency in Fig. 5 ). We then calculated the potentials if cooling efficiency of each city can be enhanced to 50–90th percentile across all urban local climate zones within the corresponding biogeographic region (Fixed green space area in Fig. 5 ). We also calculated the potentials if both NDVI and cooling efficiency were enhanced (Enhancing both in Fig. 5) to a certain corresponding level (i.e., i th percentile NDVI + i th percentile cooling efficiency). We examined if there are additional effects of idealizing relative spatial distributions of urban green spaces and humans on cooling benefits. To this end, the pixel values of NDVI or population amount remained unchanged, but their one-to-one correspondences were based on their ranking: the largest population corresponds to the highest NDVI, and so forth. Under each scenario, we calculated cooling capacity and cooling benefit for each city, and the between-city inequality was measured by the Gini coefficient.
We used the Google Earth Engine to process the spatial data. The statistical analyses were conducted using R v4.3.3 106 , with car v3.1-2 107 , piecewiseSEM v2.1.2 108 , and ineq v0.2-13 109 packages. The global maps of cooling were created using the ArcGIS v10.3 software.
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
City population statistics data is collected from the Population Division of the Department of Economic and Social Affairs of the United Nations ( https://www.un.org/development/desa/pd/content/worlds-cities-2018-data-booklet ). Global North-South division is based on Human Development Report 2019 which from United Nations Development Programme ( https://hdr.undp.org/content/human-development-report-2019 ). Global urban boundaries from GAIA data are available from Star Cloud Data Service Platform ( https://data-starcloud.pcl.ac.cn/resource/14 ) . Global water data is derived from 2018 Copernicus Global Land Service (CGLS 100-m) data ( https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_Landcover_100m_Proba-V-C3_Global ), European Space Agency (ESA) WorldCover 10 m 2020 product ( https://developers.google.com/earth-engine/datasets/catalog/ESA_WorldCover_v100 ), and GSHHG (A Global Self-consistent, Hierarchical, High-resolution Geography Database) at https://www.soest.hawaii.edu/pwessel/gshhg/ . Landsat 8 LST and NDVI data with 30 m resolution are available at https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T1_L2 . Land surface temperature (LST) data with 1 km from MODIS Aqua product (MYD11A1) is available at https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MYD11A1 . NDVI (1 km) dataset from MYD13A2 is available at https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MYD13A2 . Population data (100 m) is derived from WorldPop ( https://developers.google.com/earth-engine/datasets/catalog/WorldPop_GP_100m_pop ). Local climate zones are also based on 2018 CGLS data ( https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_Landcover_100m_Proba-V-C3_Global ), and built-up height data is available from Global Human Settlement Layers (GHSL, 100 m) ( https://developers.google.com/earth-engine/datasets/catalog/JRC_GHSL_P2023A_GHS_BUILT_H ). Temperature data is calculated from ERA5-Land Monthly Aggregated dataset ( https://developers.google.com/earth-engine/datasets/catalog/ECMWF_ERA5_LAND_MONTHLY_AGGR ). Precipitation and wind data are calculated from TerraClimate (Monthly Climate and Climatic Water Balance for Global Terrestrial Surfaces, University of Idaho) ( https://developers.google.com/earth-engine/datasets/catalog/IDAHO_EPSCOR_TERRACLIMATE ). Humidity data is calculated from Famine Early Warning Systems Network (FEWS NET) Land Data Assimilation System ( https://developers.google.com/earth-engine/datasets/catalog/NASA_FLDAS_NOAH01_C_GL_M_V001 ). Topography data from MERIT DEM (Multi-Error-Removed Improved-Terrain DEM) product is available at https://developers.google.com/earth-engine/datasets/catalog/MERIT_DEM_v1_0_3 . GDP from Gross Domestic Product and Human Development Index dataset is available at https://doi.org/10.5061/dryad.dk1j0 . VIIRS nighttime light data is available at https://developers.google.com/earth-engine/datasets/catalog/NOAA_VIIRS_DNB_MONTHLY_V1_VCMSLCFG . City building volume data from Global 3D Building Structure (1 km) is available at https://doi.org/10.34894/4QAGYL . Albedo data is derived from the MODIS MCD43A3 product ( https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MCD43A3 ), and aerosol data is derived from the MODIS MCD19A2 product ( https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MCD19A2_GRANULES ). All data used for generating the results are publicly available at https://doi.org/10.6084/m9.figshare.26340592.v1 .
The codes used for data collection and analyses are publicly available at https://doi.org/10.6084/m9.figshare.26340592.v1 .
Dosio, A., Mentaschi, L., Fischer, E. M. & Wyser, K. Extreme heat waves under 1.5 °C and 2 °C global warming. Environ. Res. Lett. 13 , 054006 (2018).
Article ADS Google Scholar
Suarez-Gutierrez, L., Müller, W. A., Li, C. & Marotzke, J. Hotspots of extreme heat under global warming. Clim. Dyn. 55 , 429–447 (2020).
Article Google Scholar
Guo, Y. et al. Global variation in the effects of ambient temperature on mortality: a systematic evaluation. Epidemiology 25 , 781–789 (2014).
Article PubMed PubMed Central Google Scholar
Mora, C. et al. Global risk of deadly heat. Nat. Clim. Chang. 7 , 501–506 (2017).
Ebi, K. L. et al. Hot weather and heat extremes: health risks. Lancet 398 , 698–708 (2021).
Article PubMed Google Scholar
Lüthi, S. et al. Rapid increase in the risk of heat-related mortality. Nat. Commun. 14 , 4894 (2023).
Article ADS PubMed PubMed Central Google Scholar
United Nations Department of Economic Social Affairs, Population Division. in World Population Prospects 2022: Summary of Results (United Nations Fund for Population Activities, 2022).
Sachindra, D., Ng, A., Muthukumaran, S. & Perera, B. Impact of climate change on urban heat island effect and extreme temperatures: a case‐study. Q. J. R. Meteorol. Soc. 142 , 172–186 (2016).
Guo, L. et al. Evaluating contributions of urbanization and global climate change to urban land surface temperature change: a case study in Lagos, Nigeria. Sci. Rep. 12 , 14168 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Liu, Z. et al. Surface warming in global cities is substantially more rapid than in rural background areas. Commun. Earth Environ. 3 , 219 (2022).
Mentaschi, L. et al. Global long-term mapping of surface temperature shows intensified intra-city urban heat island extremes. Glob. Environ. Change 72 , 102441 (2022).
Asseng, S., Spänkuch, D., Hernandez-Ochoa, I. M. & Laporta, J. The upper temperature thresholds of life. Lancet Planet. Health 5 , e378–e385 (2021).
Zander, K. K., Botzen, W. J., Oppermann, E., Kjellstrom, T. & Garnett, S. T. Heat stress causes substantial labour productivity loss in Australia. Nat. Clim. Chang. 5 , 647–651 (2015).
Flouris, A. D. et al. Workers’ health and productivity under occupational heat strain: a systematic review and meta-analysis. Lancet Planet. Health 2 , e521–e531 (2018).
Xu, C., Kohler, T. A., Lenton, T. M., Svenning, J.-C. & Scheffer, M. Future of the human climate niche. Proc. Natl Acad. Sci. USA 117 , 11350–11355 (2020).
Lenton, T. M. et al. Quantifying the human cost of global warming. Nat. Sustain. 6 , 1237–1247 (2023).
Harrington, L. J. et al. Poorest countries experience earlier anthropogenic emergence of daily temperature extremes. Environ. Res. Lett. 11 , 055007 (2016).
Bathiany, S., Dakos, V., Scheffer, M. & Lenton, T. M. Climate models predict increasing temperature variability in poor countries. Sci. Adv. 4 , eaar5809 (2018).
Alizadeh, M. R. et al. Increasing heat‐stress inequality in a warming climate. Earth Future 10 , e2021EF002488 (2022).
Tuholske, C. et al. Global urban population exposure to extreme heat. Proc. Natl Acad. Sci. USA 118 , e2024792118 (2021).
Article CAS PubMed PubMed Central Google Scholar
Manoli, G. et al. Magnitude of urban heat islands largely explained by climate and population. Nature 573 , 55–60 (2019).
Article ADS CAS PubMed Google Scholar
Wang, J. et al. Anthropogenic emissions and urbanization increase risk of compound hot extremes in cities. Nat. Clim. Chang. 11 , 1084–1089 (2021).
Article ADS CAS Google Scholar
Bowler, D. E., Buyung-Ali, L., Knight, T. M. & Pullin, A. S. Urban greening to cool towns and cities: a systematic review of the empirical evidence. Landsc. Urban Plan. 97 , 147–155 (2010).
Armson, D., Stringer, P. & Ennos, A. The effect of tree shade and grass on surface and globe temperatures in an urban area. Urban For. Urban Green. 11 , 245–255 (2012).
Wang, C., Wang, Z. H. & Yang, J. Cooling effect of urban trees on the built environment of contiguous United States. Earth Future 6 , 1066–1081 (2018).
Pataki, D. E., McCarthy, H. R., Litvak, E. & Pincetl, S. Transpiration of urban forests in the Los Angeles metropolitan area. Ecol. Appl. 21 , 661–677 (2011).
Konarska, J. et al. Transpiration of urban trees and its cooling effect in a high latitude city. Int. J. Biometeorol. 60 , 159–172 (2016).
Article ADS PubMed Google Scholar
Li, X., Zhou, W., Ouyang, Z., Xu, W. & Zheng, H. Spatial pattern of greenspace affects land surface temperature: evidence from the heavily urbanized Beijing metropolitan area, China. Landsc. Ecol. 27 , 887–898 (2012).
Yu, Z., Xu, S., Zhang, Y., Jørgensen, G. & Vejre, H. Strong contributions of local background climate to the cooling effect of urban green vegetation. Sci. Rep. 8 , 6798 (2018).
Richards, D. R., Fung, T. K., Belcher, R. & Edwards, P. J. Differential air temperature cooling performance of urban vegetation types in the tropics. Urban For. Urban Green. 50 , 126651 (2020).
Winbourne, J. B. et al. Tree transpiration and urban temperatures: current understanding, implications, and future research directions. BioScience 70 , 576–588 (2020).
Schwaab, J. et al. The role of urban trees in reducing land surface temperatures in European cities. Nat. Commun. 12 , 6763 (2021).
Vo, T. T. & Hu, L. Diurnal evolution of urban tree temperature at a city scale. Sci. Rep. 11 , 10491 (2021).
Wang, J. et al. Comparing relationships between urban heat exposure, ecological structure, and socio-economic patterns in Beijing and New York City. Landsc. Urban Plan. 235 , 104750 (2023).
Chen, B. et al. Contrasting inequality in human exposure to greenspace between cities of Global North and Global South. Nat. Commun. 13 , 4636 (2022).
Pavanello, F. et al. Air-conditioning and the adaptation cooling deficit in emerging economies. Nat. Commun. 12 , 6460 (2021).
Turner, V. K., Middel, A. & Vanos, J. K. Shade is an essential solution for hotter cities. Nature 619 , 694–697 (2023).
Hope, D. et al. Socioeconomics drive urban plant diversity. Proc. Natl Acad. Sci. USA 100 , 8788–8792 (2003).
Leong, M., Dunn, R. R. & Trautwein, M. D. Biodiversity and socioeconomics in the city: a review of the luxury effect. Biol. Lett. 14 , 20180082 (2018).
Schwarz, K. et al. Trees grow on money: urban tree canopy cover and environmental justice. PloS ONE 10 , e0122051 (2015).
Chakraborty, T., Hsu, A., Manya, D. & Sheriff, G. Disproportionately higher exposure to urban heat in lower-income neighborhoods: a multi-city perspective. Environ. Res. Lett. 14 , 105003 (2019).
Wang, J. et al. Significant effects of ecological context on urban trees’ cooling efficiency. ISPRS J. Photogramm. Remote Sens. 159 , 78–89 (2020).
Marando, F. et al. Urban heat island mitigation by green infrastructure in European Functional Urban Areas. Sust. Cities Soc. 77 , 103564 (2022).
Cheng, X., Peng, J., Dong, J., Liu, Y. & Wang, Y. Non-linear effects of meteorological variables on cooling efficiency of African urban trees. Environ. Int. 169 , 107489 (2022).
Yang, Q. et al. Global assessment of urban trees’ cooling efficiency based on satellite observations. Environ. Res. Lett. 17 , 034029 (2022).
Yin, Y., He, L., Wennberg, P. O. & Frankenberg, C. Unequal exposure to heatwaves in Los Angeles: Impact of uneven green spaces. Sci. Adv. 9 , eade8501 (2023).
Fantom N., Serajuddin U. The World Bank’s Classification of Countries by Income (The World Bank, 2016).
Iungman, T. et al. Cooling cities through urban green infrastructure: a health impact assessment of European cities. Lancet 401 , 577–589 (2023).
He, C. et al. The inequality labor loss risk from future urban warming and adaptation strategies. Nat. Commun. 13 , 3847 (2022).
Kii, M. Projecting future populations of urban agglomerations around the world and through the 21st century. npj Urban Sustain 1 , 10 (2021).
Paschalis, A., Chakraborty, T., Fatichi, S., Meili, N. & Manoli, G. Urban forests as main regulator of the evaporative cooling effect in cities. AGU Adv. 2 , e2020AV000303 (2021).
Hunte, N., Roopsind, A., Ansari, A. A. & Caughlin, T. T. Colonial history impacts urban tree species distribution in a tropical city. Urban For. Urban Green. 41 , 313–322 (2019).
Kabano, P., Harris, A. & Lindley, S. Sensitivity of canopy phenology to local urban environmental characteristics in a tropical city. Ecosystems 24 , 1110–1124 (2021).
Frank, S. D. & Backe, K. M. Effects of urban heat islands on temperate forest trees and arthropods. Curr. Rep. 9 , 48–57 (2023).
Esperon-Rodriguez, M. et al. Climate change increases global risk to urban forests. Nat. Clim. Chang. 12 , 950–955 (2022).
Stewart, I. D. & Oke, T. R. Local climate zones for urban temperature studies. Bull. Am. Meteorol. Soc. 93 , 1879–1900 (2012).
Biardeau, L. T., Davis, L. W., Gertler, P. & Wolfram, C. Heat exposure and global air conditioning. Nat. Sustain. 3 , 25–28 (2020).
Davis, L., Gertler, P., Jarvis, S. & Wolfram, C. Air conditioning and global inequality. Glob. Environ. Change 69 , 102299 (2021).
Colelli, F. P., Wing, I. S. & Cian, E. D. Air-conditioning adoption and electricity demand highlight climate change mitigation–adaptation tradeoffs. Sci. Rep. 13 , 4413 (2023).
Sun, L., Chen, J., Li, Q. & Huang, D. Dramatic uneven urbanization of large cities throughout the world in recent decades. Nat. Commun. 11 , 5366 (2020).
Liu, D., Kwan, M.-P. & Kan, Z. Analysis of urban green space accessibility and distribution inequity in the City of Chicago. Urban For. Urban Green. 59 , 127029 (2021).
Hsu, A., Sheriff, G., Chakraborty, T. & Manya, D. Disproportionate exposure to urban heat island intensity across major US cities. Nat. Commun. 12 , 2721 (2021).
Zhao, L., Lee, X., Smith, R. B. & Oleson, K. Strong contributions of local background climate to urban heat islands. Nature 511 , 216–219 (2014).
Wu, S., Chen, B., Webster, C., Xu, B. & Gong, P. Improved human greenspace exposure equality during 21st century urbanization. Nat. Commun. 14 , 6460 (2023).
Zhao, J., Zhao, X., Wu, D., Meili, N. & Fatichi, S. Satellite-based evidence highlights a considerable increase of urban tree cooling benefits from 2000 to 2015. Glob. Chang. Biol. 29 , 3085–3097 (2023).
Article CAS PubMed Google Scholar
Nice, K. A., Coutts, A. M. & Tapper, N. J. Development of the VTUF-3D v1. 0 urban micro-climate model to support assessment of urban vegetation influences on human thermal comfort. Urban Clim. 24 , 1052–1076 (2018).
Meili, N. et al. An urban ecohydrological model to quantify the effect of vegetation on urban climate and hydrology (UT&C v1. 0). Geosci. Model Dev. 13 , 335–362 (2020).
Nesbitt, L., Meitner, M. J., Sheppard, S. R. & Girling, C. The dimensions of urban green equity: a framework for analysis. Urban For. Urban Green. 34 , 240–248 (2018).
Hedblom, M., Prévot, A.-C. & Grégoire, A. Science fiction blockbuster movies—a problem or a path to urban greenery? Urban For. Urban Green. 74 , 127661 (2022).
Norton, B. A. et al. Planning for cooler cities: a framework to prioritise green infrastructure to mitigate high temperatures in urban landscapes. Landsc. Urban Plan 134 , 127–138 (2015).
Medl, A., Stangl, R. & Florineth, F. Vertical greening systems—a review on recent technologies and research advancement. Build. Environ. 125 , 227–239 (2017).
Chen, B., Lin, C., Gong, P. & An, J. Optimize urban shade using digital twins of cities. Nature 622 , 242–242 (2023).
Pamukcu-Albers, P. et al. Building green infrastructure to enhance urban resilience to climate change and pandemics. Landsc. Ecol. 36 , 665–673 (2021).
Haaland, C. & van Den Bosch, C. K. Challenges and strategies for urban green-space planning in cities undergoing densification: a review. Urban For. Urban Green. 14 , 760–771 (2015).
Shafique, M., Kim, R. & Rafiq, M. Green roof benefits, opportunities and challenges—a review. Renew. Sust. Energ. Rev. 90 , 757–773 (2018).
Wang, J., Zhou, W. & Jiao, M. Location matters: planting urban trees in the right places improves cooling. Front. Ecol. Environ. 20 , 147–151 (2022).
Lan, T., Liu, Y., Huang, G., Corcoran, J. & Peng, J. Urban green space and cooling services: opposing changes of integrated accessibility and social equity along with urbanization. Sust. Cities Soc. 84 , 104005 (2022).
Wood, S. & Dupras, J. Increasing functional diversity of the urban canopy for climate resilience: Potential tradeoffs with ecosystem services? Urban For. Urban Green. 58 , 126972 (2021).
Wong, N. H., Tan, C. L., Kolokotsa, D. D. & Takebayashi, H. Greenery as a mitigation and adaptation strategy to urban heat. Nat. Rev. Earth Environ. 2 , 166–181 (2021).
United Nations. Department of economic and social affairs, population division. in The World’s Cities in 2018—Data Booklet (UN, 2018).
United Nations Development Programme (UNDP). Human Development Report 2019: Beyond Income, Beyond Averages, Beyond Today: Inequalities in Human Development in the 21st Century (United Nations Development Programme (UNDP), 2019)
Li, X. et al. Mapping global urban boundaries from the global artificial impervious area (GAIA) data. Environ. Res. Lett. 15 , 094044 (2020).
Stevens, F. R., Gaughan, A. E., Linard, C. & Tatem, A. J. Disaggregating census data for population mapping using random forests with remotely-sensed and ancillary data. PloS ONE 10 , e0107042 (2015).
Buchhorn, M. et al. Copernicus global land cover layers—collection 2. Remote Sens 12 , 1044 (2020).
Gillerot, L. et al. Forest structure and composition alleviate human thermal stress. Glob. Change Biol. 28 , 7340–7352 (2022).
Article CAS Google Scholar
Hamada, S., Tanaka, T. & Ohta, T. Impacts of land use and topography on the cooling effect of green areas on surrounding urban areas. Urban For. Urban Green. 12 , 426–434 (2013).
Sun, X. et al. Quantifying landscape-metrics impacts on urban green-spaces and water-bodies cooling effect: the study of Nanjing, China. Urban For . Urban Green. 55 , 126838 (2020).
Zhang, Q., Zhou, D., Xu, D. & Rogora, A. Correlation between cooling effect of green space and surrounding urban spatial form: Evidence from 36 urban green spaces. Build. Environ. 222 , 109375 (2022).
Pesaresi, M., Politis, P. GHS-BUILT-H R2023A - GHS building height, derived from AW3D30, SRTM30, and Sentinel2 composite (2018) . European Commission, Joint Research Centre (JRC) https://doi.org/10.2905/85005901-3A49-48DD-9D19-6261354F56FE (2023).
Yamazaki, D. et al. A high‐accuracy map of global terrain elevations. Geophys. Res. Lett. 44 , 5844–5853 (2017).
Wessel, P. & Smith, W. H. A global, self‐consistent, hierarchical, high‐resolution shoreline database. J. Geophys. Res. Solid Earth 101 , 8741–8743 (1996).
Ren et al. climatic map studies: a review. Int. J. Climatol. 31 , 2213–2233 (2011).
Zhou, X. et al. Evaluation of urban heat islands using local climate zones and the influence of sea-land breeze. Sust. Cities Soc. 55 , 102060 (2020).
Zhou, W., Huang, G. & Cadenasso, M. L. Does spatial configuration matter? Understanding the effects of land cover pattern on land surface temperature in urban landscapes. Landsc. Urban Plan 102 , 54–63 (2011).
Muñoz Sabater, J. ERA5-Land monthly averaged data from 1981 to present . Copernicus Climate Change Service (C3S) Climate Data Store (CDS) https://doi.org/10.24381/cds.68d2bb30 (2019).
Abatzoglou, J. T., Dobrowski, S. Z., Parks, S. A. & Hegewisch, K. C. TerraClimate, a high-resolution global dataset of monthly climate and climatic water balance from 1958–2015. Sci. Data 5 , 1–12 (2018).
Kummu, M., Taka, M. & Guillaume, J. H. Gridded global datasets for gross domestic product and Human Development Index over 1990–2015. Sci. Data 5 , 1–15 (2018).
Zanaga, D. et al. ESA WorldCover 10 m 2020 v100. https://doi.org/10.5281/zenodo.5571936 (2021).
McNally, A. et al. A land data assimilation system for sub-Saharan Africa food and water security applications. Sci. Data 4 , 1–19 (2017).
Schaaf C., & Wang Z. MODIS/Terra+Aqua BRDF/Albedo Daily L3 Global - 500m V061 . NASA EOSDIS Land Processes Distributed Active Archive Center. https://doi.org/10.5067/MODIS/MCD43A3.061 (2021).
Lyapustin A., & Wang Y. MODIS/Terra+Aqua Land Aerosol Optical Depth Daily L2G Global 1km SIN Grid V061 . NASA EOSDIS Land Processes Distributed Active Archive Center. https://doi.org/10.5067/MODIS/MCD19A2.061 (2022).
Li, M., Wang, Y., Rosier, J. F., Verburg, P. H. & Vliet, J. V. Global maps of 3D built-up patterns for urban morphological analysis. Int. J. Appl. Earth Obs. Geoinf. 114 , 103048 (2022).
Google Scholar
Elvidge, C. D., Baugh, K., Zhizhin, M., Hsu, F. C. & Ghosh, T. VIIRS night-time lights. Int. J. Remote Sens. 38 , 5860–5879 (2017).
Zhou, W. et al. Urban tree canopy has greater cooling effects in socially vulnerable communities in the US. One Earth 4 , 1764–1775 (2021).
Beck, H. E. et al. Present and future Köppen-Geiger climate classification maps at 1-km resolution. Sci. Data 5 , 1–12 (2018).
R. Core Team. R: A Language and Environment for Statistical Computing . R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/ (2023).
Fox J., & Weisberg S. An R Companion to Applied Regression 3rd edn (Sage, 2019). https://socialsciences.mcmaster.ca/jfox/Books/Companion/ .
Lefcheck, J. S. piecewiseSEM: Piecewise structural equation modelling in r for ecology, evolution, and systematics. Methods Ecol. Evol. 7 , 573–579 (2016).
Zeileis, A. _ineq: Measuring Inequality, Concentration, and Poverty_ . R package version 0.2-13. https://CRAN.R-project.org/package=ineq (2014).
Download references
We thank all the data providers. We thank Marten Scheffer for valuable discussion. C.X. is supported by the National Natural Science Foundation of China (Grant No. 32061143014). J.-C.S. was supported by Center for Ecological Dynamics in a Novel Biosphere (ECONOVO), funded by Danish National Research Foundation (grant DNRF173), and his VILLUM Investigator project “Biodiversity Dynamics in a Changing World”, funded by VILLUM FONDEN (grant 16549). W.Z. was supported by the National Science Foundation of China through Grant No. 42225104. T.M.L. and J.F.A. are supported by the Open Society Foundations (OR2021-82956). W.J.R. is supported by the funding received from Roger Worthington.
Authors and affiliations.
School of Life Sciences, Nanjing University, Nanjing, China
Yuxiang Li, Shuqing N. Teng & Chi Xu
Center for Ecological Dynamics in a Novel Biosphere (ECONOVO), Department of Biology, Aarhus University, Aarhus, Denmark
Jens-Christian Svenning
State Key Laboratory of Urban and Regional Ecology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, China
University of Chinese Academy of Sciences, Beijing, China
Beijing Urban Ecosystem Research Station, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, China
School for Environment and Sustainability, University of Michigan, Ann Arbor, MI, USA
Global Systems Institute, University of Exeter, Exeter, UK
Jesse F. Abrams & Timothy M. Lenton
Department of Forest Ecosystems and Society, Oregon State University, Corvallis, OR, USA
William J. Ripple
Department of Environmental Science and Engineering, Fudan University, Shanghai, China
Department of Applied Ecology, North Carolina State University, Raleigh, NC, USA
Robert R. Dunn
You can also search for this author in PubMed Google Scholar
Y.L., S.N.T., R.R.D., and C.X. designed the study. Y.L. collected the data, generated the code, performed the analyses, and produced the figures with inputs from J.-C.S., W.Z., K.Z., J.F.A., T.M.L., W.J.R., Z.Y., S.N.T., R.R.D. and C.X. Y.L., S.N.T., R.R.D. and C.X. wrote the first draft with inputs from J.-C.S., W.Z., K.Z., J.F.A., T.M.L., W.J.R., and Z.Y. All coauthors interpreted the results and revised the manuscript.
Correspondence to Shuqing N. Teng , Robert R. Dunn or Chi Xu .
Competing interests.
The authors declare no competing interests.
Peer review information.
Nature Communications thanks Chris Webster and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information, peer review file, description of additional supplementary files, supplementary data 1, reporting summary, rights and permissions.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .
Reprints and permissions
Cite this article.
Li, Y., Svenning, JC., Zhou, W. et al. Green spaces provide substantial but unequal urban cooling globally. Nat Commun 15 , 7108 (2024). https://doi.org/10.1038/s41467-024-51355-0
Download citation
Received : 06 December 2023
Accepted : 05 August 2024
Published : 02 September 2024
DOI : https://doi.org/10.1038/s41467-024-51355-0
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.
Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.
Respiratory Research volume 25 , Article number: 329 ( 2024 ) Cite this article
Metrics details
Preserved Ratio Impaired Spirometry (PRISm) is considered to be a precursor of chronic obstructive pulmonary disease. Radiomics nomogram can effectively identify the PRISm subjects from non-COPD subjects, especially when during large-scale CT lung cancer screening.
Totally 1481 participants (864, 370 and 247 in training, internal validation, and external validation cohorts, respectively) were included. Whole lung on thin-section computed tomography (CT) was segmented with a fully automated segmentation algorithm. PyRadiomics was adopted for extracting radiomics features. Clinical features were also obtained. Moreover, Spearman correlation analysis, minimum redundancy maximum relevance (mRMR) feature ranking and least absolute shrinkage and selection operator (LASSO) classifier were adopted to analyze whether radiomics features could be used to build radiomics signatures. A nomogram that incorporated clinical features and radiomics signature was constructed through multivariable logistic regression. Last, calibration, discrimination and clinical usefulness were analyzed using validation cohorts.
The radiomics signature, which included 14 stable features, was related to PRISm of training and validation cohorts ( p < 0.001). The radiomics nomogram incorporating independent predicting factors (radiomics signature, age, BMI, and gender) well discriminated PRISm from non-COPD subjects compared with clinical model or radiomics signature alone for training cohort (AUC 0.787 vs. 0.675 vs. 0.778), internal (AUC 0.773 vs. 0.682 vs. 0.767) and external validation cohorts (AUC 0.702 vs. 0.610 vs. 0.699). Decision curve analysis suggested that our constructed radiomics nomogram outperformed clinical model.
The CT-based whole lung radiomics nomogram could identify PRISm to help decision-making in clinic.
Identifying PRISm subjects among non-COPD subjects, especially in the context of large-scale CT lung cancer screening, is currently a challenge.
In this retrospective, and multicentric study that included 1481 subjects, radiomics nomogram developed by integrating radiomics signature and clinical features achieved good performance for the identification of PRISm, with AUC of 0.787, 0.773 and 0.702 in training, internal and external validation cohort.
Radiomics nomogram, as a promising tool for identifying the PRISm from non-COPD subjects, hold great potential for guiding timely treatment and showing the added value of chest CT to evaluate the lung function status besides the morphological evaluation, especially during large-scale CT lung cancer screening.
Chronic obstructive pulmonary disease (COPD) is a primarily factor causing morbidity and mortality globally, ranking the third leading cause of death globally and resulting in tremendous health care, social and economic burdens [ 1 , 2 , 3 ]. The COPD burden may be significantly increased in the future decades owing to rapid aging of Chinese population. According to recent demographic data from the National Bureau of Statistics of China, the proportion of the population aged 65 and above is expected to increase from 12.6% in 2020 to a projected 28.1% by 2050 [ 4 ]. This rapid aging trend could significantly heighten the COPD burden in China compared to other populations. A recent meta-analytic study revealed that the prevalence of COPD increases notably with age with a marked rise from 4.37% (95% CI 2.76% − 6.33%) among individuals aged 40–49 years to 24.03% (95% CI 20.04%-28.26%) in those aged 70 years and older [ 5 ]. Moreover, it has been reported China had the largest absolute economic burden of COPD in the world, China alone accounts for 83.5% of the economic losses in upper-middle-income countries [ 6 ]. Screening and identifying COPD early can prevent disease progression and reduce health and economic burdens.
Preserved ratio impaired spirometry (PRISm), also known as restrictive pattern or unclassified spirometry, is defined as a FEV1 of less than 80% predicted, despite a normal or preserved FEV1/forced vital capacity (FVC) ratio (≥ 0.70) [ 7 ]. PRISm can transition to normal, obstructive or restrictive spirometry over time [ 8 ]. Therefore, PRISm has been increasingly identified with prognostic significance [ 9 , 10 , 11 , 12 ]. Based on some population-based studies performed in Western populations, PRISm subjects are associated with an increased airflow limitation (AFL) rate and an increased mortality risk relative to subjects having normal lung function [ 9 , 10 , 11 , 12 ]. These observations are similar to those reported from the Asian region [ 13 ]. According to the community survey involving 3032 Japanese people during the 5-year follow-up [ 13 ], 31 with PRISm at the first visit showed an increased overall mortality and a higher probability of COPD progression compared with people showing normal spirometry. As a result, finding markers that can accurately identify PRISm and offer a foundation for early prognosis prediction is of great significance for enhancing the management of clinical subjects.
Despite PFTs remaining the gold standard for diagnosing PRISm, the utilization rate is relatively low. In China, only 6.7% of individuals over 40 have undergone PFTs, resulting in a significant number of undiagnosed cases due to limited accessibility [ 14 ]. Imaging, particularly chest CT scans, offer considerable advantages and potential. With the widespread adoption of lung cancer screening programs, not only the use of CT scans are increased but also CT could provide more detailed anatomical information. Results from large-scale screening studies such as NELSON and NLST have shown reduced mortality rates of lung cancer undergoing lung cancer CT screenings [ 15 ]. These advancements offer greater survival chances for cancer patients. If we can simultaneously screen for PRISm during lung cancer screenings could provide more timely medical interventions and more benefit for the population [ 16 ].
In recent years, radiomics has aroused increasing attention, which refers to the process in which medical images are converted in high-dimensional, mineable data through high-throughput quantitative feature extraction and data analysis to support decision-making [ 17 ]. Radiomics is adopted for identifying chest diseases and evaluating prognosis [ 18 , 19 , 20 , 21 , 22 ]. Recently, several imaging studies have focused on exploring those imaging features of PRISm and the significance of quantitative HRCT in early diagnosis. However, to our best knowledge, studies have not yet investigated the relationship between radiomics and PRISm. The purpose of the study was to investigate the performance of CT radiomics in identifying PRISm subjects among non-COPD subjects with one-stop CT screening.
Totally 1513 subjects with PFT in five hospitals were retrospectively recruited between February 2013 and December 2022. The trial was registered in Chinese Clinical Trial Registry on 29 March 2023 (Number: ChiCTR2300069929, URL: https://www.chictr.org.cn/showproj.html?proj=192439 ). Subjects were enrolled based on the following inclusion criteria: (1) both chest CT and PFT were performed in the same hospital; (2) the PFT to chest CT interval less than 2 weeks; (3) complete thin-section (< 2 mm) chest CT images; (4) the postbronchodilator FEV1/FVC ≥ 0.7. The exclusion criteria as follows: (1) co-morbid other thoracic disease (e.g., pneumonia, pulmonary atelectasis, lung nodules larger than 6 mm or masses, asthma, and pleural effusion); (2) concomitant malignant neoplasms; and (3) artifacts. Finally, 1481 subjects were included in the study. Among them, 1234 subjects from one hospital were randomly divided into training ( n = 864) and internal validation cohort ( n = 370) with the ratio of 7:3, using “caret” R package. Those from other four hospitals were classified in independent external validation cohort ( n = 247). Figure 1 displays the subjects screening workflow. In the meanwhile, clinical basic information of the subjects such as age, sex, height, weight, BMI, and smoking status, was obtained based on electronic medical records system. The approval of the retrospective study was provided by the institutional review board of the leading hospital. Due to the retrospective nature, the informed consent was waived.
Flowchart for the selection of the study population
Table S1 shows the CT acquisition parameters and pulmonary function test apparatus in detail. Lung function was categorized in line with modified GOLD criteria and prior studies [ 12 , 23 ]. In this study, based on the PFT results, the non-COPD subjects were classified into the PRISm and normal spirometry groups for the training, internal validation and the independent external validation cohorts. Normal spirometry was defined as FEV1/FVC ≥ 0.70 and FEV1 ≥ 80% predicted; PRISm was defined as FEV1/FVC ≥ 0.70 and FEV1 < 80% predicted.
The pretrained CNN of U-Net structure was used to process chest CT images of each subject [ 24 ]. Firstly, the right and left lungs were automatically segmented using a publicly accessed deep-learning model, U-net (R231) ( https://github.com/JoHof/lungmask ), which has been trained based on different large-scale datasets including broad visual variability. Secondly, we merged the right and left lung into a combined region of interest (ROI) (Figure S1 ). Thirdly, an experienced chest radiologist with 8-year experience examined the segmentation outcome visually with the use of ITK-SNAP software (version 3.8.0, www.itksnap.org ). Inaccurate segmentation could be corrected manually with ITK-SNAP.
Prior to the extraction of radiomics features, three steps were utilized for image preprocessing. Firstly, linear interpolation was employed to resample images to 1 mm*1 mm*1 mm. Secondly, gray-level discretization was used for converting continuous images to discrete integer values and a bin width of 25 was used to reduce the effect of imaging noise. At last, wavelet and log image filters were adopted for eliminating mixed noise during image digitization and obtaining high- or low-frequency features. Pyradiomics software (version 3.0.1, https://pyradiomics.readthedocs.io/en/latest/ ) was adopted for extracting lung radiomics features. Totally 1218 features were obtained in each volume of interest with open-source package (pyradiomics), including first-order, gray-level cooccurrence matrix (GLCM), gray-level run length matrix (GLRLM), gray-level size zone matrix (GLSZM), gray-level dependence matrix (GLDM), and shape features. Radiomics features comply with the image biomarker standardization initiative (IBSI) [ 25 ], including standardization of acquisition and feature extraction, guidelines for annotation, segmentation, feature selection, model building and validation and clinical implementation. For the purpose of normalizing the features, the Z score method was adopted. In addition, the difference in the numerical scale was removed.
Optimal radiomics features were selected. Firstly, by eliminating redundancy with correlation coefficient > 0.90, we selected the optimal radiomic feature. Secondly, maximal redundancy minimal relevance (mRMR) algorithm was adopted for eliminating irrelevant or redundant features. mRMR has been demonstrated to be an efficient and reliable feature selection method for radiomics which can discover the optimal subset of features through considering the importance of features and the correlation between them. Least absolute shrinkage and selection operator (LASSO), which is the embedded approach, has been extensively applied to select high-dimensional radiomics features [ 26 ]. In addition, 10-fold cross-validation was carried out using penalty parameter and LASSO regression algorithm. The best feature dataset that had the lowest cross-validation binomial deviation was chosen, while nonzero coefficient was defined as selected feature weight, representing relation of features with PRISm. We used a linear combination of selected feature and coefficient vectors to calculate the Radscore of each subject. In addition, the construction of radiomics model was made.
Three models were built including clinical, radiomic and the combined models. Statistically significant risk variables were identified through univariable logistic regression, which were then incorporated for multivariable regression to construct the clinical and combined models. Last, we constructed a nomogram to visualize the combined model, graphically evaluated the variable importance and calculate the prediction accuracy. AUCs (areas under the ROC curve) of those three models were compared using Delong test. Nomogram calibration was assessed through calibration curves (Hosmer-Lemeshow test). Nomogram clinical practicability was assessed by decision curves Analysis (DCA).
In statistical analysis, R software (version 4.2.2; http://www.Rproject.org ) and IBM SPSS Statistics (Version 26.0; IBM Corp., New York, USA) were used. Measurement data were represented through mean ± standard deviation. Continuous variables of normal distribution were evaluated with the use of Mann-Whitney/Wilcoxon nonparametric test. Categorical data were explored by chi-square test between groups. Independent predicting factors were determined from diverse clinical variables by multivariate logistic regression. P < 0.05 represented statistical significance. “glmnet” package was used for LASSO analysis. In addition, we employed “caret” package for random division. “rms” package was applied for drawing calibration plot and conducting multivariate logistic regression. “pROC” package was used for drawing ROC (receiver operating characteristic curve) of radiomics signature. “rmda” package for DCA to evaluate net benefit, which is defined as the differential value of true positives proportion and false positives proportion, weighted by the relative harm of false positive and false negative results.
Table 1 displays basic demographic features of subjects. A total of 864 subjects with normal spirometry were in training cohort (500 males, 364 females; average age, 60.3 ± 12.7 years), 370 subjects were in internal validation cohort (207 males, 163 females; average age, 60.9 ± 11.8 years) and 247 subjects were in external validation cohort (163 males, 84 females; average age, 61.6 ± 12.4 years). The rates of PRISm subjects were 37.5% (324 of 864), 42.2% (156 of 370), and 40.5% (100 of 247) of training, internal, and external validation cohorts, respectively. Three cohorts differed significantly regarding gender and BMI ( p < 0.05). Significant differences were found in training and internal validation cohorts with respect to age and smoking status ( p < 0.001), but not in external cohort. In addition, age distribution was not of significant difference between two validation cohorts. Table 1 displays results of univariate and multivariate logistic regression. Age, BMI, and gender identified from multivariable regression were included to develop the clinical model. Figure 2 presents ROC curves for clinical model. The corresponding AUCs were 0.675 (95% CI: 0.637, 0.712), 0.682 (95% CI: 0.627, 0.737) and 0.610 (95% CI: 0.538, 0.682) in training, internal and external validation cohorts, respectively.
Diagnostic performance of the clinical factors model, radiomics signature, and radiomics nomogram was assessed and compared through ROC curves in the training ( A ), internal validation ( B ) and external validation ( C ) cohorts. ROC = receiver operating characteristics; AUC = area under the receiver operating characteristic curve
Among 1218 radiomics features extracted from chest CT images, 245 features exhibited high stability, and then were decreased to 30 features through minimum redundancy maximum relevancy. Finally, LASSO was conducted to select features (Fig. 3 A, B), among which, those 14 features with highest importance were retained, as shown in Fig. 3 C. The calculation formulas for Radscore are listed in Supplementary Results .
Radiomics feature selection by using the least absolute shrinkage and selection operator (LASSO) logistic regression. ( A ) Selection of the tuning parameter (λ) in the LASSO model via 10-fold cross-validation based on minimum criteria. Binomial deviances from the LASSO regression cross-validation model are plotted as a function of log(λ). The y-axis shows binomial deviances and the lower x-axis the log(λ). Numbers along the upper x-axis indicate the average number of predictors. Red dots indicate average deviance values for each model with a given λ, and vertical bars through the red dots indicate the upper and lower values of the deviances. The vertical black lines define the optimal values of λ, where the model provides its best fit to the data. ( B ) The coefficients have been plotted vs. log(λ). ( C ) The 14 features with nonzero coefficients are shown in the plot
Figure 2 displays ROC curves for radiomics signature. The AUCs of our radiomics signature were 0.778 (95% CI: 0.746, 0.810), 0.767 (95% CI: 0.718, 0.815) and 0.699 (95% CI: 0.633, 0.766) for training, internal and external validation cohorts, respectively.
Radscore and clinical features were incorporated to develop the radiomics nomogram for training cohort ( Fig. 4 A). The calibration curve of the radiomics nomogram showed good consistence between the predicted and expected probabilities for PRISm (Fig. 4 B, C, D). Meanwhile, upon Hosmer–Lemeshow test, their P -values were 0.9995, 0.4521, and 0.1049 for training, internal, and external validation cohorts, respectively, which revealed relatively excellent agreement between the nomogram prediction and the actual observation. Figure 2 shows ROC curves for radiomics nomogram. AUCs were used as an index of diagnostic accuracy; a higher AUC reflects greater accuracy. Its AUC, sensitivity, specificity, and accuracy were 0.787 (95% CI: 0.756, 0.818), 75.0%, 68.5%, and 70.9%; 0.773 (95%CI: 0.725, 0.821), 64.1%, 78.8%, and 71.6%; and 0.702 (95%CI: 0.636,0.767), 63.9%, 68.0%, and 65.6% in training, internal and external validation cohorts, separately.
Radiomics nomogram, calibration curves and DCA curves. ( A ) The radiomics nomogram, combining age, BMI, gender and Radscore, was developed in the training cohort. (B–D) The nomogram calibration curves in training ( B ), internal validation ( C ), and external validation ( D ) cohorts. Calibration curves indicate the goodness-of-ft of the model. ( E ) Decision curve analysis for different models
Table 2 displays model diagnostic accuracy. Comparison between ROC curves was performed by the DeLong test. The Delong test showed that there was a statistically significant difference in AUCs between the radiomics nomogram and the clinical model (Z = 6.64, p < 0.001; Z = 3.72, p < 0.001; and Z = 2.46, p = 0.014 in the training, internal validation, and external validation cohort, respectively). There is a difference between the radiomics signature and radiomics nomogram (Z = -2.01, p = 0.044) in the training cohort. But no significant difference (Z = -0.84, p = 0.401 and Z = 0.18, p = 0.855 in internal validation, and external validation cohort, respectively) between the radiomics nomogram and radiomics signature. The correlation between radiomics signature and clinical model was the moderate in the training and internal validation cohorts ( R = 0.4). Correlations in the external validation cohort was weak ( R = 0.2).
Decision curve analysis was used to evaluate the clinical practicability of the nomogram prediction model ( Fig. 4 E ) . The results showed that the nomogram obtained more benefit than the “treat all,” “treat none,” and the clinical model when the threshold probability was in the range of 4–70%. An example of the nomogram in use is shown in Fig. 5 . Similar to the points scoring system, we assigned points for each predictor of PRISm and then equated these predictors with the risk of PRISm. We can read the top score scale upward from the predictors to determine the points score associated with patient BMI, age, gender and the Radscore. Once a score has been assigned to each predictor, an overall score is calculated. Then, the total score is converted to the probability of PRISm by reading the associated probability of PRISm from the total point scale.
An example of the nomogram in clinical practice. The nomogram was used to calculate the scoring process of risk of PRISm. ( A ) Thin-section chest CT image of a 49-year-old normal female subject. Her Clinical features were analyzed as follows: BMI = 19.5 kg/m2, Radscore = -2.64. The nomogram showed that this patient had a total of 174 points after summing all points, which corresponds to a close to 4.00% probability of PRISm. ( B ) Thin-section chest CT image of a 43-year-old male subject. His clinical features were analyzed as follows: BMI = 30.80 kg/m2, Radscore = 4.30. The nomogram showed that this patient had a total of 225 points after summing all points, which corresponds to a close to 96.9% probability of PRISm
Identifying subjects with PRISm is of great importance to verify the early, effective, and individualized decision-making in the prevention of COPD, because many PRISm would progress into COPD. In the present study, the radiomics nomogram incorporating clinical factors and radiomics signature was established and verified to identify PRISm subjects based on whole lung CT radiomics. The radiomics nomogram proposed in the current work exhibited favorable discrimination in training cohort (AUC, 0.787), internal validation cohort (AUC, 0.773) and external validation cohort (AUC, 0.702), outperforming radiomics signature (training, 0.778; internal validation, 0.767; external validation, 0.699) and clinical factor model (training, 0.675; internal validation, 0.682; external validation, 0.610).
The incidence and disease burden of COPD is high in China, the overall pulmonary function detection rate was still at a low level, and many people have been underdiagnosed. In contrast, the popularity of chest CT is very high, especially with the large-scale chest CT screening for lung cancer. Moreover, more and more community health service centers will be equipped with CT. Therefore, the most important clinical scenario is for the large-scale lung cancer screening population that usually does not perform PFT, and many people who were high risk (most likely to develop COPD) can be found through our model prediction, which can help enhance the early intervention of PRISm, reduce the social-economic burden and improve the patient’s life quality. Many clinical factors have been explored in PRISm. It has been found female sex, old age, smoking, and extreme weight were related to PRISm [ 27 ]. The former and current smokers were examined in one cross-sectional and follow-up study of COPDGene [ 10 , 28 ], as a result, PRISm patients showed the increased BMI compared with COPD patients and normal subjects, while persistent smoking independently predicted the reduced life quality of COPD patients. Many studies suggest that age and BMI are imperative risk factors of PRISm [ 7 ]. The older cohorts may show an increased impaired spirometry rate, particularly through the longitudinal follow-up. BMI can induce the risk of PRISm risk through the distinct pathway, including inflammatory and metabolic effects of adipose tissue [ 29 ]. Previous population-based studies suggest increased risk of restrictive pattern among females [ 30 , 31 ]. In our study, age, sex and BMI were selected as independent predictors for PRISm subjects, Table 3 showed that female subjects with increased age and BMI are more likely to be PRISm, which was consistent with previous studies.
To the best of our knowledge, relevant studies are few on the identification of PRISm with CT-based methods. Wei et al. [ 32 ] evaluated the CT-based quantitative features with an in-house system, and found that lung capacity, emphysema index, and airway wall area did not predict intermediate-stage chronic bronchitis that progresses from normal lung function to early COPD. Moreover, their study did not evaluate CT textural features and relevant clinical factors. To date, there are rare study for the identification of PRISm population using radiomics. The CT-based radiomics nomogram is established by integrating clinical factors and radiomics signature for identifying PRISm in our study. It is very difficult to identify the proper margin of the diffuse lung lesions. Thus, the full-automatic lung lobe segmentation method was performed using U-Net, which has been proven efficacy in pulmonary disease, especially pulmonary diffuse disease [ 33 , 34 , 35 , 36 ]. This radiomics signature included 14 radiomics features, which well distinguished PRISm subjects from normal spirometry subjects, and the performances were high in training (0.776 [95%CI, 0.746–0.810]), internal (0.767 [95%CI, 0.718–0.815]) and external validation (0.699 [95%CI, 0.633–0.766]) cohorts. In our radiomics signature, the majority of the features were transformed by wavelet filter, splitting imaging data into various different frequency components on three axis of the whole lung region [ 37 ]. This suggests that wavelet features probably interpret spatial heterogeneity in whole lung regions at multiple scales. In addition, the constructed radiomics signature model was combined with the clinical factors. Lu et al. predicted PRISm from the normal by the combination of CT quantitative parameters, as well as clinical features with an AUC of 0.786 [ 38 ]. Our study made a greater performance sightly (AUC = 0.787).
The constructed novel PRISm prediction nomogram was further evaluated by a decision curve to clarify the clinical utility, which could offer insight into clinical outcomes on the basis of threshold probability, from which the net benefit could be derived [ 39 , 40 ]. The results showed that if the threshold probability of a patient is > 4%, the application of our constructed radiomics nomogram in predicting PRISm was more beneficial in relative to the treat-none or treat-all-patients scheme. The present novel nomogram provides an important quantitative indicator and reference for the decision-making and management of treatment regimens for PRISm subjects. The new approach sheds more lights on clinical outcomes according to threshold probability, and net benefits were obtained on this basis [ 41 ].
However, this study still has the following limitations. First, owing to the retrospective and multi-institutional nature of the current work, CT acquisition parameters and reconstruction techniques were not in consistence. But, we use techniques such as regularization, normalization, and resampling to improve the performance of CT images, thereby enhancing the accuracy and reliability of diagnosis. Second, common CT quantitative parameters can be measured to provide more information on pulmonary lesions, such as air trapping, pulmonary vascular disease and so on. Therefore, in the future, we will incorporate these common quantitative features to optimize our prediction model. Thirdly, to keep pace with the advances in technology, other advanced deep-learning algorithm should be applied in our further studies. Fourth, we have excluded lung cancer patients in this study; however, in future research, we will include lung cancer patients and apply whole lung radiomics to distinguish whether they have PRISm or not.
To sum up, the CT radiomics model incorporating clinical factors and radiomics signature is established and validated to identify PRISm in non-COPD subjects. The radiomics approach may be helpful to delay initiation COPD progression.
No datasets were generated or analysed during the current study.
Chronic obstructive pulmonary disease
Global initiative for chronic obstructive lung disease
Pulmonary function test
The ratio of forced expiratory volume in 1 s to forced vital capacity
Preserved Ratio Impaired Spirometry
Airflow limitation
Region of interest
Body mass index
Gray level cooccurrence matrix
Gray level size zone matrix
Gray level dependence matrix
Maximal redundancy minimal relevance
Least absolute shrinkage and selection operator
Decision curves analysis
The area under the curve
The receiver operating characteristic curve
Yin P, Wu J, Wang L, et al. The Burden of COPD in China and its provinces: findings from the global burden of Disease Study 2019. Front Public Health. 2022;10:859499. https://doi.org/10.3389/fpubh.2022.859499 .
Article PubMed PubMed Central Google Scholar
Yadav AK, Gu W, Zhang T, Xu X, Yu L. Current perspectives on Biological Therapy for COPD. Copd. 2023;20(1):197–209. https://doi.org/10.1080/15412555.2023.2187210 .
Article Google Scholar
Wang C, Xu J, Yang L, et al. Prevalence and risk factors of chronic obstructive pulmonary disease in China (the China Pulmonary Health [CPH] study): a national cross-sectional study. Lancet. 2018;28(10131):1706–17. https://doi.org/10.1016/s0140-6736(18)30841-9 .
China. https://www.stats.gov.cn/
Al Wachami N, Guennouni M, Iderdar Y, et al. Estimating the global prevalence of chronic obstructive pulmonary disease (COPD): a systematic review and meta-analysis. BMC Public Health. 2024;25(1):297. https://doi.org/10.1186/s12889-024-17686-9 .
Chen S, Kuhn M, Prettner K, et al. The global economic burden of chronic obstructive pulmonary disease for 204 countries and territories in 2020-50: a health-augmented macroeconomic modelling study. Lancet Glob Health. 2023;11(8):e1183–93. https://doi.org/10.1016/s2214-109x(23)00217-6 .
Article CAS Google Scholar
Higbee DH, Granell R, Davey Smith G, Dodd JW. Prevalence, risk factors, and clinical implications of preserved ratio impaired spirometry: a UK Biobank cohort analysis. Lancet Respir Med. 2022;10(2):149–57. https://doi.org/10.1016/s2213-2600(21)00369-6 .
Wan ES. The clinical spectrum of PRISm. Am J Respir Crit Care Med. 2022;1(5):524–5. https://doi.org/10.1164/rccm.202205-0965ED .
Park HJ, Byun MK, Rhee CK, Kim K, Kim HJ, Yoo KH. Significant predictors of medically diagnosed chronic obstructive pulmonary disease in patients with preserved ratio impaired spirometry: a 3-year cohort study. Respir Res. 2018;24(1):185. https://doi.org/10.1186/s12931-018-0896-7 .
Wan ES, Fortis S, Regan EA, et al. Longitudinal phenotypes and mortality in preserved ratio impaired spirometry in the COPDGene Study. Am J Respir Crit Care Med. 2018;1(11):1397–405. https://doi.org/10.1164/rccm.201804-0663OC .
Wan ES, Hokanson JE, Regan EA, et al. Significant spirometric transitions and preserved ratio impaired Spirometry among ever smokers. Chest. 2022;161(3):651–61. https://doi.org/10.1016/j.chest.2021.09.021 .
Wijnant SRA, De Roos E, Kavousi M, et al. Trajectory and mortality of preserved ratio impaired spirometry: the Rotterdam Study. Eur Respir J. 2020;55(1). https://doi.org/10.1183/13993003.01217-2019 .
Washio Y, Sakata S, Fukuyama S, et al. Risks of mortality and airflow limitation in Japanese individuals with preserved ratio impaired spirometry. Am J Respir Crit Care Med. 2022;1(5):563–72. https://doi.org/10.1164/rccm.202110-2302OC .
Tong H, Cong S, Fang LW, et al. [Performance of pulmonary function test in people aged 40 years and above in China, 2019–2020]. Zhonghua Liu Xing Bing Xue Za Zhi. 2023;10(5):727–34. https://doi.org/10.3760/cma.j.cn112338-20230202-00051 .
Oudkerk M, Liu S, Heuvelmans MA, Walter JE, Field JK. Lung cancer LDCT screening and mortality reduction - evidence, pitfalls and future perspectives. Nat Rev Clin Oncol. 2021;18(3):135–51. https://doi.org/10.1038/s41571-020-00432-6 .
Sunyi Zheng, Peter MA, van Ooijen OM. Lung Cancer Screening and Nodule Detection: the role of Artificial Intelligence Artificial Intelligence in cardiothoracic imaging. 2020:459. https://doi.org/10.1007/978-3-030-92087-6_43
Gillies RJ, Kinahan PE, Hricak H. Radiomics: images are more than pictures, they are data. Radiol. 2016;278(2):563–77. https://doi.org/10.1148/radiol.2015151169 .
Huang W, Deng H, Li Z, et al. Baseline whole-lung CT features deriving from deep learning and radiomics: prediction of benign and malignant pulmonary ground-glass nodules. Front Oncol. 2023;13:1255007. https://doi.org/10.3389/fonc.2023.1255007 .
Huang W, Zhang H, Ge Y, et al. Radiomics-based machine learning methods for volume doubling time prediction of Pulmonary Ground-glass nodules with baseline chest computed Tomography. J Thorac Imaging. 2023;1(5):304–14. https://doi.org/10.1097/rti.0000000000000725 .
Tu W, Sun G, Fan L, et al. Radiomics signature: a potential and incremental predictor for EGFR mutation status in NSCLC patients, comparison with CT morphology. Lung Cancer. 2019;132:28–35. https://doi.org/10.1016/j.lungcan.2019.03.025 .
Wang Y, Lyu D, Fan L, Liu S. Advances in the prediction of spread through air spaces with imaging in lung cancer: a narrative review. Transl Cancer Res. 2023;31(3):624–30. https://doi.org/10.21037/tcr-22-2593 .
Zhou T, Tu W, Dong P et al. CT-Based Radiomic Nomogram for the Prediction of Chronic Obstructive Pulmonary Disease in Patients with Lung cancer. Acad Radiol. 14. 2023; https://doi.org/10.1016/j.acra.2023.03.021
Agustí A, Celli BR, Criner GJ, et al. Global Initiative for Chronic Obstructive Lung Disease 2023 Report: GOLD Executive Summary. Eur Respir J. 2023;61(4). https://doi.org/10.1183/13993003.00239-2023 .
Hofmanninger J, Prayer F, Pan J, Röhrich S, Prosch H, Langs G. Automatic lung segmentation in routine imaging is primarily a data diversity problem, not a methodology problem. Eur Radiol Exp. 2020;20(1):50. https://doi.org/10.1186/s41747-020-00173-2 .
Yang K, Yang Y, Kang Y, et al. The value of radiomic features in chronic obstructive pulmonary disease assessment: a prospective study. Clin Radiol. 2022;77(6):e466–72. https://doi.org/10.1016/j.crad.2022.02.015 .
Remeseiro B, Bolon-Canedo V. A review of feature selection methods in medical applications. Comput Biol Med. 2019;112:103375. https://doi.org/10.1016/j.compbiomed.2019.103375 .
Guerra S, Carsin AE, Keidel D, et al. Health-related quality of life and risk factors associated with spirometric restriction. Eur Respir J. 2017;49(5). https://doi.org/10.1183/13993003.02096-2016 .
Wan ES, Castaldi PJ, Cho MH, et al. Epidemiology, genetics, and subtyping of preserved ratio impaired spirometry (PRISm) in COPDGene. Respir Res. 2014;6(1):89. https://doi.org/10.1186/s12931-014-0089-y .
Maclay JD, MacNee W. Cardiovascular disease in COPD: mechanisms. Chest. 2013;143(3):798–807. https://doi.org/10.1378/chest.12-0938 .
Guerra S, Sherrill DL, Venker C, Ceccato CM, Halonen M, Martinez FD. Morbidity and mortality associated with the restrictive spirometric pattern: a longitudinal study. Thorax. 2010;65(6):499–504. https://doi.org/10.1136/thx.2009.126052 .
Mannino DM, McBurnie MA, Tan W, et al. Restricted spirometry in the Burden of Lung Disease Study. Int J Tuberc Lung Dis. 2012;16(10):1405–11. https://doi.org/10.5588/ijtld.12.0054 .
Wei X, Ding Q, Yu N, et al. Imaging Features of Chronic Bronchitis with preserved ratio and impaired spirometry (PRISm). Lung. 2018;196(6):649–58. https://doi.org/10.1007/s00408-018-0162-2 .
Yang Y, Li W, Guo Y, et al. Early COPD risk decision for adults aged from 40 to 79 years based on lung Radiomics features. Front Med (Lausanne). 2022;9:845286. https://doi.org/10.3389/fmed.2022.845286 .
Article PubMed Google Scholar
Yang Y, Li W, Guo Y, et al. Lung radiomics features for characterizing and classifying COPD stage based on feature combination strategy and multi-layer perceptron classifier. Math Biosci Eng. 2022;25(8):7826–55. https://doi.org/10.3934/mbe.2022366 .
Yang Y, Li W, Kang Y, et al. A novel lung radiomics feature for characterizing resting heart rate and COPD stage evolution based on radiomics feature combination strategy. Math Biosci Eng. 2022;17(4):4145–65. https://doi.org/10.3934/mbe.2022191 .
Yang Y, Wang S, Zeng N, et al. Lung Radiomics features selection for COPD Stage classification based on Auto-Metric graph neural network. Diagnostics (Basel). 2022;20(10). https://doi.org/10.3390/diagnostics12102274 .
Wilson R, Devaraj A. Radiomics of pulmonary nodules and lung cancer. Transl Lung Cancer Res. 2017;6(1):86–91. https://doi.org/10.21037/tlcr.2017.01.04 .
Lu J, Ge H, Qi L, et al. Subtyping preserved ratio impaired spirometry (PRISm) by using quantitative HRCT imaging characteristics. Respir Res. 2022;11(1):309. https://doi.org/10.1186/s12931-022-02113-7 .
Localio AR, Goodman S. Beyond the usual prediction accuracy metrics: reporting results for clinical decision making. Ann intern med. 2012;157(4):294-5. https://doi.org/10.7326/0003-4819-157-4-201208210-00014
Van Calster B, Vickers AJ. Calibration of risk prediction models: impact on decision-analytic performance. Med Decis Mak. 2015;35(2):162–9. https://doi.org/10.1177/0272989x14547233 .
Balachandran VP, Gonen M, Smith JJ, DeMatteo RP. Nomograms in oncology: more than meets the eye. Lancet Oncol. 2015;16(4):e173–80. https://doi.org/10.1016/s1470-2045(14)71116-7 .
Download references
We thank the colleagues in our department for their help in our study.
This work was supported by the National Key Research and Development Program of China (2022YFC2010002, 2022YFC2010000 and 2022YFC2010005), the National Natural Science Foundation of China (82171926, 81930049 and 82430065), the Medical Imaging Database Construction Program of National Health Commission (YXFSC2022JJSJ002), the Clinical Innovative Project of Shanghai Changzheng Hospital (2020YLCYJY24), the Program of Science and Technology Commission of Shanghai Municipality (21DZ2202600), and the Shanghai Sailing Program (20YF1449000).
TaoHu Zhou, Yu Guan and XiaoQing Lin contributed equally to this work.
Department of Radiology, Second Affiliated Hospital of Naval Medical University, No. 415 Fengyang Road, Shanghai, 200003, China
TaoHu Zhou, Yu Guan, XiaoQing Lin, XiuXiu Zhou, Jie Li, ShiYuan Liu & Li Fan
School of Medical Imaging, Shandong Second Medical University, Weifang, 261053, Shandong, China
College of Health Sciences and Engineering, University of Shanghai for Science and Technology, No.516 Jungong Road, Shanghai, 200093, China
XiaoQing Lin & Jie Li
Department of Medical Imaging, Affiliated Hospital of Ji Ning Medical University, Ji Ning, 272000, China
Department of Radiology, Zhejiang Provincial People’s Hospital, Affiliated People’s Hospital of Hangzhou Medical College, Hangzhou, ZJ, China
Department of Radiology, Jiangxi Provincial People’s Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, China
You can also search for this author in PubMed Google Scholar
TaoHu Zhou and Li Fan had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Yu Guan and XiaoQing Lin contributed equally to this work. Concept and design: TaoHu Zhou and Li Fan. Acquisition, analysis, or interpretation of data: XiuXiu Zhou, Liang Mao, YanQing Ma, Bing Fan, Jie LiDrafting of the manuscript: TaoHu Zhou, Yu Guan and XiaoQing Lin. Statistical analysis: TaoHu Zhou. Obtained funding: ShiYuan Liu, Li Fan. Supervision: ShiYuan Liu, Li Fan. Image processing: TaoHu Zhou.
Correspondence to Li Fan .
Ethics approval and consent to participate.
This retrospective study involving human participants were reviewed and approved by the ethics committee of Second Affiliated Hospital of Naval Medical University. Patient informed consent was waived.
No individual participant data is reported that would require consent to publish from the participant (or legal parent or guardian for children).
The authors declare no competing interests.
Publisher’s note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Below is the link to the electronic supplementary material.
Rights and permissions.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .
Reprints and permissions
Cite this article.
Zhou, T., Guan, Y., Lin, X. et al. CT-based whole lung radiomics nomogram for identification of PRISm from non-COPD subjects. Respir Res 25 , 329 (2024). https://doi.org/10.1186/s12931-024-02964-2
Download citation
Received : 11 March 2024
Accepted : 28 August 2024
Published : 03 September 2024
DOI : https://doi.org/10.1186/s12931-024-02964-2
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
ISSN: 1465-993X
IMAGES
VIDEO
COMMENTS
Controlled experiments establish causality, whereas correlational studies only show associations between variables. In an experimental design, you manipulate an independent variable and measure its effect on a dependent variable. Other variables are controlled so they can't impact the results. In a correlational design, you measure variables ...
Correlational Research: Seeking Relationships among Variables. In contrast to descriptive research, which is designed primarily to provide static pictures, correlational research involves the measurement of two or more relevant variables and an assessment of the relationship between or among those variables.
Definition. Examines the relationship between two or more variables without manipulating them. Involves the manipulation of one or more variables to observe the effect on another variable. Goal. To identify the strength and direction of the relationship between variables. To establish a cause-and-effect relationship between variables.
Case study research involves in-depth analysis of a single individual, group, or event, often using qualitative methods to explore complex phenomena. On the other hand, experimental research involves manipulating variables and measuring their effects on outcomes in a controlled setting to establish cause-and-effect relationships.
Face validity and content validity are similar in that they both evaluate how suitable the content of a test is. The difference is that face validity is subjective, and assesses content at surface level.. When a test has strong face validity, anyone would agree that the test's questions appear to measure what they are intended to measure.. For example, looking at a 4th grade math test ...
Correlational research is a type of nonexperimental research in which the researcher measures two variables and assesses the statistical relationship (i.e., the correlation) between them with little or no effort to control extraneous variables. There are essentially two reasons that researchers interested in statistical relationships between ...
A correlational research design investigates relationships between variables without the researcher controlling or manipulating any of them. A correlation reflects the strength and/or direction of the relationship between two (or more) variables. The direction of a correlation can be either positive or negative. Positive correlation.
A correlational study is a type of research design that looks at the relationships between two or more variables. Correlational studies are non-experimental, which means that the experimenter does not manipulate or control any of the variables. A correlation refers to a relationship between two variables. Correlations can be strong or weak and ...
Two advantages of the experimental research design are (1) the assurance that the independent variable (also known as the experimental manipulation) occurs prior to the measured dependent variable, and (2) the creation of initial equivalence between the conditions of the experiment (in this case by using random assignment to conditions).
Case studies are good for describing, comparing, evaluating and understanding different aspects of a research problem. Table of contents. When to do a case study. Step 1: Select a case. Step 2: Build a theoretical framework. Step 3: Collect your data. Step 4: Describe and analyze the case.
Eugene Webb. A case study and experiment are the two prominent approaches often used at the forefront of scholarly inquiry. While case studies study the complexities of real-life situations, aiming for depth and contextual understanding, experiments seek to uncover causal relationships through controlled manipulation and observation.
Correlation does not imply causation; but often, observational data are the only option, even though the research question at hand involves causality. ... natural experiment, or observational study—is suited for drawing a causal inference regarding a specific research question must be decided on a case-by-case basis (see also Cartwright's ...
Now we will examine three types of quasi-experimental research: cross-sectional, longitudinal, and cross-sequential. Cross-sectional research studies make a comparison of different groups at the ...
A correlational study is an experimental design that evaluates only the correlation between variables. The researchers record measurements but do not control or manipulate the variables. Correlational research is a form of observational study. A correlation indicates that as the value of one variable increases, the other tends to change in a ...
Observational research studies involve the passive observation of subjects without any intervention or manipulation by researchers. These studies are designed to scrutinize the relationships between variables and test subjects, uncover patterns, and draw conclusions grounded in real-world data. Researchers refrain from interfering with the ...
Video: Non-Experimental Research Methods: Case Studies and Observation Using a range of classic and contemporary studies, this film illustrates and evaluates the strengths, weaknesses, and limitations of three different types of non-experimental methods used by psychologists to study social behavior: Case Studies, Naturalistic Observation and ...
A case study is an in-depth investigation of a particular individual, group, or event. It involves collecting and analyzing qualitative or quantitative data to gain a comprehensive understanding of the subject under study. Case studies are often used to explore complex phenomena, generate hypotheses, or provide detailed descriptions of unique ...
The difference between a case study and an experiment lies in their methodology and purpose, where a case study is an in-depth analysis of a specific situation or individual, while an experiment involves the manipulation of variables to test hypotheses and draw conclusions. ... Experiments involve testing the correlation between two variables ...
Purpose. Descriptive research is used to uncover new facts and the meaning of research. Correlational research is carried out to measure two variables. Nature. Descriptive research is analytical, where in-depth studies help collect information during research. Correlational nature is mathematical in nature.
Correlational studies aim to find out if there are differences in the characteristics of a population depending on whether or not its subjects have been exposed to an event of interest in the naturalistic setting. In eHealth, correlational studies are often used to determine whether the use of an eHealth system is associated with a particular set of user characteristics and/or quality of care ...
Correlational Study: A research project designed to discover the degree to which two variables are related to each other. Useful for making predictions. Does not prove a cause and effect relationship. It proves that two variables are related but not why they are related. Example: T.V. watching vs. low GPA.
Avoiding literature also results in the avoidance of initial research questions. Ideally, a research question is derived from the review itself. Such ignorance of literature and research questions results may produce an unstructured study that is hard to make sense of and, thus, faces desk rejection, an issue highlighted by Suddaby . He notes ...
A bi-directional relationship (correlation) is fitted between mean annual temperature and precipitation. Red and blue solid arrows indicate significantly negative and positive coefficients with p ...
CT image acquisition and pulmonary function test. Table S1 shows the CT acquisition parameters and pulmonary function test apparatus in detail. Lung function was categorized in line with modified GOLD criteria and prior studies [12, 23].In this study, based on the PFT results, the non-COPD subjects were classified into the PRISm and normal spirometry groups for the training, internal ...