Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Exploratory Research | Definition, Guide, & Examples

Exploratory Research | Definition, Guide, & Examples

Published on December 6, 2021 by Tegan George . Revised on November 20, 2023.

Exploratory research is a methodology approach that investigates research questions that have not previously been studied in depth.

Exploratory research is often qualitative and primary in nature. However, a study with a large sample conducted in an exploratory manner can be quantitative as well. It is also often referred to as interpretive research or a grounded theory approach due to its flexible and open-ended nature.

Table of contents

When to use exploratory research, exploratory research questions, exploratory research data collection, step-by-step example of exploratory research, exploratory vs. explanatory research, advantages and disadvantages of exploratory research, other interesting articles, frequently asked questions about exploratory research.

Exploratory research is often used when the issue you’re studying is new or when the data collection process is challenging for some reason.

You can use this type of research if you have a general idea or a specific question that you want to study but there is no preexisting knowledge or paradigm with which to study it.

Prevent plagiarism. Run a free check.

Exploratory research questions are designed to help you understand more about a particular topic of interest. They can help you connect ideas to understand the groundwork of your analysis without adding any preconceived notions or assumptions yet.

Here are some examples:

  • What effect does using a digital notebook have on the attention span of middle schoolers?
  • What factors influence mental health in undergraduates?
  • What outcomes are associated with an authoritative parenting style?
  • In what ways does the presence of a non-native accent affect intelligibility?
  • How can the use of a grocery delivery service reduce food waste in single-person households?

Collecting information on a previously unexplored topic can be challenging. Exploratory research can help you narrow down your topic and formulate a clear hypothesis and problem statement , as well as giving you the “lay of the land” on your topic.

Data collection using exploratory research is often divided into primary and secondary research methods, with data analysis following the same model.

Primary research

In primary research, your data is collected directly from primary sources : your participants. There is a variety of ways to collect primary data.

Some examples include:

  • Survey methodology: Sending a survey out to the student body asking them if they would eat vegan meals
  • Focus groups: Compiling groups of 8–10 students and discussing what they think of vegan options for dining hall food
  • Interviews: Interviewing students entering and exiting the dining hall, asking if they would eat vegan meals

Secondary research

In secondary research, your data is collected from preexisting primary research, such as experiments or surveys.

Some other examples include:

  • Case studies : Health of an all-vegan diet
  • Literature reviews : Preexisting research about students’ eating habits and how they have changed over time
  • Online polls, surveys, blog posts, or interviews; social media: Have other schools done something similar?

For some subjects, it’s possible to use large- n government data, such as the decennial census or yearly American Community Survey (ACS) open-source data.

How you proceed with your exploratory research design depends on the research method you choose to collect your data. In most cases, you will follow five steps.

We’ll walk you through the steps using the following example.

Therefore, you would like to focus on improving intelligibility instead of reducing the learner’s accent.

Step 1: Identify your problem

The first step in conducting exploratory research is identifying what the problem is and whether this type of research is the right avenue for you to pursue. Remember that exploratory research is most advantageous when you are investigating a previously unexplored problem.

Step 2: Hypothesize a solution

The next step is to come up with a solution to the problem you’re investigating. Formulate a hypothetical statement to guide your research.

Step 3. Design your methodology

Next, conceptualize your data collection and data analysis methods and write them up in a research design.

Step 4: Collect and analyze data

Next, you proceed with collecting and analyzing your data so you can determine whether your preliminary results are in line with your hypothesis.

In most types of research, you should formulate your hypotheses a priori and refrain from changing them due to the increased risk of Type I errors and data integrity issues. However, in exploratory research, you are allowed to change your hypothesis based on your findings, since you are exploring a previously unexplained phenomenon that could have many explanations.

Step 5: Avenues for future research

Decide if you would like to continue studying your topic. If so, it is likely that you will need to change to another type of research. As exploratory research is often qualitative in nature, you may need to conduct quantitative research with a larger sample size to achieve more generalizable results.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

exploratory case study advantages and disadvantages

It can be easy to confuse exploratory research with explanatory research. To understand the relationship, it can help to remember that exploratory research lays the groundwork for later explanatory research.

Exploratory research investigates research questions that have not been studied in depth. The preliminary results often lay the groundwork for future analysis.

Explanatory research questions tend to start with “why” or “how”, and the goal is to explain why or how a previously studied phenomenon takes place.

Exploratory vs explanatory research

Like any other research design , exploratory studies have their trade-offs: they provide a unique set of benefits but also come with downsides.

  • It can be very helpful in narrowing down a challenging or nebulous problem that has not been previously studied.
  • It can serve as a great guide for future research, whether your own or another researcher’s. With new and challenging research problems, adding to the body of research in the early stages can be very fulfilling.
  • It is very flexible, cost-effective, and open-ended. You are free to proceed however you think is best.

Disadvantages

  • It usually lacks conclusive results, and results can be biased or subjective due to a lack of preexisting knowledge on your topic.
  • It’s typically not externally valid and generalizable, and it suffers from many of the challenges of qualitative research .
  • Since you are not operating within an existing research paradigm, this type of research can be very labor-intensive.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Degrees of freedom
  • Null hypothesis
  • Discourse analysis
  • Control groups
  • Mixed methods research
  • Non-probability sampling
  • Quantitative research
  • Ecological validity

Research bias

  • Rosenthal effect
  • Implicit bias
  • Cognitive bias
  • Selection bias
  • Negativity bias
  • Status quo bias

Exploratory research is a methodology approach that explores research questions that have not previously been studied in depth. It is often used when the issue you’re studying is new, or the data collection process is challenging in some way.

Exploratory research aims to explore the main aspects of an under-researched problem, while explanatory research aims to explain the causes and consequences of a well-defined problem.

You can use exploratory research if you have a general idea or a specific question that you want to study but there is no preexisting knowledge or paradigm with which to study it.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

George, T. (2023, November 20). Exploratory Research | Definition, Guide, & Examples. Scribbr. Retrieved August 27, 2024, from https://www.scribbr.com/methodology/exploratory-research/

Is this article helpful?

Tegan George

Tegan George

Other students also liked, explanatory research | definition, guide, & examples, qualitative vs. quantitative research | differences, examples & methods, what is a research design | types, guide & examples, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

  • First Online: 27 October 2022

Cite this chapter

exploratory case study advantages and disadvantages

  • R. M. Channaveer 4 &
  • Rajendra Baikady 5  

3225 Accesses

1 Citations

This chapter reviews the strengths and limitations of case study as a research method in social sciences. It provides an account of an evidence base to justify why a case study is best suitable for some research questions and why not for some other research questions. Case study designing around the research context, defining the structure and modality, conducting the study, collecting the data through triangulation mode, analysing the data, and interpreting the data and theory building at the end give a holistic view of it. In addition, the chapter also focuses on the types of case study and when and where to use case study as a research method in social science research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

exploratory case study advantages and disadvantages

Case Study Research

exploratory case study advantages and disadvantages

Ang, C. S., Lee, K. F., & Dipolog-Ubanan, G. F. (2019). Determinants of first-year student identity and satisfaction in higher education: A quantitative case study. SAGE Open, 9 (2), 215824401984668. https://doi.org/10.1177/2158244019846689

Baxter, P., & Jack, S. (2015). Qualitative case study methodology: Study design and implementation for novice researchers. The Qualitative Report . Published. https://doi.org/10.46743/2160-3715/2008.1573

Bhatta, T. P. (2018). Case study research, philosophical position and theory building: A methodological discussion. Dhaulagiri Journal of Sociology and Anthropology, 12 , 72–79. https://doi.org/10.3126/dsaj.v12i0.22182

Article   Google Scholar  

Bromley, P. D. (1990). Academic contributions to psychological counselling. A philosophy of science for the study of individual cases. Counselling Psychology Quarterly , 3 (3), 299–307.

Google Scholar  

Crowe, S., Cresswell, K., Robertson, A., Huby, G., Avery, A., & Sheikh, A. (2011). The case study approach. BMC Medical Research Methodology, 11 (1), 1–9.

Grässel, E., & Schirmer, B. (2006). The use of volunteers to support family carers of dementia patients: Results of a prospective longitudinal study investigating expectations towards and experience with training and professional support. Zeitschrift Fur Gerontologie Und Geriatrie, 39 (3), 217–226.

Greenwood, D., & Lowenthal, D. (2005). Case study as a means of researching social work and improving practitioner education. Journal of Social Work Practice, 19 (2), 181–193. https://doi.org/10.1080/02650530500144782

Gülseçen, S., & Kubat, A. (2006). Teaching ICT to teacher candidates using PBL: A qualitative and quantitative evaluation. Journal of Educational Technology & Society, 9 (2), 96–106.

Gomm, R., Hammersley, M., & Foster, P. (2000). Case study and generalization. Case study method , 98–115.

Hamera, J., Denzin, N. K., & Lincoln, Y. S. (2011). Performance ethnography . SAGE.

Hayes, N. (2000). Doing psychological research (p. 133). Open University Press.

Harrison, H., Birks, M., Franklin, R., & Mills, J. (2017). Case study research: Foundations and methodological orientations. In Forum qualitative sozialforschung/forum: Qualitative social research (Vol. 18, No. 1).

Iwakabe, S., & Gazzola, N. (2009). From single-case studies to practice-based knowledge: Aggregating and synthesizing case studies. Psychotherapy Research, 19 (4–5), 601–611. https://doi.org/10.1080/10503300802688494

Johnson, M. P. (2006). Decision models for the location of community corrections centers. Environment and Planning b: Planning and Design, 33 (3), 393–412. https://doi.org/10.1068/b3125

Kaarbo, J., & Beasley, R. K. (1999). A practical guide to the comparative case study method in political psychology. Political Psychology, 20 (2), 369–391. https://doi.org/10.1111/0162-895x.00149

Lovell, G. I. (2006). Justice excused: The deployment of law in everyday political encounters. Law Society Review, 40 (2), 283–324. https://doi.org/10.1111/j.1540-5893.2006.00265.x

McDonough, S., & McDonough, S. (1997). Research methods as part of English language teacher education. English Language Teacher Education and Development, 3 (1), 84–96.

Meredith, J. (1998). Building operations management theory through case and field research. Journal of Operations Management, 16 (4), 441–454. https://doi.org/10.1016/s0272-6963(98)00023-0

Mills, A. J., Durepos, G., & Wiebe, E. (Eds.). (2009). Encyclopedia of case study research . Sage Publications.

Ochieng, P. A. (2009). An analysis of the strengths and limitation of qualitative and quantitative research paradigms. Problems of Education in the 21st Century , 13 , 13.

Page, E. B., Webb, E. J., Campell, D. T., Schwart, R. D., & Sechrest, L. (1966). Unobtrusive measures: Nonreactive research in the social sciences. American Educational Research Journal, 3 (4), 317. https://doi.org/10.2307/1162043

Rashid, Y., Rashid, A., Warraich, M. A., Sabir, S. S., & Waseem, A. (2019). Case study method: A step-by-step guide for business researchers. International Journal of Qualitative Methods, 18 , 160940691986242. https://doi.org/10.1177/1609406919862424

Ridder, H. G. (2017). The theory contribution of case study research designs. Business Research, 10 (2), 281–305. https://doi.org/10.1007/s40685-017-0045-z

Sadeghi Moghadam, M. R., Ghasemnia Arabi, N., & Khoshsima, G. (2021). A Review of case study method in operations management research. International Journal of Qualitative Methods, 20 , 160940692110100. https://doi.org/10.1177/16094069211010088

Sommer, B. B., & Sommer, R. (1997). A practical guide to behavioral research: Tools and techniques . Oxford University Press.

Stake, R. E. (2010). Qualitative research: Studying how things work .

Stake, R. E. (1995). The Art of Case Study Research . Sage Publications.

Stoecker, R. (1991). Evaluating and rethinking the case study. The Sociological Review, 39 (1), 88–112.

Suryani, A. (2013). Comparing case study and ethnography as qualitative research approaches .

Taylor, S., & Berridge, V. (2006). Medicinal plants and malaria: An historical case study of research at the London School of Hygiene and Tropical Medicine in the twentieth century. Transactions of the Royal Society of Tropical Medicine and Hygiene, 100 (8), 707–714. https://doi.org/10.1016/j.trstmh.2005.11.017

Tellis, W. (1997). Introduction to case study. The Qualitative Report . Published. https://doi.org/10.46743/2160-3715/1997.2024

Towne, L., & Shavelson, R. J. (2002). Scientific research in education . National Academy Press Publications Sales Office.

Widdowson, M. D. J. (2011). Case study research methodology. International Journal of Transactional Analysis Research, 2 (1), 25–34.

Yin, R. K. (2004). The case study anthology . Sage.

Yin, R. K. (2003). Design and methods. Case Study Research , 3 (9.2).

Yin, R. K. (1994). Case study research: Design and methods (2nd ed.). Sage Publishing.

Yin, R. (1984). Case study research: Design and methods . Sage Publications Beverly Hills.

Yin, R. (1993). Applications of case study research . Sage Publishing.

Zainal, Z. (2003). An investigation into the effects of discipline-specific knowledge, proficiency and genre on reading comprehension and strategies of Malaysia ESP Students. Unpublished Ph. D. Thesis. University of Reading , 1 (1).

Zeisel, J. (1984). Inquiry by design: Tools for environment-behaviour research (No. 5). CUP archive.

Download references

Author information

Authors and affiliations.

Department of Social Work, Central University of Karnataka, Kadaganchi, India

R. M. Channaveer

Department of Social Work, University of Johannesburg, Johannesburg, South Africa

Rajendra Baikady

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to R. M. Channaveer .

Editor information

Editors and affiliations.

Centre for Family and Child Studies, Research Institute of Humanities and Social Sciences, University of Sharjah, Sharjah, United Arab Emirates

M. Rezaul Islam

Department of Development Studies, University of Dhaka, Dhaka, Bangladesh

Niaz Ahmed Khan

Department of Social Work, School of Humanities, University of Johannesburg, Johannesburg, South Africa

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Channaveer, R.M., Baikady, R. (2022). Case Study. In: Islam, M.R., Khan, N.A., Baikady, R. (eds) Principles of Social Research Methodology. Springer, Singapore. https://doi.org/10.1007/978-981-19-5441-2_21

Download citation

DOI : https://doi.org/10.1007/978-981-19-5441-2_21

Published : 27 October 2022

Publisher Name : Springer, Singapore

Print ISBN : 978-981-19-5219-7

Online ISBN : 978-981-19-5441-2

eBook Packages : Social Sciences Social Sciences (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Case Study Research Method in Psychology

Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

Case studies are in-depth investigations of a person, group, event, or community. Typically, data is gathered from various sources using several methods (e.g., observations & interviews).

The case study research method originated in clinical medicine (the case history, i.e., the patient’s personal history). In psychology, case studies are often confined to the study of a particular individual.

The information is mainly biographical and relates to events in the individual’s past (i.e., retrospective), as well as to significant events that are currently occurring in his or her everyday life.

The case study is not a research method, but researchers select methods of data collection and analysis that will generate material suitable for case studies.

Freud (1909a, 1909b) conducted very detailed investigations into the private lives of his patients in an attempt to both understand and help them overcome their illnesses.

This makes it clear that the case study is a method that should only be used by a psychologist, therapist, or psychiatrist, i.e., someone with a professional qualification.

There is an ethical issue of competence. Only someone qualified to diagnose and treat a person can conduct a formal case study relating to atypical (i.e., abnormal) behavior or atypical development.

case study

 Famous Case Studies

  • Anna O – One of the most famous case studies, documenting psychoanalyst Josef Breuer’s treatment of “Anna O” (real name Bertha Pappenheim) for hysteria in the late 1800s using early psychoanalytic theory.
  • Little Hans – A child psychoanalysis case study published by Sigmund Freud in 1909 analyzing his five-year-old patient Herbert Graf’s house phobia as related to the Oedipus complex.
  • Bruce/Brenda – Gender identity case of the boy (Bruce) whose botched circumcision led psychologist John Money to advise gender reassignment and raise him as a girl (Brenda) in the 1960s.
  • Genie Wiley – Linguistics/psychological development case of the victim of extreme isolation abuse who was studied in 1970s California for effects of early language deprivation on acquiring speech later in life.
  • Phineas Gage – One of the most famous neuropsychology case studies analyzes personality changes in railroad worker Phineas Gage after an 1848 brain injury involving a tamping iron piercing his skull.

Clinical Case Studies

  • Studying the effectiveness of psychotherapy approaches with an individual patient
  • Assessing and treating mental illnesses like depression, anxiety disorders, PTSD
  • Neuropsychological cases investigating brain injuries or disorders

Child Psychology Case Studies

  • Studying psychological development from birth through adolescence
  • Cases of learning disabilities, autism spectrum disorders, ADHD
  • Effects of trauma, abuse, deprivation on development

Types of Case Studies

  • Explanatory case studies : Used to explore causation in order to find underlying principles. Helpful for doing qualitative analysis to explain presumed causal links.
  • Exploratory case studies : Used to explore situations where an intervention being evaluated has no clear set of outcomes. It helps define questions and hypotheses for future research.
  • Descriptive case studies : Describe an intervention or phenomenon and the real-life context in which it occurred. It is helpful for illustrating certain topics within an evaluation.
  • Multiple-case studies : Used to explore differences between cases and replicate findings across cases. Helpful for comparing and contrasting specific cases.
  • Intrinsic : Used to gain a better understanding of a particular case. Helpful for capturing the complexity of a single case.
  • Collective : Used to explore a general phenomenon using multiple case studies. Helpful for jointly studying a group of cases in order to inquire into the phenomenon.

Where Do You Find Data for a Case Study?

There are several places to find data for a case study. The key is to gather data from multiple sources to get a complete picture of the case and corroborate facts or findings through triangulation of evidence. Most of this information is likely qualitative (i.e., verbal description rather than measurement), but the psychologist might also collect numerical data.

1. Primary sources

  • Interviews – Interviewing key people related to the case to get their perspectives and insights. The interview is an extremely effective procedure for obtaining information about an individual, and it may be used to collect comments from the person’s friends, parents, employer, workmates, and others who have a good knowledge of the person, as well as to obtain facts from the person him or herself.
  • Observations – Observing behaviors, interactions, processes, etc., related to the case as they unfold in real-time.
  • Documents & Records – Reviewing private documents, diaries, public records, correspondence, meeting minutes, etc., relevant to the case.

2. Secondary sources

  • News/Media – News coverage of events related to the case study.
  • Academic articles – Journal articles, dissertations etc. that discuss the case.
  • Government reports – Official data and records related to the case context.
  • Books/films – Books, documentaries or films discussing the case.

3. Archival records

Searching historical archives, museum collections and databases to find relevant documents, visual/audio records related to the case history and context.

Public archives like newspapers, organizational records, photographic collections could all include potentially relevant pieces of information to shed light on attitudes, cultural perspectives, common practices and historical contexts related to psychology.

4. Organizational records

Organizational records offer the advantage of often having large datasets collected over time that can reveal or confirm psychological insights.

Of course, privacy and ethical concerns regarding confidential data must be navigated carefully.

However, with proper protocols, organizational records can provide invaluable context and empirical depth to qualitative case studies exploring the intersection of psychology and organizations.

  • Organizational/industrial psychology research : Organizational records like employee surveys, turnover/retention data, policies, incident reports etc. may provide insight into topics like job satisfaction, workplace culture and dynamics, leadership issues, employee behaviors etc.
  • Clinical psychology : Therapists/hospitals may grant access to anonymized medical records to study aspects like assessments, diagnoses, treatment plans etc. This could shed light on clinical practices.
  • School psychology : Studies could utilize anonymized student records like test scores, grades, disciplinary issues, and counseling referrals to study child development, learning barriers, effectiveness of support programs, and more.

How do I Write a Case Study in Psychology?

Follow specified case study guidelines provided by a journal or your psychology tutor. General components of clinical case studies include: background, symptoms, assessments, diagnosis, treatment, and outcomes. Interpreting the information means the researcher decides what to include or leave out. A good case study should always clarify which information is the factual description and which is an inference or the researcher’s opinion.

1. Introduction

  • Provide background on the case context and why it is of interest, presenting background information like demographics, relevant history, and presenting problem.
  • Compare briefly to similar published cases if applicable. Clearly state the focus/importance of the case.

2. Case Presentation

  • Describe the presenting problem in detail, including symptoms, duration,and impact on daily life.
  • Include client demographics like age and gender, information about social relationships, and mental health history.
  • Describe all physical, emotional, and/or sensory symptoms reported by the client.
  • Use patient quotes to describe the initial complaint verbatim. Follow with full-sentence summaries of relevant history details gathered, including key components that led to a working diagnosis.
  • Summarize clinical exam results, namely orthopedic/neurological tests, imaging, lab tests, etc. Note actual results rather than subjective conclusions. Provide images if clearly reproducible/anonymized.
  • Clearly state the working diagnosis or clinical impression before transitioning to management.

3. Management and Outcome

  • Indicate the total duration of care and number of treatments given over what timeframe. Use specific names/descriptions for any therapies/interventions applied.
  • Present the results of the intervention,including any quantitative or qualitative data collected.
  • For outcomes, utilize visual analog scales for pain, medication usage logs, etc., if possible. Include patient self-reports of improvement/worsening of symptoms. Note the reason for discharge/end of care.

4. Discussion

  • Analyze the case, exploring contributing factors, limitations of the study, and connections to existing research.
  • Analyze the effectiveness of the intervention,considering factors like participant adherence, limitations of the study, and potential alternative explanations for the results.
  • Identify any questions raised in the case analysis and relate insights to established theories and current research if applicable. Avoid definitive claims about physiological explanations.
  • Offer clinical implications, and suggest future research directions.

5. Additional Items

  • Thank specific assistants for writing support only. No patient acknowledgments.
  • References should directly support any key claims or quotes included.
  • Use tables/figures/images only if substantially informative. Include permissions and legends/explanatory notes.
  • Provides detailed (rich qualitative) information.
  • Provides insight for further research.
  • Permitting investigation of otherwise impractical (or unethical) situations.

Case studies allow a researcher to investigate a topic in far more detail than might be possible if they were trying to deal with a large number of research participants (nomothetic approach) with the aim of ‘averaging’.

Because of their in-depth, multi-sided approach, case studies often shed light on aspects of human thinking and behavior that would be unethical or impractical to study in other ways.

Research that only looks into the measurable aspects of human behavior is not likely to give us insights into the subjective dimension of experience, which is important to psychoanalytic and humanistic psychologists.

Case studies are often used in exploratory research. They can help us generate new ideas (that might be tested by other methods). They are an important way of illustrating theories and can help show how different aspects of a person’s life are related to each other.

The method is, therefore, important for psychologists who adopt a holistic point of view (i.e., humanistic psychologists ).

Limitations

  • Lacking scientific rigor and providing little basis for generalization of results to the wider population.
  • Researchers’ own subjective feelings may influence the case study (researcher bias).
  • Difficult to replicate.
  • Time-consuming and expensive.
  • The volume of data, together with the time restrictions in place, impacted the depth of analysis that was possible within the available resources.

Because a case study deals with only one person/event/group, we can never be sure if the case study investigated is representative of the wider body of “similar” instances. This means the conclusions drawn from a particular case may not be transferable to other settings.

Because case studies are based on the analysis of qualitative (i.e., descriptive) data , a lot depends on the psychologist’s interpretation of the information she has acquired.

This means that there is a lot of scope for Anna O , and it could be that the subjective opinions of the psychologist intrude in the assessment of what the data means.

For example, Freud has been criticized for producing case studies in which the information was sometimes distorted to fit particular behavioral theories (e.g., Little Hans ).

This is also true of Money’s interpretation of the Bruce/Brenda case study (Diamond, 1997) when he ignored evidence that went against his theory.

Breuer, J., & Freud, S. (1895).  Studies on hysteria . Standard Edition 2: London.

Curtiss, S. (1981). Genie: The case of a modern wild child .

Diamond, M., & Sigmundson, K. (1997). Sex Reassignment at Birth: Long-term Review and Clinical Implications. Archives of Pediatrics & Adolescent Medicine , 151(3), 298-304

Freud, S. (1909a). Analysis of a phobia of a five year old boy. In The Pelican Freud Library (1977), Vol 8, Case Histories 1, pages 169-306

Freud, S. (1909b). Bemerkungen über einen Fall von Zwangsneurose (Der “Rattenmann”). Jb. psychoanal. psychopathol. Forsch ., I, p. 357-421; GW, VII, p. 379-463; Notes upon a case of obsessional neurosis, SE , 10: 151-318.

Harlow J. M. (1848). Passage of an iron rod through the head.  Boston Medical and Surgical Journal, 39 , 389–393.

Harlow, J. M. (1868).  Recovery from the Passage of an Iron Bar through the Head .  Publications of the Massachusetts Medical Society. 2  (3), 327-347.

Money, J., & Ehrhardt, A. A. (1972).  Man & Woman, Boy & Girl : The Differentiation and Dimorphism of Gender Identity from Conception to Maturity. Baltimore, Maryland: Johns Hopkins University Press.

Money, J., & Tucker, P. (1975). Sexual signatures: On being a man or a woman.

Further Information

  • Case Study Approach
  • Case Study Method
  • Enhancing the Quality of Case Studies in Health Services Research
  • “We do things together” A case study of “couplehood” in dementia
  • Using mixed methods for evaluating an integrative approach to cancer care: a case study

Print Friendly, PDF & Email

  • Open access
  • Published: 27 June 2011

The case study approach

  • Sarah Crowe 1 ,
  • Kathrin Cresswell 2 ,
  • Ann Robertson 2 ,
  • Guro Huby 3 ,
  • Anthony Avery 1 &
  • Aziz Sheikh 2  

BMC Medical Research Methodology volume  11 , Article number:  100 ( 2011 ) Cite this article

796k Accesses

1111 Citations

42 Altmetric

Metrics details

The case study approach allows in-depth, multi-faceted explorations of complex issues in their real-life settings. The value of the case study approach is well recognised in the fields of business, law and policy, but somewhat less so in health services research. Based on our experiences of conducting several health-related case studies, we reflect on the different types of case study design, the specific research questions this approach can help answer, the data sources that tend to be used, and the particular advantages and disadvantages of employing this methodological approach. The paper concludes with key pointers to aid those designing and appraising proposals for conducting case study research, and a checklist to help readers assess the quality of case study reports.

Peer Review reports

Introduction

The case study approach is particularly useful to employ when there is a need to obtain an in-depth appreciation of an issue, event or phenomenon of interest, in its natural real-life context. Our aim in writing this piece is to provide insights into when to consider employing this approach and an overview of key methodological considerations in relation to the design, planning, analysis, interpretation and reporting of case studies.

The illustrative 'grand round', 'case report' and 'case series' have a long tradition in clinical practice and research. Presenting detailed critiques, typically of one or more patients, aims to provide insights into aspects of the clinical case and, in doing so, illustrate broader lessons that may be learnt. In research, the conceptually-related case study approach can be used, for example, to describe in detail a patient's episode of care, explore professional attitudes to and experiences of a new policy initiative or service development or more generally to 'investigate contemporary phenomena within its real-life context' [ 1 ]. Based on our experiences of conducting a range of case studies, we reflect on when to consider using this approach, discuss the key steps involved and illustrate, with examples, some of the practical challenges of attaining an in-depth understanding of a 'case' as an integrated whole. In keeping with previously published work, we acknowledge the importance of theory to underpin the design, selection, conduct and interpretation of case studies[ 2 ]. In so doing, we make passing reference to the different epistemological approaches used in case study research by key theoreticians and methodologists in this field of enquiry.

This paper is structured around the following main questions: What is a case study? What are case studies used for? How are case studies conducted? What are the potential pitfalls and how can these be avoided? We draw in particular on four of our own recently published examples of case studies (see Tables 1 , 2 , 3 and 4 ) and those of others to illustrate our discussion[ 3 – 7 ].

What is a case study?

A case study is a research approach that is used to generate an in-depth, multi-faceted understanding of a complex issue in its real-life context. It is an established research design that is used extensively in a wide variety of disciplines, particularly in the social sciences. A case study can be defined in a variety of ways (Table 5 ), the central tenet being the need to explore an event or phenomenon in depth and in its natural context. It is for this reason sometimes referred to as a "naturalistic" design; this is in contrast to an "experimental" design (such as a randomised controlled trial) in which the investigator seeks to exert control over and manipulate the variable(s) of interest.

Stake's work has been particularly influential in defining the case study approach to scientific enquiry. He has helpfully characterised three main types of case study: intrinsic , instrumental and collective [ 8 ]. An intrinsic case study is typically undertaken to learn about a unique phenomenon. The researcher should define the uniqueness of the phenomenon, which distinguishes it from all others. In contrast, the instrumental case study uses a particular case (some of which may be better than others) to gain a broader appreciation of an issue or phenomenon. The collective case study involves studying multiple cases simultaneously or sequentially in an attempt to generate a still broader appreciation of a particular issue.

These are however not necessarily mutually exclusive categories. In the first of our examples (Table 1 ), we undertook an intrinsic case study to investigate the issue of recruitment of minority ethnic people into the specific context of asthma research studies, but it developed into a instrumental case study through seeking to understand the issue of recruitment of these marginalised populations more generally, generating a number of the findings that are potentially transferable to other disease contexts[ 3 ]. In contrast, the other three examples (see Tables 2 , 3 and 4 ) employed collective case study designs to study the introduction of workforce reconfiguration in primary care, the implementation of electronic health records into hospitals, and to understand the ways in which healthcare students learn about patient safety considerations[ 4 – 6 ]. Although our study focusing on the introduction of General Practitioners with Specialist Interests (Table 2 ) was explicitly collective in design (four contrasting primary care organisations were studied), is was also instrumental in that this particular professional group was studied as an exemplar of the more general phenomenon of workforce redesign[ 4 ].

What are case studies used for?

According to Yin, case studies can be used to explain, describe or explore events or phenomena in the everyday contexts in which they occur[ 1 ]. These can, for example, help to understand and explain causal links and pathways resulting from a new policy initiative or service development (see Tables 2 and 3 , for example)[ 1 ]. In contrast to experimental designs, which seek to test a specific hypothesis through deliberately manipulating the environment (like, for example, in a randomised controlled trial giving a new drug to randomly selected individuals and then comparing outcomes with controls),[ 9 ] the case study approach lends itself well to capturing information on more explanatory ' how ', 'what' and ' why ' questions, such as ' how is the intervention being implemented and received on the ground?'. The case study approach can offer additional insights into what gaps exist in its delivery or why one implementation strategy might be chosen over another. This in turn can help develop or refine theory, as shown in our study of the teaching of patient safety in undergraduate curricula (Table 4 )[ 6 , 10 ]. Key questions to consider when selecting the most appropriate study design are whether it is desirable or indeed possible to undertake a formal experimental investigation in which individuals and/or organisations are allocated to an intervention or control arm? Or whether the wish is to obtain a more naturalistic understanding of an issue? The former is ideally studied using a controlled experimental design, whereas the latter is more appropriately studied using a case study design.

Case studies may be approached in different ways depending on the epistemological standpoint of the researcher, that is, whether they take a critical (questioning one's own and others' assumptions), interpretivist (trying to understand individual and shared social meanings) or positivist approach (orientating towards the criteria of natural sciences, such as focusing on generalisability considerations) (Table 6 ). Whilst such a schema can be conceptually helpful, it may be appropriate to draw on more than one approach in any case study, particularly in the context of conducting health services research. Doolin has, for example, noted that in the context of undertaking interpretative case studies, researchers can usefully draw on a critical, reflective perspective which seeks to take into account the wider social and political environment that has shaped the case[ 11 ].

How are case studies conducted?

Here, we focus on the main stages of research activity when planning and undertaking a case study; the crucial stages are: defining the case; selecting the case(s); collecting and analysing the data; interpreting data; and reporting the findings.

Defining the case

Carefully formulated research question(s), informed by the existing literature and a prior appreciation of the theoretical issues and setting(s), are all important in appropriately and succinctly defining the case[ 8 , 12 ]. Crucially, each case should have a pre-defined boundary which clarifies the nature and time period covered by the case study (i.e. its scope, beginning and end), the relevant social group, organisation or geographical area of interest to the investigator, the types of evidence to be collected, and the priorities for data collection and analysis (see Table 7 )[ 1 ]. A theory driven approach to defining the case may help generate knowledge that is potentially transferable to a range of clinical contexts and behaviours; using theory is also likely to result in a more informed appreciation of, for example, how and why interventions have succeeded or failed[ 13 ].

For example, in our evaluation of the introduction of electronic health records in English hospitals (Table 3 ), we defined our cases as the NHS Trusts that were receiving the new technology[ 5 ]. Our focus was on how the technology was being implemented. However, if the primary research interest had been on the social and organisational dimensions of implementation, we might have defined our case differently as a grouping of healthcare professionals (e.g. doctors and/or nurses). The precise beginning and end of the case may however prove difficult to define. Pursuing this same example, when does the process of implementation and adoption of an electronic health record system really begin or end? Such judgements will inevitably be influenced by a range of factors, including the research question, theory of interest, the scope and richness of the gathered data and the resources available to the research team.

Selecting the case(s)

The decision on how to select the case(s) to study is a very important one that merits some reflection. In an intrinsic case study, the case is selected on its own merits[ 8 ]. The case is selected not because it is representative of other cases, but because of its uniqueness, which is of genuine interest to the researchers. This was, for example, the case in our study of the recruitment of minority ethnic participants into asthma research (Table 1 ) as our earlier work had demonstrated the marginalisation of minority ethnic people with asthma, despite evidence of disproportionate asthma morbidity[ 14 , 15 ]. In another example of an intrinsic case study, Hellstrom et al.[ 16 ] studied an elderly married couple living with dementia to explore how dementia had impacted on their understanding of home, their everyday life and their relationships.

For an instrumental case study, selecting a "typical" case can work well[ 8 ]. In contrast to the intrinsic case study, the particular case which is chosen is of less importance than selecting a case that allows the researcher to investigate an issue or phenomenon. For example, in order to gain an understanding of doctors' responses to health policy initiatives, Som undertook an instrumental case study interviewing clinicians who had a range of responsibilities for clinical governance in one NHS acute hospital trust[ 17 ]. Sampling a "deviant" or "atypical" case may however prove even more informative, potentially enabling the researcher to identify causal processes, generate hypotheses and develop theory.

In collective or multiple case studies, a number of cases are carefully selected. This offers the advantage of allowing comparisons to be made across several cases and/or replication. Choosing a "typical" case may enable the findings to be generalised to theory (i.e. analytical generalisation) or to test theory by replicating the findings in a second or even a third case (i.e. replication logic)[ 1 ]. Yin suggests two or three literal replications (i.e. predicting similar results) if the theory is straightforward and five or more if the theory is more subtle. However, critics might argue that selecting 'cases' in this way is insufficiently reflexive and ill-suited to the complexities of contemporary healthcare organisations.

The selected case study site(s) should allow the research team access to the group of individuals, the organisation, the processes or whatever else constitutes the chosen unit of analysis for the study. Access is therefore a central consideration; the researcher needs to come to know the case study site(s) well and to work cooperatively with them. Selected cases need to be not only interesting but also hospitable to the inquiry [ 8 ] if they are to be informative and answer the research question(s). Case study sites may also be pre-selected for the researcher, with decisions being influenced by key stakeholders. For example, our selection of case study sites in the evaluation of the implementation and adoption of electronic health record systems (see Table 3 ) was heavily influenced by NHS Connecting for Health, the government agency that was responsible for overseeing the National Programme for Information Technology (NPfIT)[ 5 ]. This prominent stakeholder had already selected the NHS sites (through a competitive bidding process) to be early adopters of the electronic health record systems and had negotiated contracts that detailed the deployment timelines.

It is also important to consider in advance the likely burden and risks associated with participation for those who (or the site(s) which) comprise the case study. Of particular importance is the obligation for the researcher to think through the ethical implications of the study (e.g. the risk of inadvertently breaching anonymity or confidentiality) and to ensure that potential participants/participating sites are provided with sufficient information to make an informed choice about joining the study. The outcome of providing this information might be that the emotive burden associated with participation, or the organisational disruption associated with supporting the fieldwork, is considered so high that the individuals or sites decide against participation.

In our example of evaluating implementations of electronic health record systems, given the restricted number of early adopter sites available to us, we sought purposively to select a diverse range of implementation cases among those that were available[ 5 ]. We chose a mixture of teaching, non-teaching and Foundation Trust hospitals, and examples of each of the three electronic health record systems procured centrally by the NPfIT. At one recruited site, it quickly became apparent that access was problematic because of competing demands on that organisation. Recognising the importance of full access and co-operative working for generating rich data, the research team decided not to pursue work at that site and instead to focus on other recruited sites.

Collecting the data

In order to develop a thorough understanding of the case, the case study approach usually involves the collection of multiple sources of evidence, using a range of quantitative (e.g. questionnaires, audits and analysis of routinely collected healthcare data) and more commonly qualitative techniques (e.g. interviews, focus groups and observations). The use of multiple sources of data (data triangulation) has been advocated as a way of increasing the internal validity of a study (i.e. the extent to which the method is appropriate to answer the research question)[ 8 , 18 – 21 ]. An underlying assumption is that data collected in different ways should lead to similar conclusions, and approaching the same issue from different angles can help develop a holistic picture of the phenomenon (Table 2 )[ 4 ].

Brazier and colleagues used a mixed-methods case study approach to investigate the impact of a cancer care programme[ 22 ]. Here, quantitative measures were collected with questionnaires before, and five months after, the start of the intervention which did not yield any statistically significant results. Qualitative interviews with patients however helped provide an insight into potentially beneficial process-related aspects of the programme, such as greater, perceived patient involvement in care. The authors reported how this case study approach provided a number of contextual factors likely to influence the effectiveness of the intervention and which were not likely to have been obtained from quantitative methods alone.

In collective or multiple case studies, data collection needs to be flexible enough to allow a detailed description of each individual case to be developed (e.g. the nature of different cancer care programmes), before considering the emerging similarities and differences in cross-case comparisons (e.g. to explore why one programme is more effective than another). It is important that data sources from different cases are, where possible, broadly comparable for this purpose even though they may vary in nature and depth.

Analysing, interpreting and reporting case studies

Making sense and offering a coherent interpretation of the typically disparate sources of data (whether qualitative alone or together with quantitative) is far from straightforward. Repeated reviewing and sorting of the voluminous and detail-rich data are integral to the process of analysis. In collective case studies, it is helpful to analyse data relating to the individual component cases first, before making comparisons across cases. Attention needs to be paid to variations within each case and, where relevant, the relationship between different causes, effects and outcomes[ 23 ]. Data will need to be organised and coded to allow the key issues, both derived from the literature and emerging from the dataset, to be easily retrieved at a later stage. An initial coding frame can help capture these issues and can be applied systematically to the whole dataset with the aid of a qualitative data analysis software package.

The Framework approach is a practical approach, comprising of five stages (familiarisation; identifying a thematic framework; indexing; charting; mapping and interpretation) , to managing and analysing large datasets particularly if time is limited, as was the case in our study of recruitment of South Asians into asthma research (Table 1 )[ 3 , 24 ]. Theoretical frameworks may also play an important role in integrating different sources of data and examining emerging themes. For example, we drew on a socio-technical framework to help explain the connections between different elements - technology; people; and the organisational settings within which they worked - in our study of the introduction of electronic health record systems (Table 3 )[ 5 ]. Our study of patient safety in undergraduate curricula drew on an evaluation-based approach to design and analysis, which emphasised the importance of the academic, organisational and practice contexts through which students learn (Table 4 )[ 6 ].

Case study findings can have implications both for theory development and theory testing. They may establish, strengthen or weaken historical explanations of a case and, in certain circumstances, allow theoretical (as opposed to statistical) generalisation beyond the particular cases studied[ 12 ]. These theoretical lenses should not, however, constitute a strait-jacket and the cases should not be "forced to fit" the particular theoretical framework that is being employed.

When reporting findings, it is important to provide the reader with enough contextual information to understand the processes that were followed and how the conclusions were reached. In a collective case study, researchers may choose to present the findings from individual cases separately before amalgamating across cases. Care must be taken to ensure the anonymity of both case sites and individual participants (if agreed in advance) by allocating appropriate codes or withholding descriptors. In the example given in Table 3 , we decided against providing detailed information on the NHS sites and individual participants in order to avoid the risk of inadvertent disclosure of identities[ 5 , 25 ].

What are the potential pitfalls and how can these be avoided?

The case study approach is, as with all research, not without its limitations. When investigating the formal and informal ways undergraduate students learn about patient safety (Table 4 ), for example, we rapidly accumulated a large quantity of data. The volume of data, together with the time restrictions in place, impacted on the depth of analysis that was possible within the available resources. This highlights a more general point of the importance of avoiding the temptation to collect as much data as possible; adequate time also needs to be set aside for data analysis and interpretation of what are often highly complex datasets.

Case study research has sometimes been criticised for lacking scientific rigour and providing little basis for generalisation (i.e. producing findings that may be transferable to other settings)[ 1 ]. There are several ways to address these concerns, including: the use of theoretical sampling (i.e. drawing on a particular conceptual framework); respondent validation (i.e. participants checking emerging findings and the researcher's interpretation, and providing an opinion as to whether they feel these are accurate); and transparency throughout the research process (see Table 8 )[ 8 , 18 – 21 , 23 , 26 ]. Transparency can be achieved by describing in detail the steps involved in case selection, data collection, the reasons for the particular methods chosen, and the researcher's background and level of involvement (i.e. being explicit about how the researcher has influenced data collection and interpretation). Seeking potential, alternative explanations, and being explicit about how interpretations and conclusions were reached, help readers to judge the trustworthiness of the case study report. Stake provides a critique checklist for a case study report (Table 9 )[ 8 ].

Conclusions

The case study approach allows, amongst other things, critical events, interventions, policy developments and programme-based service reforms to be studied in detail in a real-life context. It should therefore be considered when an experimental design is either inappropriate to answer the research questions posed or impossible to undertake. Considering the frequency with which implementations of innovations are now taking place in healthcare settings and how well the case study approach lends itself to in-depth, complex health service research, we believe this approach should be more widely considered by researchers. Though inherently challenging, the research case study can, if carefully conceptualised and thoughtfully undertaken and reported, yield powerful insights into many important aspects of health and healthcare delivery.

Yin RK: Case study research, design and method. 2009, London: Sage Publications Ltd., 4

Google Scholar  

Keen J, Packwood T: Qualitative research; case study evaluation. BMJ. 1995, 311: 444-446.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Sheikh A, Halani L, Bhopal R, Netuveli G, Partridge M, Car J, et al: Facilitating the Recruitment of Minority Ethnic People into Research: Qualitative Case Study of South Asians and Asthma. PLoS Med. 2009, 6 (10): 1-11.

Article   Google Scholar  

Pinnock H, Huby G, Powell A, Kielmann T, Price D, Williams S, et al: The process of planning, development and implementation of a General Practitioner with a Special Interest service in Primary Care Organisations in England and Wales: a comparative prospective case study. Report for the National Co-ordinating Centre for NHS Service Delivery and Organisation R&D (NCCSDO). 2008, [ http://www.sdo.nihr.ac.uk/files/project/99-final-report.pdf ]

Robertson A, Cresswell K, Takian A, Petrakaki D, Crowe S, Cornford T, et al: Prospective evaluation of the implementation and adoption of NHS Connecting for Health's national electronic health record in secondary care in England: interim findings. BMJ. 2010, 41: c4564-

Pearson P, Steven A, Howe A, Sheikh A, Ashcroft D, Smith P, the Patient Safety Education Study Group: Learning about patient safety: organisational context and culture in the education of healthcare professionals. J Health Serv Res Policy. 2010, 15: 4-10. 10.1258/jhsrp.2009.009052.

Article   PubMed   Google Scholar  

van Harten WH, Casparie TF, Fisscher OA: The evaluation of the introduction of a quality management system: a process-oriented case study in a large rehabilitation hospital. Health Policy. 2002, 60 (1): 17-37. 10.1016/S0168-8510(01)00187-7.

Stake RE: The art of case study research. 1995, London: Sage Publications Ltd.

Sheikh A, Smeeth L, Ashcroft R: Randomised controlled trials in primary care: scope and application. Br J Gen Pract. 2002, 52 (482): 746-51.

PubMed   PubMed Central   Google Scholar  

King G, Keohane R, Verba S: Designing Social Inquiry. 1996, Princeton: Princeton University Press

Doolin B: Information technology as disciplinary technology: being critical in interpretative research on information systems. Journal of Information Technology. 1998, 13: 301-311. 10.1057/jit.1998.8.

George AL, Bennett A: Case studies and theory development in the social sciences. 2005, Cambridge, MA: MIT Press

Eccles M, the Improved Clinical Effectiveness through Behavioural Research Group (ICEBeRG): Designing theoretically-informed implementation interventions. Implementation Science. 2006, 1: 1-8. 10.1186/1748-5908-1-1.

Article   PubMed Central   Google Scholar  

Netuveli G, Hurwitz B, Levy M, Fletcher M, Barnes G, Durham SR, Sheikh A: Ethnic variations in UK asthma frequency, morbidity, and health-service use: a systematic review and meta-analysis. Lancet. 2005, 365 (9456): 312-7.

Sheikh A, Panesar SS, Lasserson T, Netuveli G: Recruitment of ethnic minorities to asthma studies. Thorax. 2004, 59 (7): 634-

CAS   PubMed   PubMed Central   Google Scholar  

Hellström I, Nolan M, Lundh U: 'We do things together': A case study of 'couplehood' in dementia. Dementia. 2005, 4: 7-22. 10.1177/1471301205049188.

Som CV: Nothing seems to have changed, nothing seems to be changing and perhaps nothing will change in the NHS: doctors' response to clinical governance. International Journal of Public Sector Management. 2005, 18: 463-477. 10.1108/09513550510608903.

Lincoln Y, Guba E: Naturalistic inquiry. 1985, Newbury Park: Sage Publications

Barbour RS: Checklists for improving rigour in qualitative research: a case of the tail wagging the dog?. BMJ. 2001, 322: 1115-1117. 10.1136/bmj.322.7294.1115.

Mays N, Pope C: Qualitative research in health care: Assessing quality in qualitative research. BMJ. 2000, 320: 50-52. 10.1136/bmj.320.7226.50.

Mason J: Qualitative researching. 2002, London: Sage

Brazier A, Cooke K, Moravan V: Using Mixed Methods for Evaluating an Integrative Approach to Cancer Care: A Case Study. Integr Cancer Ther. 2008, 7: 5-17. 10.1177/1534735407313395.

Miles MB, Huberman M: Qualitative data analysis: an expanded sourcebook. 1994, CA: Sage Publications Inc., 2

Pope C, Ziebland S, Mays N: Analysing qualitative data. Qualitative research in health care. BMJ. 2000, 320: 114-116. 10.1136/bmj.320.7227.114.

Cresswell KM, Worth A, Sheikh A: Actor-Network Theory and its role in understanding the implementation of information technology developments in healthcare. BMC Med Inform Decis Mak. 2010, 10 (1): 67-10.1186/1472-6947-10-67.

Article   PubMed   PubMed Central   Google Scholar  

Malterud K: Qualitative research: standards, challenges, and guidelines. Lancet. 2001, 358: 483-488. 10.1016/S0140-6736(01)05627-6.

Article   CAS   PubMed   Google Scholar  

Yin R: Case study research: design and methods. 1994, Thousand Oaks, CA: Sage Publishing, 2

Yin R: Enhancing the quality of case studies in health services research. Health Serv Res. 1999, 34: 1209-1224.

Green J, Thorogood N: Qualitative methods for health research. 2009, Los Angeles: Sage, 2

Howcroft D, Trauth E: Handbook of Critical Information Systems Research, Theory and Application. 2005, Cheltenham, UK: Northampton, MA, USA: Edward Elgar

Book   Google Scholar  

Blakie N: Approaches to Social Enquiry. 1993, Cambridge: Polity Press

Doolin B: Power and resistance in the implementation of a medical management information system. Info Systems J. 2004, 14: 343-362. 10.1111/j.1365-2575.2004.00176.x.

Bloomfield BP, Best A: Management consultants: systems development, power and the translation of problems. Sociological Review. 1992, 40: 533-560.

Shanks G, Parr A: Positivist, single case study research in information systems: A critical analysis. Proceedings of the European Conference on Information Systems. 2003, Naples

Pre-publication history

The pre-publication history for this paper can be accessed here: http://www.biomedcentral.com/1471-2288/11/100/prepub

Download references

Acknowledgements

We are grateful to the participants and colleagues who contributed to the individual case studies that we have drawn on. This work received no direct funding, but it has been informed by projects funded by Asthma UK, the NHS Service Delivery Organisation, NHS Connecting for Health Evaluation Programme, and Patient Safety Research Portfolio. We would also like to thank the expert reviewers for their insightful and constructive feedback. Our thanks are also due to Dr. Allison Worth who commented on an earlier draft of this manuscript.

Author information

Authors and affiliations.

Division of Primary Care, The University of Nottingham, Nottingham, UK

Sarah Crowe & Anthony Avery

Centre for Population Health Sciences, The University of Edinburgh, Edinburgh, UK

Kathrin Cresswell, Ann Robertson & Aziz Sheikh

School of Health in Social Science, The University of Edinburgh, Edinburgh, UK

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Sarah Crowe .

Additional information

Competing interests.

The authors declare that they have no competing interests.

Authors' contributions

AS conceived this article. SC, KC and AR wrote this paper with GH, AA and AS all commenting on various drafts. SC and AS are guarantors.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article.

Crowe, S., Cresswell, K., Robertson, A. et al. The case study approach. BMC Med Res Methodol 11 , 100 (2011). https://doi.org/10.1186/1471-2288-11-100

Download citation

Received : 29 November 2010

Accepted : 27 June 2011

Published : 27 June 2011

DOI : https://doi.org/10.1186/1471-2288-11-100

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Case Study Approach
  • Electronic Health Record System
  • Case Study Design
  • Case Study Site
  • Case Study Report

BMC Medical Research Methodology

ISSN: 1471-2288

exploratory case study advantages and disadvantages

Logo for Open Educational Resources Collective

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 8: Case study

Darshini Ayton

Learning outcomes

Upon completion of this chapter, you should be able to:

  • Identify the key terms and concepts used in qualitative case study research.
  • Discuss the advantages and disadvantages of qualitative case study research.

What is a case study?

The key concept in a case study is context .

In qualitative research, case studies provide in-depth accounts of events, relationships, experiences or processes. Stemming from the fields of evaluation, political science and law, the aim of a qualitative case study is to explore a phenomenon within the context of the case 1 and to answer how and why research questions. 2 The contextual conditions are relevant to the phenomenon under study and the contextual factors tend to lie with the case. 1 From the outset it is important (a) to determine who or what is your case – this can be a person, program, organisation or group, or a process – and (b) to articulate the phenomenon of interest.

An example of why context is important in understanding the phenomenon of interest is a study of health promotion action by local churches in Victoria, Australia. 3 The phenomenon under study was health promotion action, with 10 churches comprising the cases, which were mapped across the framework of health promotion approaches. 4 The contextual factors included church denomination (Baptist, Church of Christ, Uniting, Anglican, Catholic and Salvation Army), size (small, medium and large), location (rural and metropolitan), partnerships with external organisations (government, local schools and social welfare organisations) and theological orientation (traditional, modern or postmodern), to understand the phenomenon of health promotion action. Data collection took 12 months and involved interviews with 37 church leaders, 10 focus groups with volunteers, 17 instances of participant observation of church activities, including church services, youth events, food banks and community meals, and 12 documentary analyses of church websites, newsletters and annual reports. The case studies identified and illustrated how and why three different expressions of church – traditional, new modern and emerging – led to different levels and types of health promotion activities.

Three prominent qualitative case study methodologists, Robert Stake, Robert Yin and Sharan Merriams, have articulated different approaches to case studies and their underpinning philosophical and paradigmatic assumptions. Table 8 outlines these approaches, based on work by Yazan, 5 whose expanded table covers characteristics of case studies, data collection and analysis.

Table 8.1. Comparison of case study terms used by three key methodologists

The Art of Case Study Research Case Study Research:
Design and Methods.
Qualitative Research and Case Study Applications in Education.
First author and year Stake, 1995 Yin, 2002 Merriam, 1998
Definition of qualitative case study study of the particularity and complexity of a single case, coming to understand its activity within important circumstances’ empirical inquiry that investigates a contemporary phenomenon within its real-life context…relies on multiple sources of
evidence’
‘an intensive, holistic description
and analysis of a bounded phenomenon such as a program, an institution, a person, a process, or a social unit’
Paradigm Constructivism Positivism Constructivism
Definition of a case ‘a specific, a complex, functioning thing, more specifically an integrated system’ which ‘has a boundary and working parts’ and purposive (in social sciences and human services) ‘a contemporary
phenomenon within its real life context, especially when the boundaries between a phenomenon and context are not clear and the researcher has little control over the phenomenon and context’
‘a thing, a single entity, a unit around which there are boundaries’

It can 'be a person… a program, a group…a specific policy, and so on'

Table 8.1 is derived from ‘Three Approaches to Case Study Methods in Education: Yin, Merriam, and Stake ‘  by Bedrettin Yazan,  licensed under CC BY-NC-SA 4.0. 5

There are several forms of qualitative case studies. 1,2

Discovery-led case studies, which:

  • describe what is happening in the setting
  • explore the key issues affecting people within the setting
  • compare settings, to learn from the similarities and differences between them.

Theory-led case studies, which:

  • explain the causes of events, processes or relationships within a setting
  • illustrate how a particular theory applies to a real-life setting
  • experiment with changes in the setting to test specific factors or variables.

Single and collective case studies, where: 2, 9

  • the researcher wants to understand a unique phenomenon in detail– known as an intrinsic case study
  • the researcher is seeking insight and understanding of a particular situation or phenomenon, known as an illustrative case study or instrumental case study.

In both intrinsic, instrumental and illustrative case studies, the exploration might take place within a single case. In contrast, a collective case study includes multiple individual cases, and the exploration occurs both within and between cases. Collective case studies may include comparative cases, whereby cases are sampled to provide points of comparison for either context or the phenomenon. Embedded case studies are increasingly common within multi-site, randomised controlled trials, where each of the study sites is considered a case.

Multiple forms of data collection and methods of analysis (e.g. thematic, content, framework and constant comparative analyses) can be employed, since case studies are characterised by the depth of knowledge they provide and their nuanced approaches to understanding phenomena within context. 2,5 This approach enables triangulation between data sources (interviews, focus groups, participant observations), researchers and theory. Refer to Chapter 19 for information about triangulation.

Advantages and disadvantages of qualitative case studies

Advantages of using a case study approach include the ability to explore the subtleties and intricacies of complex social situations, and the use of multiple data collection methods and data from multiple sources within the case, which enables rigour through triangulation. Collective case studies enable comparison and contrasting within and across cases.

However, it can be challenging to define the boundaries of the case and to gain appropriate access to the case for the ‘deep dive’ form of analysis. Participant observation, which is a common form of data collection, can lead to observer bias. Data collection can take a long time and may require lengthy times, resources and funding to conduct the study. 9

Table 8.2 provides an example of a single case study and of a collective case study.

Table 8.2. Examples of qualitative case studies

Title
Nayback-Beebe, 2012 Clack, 2018
‘The purpose of this phenomenological qualitative case study… was to gain a holistic understanding of the lived-experience of a male victim of intimate partner violence and the real-life context in which the violence emerged.’ ‘in-depth investigation of the main barriers, facilitators and contextual factors relevant to successfully implementing these strategies in European acute care hospitals’
‘What is the lived experience of living in and leaving an abusive intimate relationship for a white middle class male?’ ‘(1) what are the main barriers and facilitators to successfully implementing CRBSI prevention procedures?; and

(2) what role do contextual factors play?’
A single, intrinsic qualitative research study. Following Yin’s case study approach, the authors wished to uncover the contextual conditions relevant to the phenomenon under study – living in and leaving an abusive intimate relationship as a white, middle-class male. The researchers wanted to understand and explore the contextual conditions related to female-to-male perpetrated intimate partner violence. A qualitative comparative case study of 6 of the 14 hospitals participating in the Prevention of Hospital Infections by Intervention and Training (PROHIBIT) randomised controlled study on the prevention of catheter-related bloodstream infection prevention. The case study examined contextual factors that affect the implementation of an intervention, particularly across culturally, politically and economically diverse hospital settings in Europe.
United States of America, insights from a case study to provide nurses with an understanding that intimate partner violence occurs in the lives of men and women, and to be aware of this in the inpatient and outpatient settings. European acute-care hospitals that were participating in the PROHIBIT randomised controlled trial.
Three in-depth interviews conducted for one month. The participant was a 44-year-old man who met the following inclusion criteria:

• self-reported survivor of physical, emotional, verbal abuse, harassment and/or humiliation by a current or former partner

• the violence occurred in the context of a heterosexual relationship

• was in the process of leaving or had left the relationship
Data collection before and after the implementation of an intervention and included 129 interviews (133 hours) with hospital administration, IPC and ICU leadership and staff, telephone interviews with onsite investigators alongside 41 hours of direct observations
Existential phenomenology following Colaizzi’s method for data analysis. Thematic analysis was inductive (first site visit) and deductive (second site visit), with cross-case analysis using a stacking technique; cases were grouped according to common characteristics and differences, and similarities were examined.
Theme 1. Living in the relationship – confrontation from within

Theme 2. Living in the relationship – confrontation from without

Theme 3. Leaving the relationship – realisation and relinquishment
Overarching theme: Living with a knot in your stomach
Three meta themes were identified

• implementation agendas

• resourcing

• boundary spanning

Qualitative case studies provide a study design with diverse methods to examine the contextual factors relevant to understanding the why and how of a phenomenon within a case. The design incorporates single case studies and collective cases, which can also be embedded within randomised controlled trials as a form of process evaluation.

  • Creswell J, Hanson W, Clark Plano V et al.. Qualitative research designs: selection and implementation. Couns Psychol  2007;35(2):236-264. doi:10.1177/0011000006287390
  • Crowe S, Cresswell K, Robertson A, et al. The case study approach. BMC Med Res Methodol . 2011;11:100. doi:10.1186/1471-2288-11-100
  • Ayton D, Manderson L, Smith BJ et al. Health promotion in local churches in Victoria: an exploratory study. Health Soc Care Community . 2016;24(6):728-738. doi:10.1111/hsc.12258
  • Keleher H, Murphy C. Understanding Health: A Determinants Approach . Oxford University Press; 2004.
  • Yazan B. Three approaches to case study methods in education: Yin, Merriam, and Stake. The Qualitative Report . 2015;20(2):134-152. doi:10.46743/2160-3715/2015.2102
  • Stake RE. The A rt of C ase S tudy R esearch . SAGE Publications; 1995.
  • Yin RK. Case S tudy R esearch: Design and M ethods . SAGE Publications; 2002.
  • Merriam SB. Qualitative R esearch and C ase S tudy A pplications in E ducation . Jossey-Boss; 1998.
  • Kekeya J. Qualitative case study research design: the commonalities and differences between collective, intrinsic and instrumental case studies. Contemporary PNG Studies . 2021;36:28-37.
  • Nayback-Beebe AM, Yoder LH. The lived experiences of a male survivor of intimate partner violence: a qualitative case study. Medsurg Nurs . 2012;21(2):89-95; quiz 96.
  • Clack L, Zingg W, Saint S et al. Implementing infection prevention practices across European hospitals: an in-depth qualitative assessment. BMJ Qual Saf . 2018;27(10):771-780. doi:10.1136/bmjqs-2017-007675

Qualitative Research – a practical guide for health and social care researchers and practitioners Copyright © 2023 by Darshini Ayton is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License , except where otherwise noted.

Share This Book

The Advantages and Limitations of Single Case Study Analysis

exploratory case study advantages and disadvantages

As Andrew Bennett and Colin Elman have recently noted, qualitative research methods presently enjoy “an almost unprecedented popularity and vitality… in the international relations sub-field”, such that they are now “indisputably prominent, if not pre-eminent” (2010: 499). This is, they suggest, due in no small part to the considerable advantages that case study methods in particular have to offer in studying the “complex and relatively unstructured and infrequent phenomena that lie at the heart of the subfield” (Bennett and Elman, 2007: 171). Using selected examples from within the International Relations literature[1], this paper aims to provide a brief overview of the main principles and distinctive advantages and limitations of single case study analysis. Divided into three inter-related sections, the paper therefore begins by first identifying the underlying principles that serve to constitute the case study as a particular research strategy, noting the somewhat contested nature of the approach in ontological, epistemological, and methodological terms. The second part then looks to the principal single case study types and their associated advantages, including those from within the recent ‘third generation’ of qualitative International Relations (IR) research. The final section of the paper then discusses the most commonly articulated limitations of single case studies; while accepting their susceptibility to criticism, it is however suggested that such weaknesses are somewhat exaggerated. The paper concludes that single case study analysis has a great deal to offer as a means of both understanding and explaining contemporary international relations.

The term ‘case study’, John Gerring has suggested, is “a definitional morass… Evidently, researchers have many different things in mind when they talk about case study research” (2006a: 17). It is possible, however, to distil some of the more commonly-agreed principles. One of the most prominent advocates of case study research, Robert Yin (2009: 14) defines it as “an empirical enquiry that investigates a contemporary phenomenon in depth and within its real-life context, especially when the boundaries between phenomenon and context are not clearly evident”. What this definition usefully captures is that case studies are intended – unlike more superficial and generalising methods – to provide a level of detail and understanding, similar to the ethnographer Clifford Geertz’s (1973) notion of ‘thick description’, that allows for the thorough analysis of the complex and particularistic nature of distinct phenomena. Another frequently cited proponent of the approach, Robert Stake, notes that as a form of research the case study “is defined by interest in an individual case, not by the methods of inquiry used”, and that “the object of study is a specific, unique, bounded system” (2008: 443, 445). As such, three key points can be derived from this – respectively concerning issues of ontology, epistemology, and methodology – that are central to the principles of single case study research.

First, the vital notion of ‘boundedness’ when it comes to the particular unit of analysis means that defining principles should incorporate both the synchronic (spatial) and diachronic (temporal) elements of any so-called ‘case’. As Gerring puts it, a case study should be “an intensive study of a single unit… a spatially bounded phenomenon – e.g. a nation-state, revolution, political party, election, or person – observed at a single point in time or over some delimited period of time” (2004: 342). It is important to note, however, that – whereas Gerring refers to a single unit of analysis – it may be that attention also necessarily be given to particular sub-units. This points to the important difference between what Yin refers to as an ‘holistic’ case design, with a single unit of analysis, and an ’embedded’ case design with multiple units of analysis (Yin, 2009: 50-52). The former, for example, would examine only the overall nature of an international organization, whereas the latter would also look to specific departments, programmes, or policies etc.

Secondly, as Tim May notes of the case study approach, “even the most fervent advocates acknowledge that the term has entered into understandings with little specification or discussion of purpose and process” (2011: 220). One of the principal reasons for this, he argues, is the relationship between the use of case studies in social research and the differing epistemological traditions – positivist, interpretivist, and others – within which it has been utilised. Philosophy of science concerns are obviously a complex issue, and beyond the scope of much of this paper. That said, the issue of how it is that we know what we know – of whether or not a single independent reality exists of which we as researchers can seek to provide explanation – does lead us to an important distinction to be made between so-called idiographic and nomothetic case studies (Gerring, 2006b). The former refers to those which purport to explain only a single case, are concerned with particularisation, and hence are typically (although not exclusively) associated with more interpretivist approaches. The latter are those focused studies that reflect upon a larger population and are more concerned with generalisation, as is often so with more positivist approaches[2]. The importance of this distinction, and its relation to the advantages and limitations of single case study analysis, is returned to below.

Thirdly, in methodological terms, given that the case study has often been seen as more of an interpretivist and idiographic tool, it has also been associated with a distinctly qualitative approach (Bryman, 2009: 67-68). However, as Yin notes, case studies can – like all forms of social science research – be exploratory, descriptive, and/or explanatory in nature. It is “a common misconception”, he notes, “that the various research methods should be arrayed hierarchically… many social scientists still deeply believe that case studies are only appropriate for the exploratory phase of an investigation” (Yin, 2009: 6). If case studies can reliably perform any or all three of these roles – and given that their in-depth approach may also require multiple sources of data and the within-case triangulation of methods – then it becomes readily apparent that they should not be limited to only one research paradigm. Exploratory and descriptive studies usually tend toward the qualitative and inductive, whereas explanatory studies are more often quantitative and deductive (David and Sutton, 2011: 165-166). As such, the association of case study analysis with a qualitative approach is a “methodological affinity, not a definitional requirement” (Gerring, 2006a: 36). It is perhaps better to think of case studies as transparadigmatic; it is mistaken to assume single case study analysis to adhere exclusively to a qualitative methodology (or an interpretivist epistemology) even if it – or rather, practitioners of it – may be so inclined. By extension, this also implies that single case study analysis therefore remains an option for a multitude of IR theories and issue areas; it is how this can be put to researchers’ advantage that is the subject of the next section.

Having elucidated the defining principles of the single case study approach, the paper now turns to an overview of its main benefits. As noted above, a lack of consensus still exists within the wider social science literature on the principles and purposes – and by extension the advantages and limitations – of case study research. Given that this paper is directed towards the particular sub-field of International Relations, it suggests Bennett and Elman’s (2010) more discipline-specific understanding of contemporary case study methods as an analytical framework. It begins however, by discussing Harry Eckstein’s seminal (1975) contribution to the potential advantages of the case study approach within the wider social sciences.

Eckstein proposed a taxonomy which usefully identified what he considered to be the five most relevant types of case study. Firstly were so-called configurative-idiographic studies, distinctly interpretivist in orientation and predicated on the assumption that “one cannot attain prediction and control in the natural science sense, but only understanding ( verstehen )… subjective values and modes of cognition are crucial” (1975: 132). Eckstein’s own sceptical view was that any interpreter ‘simply’ considers a body of observations that are not self-explanatory and “without hard rules of interpretation, may discern in them any number of patterns that are more or less equally plausible” (1975: 134). Those of a more post-modernist bent, of course – sharing an “incredulity towards meta-narratives”, in Lyotard’s (1994: xxiv) evocative phrase – would instead suggest that this more free-form approach actually be advantageous in delving into the subtleties and particularities of individual cases.

Eckstein’s four other types of case study, meanwhile, promote a more nomothetic (and positivist) usage. As described, disciplined-configurative studies were essentially about the use of pre-existing general theories, with a case acting “passively, in the main, as a receptacle for putting theories to work” (Eckstein, 1975: 136). As opposed to the opportunity this presented primarily for theory application, Eckstein identified heuristic case studies as explicit theoretical stimulants – thus having instead the intended advantage of theory-building. So-called p lausibility probes entailed preliminary attempts to determine whether initial hypotheses should be considered sound enough to warrant more rigorous and extensive testing. Finally, and perhaps most notably, Eckstein then outlined the idea of crucial case studies , within which he also included the idea of ‘most-likely’ and ‘least-likely’ cases; the essential characteristic of crucial cases being their specific theory-testing function.

Whilst Eckstein’s was an early contribution to refining the case study approach, Yin’s (2009: 47-52) more recent delineation of possible single case designs similarly assigns them roles in the applying, testing, or building of theory, as well as in the study of unique cases[3]. As a subset of the latter, however, Jack Levy (2008) notes that the advantages of idiographic cases are actually twofold. Firstly, as inductive/descriptive cases – akin to Eckstein’s configurative-idiographic cases – whereby they are highly descriptive, lacking in an explicit theoretical framework and therefore taking the form of “total history”. Secondly, they can operate as theory-guided case studies, but ones that seek only to explain or interpret a single historical episode rather than generalise beyond the case. Not only does this therefore incorporate ‘single-outcome’ studies concerned with establishing causal inference (Gerring, 2006b), it also provides room for the more postmodern approaches within IR theory, such as discourse analysis, that may have developed a distinct methodology but do not seek traditional social scientific forms of explanation.

Applying specifically to the state of the field in contemporary IR, Bennett and Elman identify a ‘third generation’ of mainstream qualitative scholars – rooted in a pragmatic scientific realist epistemology and advocating a pluralistic approach to methodology – that have, over the last fifteen years, “revised or added to essentially every aspect of traditional case study research methods” (2010: 502). They identify ‘process tracing’ as having emerged from this as a central method of within-case analysis. As Bennett and Checkel observe, this carries the advantage of offering a methodologically rigorous “analysis of evidence on processes, sequences, and conjunctures of events within a case, for the purposes of either developing or testing hypotheses about causal mechanisms that might causally explain the case” (2012: 10).

Harnessing various methods, process tracing may entail the inductive use of evidence from within a case to develop explanatory hypotheses, and deductive examination of the observable implications of hypothesised causal mechanisms to test their explanatory capability[4]. It involves providing not only a coherent explanation of the key sequential steps in a hypothesised process, but also sensitivity to alternative explanations as well as potential biases in the available evidence (Bennett and Elman 2010: 503-504). John Owen (1994), for example, demonstrates the advantages of process tracing in analysing whether the causal factors underpinning democratic peace theory are – as liberalism suggests – not epiphenomenal, but variously normative, institutional, or some given combination of the two or other unexplained mechanism inherent to liberal states. Within-case process tracing has also been identified as advantageous in addressing the complexity of path-dependent explanations and critical junctures – as for example with the development of political regime types – and their constituent elements of causal possibility, contingency, closure, and constraint (Bennett and Elman, 2006b).

Bennett and Elman (2010: 505-506) also identify the advantages of single case studies that are implicitly comparative: deviant, most-likely, least-likely, and crucial cases. Of these, so-called deviant cases are those whose outcome does not fit with prior theoretical expectations or wider empirical patterns – again, the use of inductive process tracing has the advantage of potentially generating new hypotheses from these, either particular to that individual case or potentially generalisable to a broader population. A classic example here is that of post-independence India as an outlier to the standard modernisation theory of democratisation, which holds that higher levels of socio-economic development are typically required for the transition to, and consolidation of, democratic rule (Lipset, 1959; Diamond, 1992). Absent these factors, MacMillan’s single case study analysis (2008) suggests the particularistic importance of the British colonial heritage, the ideology and leadership of the Indian National Congress, and the size and heterogeneity of the federal state.

Most-likely cases, as per Eckstein above, are those in which a theory is to be considered likely to provide a good explanation if it is to have any application at all, whereas least-likely cases are ‘tough test’ ones in which the posited theory is unlikely to provide good explanation (Bennett and Elman, 2010: 505). Levy (2008) neatly refers to the inferential logic of the least-likely case as the ‘Sinatra inference’ – if a theory can make it here, it can make it anywhere. Conversely, if a theory cannot pass a most-likely case, it is seriously impugned. Single case analysis can therefore be valuable for the testing of theoretical propositions, provided that predictions are relatively precise and measurement error is low (Levy, 2008: 12-13). As Gerring rightly observes of this potential for falsification:

“a positivist orientation toward the work of social science militates toward a greater appreciation of the case study format, not a denigration of that format, as is usually supposed” (Gerring, 2007: 247, emphasis added).

In summary, the various forms of single case study analysis can – through the application of multiple qualitative and/or quantitative research methods – provide a nuanced, empirically-rich, holistic account of specific phenomena. This may be particularly appropriate for those phenomena that are simply less amenable to more superficial measures and tests (or indeed any substantive form of quantification) as well as those for which our reasons for understanding and/or explaining them are irreducibly subjective – as, for example, with many of the normative and ethical issues associated with the practice of international relations. From various epistemological and analytical standpoints, single case study analysis can incorporate both idiographic sui generis cases and, where the potential for generalisation may exist, nomothetic case studies suitable for the testing and building of causal hypotheses. Finally, it should not be ignored that a signal advantage of the case study – with particular relevance to international relations – also exists at a more practical rather than theoretical level. This is, as Eckstein noted, “that it is economical for all resources: money, manpower, time, effort… especially important, of course, if studies are inherently costly, as they are if units are complex collective individuals ” (1975: 149-150, emphasis added).

Limitations

Single case study analysis has, however, been subject to a number of criticisms, the most common of which concern the inter-related issues of methodological rigour, researcher subjectivity, and external validity. With regard to the first point, the prototypical view here is that of Zeev Maoz (2002: 164-165), who suggests that “the use of the case study absolves the author from any kind of methodological considerations. Case studies have become in many cases a synonym for freeform research where anything goes”. The absence of systematic procedures for case study research is something that Yin (2009: 14-15) sees as traditionally the greatest concern due to a relative absence of methodological guidelines. As the previous section suggests, this critique seems somewhat unfair; many contemporary case study practitioners – and representing various strands of IR theory – have increasingly sought to clarify and develop their methodological techniques and epistemological grounding (Bennett and Elman, 2010: 499-500).

A second issue, again also incorporating issues of construct validity, concerns that of the reliability and replicability of various forms of single case study analysis. This is usually tied to a broader critique of qualitative research methods as a whole. However, whereas the latter obviously tend toward an explicitly-acknowledged interpretive basis for meanings, reasons, and understandings:

“quantitative measures appear objective, but only so long as we don’t ask questions about where and how the data were produced… pure objectivity is not a meaningful concept if the goal is to measure intangibles [as] these concepts only exist because we can interpret them” (Berg and Lune, 2010: 340).

The question of researcher subjectivity is a valid one, and it may be intended only as a methodological critique of what are obviously less formalised and researcher-independent methods (Verschuren, 2003). Owen (1994) and Layne’s (1994) contradictory process tracing results of interdemocratic war-avoidance during the Anglo-American crisis of 1861 to 1863 – from liberal and realist standpoints respectively – are a useful example. However, it does also rest on certain assumptions that can raise deeper and potentially irreconcilable ontological and epistemological issues. There are, regardless, plenty such as Bent Flyvbjerg (2006: 237) who suggest that the case study contains no greater bias toward verification than other methods of inquiry, and that “on the contrary, experience indicates that the case study contains a greater bias toward falsification of preconceived notions than toward verification”.

The third and arguably most prominent critique of single case study analysis is the issue of external validity or generalisability. How is it that one case can reliably offer anything beyond the particular? “We always do better (or, in the extreme, no worse) with more observation as the basis of our generalization”, as King et al write; “in all social science research and all prediction, it is important that we be as explicit as possible about the degree of uncertainty that accompanies out prediction” (1994: 212). This is an unavoidably valid criticism. It may be that theories which pass a single crucial case study test, for example, require rare antecedent conditions and therefore actually have little explanatory range. These conditions may emerge more clearly, as Van Evera (1997: 51-54) notes, from large-N studies in which cases that lack them present themselves as outliers exhibiting a theory’s cause but without its predicted outcome. As with the case of Indian democratisation above, it would logically be preferable to conduct large-N analysis beforehand to identify that state’s non-representative nature in relation to the broader population.

There are, however, three important qualifiers to the argument about generalisation that deserve particular mention here. The first is that with regard to an idiographic single-outcome case study, as Eckstein notes, the criticism is “mitigated by the fact that its capability to do so [is] never claimed by its exponents; in fact it is often explicitly repudiated” (1975: 134). Criticism of generalisability is of little relevance when the intention is one of particularisation. A second qualifier relates to the difference between statistical and analytical generalisation; single case studies are clearly less appropriate for the former but arguably retain significant utility for the latter – the difference also between explanatory and exploratory, or theory-testing and theory-building, as discussed above. As Gerring puts it, “theory confirmation/disconfirmation is not the case study’s strong suit” (2004: 350). A third qualification relates to the issue of case selection. As Seawright and Gerring (2008) note, the generalisability of case studies can be increased by the strategic selection of cases. Representative or random samples may not be the most appropriate, given that they may not provide the richest insight (or indeed, that a random and unknown deviant case may appear). Instead, and properly used , atypical or extreme cases “often reveal more information because they activate more actors… and more basic mechanisms in the situation studied” (Flyvbjerg, 2006). Of course, this also points to the very serious limitation, as hinted at with the case of India above, that poor case selection may alternatively lead to overgeneralisation and/or grievous misunderstandings of the relationship between variables or processes (Bennett and Elman, 2006a: 460-463).

As Tim May (2011: 226) notes, “the goal for many proponents of case studies […] is to overcome dichotomies between generalizing and particularizing, quantitative and qualitative, deductive and inductive techniques”. Research aims should drive methodological choices, rather than narrow and dogmatic preconceived approaches. As demonstrated above, there are various advantages to both idiographic and nomothetic single case study analyses – notably the empirically-rich, context-specific, holistic accounts that they have to offer, and their contribution to theory-building and, to a lesser extent, that of theory-testing. Furthermore, while they do possess clear limitations, any research method involves necessary trade-offs; the inherent weaknesses of any one method, however, can potentially be offset by situating them within a broader, pluralistic mixed-method research strategy. Whether or not single case studies are used in this fashion, they clearly have a great deal to offer.

References 

Bennett, A. and Checkel, J. T. (2012) ‘Process Tracing: From Philosophical Roots to Best Practice’, Simons Papers in Security and Development, No. 21/2012, School for International Studies, Simon Fraser University: Vancouver.

Bennett, A. and Elman, C. (2006a) ‘Qualitative Research: Recent Developments in Case Study Methods’, Annual Review of Political Science , 9, 455-476.

Bennett, A. and Elman, C. (2006b) ‘Complex Causal Relations and Case Study Methods: The Example of Path Dependence’, Political Analysis , 14, 3, 250-267.

Bennett, A. and Elman, C. (2007) ‘Case Study Methods in the International Relations Subfield’, Comparative Political Studies , 40, 2, 170-195.

Bennett, A. and Elman, C. (2010) Case Study Methods. In C. Reus-Smit and D. Snidal (eds) The Oxford Handbook of International Relations . Oxford University Press: Oxford. Ch. 29.

Berg, B. and Lune, H. (2012) Qualitative Research Methods for the Social Sciences . Pearson: London.

Bryman, A. (2012) Social Research Methods . Oxford University Press: Oxford.

David, M. and Sutton, C. D. (2011) Social Research: An Introduction . SAGE Publications Ltd: London.

Diamond, J. (1992) ‘Economic development and democracy reconsidered’, American Behavioral Scientist , 35, 4/5, 450-499.

Eckstein, H. (1975) Case Study and Theory in Political Science. In R. Gomm, M. Hammersley, and P. Foster (eds) Case Study Method . SAGE Publications Ltd: London.

Flyvbjerg, B. (2006) ‘Five Misunderstandings About Case-Study Research’, Qualitative Inquiry , 12, 2, 219-245.

Geertz, C. (1973) The Interpretation of Cultures: Selected Essays by Clifford Geertz . Basic Books Inc: New York.

Gerring, J. (2004) ‘What is a Case Study and What Is It Good for?’, American Political Science Review , 98, 2, 341-354.

Gerring, J. (2006a) Case Study Research: Principles and Practices . Cambridge University Press: Cambridge.

Gerring, J. (2006b) ‘Single-Outcome Studies: A Methodological Primer’, International Sociology , 21, 5, 707-734.

Gerring, J. (2007) ‘Is There a (Viable) Crucial-Case Method?’, Comparative Political Studies , 40, 3, 231-253.

King, G., Keohane, R. O. and Verba, S. (1994) Designing Social Inquiry: Scientific Inference in Qualitative Research . Princeton University Press: Chichester.

Layne, C. (1994) ‘Kant or Cant: The Myth of the Democratic Peace’, International Security , 19, 2, 5-49.

Levy, J. S. (2008) ‘Case Studies: Types, Designs, and Logics of Inference’, Conflict Management and Peace Science , 25, 1-18.

Lipset, S. M. (1959) ‘Some Social Requisites of Democracy: Economic Development and Political Legitimacy’, The American Political Science Review , 53, 1, 69-105.

Lyotard, J-F. (1984) The Postmodern Condition: A Report on Knowledge . University of Minnesota Press: Minneapolis.

MacMillan, A. (2008) ‘Deviant Democratization in India’, Democratization , 15, 4, 733-749.

Maoz, Z. (2002) Case study methodology in international studies: from storytelling to hypothesis testing. In F. P. Harvey and M. Brecher (eds) Evaluating Methodology in International Studies . University of Michigan Press: Ann Arbor.

May, T. (2011) Social Research: Issues, Methods and Process . Open University Press: Maidenhead.

Owen, J. M. (1994) ‘How Liberalism Produces Democratic Peace’, International Security , 19, 2, 87-125.

Seawright, J. and Gerring, J. (2008) ‘Case Selection Techniques in Case Study Research: A Menu of Qualitative and Quantitative Options’, Political Research Quarterly , 61, 2, 294-308.

Stake, R. E. (2008) Qualitative Case Studies. In N. K. Denzin and Y. S. Lincoln (eds) Strategies of Qualitative Inquiry . Sage Publications: Los Angeles. Ch. 17.

Van Evera, S. (1997) Guide to Methods for Students of Political Science . Cornell University Press: Ithaca.

Verschuren, P. J. M. (2003) ‘Case study as a research strategy: some ambiguities and opportunities’, International Journal of Social Research Methodology , 6, 2, 121-139.

Yin, R. K. (2009) Case Study Research: Design and Methods . SAGE Publications Ltd: London.

[1] The paper follows convention by differentiating between ‘International Relations’ as the academic discipline and ‘international relations’ as the subject of study.

[2] There is some similarity here with Stake’s (2008: 445-447) notion of intrinsic cases, those undertaken for a better understanding of the particular case, and instrumental ones that provide insight for the purposes of a wider external interest.

[3] These may be unique in the idiographic sense, or in nomothetic terms as an exception to the generalising suppositions of either probabilistic or deterministic theories (as per deviant cases, below).

[4] Although there are “philosophical hurdles to mount”, according to Bennett and Checkel, there exists no a priori reason as to why process tracing (as typically grounded in scientific realism) is fundamentally incompatible with various strands of positivism or interpretivism (2012: 18-19). By extension, it can therefore be incorporated by a range of contemporary mainstream IR theories.

— Written by: Ben Willis Written at: University of Plymouth Written for: David Brockington Date written: January 2013

Further Reading on E-International Relations

  • Identity in International Conflicts: A Case Study of the Cuban Missile Crisis
  • Imperialism’s Legacy in the Study of Contemporary Politics: The Case of Hegemonic Stability Theory
  • Recreating a Nation’s Identity Through Symbolism: A Chinese Case Study
  • Ontological Insecurity: A Case Study on Israeli-Palestinian Conflict in Jerusalem
  • Terrorists or Freedom Fighters: A Case Study of ETA
  • A Critical Assessment of Eco-Marxism: A Ghanaian Case Study

Please Consider Donating

Before you download your free e-book, please consider donating to support open access publishing.

E-IR is an independent non-profit publisher run by an all volunteer team. Your donations allow us to invest in new open access titles and pay our bandwidth bills to ensure we keep our existing titles free to view. Any amount, in any currency, is appreciated. Many thanks!

Donations are voluntary and not required to download the e-book - your link to download is below.

exploratory case study advantages and disadvantages

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Exploratory Research | Definition, Guide, & Examples

Exploratory Research | Definition, Guide, & Examples

Published on 6 May 2022 by Tegan George . Revised on 20 January 2023.

Exploratory research is a methodology approach that investigates topics and research questions that have not previously been studied in depth.

Exploratory research is often qualitative in nature. However, a study with a large sample conducted in an exploratory manner can be quantitative as well. It is also often referred to as interpretive research or a grounded theory approach due to its flexible and open-ended nature.

Table of contents

When to use exploratory research, exploratory research questions, exploratory research data collection, step-by-step example of exploratory research, exploratory vs explanatory research, advantages and disadvantages of exploratory research, frequently asked questions about exploratory research.

Exploratory research is often used when the issue you’re studying is new or when the data collection process is challenging for some reason.

You can use this type of research if you have a general idea or a specific question that you want to study but there is no preexisting knowledge or paradigm with which to study it.

Prevent plagiarism, run a free check.

Exploratory research questions are designed to help you understand more about a particular topic of interest. They can help you connect ideas to understand the groundwork of your analysis without adding any preconceived notions or assumptions yet.

Here are some examples:

  • What effect does using a digital notebook have on the attention span of primary schoolers?
  • What factors influence mental health in undergraduates?
  • What outcomes are associated with an authoritative parenting style?
  • In what ways does the presence of a non-native accent affect intelligibility?
  • How can the use of a grocery delivery service reduce food waste in single-person households?

Collecting information on a previously unexplored topic can be challenging. Exploratory research can help you narrow down your topic and formulate a clear hypothesis , as well as giving you the ‘lay of the land’ on your topic.

Data collection using exploratory research is often divided into primary and secondary research methods, with data analysis following the same model.

Primary research

In primary research, your data is collected directly from primary sources : your participants. There is a variety of ways to collect primary data.

Some examples include:

  • Survey methodology: Sending a survey out to the student body asking them if they would eat vegan meals
  • Focus groups: Compiling groups of 8–10 students and discussing what they think of vegan options for dining hall food
  • Interviews: Interviewing students entering and exiting the dining hall, asking if they would eat vegan meals

Secondary research

In secondary research, your data is collected from preexisting primary research, such as experiments or surveys.

Some other examples include:

  • Case studies : Health of an all-vegan diet
  • Literature reviews : Preexisting research about students’ eating habits and how they have changed over time
  • Online polls, surveys, blog posts, or interviews; social media: Have other universities done something similar?

For some subjects, it’s possible to use large- n government data, such as the decennial census or yearly American Community Survey (ACS) open-source data.

How you proceed with your exploratory research design depends on the research method you choose to collect your data. In most cases, you will follow five steps.

We’ll walk you through the steps using the following example.

Therefore, you would like to focus on improving intelligibility instead of reducing the learner’s accent.

Step 1: Identify your problem

The first step in conducting exploratory research is identifying what the problem is and whether this type of research is the right avenue for you to pursue. Remember that exploratory research is most advantageous when you are investigating a previously unexplored problem.

Step 2: Hypothesise a solution

The next step is to come up with a solution to the problem you’re investigating. Formulate a hypothetical statement to guide your research.

Step 3. Design your methodology

Next, conceptualise your data collection and data analysis methods and write them up in a research design.

Step 4: Collect and analyse data

Next, you proceed with collecting and analysing your data so you can determine whether your preliminary results are in line with your hypothesis.

In most types of research, you should formulate your hypotheses a priori and refrain from changing them due to the increased risk of Type I errors and data integrity issues. However, in exploratory research, you are allowed to change your hypothesis based on your findings, since you are exploring a previously unexplained phenomenon that could have many explanations.

Step 5: Avenues for future research

Decide if you would like to continue studying your topic. If so, it is likely that you will need to change to another type of research. As exploratory research is often qualitative in nature, you may need to conduct quantitative research with a larger sample size to achieve more generalisable results.

It can be easy to confuse exploratory research with explanatory research. To understand the relationship, it can help to remember that exploratory research lays the groundwork for later explanatory research.

Exploratory research investigates research questions that have not been studied in depth. The preliminary results often lay the groundwork for future analysis.

Explanatory research questions tend to start with ‘why’ or ‘how’, and the goal is to explain why or how a previously studied phenomenon takes place.

Exploratory vs explanatory research

Like any other research design , exploratory research has its trade-offs: it provides a unique set of benefits but also comes with downsides.

  • It can be very helpful in narrowing down a challenging or nebulous problem that has not been previously studied.
  • It can serve as a great guide for future research, whether your own or another researcher’s. With new and challenging research problems, adding to the body of research in the early stages can be very fulfilling.
  • It is very flexible, cost-effective, and open-ended. You are free to proceed however you think is best.

Disadvantages

  • It usually lacks conclusive results, and results can be biased or subjective due to a lack of preexisting knowledge on your topic.
  • It’s typically not externally valid and generalisable, and it suffers from many of the challenges of qualitative research .
  • Since you are not operating within an existing research paradigm, this type of research can be very labour-intensive.

Exploratory research is a methodology approach that explores research questions that have not previously been studied in depth. It is often used when the issue you’re studying is new, or the data collection process is challenging in some way.

You can use exploratory research if you have a general idea or a specific question that you want to study but there is no preexisting knowledge or paradigm with which to study it.

Exploratory research explores the main aspects of a new or barely researched question.

Explanatory research explains the causes and effects of an already widely researched question.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to test a hypothesis by systematically collecting and analysing data, while qualitative methods allow you to explore ideas and experiences in depth.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

George, T. (2023, January 20). Exploratory Research | Definition, Guide, & Examples. Scribbr. Retrieved 26 August 2024, from https://www.scribbr.co.uk/research-methods/exploratory-research-design/

Is this article helpful?

Tegan George

Tegan George

Other students also liked, qualitative vs quantitative research | examples & methods, descriptive research design | definition, methods & examples, case study | definition, examples & methods.

  • How it works

researchprospect post subheader

A Quick Guide to Case Study with Examples

Published by Alvin Nicolas at August 14th, 2021 , Revised On August 29, 2023

A case study is a documented history and detailed analysis of a situation concerning organisations, industries, and markets.

A case study:

  • Focuses on discovering new facts of the situation under observation.
  • Includes data collection from multiple sources over time.
  • Widely used in social sciences to study the underlying information, organisation, community, or event.
  • It does not provide any solution to the problem .

When to Use Case Study? 

You can use a case study in your research when:

  • The focus of your study is to find answers to how and why questions .
  • You don’t have enough time to conduct extensive research; case studies are convenient for completing your project successfully.
  • You want to analyse real-world problems in-depth, then you can use the method of the case study.

You can consider a single case to gain in-depth knowledge about the subject, or you can choose multiple cases to know about various aspects of your  research problem .

What are the Aims of the Case Study?

  • The case study aims at identifying weak areas that can be improved.
  • This method is often used for idiographic research (focuses on individual cases or events).
  • Another aim of the case study is nomothetic research (aims to discover new theories through data analysis of multiple cases).

Types of Case Studies

There are different types of case studies that can be categorised based on the purpose of the investigation.

Types of Case Study Definition Example
Explanatory case study Explanatory research is used to determine the answers to   and   two or more variables are interrelated. Researchers usually conduct experiments to know the effect of specific changes among two or more variables. A study to identify the impact of a nutritious diet on pregnant women.
Exploratory case study Exploratory research is conducted to understand the nature of the problem. It does not focus on finding evidence or a conclusion of the problem. It studies the problem to explore the research in-depth and covers such topics that were not considered before. An investigation of the growing crimes against women in India.
Descriptive case study  is carried out to describe real-life situations, programs. It provides information about the issue through surveys and various fact-finding methods. The widespread contaminated diseases in a specific area of the town. Investigation reveals that there is no trash removal system in that area. A researcher can hypothesise why the improper trash removal system leads to the widespread of contaminated disease.
Intrinsic case study This type of case study is conducted to get an in-depth understanding of a specific case. A case study of the academic performance of class 12th students.
Instrumental case study This type of case study supports other interests by providing a base to understand other issues. The challenges of learning a new language can be studied in a case study of a bilingual school.
Collective/Multiple case study A researcher focuses on a single issue but selects multiple cases. It aims at analysing various cases. A researcher repeats the procedures for each case. If you want to research the national child care program, you also need to focus on a child’s services agencies, reasons for child labour, or abandonment, as they may be separate cases that are interrelated to your case. These multiple cases may help you find your primary answers and uncover various other facts about the other relevant cases.
Longitudinal cumulative case study Researchers collect the information at multiple points in time. Usually, a specific group of participants is selected and examined numerous times at various periods. A researcher experiments on a group of women to determine the impact of a low-carb diet within six months. The women’s weight and a health check-up will be done multiple times to get the study’s evidence.

Looking for dissertation help?

Researchprospect to the rescue then.

We have expert writers on our team who are skilled at helping students with dissertations across a variety of STEM disciplines. Guaranteeing 100% satisfaction!

Looking for quantitative dissertation help

How to Conduct a Case Study?

  • Select the Case to Investigate
  • Formulate the Research Question
  • Review of Literature
  • Choose the Precise Case to Use in your Study
  • Select Data Collection and Analysis Techniques
  • Collect the Data
  • Analyse the Data
  • Prepare the Report

Step1: Select the Case to Investigate

The first step is to select a case to conduct your investigation. You should remember the following points.

  • Make sure that you perform the study in the available timeframe.
  • There should not be too much information available about the organisation.
  • You should be able to get access to the organisation.
  • There should be enough information available about the subject to conduct further research.

Step2: Formulate the Research Question

It’s necessary to  formulate a research question  to proceed with your case study. Most of the research questions begin with  how, why, what, or what can . 

You can also use a research statement instead of a research question to conduct your research which can be conditional or non-conditional. 

Case Topic Research Question Research Statement
The process of decision making of men between 25-40 years How do men between 25 and 40 decide whether to set up their own business or continue their job? What factors influence their decision? There is a difference between decision-making among the men of 25-30 years of age related to their career options.
The experience of 25-40 years while choosing their career options whether to set up their business or take a job. How do men of 25-40 years of age describe their experiences of doing a job and running their own business? Do these experiences influence their decision-making related to their career? Men of 25-30 years of age share various experiences related to their field of work. These experiences play a crucial role in deciding on their career.
The decision-making of 25-40 years of age attending various seminars of career guidance. How do men of 25-30 years of age attending various career guidance seminars describe their decision-making related to their career? Men of 25-30 years of age attending various career guidance seminars describe their career decision-making experiences.

Step 3: Review of Literature

Once you formulate your research statement or question, you need to extensively  review the documentation about the existing discoveries related to your research question or statement.

Step 4: Choose the Precise Case to Use in your Study

You need to select a specific case or multiple cases related to your research. It would help if you treated each case individually while using multiple cases. The outcomes of each case can be used as contributors to the outcomes of the entire study.  You can select the following cases. 

  • Representing various geographic regions
  • Cases with various size parameters
  • Explaining the existing theories or assumptions
  • Leading to discoveries
  • Providing a base for future research.

Step 5: Select Data Collection and Analysis Techniques

You can choose both  qualitative or quantitative approaches  for  collecting the data . You can use  interviews ,  surveys , artifacts, documentation, newspapers, and photographs, etc. To avoid biased observation, you can triangulate  your research to provide different views of your case. Even if you are focusing on a single case, you need to observe various case angles. It would help if you constructed validity, internal and external validity, as well as reliability.

Example: Identifying the impacts of contaminated water on people’s health and the factors responsible for it. You need to gather the data using qualitative and quantitative approaches to understand the case in such cases.

Construct validity:  You should select the most suitable measurement tool for your research. 

Internal validity:   You should use various methodological tools to  triangulate  the data. Try different methods to study the same hypothesis.

External validity:  You need to effectively apply the data beyond the case’s circumstances to more general issues.

Reliability:   You need to be confident enough to formulate the new direction for future studies based on your findings.

Also Read:  Reliability and Validity

Step 6: Collect the Data

Beware of the following when collecting data:

  • Information should be gathered systematically, and the collected evidence from various sources should contribute to your research objectives.
  • Don’t collect your data randomly.
  • Recheck your research questions to avoid mistakes.
  • You should save the collected data in any popular format for clear understanding.
  • While making any changes to collecting information, make sure to record the changes in a document.
  • You should maintain a case diary and note your opinions and thoughts evolved throughout the study.

Step 7: Analyse the Data

The research data identifies the relationship between the objects of study and the research questions or statements. You need to reconfirm the collected information and tabulate it correctly for better understanding. 

Step 8: Prepare the Report

It’s essential to prepare a report for your case study. You can write your case study in the form of a scientific paper or thesis discussing its detail with supporting evidence. 

A case study can be represented by incorporating  quotations,  stories, anecdotes,  interview transcripts , etc., with empirical data in the result section. 

You can also write it in narrative styles using  textual analysis  or   discourse analysis . Your report should also include evidence from published literature, and you can put it in the discussion section.

Advantages and Disadvantages of Case Study

Advantages Disadvantages
It’s useful for rare outcomes. An ample amount of information is obtained with few participants. Helps in developing strong reading, analytical, and planning skills. Develops analytical thinking. It consumes a lot of time compared to other research methods. It cannot estimate the incidence of disease. Limited results can be studied. The information obtained can be biased.

Frequently Asked Questions

What is the case study.

A case study is a research method where a specific instance, event, or situation is deeply examined to gain insights into real-world complexities. It involves detailed analysis of context, data, and variables to understand patterns, causes, and effects, often used in various disciplines for in-depth exploration.

You May Also Like

This post provides the key disadvantages of secondary research so you know the limitations of secondary research before making a decision.

What are the different types of research you can use in your dissertation? Here are some guidelines to help you choose a research strategy that would make your research more credible.

A survey includes questions relevant to the research topic. The participants are selected, and the questionnaire is distributed to collect the data.

USEFUL LINKS

LEARNING RESOURCES

researchprospect-reviews-trust-site

COMPANY DETAILS

Research-Prospect-Writing-Service

  • How It Works

Research-Methodology

Exploratory Research

Exploratory research, as the name implies, intends merely to explore the research questions and does not intend to offer final and conclusive solutions to existing problems. This type of research is usually conducted to study a problem that has not been clearly defined yet. Conducted in order to determine the nature of the problem, exploratory research is not intended to provide conclusive evidence, but helps us to have a better understanding of the problem.

When conducting exploratory research, the researcher ought to be willing to change his/her direction as a result of revelation of new data and new insights. [1] Accordingly, exploratory studies are often conducted using interpretive research methods and they answer to questions such as what, why and how.

Exploratory research design does not aim to provide the final and conclusive answers to the research questions, but merely explores the research topic with varying levels of depth. It has been noted that “exploratory research is the initial research, which forms the basis of more conclusive research. It can even help in determining the research design, sampling methodology and data collection method” [2] . Exploratory research “tends to tackle new problems on which little or no previous research has been done” [3] .

Unstructured interviews are the most popular primary data collection method with exploratory studies. Additionally, surveys , focus groups and observation methods can be used to collect primary data for this type of studies.

Examples of Exploratory Research Design

The following are some examples for studies with exploratory research design in business studies:

  • A study into the role of social networking sites as an effective marketing communication channel
  • An investigation into the ways of improvement of quality of customer services within hospitality sector in London
  • An assessment of the role of corporate social responsibility on consumer behaviour in pharmaceutical industry in the USA

Differences between Exploratory and Conclusive Research

The difference between exploratory and conclusive research is drawn by Sandhursen (2000) [4] in a way that exploratory studies result in a range of causes and alternative options for a solution of a specific problem, whereas, conclusive studies identify the final information that is the only solution to an existing research problem.

In other words, exploratory research design simply explores the research questions, leaving room for further researches, whereas conclusive research design is aimed to provide final findings for the research.

Moreover, it has been stated that “an exploratory study may not have as rigorous as methodology as it is used in conclusive studies, and sample sizes may be smaller. But it helps to do the exploratory study as methodically as possible, if it is going to be used for major decisions about the way we are going to conduct our next study” [5] (Nargundkar, 2003, p.41).

Exploratory studies usually create scope for future research and the future research may have a conclusive design. For example, ‘a study into the implications of COVID-19 pandemic into the global economy’ is an exploratory research. COVID-19 pandemic is a recent phenomenon and the study can generate an initial knowledge about economic implications of the phenomenon.

A follow-up study, building on the findings of this research ‘a study into the effects of COVID-19 pandemic on tourism revenues in Morocco’ is a causal conclusive research. The second research can produce research findings that can be of a practical use for decision making.

Advantages of Exploratory Research

  • Lower costs of conducting the study
  • Flexibility and adaptability to change
  • Exploratory research is effective in laying the groundwork that will lead to future studies.
  • Exploratory studies can potentially save time by determining at the earlier stages the types of research that are worth pursuing

Disadvantages of Exploratory Research

  • Inclusive nature of research findings
  • Exploratory studies generate qualitative information and interpretation of such type of information is subject to bias
  • These types of studies usually make use of a modest number of samples that may not adequately represent the target population. Accordingly, findings of exploratory research cannot be generalized to a wider population.
  • Findings of such type of studies are not usually useful in decision making in a practical level.

My e-book,  The Ultimate Guide to Writing a Dissertation in Business Studies: a step by step assistance  contains discussions of theory and application of research designs. The e-book also explains all stages of the  research process  starting from the  selection of the research area  to writing personal reflection. Important elements of dissertations such as  research philosophy ,  research approach ,  methods of data collection ,  data analysis  and  sampling  are explained in this e-book in simple words.

John Dudovskiy

Exploratory research

[1] Source: Saunders, M., Lewis, P. & Thornhill, A. (2012) “Research Methods for Business Students” 6 th  edition, Pearson Education Limited

[2] Singh, K. (2007) “Quantitative Social Research Methods” SAGE Publications, p.64

[3] Brown, R.B. (2006) “Doing Your Dissertation in Business and Management: The Reality of Research and Writing” Sage Publications, p.43

[4] Sandhusen, R.L. (2000) “Marketing” Barrons

[5] Nargundkar, R. (2008) “Marketing Research: Text and Cases” 3 rd edition, p.38

helpful professor logo

10 Case Study Advantages and Disadvantages

10 Case Study Advantages and Disadvantages

Chris Drew (PhD)

Dr. Chris Drew is the founder of the Helpful Professor. He holds a PhD in education and has published over 20 articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education. [Image Descriptor: Photo of Chris]

Learn about our Editorial Process

case study advantages and disadvantages, explained below

A case study in academic research is a detailed and in-depth examination of a specific instance or event, generally conducted through a qualitative approach to data.

The most common case study definition that I come across is is Robert K. Yin’s (2003, p. 13) quote provided below:

“An empirical inquiry that investigates a contemporary phenomenon within its real-life context, especially when the boundaries between phenomenon and context are not clearly evident.”

Researchers conduct case studies for a number of reasons, such as to explore complex phenomena within their real-life context, to look at a particularly interesting instance of a situation, or to dig deeper into something of interest identified in a wider-scale project.

While case studies render extremely interesting data, they have many limitations and are not suitable for all studies. One key limitation is that a case study’s findings are not usually generalizable to broader populations because one instance cannot be used to infer trends across populations.

Case Study Advantages and Disadvantages

1. in-depth analysis of complex phenomena.

Case study design allows researchers to delve deeply into intricate issues and situations.

By focusing on a specific instance or event, researchers can uncover nuanced details and layers of understanding that might be missed with other research methods, especially large-scale survey studies.

As Lee and Saunders (2017) argue,

“It allows that particular event to be studies in detail so that its unique qualities may be identified.”

This depth of analysis can provide rich insights into the underlying factors and dynamics of the studied phenomenon.

2. Holistic Understanding

Building on the above point, case studies can help us to understand a topic holistically and from multiple angles.

This means the researcher isn’t restricted to just examining a topic by using a pre-determined set of questions, as with questionnaires. Instead, researchers can use qualitative methods to delve into the many different angles, perspectives, and contextual factors related to the case study.

We can turn to Lee and Saunders (2017) again, who notes that case study researchers “develop a deep, holistic understanding of a particular phenomenon” with the intent of deeply understanding the phenomenon.

3. Examination of rare and Unusual Phenomena

We need to use case study methods when we stumble upon “rare and unusual” (Lee & Saunders, 2017) phenomena that would tend to be seen as mere outliers in population studies.

Take, for example, a child genius. A population study of all children of that child’s age would merely see this child as an outlier in the dataset, and this child may even be removed in order to predict overall trends.

So, to truly come to an understanding of this child and get insights into the environmental conditions that led to this child’s remarkable cognitive development, we need to do an in-depth study of this child specifically – so, we’d use a case study.

4. Helps Reveal the Experiences of Marginalzied Groups

Just as rare and unsual cases can be overlooked in population studies, so too can the experiences, beliefs, and perspectives of marginalized groups.

As Lee and Saunders (2017) argue, “case studies are also extremely useful in helping the expression of the voices of people whose interests are often ignored.”

Take, for example, the experiences of minority populations as they navigate healthcare systems. This was for many years a “hidden” phenomenon, not examined by researchers. It took case study designs to truly reveal this phenomenon, which helped to raise practitioners’ awareness of the importance of cultural sensitivity in medicine.

5. Ideal in Situations where Researchers cannot Control the Variables

Experimental designs – where a study takes place in a lab or controlled environment – are excellent for determining cause and effect . But not all studies can take place in controlled environments (Tetnowski, 2015).

When we’re out in the field doing observational studies or similar fieldwork, we don’t have the freedom to isolate dependent and independent variables. We need to use alternate methods.

Case studies are ideal in such situations.

A case study design will allow researchers to deeply immerse themselves in a setting (potentially combining it with methods such as ethnography or researcher observation) in order to see how phenomena take place in real-life settings.

6. Supports the generation of new theories or hypotheses

While large-scale quantitative studies such as cross-sectional designs and population surveys are excellent at testing theories and hypotheses on a large scale, they need a hypothesis to start off with!

This is where case studies – in the form of grounded research – come in. Often, a case study doesn’t start with a hypothesis. Instead, it ends with a hypothesis based upon the findings within a singular setting.

The deep analysis allows for hypotheses to emerge, which can then be taken to larger-scale studies in order to conduct further, more generalizable, testing of the hypothesis or theory.

7. Reveals the Unexpected

When a largescale quantitative research project has a clear hypothesis that it will test, it often becomes very rigid and has tunnel-vision on just exploring the hypothesis.

Of course, a structured scientific examination of the effects of specific interventions targeted at specific variables is extermely valuable.

But narrowly-focused studies often fail to shine a spotlight on unexpected and emergent data. Here, case studies come in very useful. Oftentimes, researchers set their eyes on a phenomenon and, when examining it closely with case studies, identify data and come to conclusions that are unprecedented, unforeseen, and outright surprising.

As Lars Meier (2009, p. 975) marvels, “where else can we become a part of foreign social worlds and have the chance to become aware of the unexpected?”

Disadvantages

1. not usually generalizable.

Case studies are not generalizable because they tend not to look at a broad enough corpus of data to be able to infer that there is a trend across a population.

As Yang (2022) argues, “by definition, case studies can make no claims to be typical.”

Case studies focus on one specific instance of a phenomenon. They explore the context, nuances, and situational factors that have come to bear on the case study. This is really useful for bringing to light important, new, and surprising information, as I’ve already covered.

But , it’s not often useful for generating data that has validity beyond the specific case study being examined.

2. Subjectivity in interpretation

Case studies usually (but not always) use qualitative data which helps to get deep into a topic and explain it in human terms, finding insights unattainable by quantitative data.

But qualitative data in case studies relies heavily on researcher interpretation. While researchers can be trained and work hard to focus on minimizing subjectivity (through methods like triangulation), it often emerges – some might argue it’s innevitable in qualitative studies.

So, a criticism of case studies could be that they’re more prone to subjectivity – and researchers need to take strides to address this in their studies.

3. Difficulty in replicating results

Case study research is often non-replicable because the study takes place in complex real-world settings where variables are not controlled.

So, when returning to a setting to re-do or attempt to replicate a study, we often find that the variables have changed to such an extent that replication is difficult. Furthermore, new researchers (with new subjective eyes) may catch things that the other readers overlooked.

Replication is even harder when researchers attempt to replicate a case study design in a new setting or with different participants.

Comprehension Quiz for Students

Question 1: What benefit do case studies offer when exploring the experiences of marginalized groups?

a) They provide generalizable data. b) They help express the voices of often-ignored individuals. c) They control all variables for the study. d) They always start with a clear hypothesis.

Question 2: Why might case studies be considered ideal for situations where researchers cannot control all variables?

a) They provide a structured scientific examination. b) They allow for generalizability across populations. c) They focus on one specific instance of a phenomenon. d) They allow for deep immersion in real-life settings.

Question 3: What is a primary disadvantage of case studies in terms of data applicability?

a) They always focus on the unexpected. b) They are not usually generalizable. c) They support the generation of new theories. d) They provide a holistic understanding.

Question 4: Why might case studies be considered more prone to subjectivity?

a) They always use quantitative data. b) They heavily rely on researcher interpretation, especially with qualitative data. c) They are always replicable. d) They look at a broad corpus of data.

Question 5: In what situations are experimental designs, such as those conducted in labs, most valuable?

a) When there’s a need to study rare and unusual phenomena. b) When a holistic understanding is required. c) When determining cause-and-effect relationships. d) When the study focuses on marginalized groups.

Question 6: Why is replication challenging in case study research?

a) Because they always use qualitative data. b) Because they tend to focus on a broad corpus of data. c) Due to the changing variables in complex real-world settings. d) Because they always start with a hypothesis.

Lee, B., & Saunders, M. N. K. (2017). Conducting Case Study Research for Business and Management Students. SAGE Publications.

Meir, L. (2009). Feasting on the Benefits of Case Study Research. In Mills, A. J., Wiebe, E., & Durepos, G. (Eds.). Encyclopedia of Case Study Research (Vol. 2). London: SAGE Publications.

Tetnowski, J. (2015). Qualitative case study research design.  Perspectives on fluency and fluency disorders ,  25 (1), 39-45. ( Source )

Yang, S. L. (2022). The War on Corruption in China: Local Reform and Innovation . Taylor & Francis.

Yin, R. (2003). Case Study research. Thousand Oaks, CA: Sage.

Chris

  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 21 Montessori Homeschool Setups
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 101 Hidden Talents Examples
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 15 Green Flags in a Relationship
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 15 Signs you're Burnt Out, Not Lazy

Leave a Comment Cancel Reply

Your email address will not be published. Required fields are marked *

exploratory case study advantages and disadvantages

  • Voxco Online
  • Voxco Panel Management
  • Voxco Panel Portal
  • Voxco Audience
  • Voxco Mobile Offline
  • Voxco Dialer Cloud
  • Voxco Dialer On-premise
  • Voxco TCPA Connect
  • Voxco Analytics
  • Voxco Text & Sentiment Analysis

exploratory case study advantages and disadvantages

  • 40+ question types
  • Drag-and-drop interface
  • Skip logic and branching
  • Multi-lingual survey
  • Text piping
  • Question library
  • CSS customization
  • White-label surveys
  • Customizable ‘Thank You’ page
  • Customizable survey theme
  • Reminder send-outs
  • Survey rewards
  • Social media
  • Website surveys
  • Correlation analysis
  • Cross-tabulation analysis
  • Trend analysis
  • Real-time dashboard
  • Customizable report
  • Email address validation
  • Recaptcha validation
  • SSL security

Take a peek at our powerful survey features to design surveys that scale discoveries.

Download feature sheet.

  • Hospitality
  • Academic Research
  • Customer Experience
  • Employee Experience
  • Product Experience
  • Market Research
  • Social Research
  • Data Analysis

Explore Voxco 

Need to map Voxco’s features & offerings? We can help!

Watch a Demo 

Download Brochures 

Get a Quote

  • NPS Calculator
  • CES Calculator
  • A/B Testing Calculator
  • Margin of Error Calculator
  • Sample Size Calculator
  • CX Strategy & Management Hub
  • Market Research Hub
  • Patient Experience Hub
  • Employee Experience Hub
  • NPS Knowledge Hub
  • Market Research Guide
  • Customer Experience Guide
  • Survey Research Guides
  • Survey Template Library
  • Webinars and Events
  • Feature Sheets
  • Try a sample survey
  • Professional Services

exploratory case study advantages and disadvantages

Get exclusive insights into research trends and best practices from top experts! Access Voxco’s ‘State of Research Report 2024 edition’ .

We’ve been avid users of the Voxco platform now for over 20 years. It gives us the flexibility to routinely enhance our survey toolkit and provides our clients with a more robust dataset and story to tell their clients.

VP Innovation & Strategic Partnerships, The Logit Group

  • Client Stories
  • Voxco Reviews
  • Why Voxco Research?
  • Careers at Voxco
  • Vulnerabilities and Ethical Hacking

Explore Regional Offices

  • Survey Software The world’s leading omnichannel survey software
  • Online Survey Tools Create sophisticated surveys with ease.
  • Mobile Offline Conduct efficient field surveys.
  • Text Analysis
  • Close The Loop
  • Automated Translations
  • NPS Dashboard
  • CATI Manage high volume phone surveys efficiently
  • Cloud/On-premise Dialer TCPA compliant Cloud on-premise dialer
  • IVR Survey Software Boost productivity with automated call workflows.
  • Analytics Analyze survey data with visual dashboards
  • Panel Manager Nurture a loyal community of respondents.
  • Survey Portal Best-in-class user friendly survey portal.
  • Voxco Audience Conduct targeted sample research in hours.
  • Predictive Analytics
  • Customer 360
  • Customer Loyalty
  • Fraud & Risk Management
  • AI/ML Enablement Services
  • Credit Underwriting

exploratory case study advantages and disadvantages

Find the best survey software for you! (Along with a checklist to compare platforms)

Get Buyer’s Guide

  • 100+ question types
  • SMS surveys
  • Financial Services
  • Banking & Financial Services
  • Retail Solution
  • Risk Management
  • Customer Lifecycle Solutions
  • Net Promoter Score
  • Customer Behaviour Analytics
  • Customer Segmentation
  • Data Unification

Explore Voxco 

Watch a Demo 

Download Brochures 

  • CX Strategy & Management Hub
  • The Voxco Guide to Customer Experience
  • Professional services
  • Blogs & White papers
  • Case Studies

Find the best customer experience platform

Uncover customer pain points, analyze feedback and run successful CX programs with the best CX platform for your team.

Get the Guide Now

exploratory case study advantages and disadvantages

VP Innovation & Strategic Partnerships, The Logit Group

  • Why Voxco Intelligence?
  • Our clients
  • Client stories
  • Featuresheets

Learn all about Exploratory Research1

Exploring the Pros and Cons of Exploratory Research

  • October 13, 2021

SHARE THE ARTICLE ON

Exploratory research is conducted to improve the understanding of a problem or phenomenon which is not rigidly defined. In this blog, we will focus on the pros & cons of Exploratory Research

What are the Characteristics of Exploratory Research?

Exploratory research helps you to gain more understanding of a topic. It helps you to gather information about your analysis without any preconceived assumptions. 

  • The beginning phase of the study.
  • Trial and error approach.
  • Study of an undefined phenomenon.
  • Uses small samples.
  • Unstructured and flexible.
  • Tentative results.
  • Guide for future research.
  • Qualitative and unrestricted. 

With an understanding of the characteristics, let’s dig into the pros & cons of exploratory research .

Exclusive Research Guide

See how Amazon,Uber and Apple enhance customer experience at scale.

7db6400b af9b 4c67 9bea fa54cb719713

Pros of Exploratory Research

The following set of pros of exploratory research advocate for its use as: 

  • Exploratory research offers a great amount of researcher discretion. The lack of structure enables the researcher to direct the progression of the research processes and in that sense, it offers a greater degree of flexibility and freedom.
  • Another pro of exploratory research is the economical way in which the process can be conducted. Exploratory research uses a relatively smaller group of people for defining and understanding the research issue.
  • Exploratory research when done properly can lay a strong foundation for any study that is carried around the same issue in the future. Exploratory research that is properly carried out helps in determining research design, sampling methodology, and data collection. This also comes with a sense of responsibility for the researcher to try and inspect the issue in-depth and concentrate on authentic reporting of results.
  • Analyzing the feasibility and viability of the research issue is another pro of exploratory research. No organization wants to invest time, effort, and resources in an area that is incapable of making value addition to the overall functioning. By carrying out an early study, exploratory research gauges the future importance that the research topic holds and accordingly directs organizational attitude.
  • Exploratory research formulates a greater understanding of a previously unresearched topic and satisfies the researcher uncovers facts and brings new issues to light. In doing so, it helps refine the future research questions. It also helps decide the best approach to reach the objective.

New call-to-action

Explore all the survey question types possible on Voxco

Cons of exploratory research

Exploratory research comes with its own set of cons that can act as roadblocks that impede a seamless data collection experience which lays the groundwork for future probes as well:

  • Exploratory research brings up tentative results and so is inconclusive. The focus of such research is to grasp and formulate a better understanding of the issue at hand. These research insights cannot be relied upon for effective decision-making.
  •  Another con of exploratory research is its qualitative data and subsequent analysis. It is difficult to derive accurate insights that can be summarized in an objective manner. The variability in qualitative data itself makes the evaluation of data collected, a difficult and cumbersome process.
  • The small sample used for exploratory research increases the risk of the sample responses being non-representative of the target audience. Smaller groups of people as samples, however useful for a quick study, can hinder a cohesive understanding which not only deteriorates the current quality of research but also adversely impacts the future research carried out along similar lines.
  • Data, when gathered through secondary resources, can supply obsolete information which may not generate any significant contribution to the understanding of an issue in the current scenario. Outdated information is neither actionable nor supportive in offering any sort of clarity under dynamic market conditions.

See Voxco survey software in action with a Free demo.

Conclusion;

In this article, we have discussed the pros and cons of exploratory research to make it easier for understanding. You can conduct exploratory research via the primary or secondary method of data collection. Weighing the pros and cons of exploratory research as mentioned above you can choose the best way to proceed with your research. 

Market Research toolkit to start your market research surveys and studies.

Dynamic : Researchers decide the directional flow of the research based on changing circumstances

Pocket Friendly : The resource investment is minimal and so does not act as a financial plough

Foundational : Lays the groundwork for future researcher

Feasibility of future assessment : Exploratory research studies the scope of the issue and determines the need for a future investigation

Nature : Exploratory research sheds light upon previously undiscovered

Inconclusive : Exploratory research offers inconclusive results

Difficult to interpret : Exploratory research offers a qualitative approach to data collection which is highly subjective and complex.

Sampling problem : Exploratory research makes use of a small number of respondents which opens up the risk of sampling bias and the consequent reduction in reliability and validity.

Incorrect sourcing : The collection of secondary data from sources that provide outdated information deteriorate the research quality.

Exploratory research design is a mechanism that explores issues that have not been clearly defined by adopting a qualitative method of data collection.

Exploratory research comes with disadvantages that include offering inconclusive results, lack of standardized analysis, small sample population and outdated information that can adversely affect the authenticity of information. Lack of preventive measure to minimise the effect of such hindrances can result in a bad understanding of the topic under consideration.

Exploratory research is carried out with the purpose of formulating an initial understanding of issues that haven’t been clearly defined yet.

Yes, due to a lack of previous knowledge about the research problem, researchers establish a suitable hypothesis that fuel the initial investigation.

A retail study that focuses on the impact of individual product sales vs packaged hamper sales on overall demand can provide a layout about how the customer looks at the two concepts differently and the variation in buying behaviour observed therein.

Explore Voxco Survey Software

Online page new product image3 02.png 1

+ Omnichannel Survey Software 

+ Online Survey Software 

+ CATI Survey Software 

+ IVR Survey Software 

+ Market Research Tool

+ Customer Experience Tool 

+ Product Experience Software 

+ Enterprise Survey Software 

TYPES OF MARKET SEGMENTATION1

Market Segmentation: Definition, importance, types and benefits

Market Segmentation: Definition, importance, types and benefits Voxco is trusted by 450+ Global Brands in 40+ countries See what question types are possible with a

ludde lorentz YfCVCPMNd38 unsplash scaled

Discover the Power of Ratio Data in Research and Market Analysis Transform your insight generation process Use our in-depth online survey guide to create an

gdpr blog post 400x250 1

Still struggling to get your survey process GDPR compliant?

Those dreaded letters, G.D.P.R! The new European data protection regulation introduced fear in the marketplace with its menacing fines and broad reach. Organizations have scrambled

Making advocacy marketing work for you 25 1

The common threads tying companies with high NPS together

The common threads tying companies with high NPS® together Maximize NPS® Insights Unlock insights to drive growth, improve customer engagement and provide detailed customer feedback

Market Segmentation Strategy A boon for Market Research 05

Market Segmentation Strategy: A boon for Market Research

MARKET RESEARCH Market Segmentation Strategy: A boon for Market Research SHARE THE ARTICLE ON Share on facebook Share on twitter Share on linkedin GET DEMO

skip logic blog 400x250 1

Why You Should Be Using Skip Logic to Enhance Respondent Experience

Why You Should Be Using Skip Logic to Enhance Respondent Experience Transform your insight generation process Use our in-depth online survey guide to create an

We use cookies in our website to give you the best browsing experience and to tailor advertising. By continuing to use our website, you give us consent to the use of cookies. Read More

Name Domain Purpose Expiry Type
hubspotutk www.voxco.com HubSpot functional cookie. 1 year HTTP
lhc_dir_locale amplifyreach.com --- 52 years ---
lhc_dirclass amplifyreach.com --- 52 years ---
Name Domain Purpose Expiry Type
_fbp www.voxco.com Facebook Pixel advertising first-party cookie 3 months HTTP
__hstc www.voxco.com Hubspot marketing platform cookie. 1 year HTTP
__hssrc www.voxco.com Hubspot marketing platform cookie. 52 years HTTP
__hssc www.voxco.com Hubspot marketing platform cookie. Session HTTP
Name Domain Purpose Expiry Type
_gid www.voxco.com Google Universal Analytics short-time unique user tracking identifier. 1 days HTTP
MUID bing.com Microsoft User Identifier tracking cookie used by Bing Ads. 1 year HTTP
MR bat.bing.com Microsoft User Identifier tracking cookie used by Bing Ads. 7 days HTTP
IDE doubleclick.net Google advertising cookie used for user tracking and ad targeting purposes. 2 years HTTP
_vwo_uuid_v2 www.voxco.com Generic Visual Website Optimizer (VWO) user tracking cookie. 1 year HTTP
_vis_opt_s www.voxco.com Generic Visual Website Optimizer (VWO) user tracking cookie that detects if the user is new or returning to a particular campaign. 3 months HTTP
_vis_opt_test_cookie www.voxco.com A session (temporary) cookie used by Generic Visual Website Optimizer (VWO) to detect if the cookies are enabled on the browser of the user or not. 52 years HTTP
_ga www.voxco.com Google Universal Analytics long-time unique user tracking identifier. 2 years HTTP
_uetsid www.voxco.com Microsoft Bing Ads Universal Event Tracking (UET) tracking cookie. 1 days HTTP
vuid vimeo.com Vimeo tracking cookie 2 years HTTP
Name Domain Purpose Expiry Type
__cf_bm hubspot.com Generic CloudFlare functional cookie. Session HTTP
Name Domain Purpose Expiry Type
_gcl_au www.voxco.com --- 3 months ---
_gat_gtag_UA_3262734_1 www.voxco.com --- Session ---
_clck www.voxco.com --- 1 year ---
_ga_HNFQQ528PZ www.voxco.com --- 2 years ---
_clsk www.voxco.com --- 1 days ---
visitor_id18452 pardot.com --- 10 years ---
visitor_id18452-hash pardot.com --- 10 years ---
lpv18452 pi.pardot.com --- Session ---
lhc_per www.voxco.com --- 6 months ---
_uetvid www.voxco.com --- 1 year ---

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • BMC Res Notes

Logo of bmcresnotes

The contribution of case study design to supporting research on Clubhouse psychosocial rehabilitation

Toby raeburn.

School of Nursing and Midwifery, Western Sydney University, Locked Bag 1797, Penrith, NSW 2751 Australia

Virginia Schmied

Catherine hungerford.

School of Nursing, Midwifery, and Indigenous Health, Faculty of Science, Charles Sturt University, Wagga Wagga, NSW Australia

Michelle Cleary

Psychosocial Clubhouses provide recovery-focused psychosocial rehabilitation to people with serious mental illness at over 300 sites in more than 30 countries worldwide. To deliver the services involved, Clubhouses employ a complex mix of theory, programs and relationships, with this complexity presenting a number of challenges to those undertaking Clubhouse research. This paper provides an overview of the usefulness of case study designs for Clubhouse researchers; and suggests ways in which the evaluation of Clubhouse models can be facilitated.

The paper begins by providing a brief explanation of the Clubhouse model of psychosocial rehabilitation, and the need for ongoing evaluation of the services delivered. This explanation is followed by an introduction to case study design, with consideration given to the way in which case studies have been used in past Clubhouse research. It is posited that case study design provides a methodological framework that supports the analysis of either quantitative, qualitative or a mixture of both types of data to investigate complex phenomena in their everyday contexts, and thereby support the development of theory. As such, case study approaches to research are well suited to the Clubhouse environment. The paper concludes with recommendations for future Clubhouse researchers who choose to employ a case study design.

Conclusions

While the quality of case study research that explores Clubhouses has been variable in the past, if applied in a diligent manner, case study design has a valuable contribution to make in future Clubhouse research.

Established towards the end of the 1940′s, the Clubhouse model is one of the world’s oldest approaches to psychosocial rehabilitation [ 1 ]. Popular worldwide, there are currently over 300 Clubhouses operating in more than 30 countries [ 2 ]. People who attend Clubhouses typically have a history of serious mental illness and face a number of challenges, including those related to their physical health, social welfare and employment [ 3 ]. In response, Clubhouses provide a wide range of social, health, educational and employment support programs [ 2 ]. To encourage a sense of empowerment and belonging, participants in these programs are referred to as ‘members’ rather than ‘patients’ or ‘consumers’ [ 4 ].

Clubhouse members follow an activity schedule referred to as the ‘work ordered day’ [ 5 ], where they work alongside paid staff, often assuming lead roles and taking responsibility for all aspects involved in running the Clubhouse. By contributing in these proactive ways, members embrace opportunities to build confidence, friendships and skills, while also being encouraged to pursue educational and employment goals in the wider society [ 6 ]. Building on these activities, Clubhouse programs referred to as Transitional Employment Programs (TEP) are then tailored to support members who decide to seek work in the competitive job market [ 6 ].

Clubhouses have been at the forefront of advocacy for consumer centred, recovery-oriented practice [ 7 , 8 ]. Despite this, researching the complex nature of these services has proved challenging [ 9 , 10 ]. Clubhouse research is further complicated by the highly personalised and context-dependent ways that people experience mental health recovery [ 11 ]. Reflection on such challenges has led to long consideration of the research design that best supports the exploration and explanation of the way in which Clubhouses work to support recovery—that is, the ‘recovery orientation’ of the Clubhouse model [ 12 ]. One research method with the potential to provide a rigorous framework for exploring phenomena within organisations such as the Clubhouse is case study design [ 13 , 14 ].

Case study design typically uses multiple perspectives to facilitate the examination of a particular phenomenon in its natural context [ 15 , 16 ]. While this may sound similar to the goal of many qualitative research approaches, case study design is different because it can be flexibly adapted as a framework that incorporates either qualitative, quantitative or a mixture of qualitative and quantitative research approaches [ 13 ]. Case study design is also unconstrained by a particular theoretical approach, meaning it can be pragmatically informed by or used to build or critique any theory related to the phenomena in question [ 17 ].

According to Tight [ 18 ], publications on the topic of case study from the past decade have been dominated by the work of two leading theorists, Yin [ 16 ] and Stake [ 19 ]. Yin [ 16 ] divides case studies into two broad groups. First, those that focus on an individual case, involving detailed exploration of either a person or an organisation. These are referred to as a ‘single case study’. Second, those that involve investigation of a group of cases for comparison and contrast are referred to as ‘multiple case studies’. Yin then makes a further division, categorising each case study as either exploratory, descriptive or explanatory.

Exploratory case studies are commonly pilot projects that seek to reveal what phenomena or theory exists within a field of interest. For example, a researcher interested in how services assist people with mental illness to achieve recovery, may seek to discover if there are any guiding recovery principals used by mental health services. Such a study may uncover phenomena and/or theory that can then lead to further investigation.

In contrast, descriptive case studies begin with a theory about a phenomena, and then seek to chronicle how the phenomena is displayed through the lens of those theoretical assumptions. For example, a descriptive study may set out to elucidate how certain recovery principles are reflected in the practices of a Clubhouse. A risk with this type of case study is that the researcher may find that the theory brought to the project is not applicable which, in turn, may lead to the need for further exploratory work.

Finally, explanatory case studies seek to interpret why a particular phenomenon or theory has been revealed in the data. This approach is cited as being particularly useful in a multiple case study design, because pattern-matching can be used. For example, a study may seek to explain why work seems to be important to the rehabilitation of people with mental illness at three different Clubhouses located across a variety of cultural contexts [ 16 ].

For Stake [ 19 ], case study design is focused on the exploration of a case and refining or revealing related concepts. Stake [ 19 ] divides case studies into intrinsic, instrumental or collective designs. Intrinsic design is used when researchers have a particular interest in improving their understanding of a phenomenon. This method is described as being primarily aimed at exploring rather than understanding theoretical constructs. In contrast, instrumental design refers to those case studies that seek to elucidate phenomena and test or strengthen theory. With this approach, the case and its context are studied in depth to facilitate deep understanding of a concept. Finally, collective case studies include any study involving more than one case, similar to Yin’s [ 16 ] description of ‘multiple case design’.

Consideration of the explanations provided by Yin [ 16 ] and Stake [ 19 ] suggest that case study may be described as a flexible research design that may utilize either qualitative, quantitative or a mixture of both types of data, to illuminate, elucidate or interpret phenomena in their everyday context and support the development of theory. This definition is important in this paper because it provides a framework for considering case study design in relation to Clubhouse research. For example, while several studies have described people’s subjective experience of recovery in psychosocial Clubhouses [ 11 ], there has been limited research exploring the way Clubhouses implement recovery-oriented practices. In this paper we review how case study research has contributed to the field of Clubhouse psychosocial rehabilitation.

Initially, this paper was conceived as an integrative literature review that examined the published case studies that have contributed to Clubhouse research. An electronic literature search was conducted seeking to identify full text peer reviewed journal articles written in English and published between 1960 and January 2015. The papers were required to refer to themselves as a ‘case study’ or derivative, and to have a focus on a Clubhouse or Fountain House. The search term ‘Fountain House’ was included because, as the name of the original Clubhouse, this term is popular in Clubhouse related literature.

The search terms, “case stud*” AND “clubhouse” OR “fountain house” were combined across three databases, leading to initial identification of 41 papers from PsycINFO, 20 from CINAHL and 16 from Proquest Social Science Journals. Reference lists were checked for other relevant papers, then following article screening and removal of duplicates, five papers were identified as relevant to the review [ 20 – 24 ]. All based in North America, the five articles were all published more than a decade ago, with one published as early as 1960.

The quality of each paper was initially assessed by the Chief Investigator (TR), using the Critical Skills Appraisal Program (CASP) [ 25 ]. CASP posits there are three broad issues that should be considered when appraising qualitative research, these are;

  • Are study results valid?
  • What are the results?
  • Will the results help locally? [ 25 ]

A ten question, three point scale was used to assess for validity, results and relevance. CASP ratings and notes were reviewed by all authors. The assessment was problematic however, as the majority of papers identified had been published in an era when diligent approaches to case study research and reporting (such as ethics approval) were often not applied. The consensus view amongst the authors was that this small sample of case studies could not bear the scrutiny of modern analytical techniques as part of an integrative literature review. Despite this, the results did provide useful information regarding the use of case study design in Clubhouse research, including the advantages and disadvantages. In turn, this prompts a variety of considerations for researchers who may consider using case study design in Clubhouse settings in future, with these considerations outlined in the results and discussion section presented below.

Results and discussion

Advantages and disadvantages of case study design in clubhouse research.

In common with qualitative research approaches such as ethnography, an emphasis on studying phenomena in its natural context means case study design incorporates the perspectives of participants who may come from vulnerable and voiceless groups in society [ 26 ]. For this reason, case studies have often been used to provide a framework to critique oppression and question social norms [ 27 ]. This suggestion was exemplified in the earliest evidence of a published Clubhouse case study, a paper by Goertzel et al. [ 22 ] published in 1960 that described the original Clubhouse in New York City during its early development. Using multiple data sources, the paper provided a rich description of the theoretical orientation, history, facilities, staff, volunteers, membership and programs available [ 22 ]. The research is important because it was written in an era when society held stigmatizing attitudes towards people with serious mental illness, who often spent their lives in custodial psychiatric institutions [ 28 , 29 ]. The paper by Goertzel et al. [ 22 ] conveyed ideas ahead of its time regarding the importance of involving people with a lived experience of mental illness in the development and delivery of mental health services. This case study, then, provides evidence of the early role that Clubhouses played in advocating for recovery-oriented models of mental health care.

Another advantage of case study design is the way in which it can be flexibly adapted to incorporate a mixture of qualitative and quantitative methods, as promoted by researchers such as Creswell [ 26 , 30 ]. An example of a mixed methods case study was conducted by Boll [ 20 ], who undertook a case study of a Clubhouse in New Jersey to explore the phenomena of empowerment among Clubhouse members involved in a service evaluation. Using a combination of quantitative and qualitative data collection methods, including survey questionnaires, participant observation, and individual interviews, the study found that researching Clubhouse members within the regular Clubhouse environment led to benefits such as enhanced engagement with new members and improved program quality [ 20 ].

A final advantage of undertaking case study research relates to the way in which it can support the testing of connections between theory and phenomena [ 31 ]. This characteristic was demonstrated in a Clubhouse case study conducted by Cowell et al. [ 24 ]. The study explored the concept of ‘function cost’, a theory designed to explain the financial cost to services that utilize co-production, where consumers are involved in both delivery and receipt of services. The boundaries in the study were difficult to ascertain because Clubhouse members were involved in the provision of tasks normally delivered by paid staff in hospital-based services. The researchers addressed this dilemma pragmatically by using two standardised research scales to collect separate financial data about costs associated with paid staff and voluntary labour invested in activities. Results from the study suggested that the concept of ‘function cost’ may provide a way to explain the financial costs of Clubhouse programs utilising co-production practices [ 24 ].

As is evident from the above examples drawn from Clubhouse research, there is no standardised way to apply case study design. Instead, this flexible approach offers researchers the opportunity to select from a variety of methods and data collection techniques to ensure a ‘best fit’ for the case in question. As with any style of research however, case study design also has some disadvantages.

One of the most commonly cited disadvantages of case studies is that findings can lack generalizability [ 15 , 16 ]. This suggestion, along with arguments that case studies lack scientific credibility because replication is difficult, has led to research regulators such as Australia’s National Health and Medical Research Council (NHMRC) [ 32 ] ranking case study as the lowest form of credible research design. Following scientific convention, the NHMRC [ 32 ] has ranked the quality of the designs of research, with some designs posited as producing more rigorous evidence than other research designs. For example when evaluating the effectiveness of an intervention, a Randomised Controlled Trial (RCT) is regarded as providing the most reliable evidence [ 33 ].

The NHMRC [ 32 ] suggests that the processes integral to RCTs minimize the risk of confounding factors and highlight that internal validity is generally stronger in randomized control trials. However external validity can be stronger in multiple case study designs, and can be weak in randomized control trials. Such weaknesses in RCT design have been exposed in a number of systematic reviews and secondary analyses. For example, Hunt, Siegfried, Morley, Sitharthan and Cleary [ 34 ] completed a Cochrane review of psychosocial interventions for people with serious mental illness examining 32 RCTs. Contrary to the view that RCTs provide a rigorous, dependable research design, the authors reported substantial difficulties with skewed data, risk of bias, poor trial methods, small sample sizes, low event rates and wide confidence intervals [ 34 ]. In another example related to Clubhouse employment programs, Johnsen et al. [ 35 ] conducted a secondary analysis of a multisite RCT and found that a limited definition of ‘competitive employment’ and variability in ‘control’ conditions, across sites, led to skewed findings. Johnsen et al. [ 35 , 36 ] together with other researchers, have gone on to observe that these kinds of variation in definition and control conditions in RCTs have led to substantial inconsistencies in research of employment services for people with serious mental illness.

Responding to criticism of case study design, theorists such as Yin [ 16 ] have suggested that generalisation of findings from case studies should focus on assessing the efficacy of theoretical constructs, rather than on the transferability of statistics. As mentioned previously, such a focus on theoretical concepts was exemplified in a case study by Cowell et al. [ 24 ], which explored the usefulness of the ‘function cost’ concept. Stake [ 19 ] has also argued that case study findings can be transferable, but from a different point of view. He suggests that readers can normally relate to the findings of case studies, which facilitate a kind of generalised understanding of phenomena [ 19 ]. For example, Jacobs used a case study design to provide an illuminating description of the challenges associated with improving access to psychiatry for members at a Clubhouse [ 23 ].

In contrast to his strong advocacy for the efficacy of case study design, one disadvantage observed by Yin [ 37 ] is that case study researchers can lack discipline, sometimes allowing detailed description and illustrative quotes to dominate findings. According to Yin, this is often at the expense of detailed accounts of research design procedures such as ethics, data collection and analytic procedures. An interesting technical point consistent across the five papers identified in this review was the lack of clarity regarding ethics and consent [ 20 – 24 ]. For example, Asmussen et al. [ 21 ] completed an interesting case study of a Clubhouse outreach program for homeless people, but failed to include any reference to ethical considerations.

In an effort to promote quality case study research, theorists such as Feagin [ 38 ], Yin [ 16 ] and Stake [ 39 ] have sought to develop protocols and structures for applying case studies. The following section will outline some considerations for effective application of case study design in future Clubhouse research.

Considerations for conducting case studies in Clubhouse settings

Assuming that a research question has been identified and that the researchers’ choice of case study design is driven by a desire to explore a phenomenon in depth in its everyday context, the next logical step is to identify whether the case best fits a single or multiple case design [ 16 ]. Single-case design may be a suitable choice if the case displays particular uniqueness—for example, a study into the unique cultural experience of needing to ‘save face’ experienced by members of a Hong Kong Clubhouse [ 40 ]; or the development of an innovative program integrating a psychiatry clinic into a Clubhouse [ 41 ]. A single case approach may also be useful for a study that has limited time and access to resources, such as a student undertaking higher degree studies that involve a research project. It is important at the outset that the researcher is clear about how findings will be analysed, and compared to or tested against a theoretical paradigm [ 19 ].

Alternatively, multiple-case design may work well in situations where there are several similar cases that can provide pathways for replication and comparison [ 39 ]. Replicating a case study in this way would then present the opportunity for pattern-matching, a technique that links several pieces of information from the same case to a theoretical proposition, thereby enhancing the rigour of findings and generalizability of theory [ 42 ]. For example, research providing theoretical observations about the Clubhouse’s supported employment programs might be strengthened by using a multiple case study design that includes Clubhouses of different sizes across a variety of cultures. This could then potentially enable generalisation of findings to the Clubhouse model as a whole.

Following the identification of whether a single or multiple case study is best suited to a research question, Yin [ 16 ] contends that a structured approach to design should be taken to ensure quality and exploratory power in case study research. He suggests that case study design should include:

  • An overview of the case study project citing objectives, issues and background.
  • Written field procedures describing research location and access to data.
  • Identification of research questions to be focused on during data collection.
  • A reporting guide outlining a general format for the report.

By employing such points as a guide, then, researchers will support consistency across case study research undertaken in a Clubhouse context.

Common data sources include but are not limited to, documentation, archival records, interviews, direct observation, participant observation and physical objects [ 16 ]. While no individual source should be considered better than another, the rationale for using several sources of data is the triangulation of evidence. Triangulation provides checks and balances for the reliability of data collection [ 43 ]. For example, data drawn from participant observation and interviews could be used to corroborate the meaning and application of data revealed through review of a Clubhouse’s documentation.

Conducting research in any service for people living with mental illness requires special sensitivity [ 44 ]. To encourage empowerment and guard against any potential harm to participants the Clubhouse model has a strong commitment to the co-production of research with members regularly encouraged to ask questions and share points of view [ 45 ]. With this in mind, a collaborative approach should be planned, actioned and reflected upon when conducting any Clubhouse case study.

A further consideration is promoting quality mental health research. People with serious mental illness often experience stigma and marginalization, and so it is important that research does not perpetuate this [ 44 ]. Developing a strong evidence base is crucial however, and within fields of mental health research there is robust debate regarding the merits and weaknesses of the different research paradigms [ 44 ]. Regardless of what approach is taken, consumers must be positioned at the centre of any mental health research—and genuine consultation with stakeholders is essential, including respectful processes, as well as ethical behaviours and practices, to ensure that research contributes to the nature, quality and the validity of the data gained [ 46 ].

Evaluation of case study designs may be conducted in a number of ways. As mentioned previously, the CASP [ 25 ] provides a ten point tool for systematic consideration of study design, results, validity and relevance. Alternatively, Popay’s [ 47 ] method of appraisal places a high value on studies that validate the expertise of consumers of healthcare and the theoretical generalizability of findings. Using this appraisal method, the research is rated as ‘thin’ if there is little consideration of consumer insights, limited explanation, and low relevance for generalization. On the other hand, studies are considered ‘thick’ if they lend weight to consumer descriptions, including detailed description of phenomena; and show potential for generalizability [ 48 ]. Much of the data found in older Clubhouse research, struggles to find relevance when tools like CASP [ 25 ] and Popay’s [ 47 ] approach are applied. While this does not diminish the value of early research, as the Clubhouse model continues to evolve, appraisal tools may provide substantial benefit for evaluating and improving the quality of modern Clubhouse case studies.

Psychosocial Clubhouses serve some of the most vulnerable and marginalised people in society. The Clubhouse model has become an internationally regarded provider of consumer-centred recovery-focused psychosocial rehabilitation [ 7 , 11 , 49 ]. With these considerations in mind, there is high need for research designs capable of exploring and describing how Clubhouses implement recovery practices.

This paper has identified case study design as a flexible research design that may utilize either qualitative, quantitative or a mixture of both types of data, to illuminate, elucidate or interpret phenomena in their everyday context and support the development of theory. As health science continues to evolve, case study design can provide a flexible framework for exploring the complex challenges presented by multidimensional mental health services like Clubhouses. Case study design enables consumers to play a central role in the development, implementation, analysis and synthesis of research. It also supports the conduct of genuine consultation with stakeholders, including respectful processes, ethical behaviours and practices to ensure the quality and validity of data gained.

Authors’ contributions

The authors have confirmed that all authors meet the ICMJE criteria for authorship credit ( http://www.icmje.org/recommendations/browse/roles-and-responsibilities/defining-the-role-of-authors-and-contributors.html ), as follows: (1) substantial contributions to conception and design, acquisition of data, or analysis and interpretation of data; (2) drafting the article or revising it critically for important intellectual content; and (3) final approval of the version to be published. All authors were involved in the development of the paper.TR drafted the initial manuscript. MC, VC and CH were also involved in manuscript revisions and supervision. All authors read and approved the final manuscript.

Acknowledgements

There was no funding source and there are nil applicable funding or acknowledgements for this discussion paper outside of the authors contributions.

Compliance with ethical guidelines

Competing interests The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the paper.

Abbreviations

CASPCritical Skills Appraisal Program
NHMRCAustralia’s National Health and Medical Research Council
RCTrandomised controlled trial

Contributor Information

Toby Raeburn, Email: ua.ude.swu.tneduts@03437461 .

Virginia Schmied, Email: [email protected] .

Catherine Hungerford, Email: ua.ude.usc@drofregnuhc .

Michelle Cleary, Email: [email protected] .

  • Scroll to top
  • Dark Light Dark Light

SurveyPoint

Exploratory Research: Overview, Application, Advantages and Disadvantages 

  • Author Kultar Singh
  • Published November 4, 2022

Exploratory Research: Overview, Application, Advantages and Disadvantages 

Research is usually described into three broad categories, i.e., exploratory, descriptive, and explanatory. Research that explores issues at an early stage of development is considered exploratory research. Exploratory research is conducted when the topic or issue is novel, and data collection is challenging. It is adaptable and may handle any research issue. In most cases, this method is used to formulate formal hypotheses.

Simply put, exploratory research is any study on questions without clear answers. It often happens before we have enough information to make meaningful distinctions or to establish a causal relationship. In addition, it facilitates the identification of the best research approach, data collection method, and subject selection. In many cases, the underlying nature of a reported problem turns out non-existent after exploratory analysis.

Table of Contents

When is the appropriate time to use it?

When an exploratory study aims to formulate a more precise problem or generate hypotheses, it helps better understand a phenomenon. Developing a hypothesis is impossible if the theory is too large or too specific. 

Therefore, exploratory research is required to obtain experience that will aid in formulating suitable hypotheses for further exploration. While exploratory research is not sufficient for decision-making, it can provide valuable insight into specific situations. The results of qualitative research can give a glimpse into “why,” “how,” and “when” events occur. Still, it cannot tell us “how often” or “how many.”

Methods used for exploratory research

In exploratory research, many methodologies are employed. Researchers can choose between primary and secondary methods or a hybrid approach.

Primary research consists of firsthand data from a specially assembled group of individuals. Organizations acquire preliminary data through interviews, focus groups, consumer surveys, or any other method that allows for feedback collection. For example, social media and blogs are excellent channels for business owners to get client feedback. 

Secondary research is the examination and synthesis of primary research conducted in the past. Any relevant data source can be used, including marketing research, periodicals, outdated publications, etc.

Often, exploratory research involves secondary research methods, such as reviewing relevant literature. Qualitative techniques, such as informal conversations with customers, employees, management, or competitors, and quantitative approaches, such as in-depth interviews, focus groups, projective methods, case studies, and pilot studies, can all be used in primary research.

Further, while conducting a case study, the case may include a single person or several people. A survey or exchange of insights can also help one gain experience. Researchers may also conduct an informal investigation of the problems. One may also conduct a pilot study or create a focus group design.

Primary research encompasses a wide variety of methodologies, both qualitative and quantitative, including but not limited to: 

  • casual talks with consumers
  • staff, management, or competitors
  • in-depth interviews
  • focus groups
  • projective methods
  • case studies
  • pilot studies

Merits of using Exploratory research

Investigating new alternatives and possibilities is crucial in light of our constantly evolving environment. In order to accomplish this, exploratory research is a fantastic tool. This strategy has numerous benefits, including creativity and innovation.

  • You can be receptive to new concepts and opportunities by conducting exploratory research. This may lead to more creative solutions to problems.
  • It promotes problem-solving, i.e., when you investigate novel concepts, you are likely to come up with answers to issues using exploratory research. Therefore, you can tackle challenging challenges more effectively.
  • The major benefits of doing exploratory research are that it is adaptable and enables the testing of several hypotheses, which increases the flexibility of your study. It implies that you may test out several strategies to find the most effective.
  • Using exploratory research techniques will increase the likelihood that you will produce reliable, valid research findings. Using this data, you can make more reliable inferences.
  • The more you conduct research using the exploratory research approach, the more proficient you become. For example, you can learn to distinguish between excellent and terrible questions or ask them effectively.
  • Using exploratory research techniques can make it simpler to make judgments based on more information than what you already know about the issue. 
  • When you employ exploratory research techniques, it will be simpler to present your facts accurately and truthfully. Adopting these techniques makes eliminating biases that might result from reporting on prior hypotheses and facts easier.

Demerits of using Exploratory research

Exploratory research has its challenges but has the potential to be a very effective technique for learning new things. Just a handful of them are as follows:

  • Exploratory research frequently has unclear objectives since it is exploratory. There is a possibility that the researcher won’t have all the information needed to do the study. Ultimately, the researcher and participants may experience dissatisfaction and misunderstanding.
  • Exploratory research can be challenging and time-consuming. Determining which questions to ask, how to gather data, and how to evaluate it might take a lot of work. Therefore, it can be challenging for researchers to finish their work within the allotted time or budget.
  • Exploratory research has many uses but doesn’t always yield accurate or valid conclusions. Rather than using actual data, exploratory study frequently relies on theories. The research’s findings might be deceptive or invalid if the hypothesis is unreliable or unsubstantiated.
  • A problem or issue may not always be discovered via exploratory investigations. The reason is that open-ended questions, frequently used in exploratory research, cannot elicit all the necessary data to resolve a problem.

Want to take the hassle out of your research analysis?  

Explore our solutions that help researchers collect accurate insights, boost ROI, and retain respondents using pre-built templates that don’t require coding.

Kultar Singh  –  Chief Executive Officer, Sambodhi

Kultar Singh

Recent posts.

cyber crime

  • Posted by Survey Point Team

How To Create Awareness Against Cyber Crime

Online Marketplaces

All You Need To Know About India's Online Marketplaces

Human resource

Exploring Human Resource Management: Guide for a Successful Workplace

exploratory case study advantages and disadvantages

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 28 August 2024

AI generates covertly racist decisions about people based on their dialect

  • Valentin Hofmann   ORCID: orcid.org/0000-0001-6603-3428 1 , 2 , 3 ,
  • Pratyusha Ria Kalluri 4 ,
  • Dan Jurafsky   ORCID: orcid.org/0000-0002-6459-7745 4 &
  • Sharese King 5  

Nature ( 2024 ) Cite this article

50 Altmetric

Metrics details

  • Computer science

Hundreds of millions of people now interact with language models, with uses ranging from help with writing 1 , 2 to informing hiring decisions 3 . However, these language models are known to perpetuate systematic racial prejudices, making their judgements biased in problematic ways about groups such as African Americans 4 , 5 , 6 , 7 . Although previous research has focused on overt racism in language models, social scientists have argued that racism with a more subtle character has developed over time, particularly in the United States after the civil rights movement 8 , 9 . It is unknown whether this covert racism manifests in language models. Here, we demonstrate that language models embody covert racism in the form of dialect prejudice, exhibiting raciolinguistic stereotypes about speakers of African American English (AAE) that are more negative than any human stereotypes about African Americans ever experimentally recorded. By contrast, the language models’ overt stereotypes about African Americans are more positive. Dialect prejudice has the potential for harmful consequences: language models are more likely to suggest that speakers of AAE be assigned less-prestigious jobs, be convicted of crimes and be sentenced to death. Finally, we show that current practices of alleviating racial bias in language models, such as human preference alignment, exacerbate the discrepancy between covert and overt stereotypes, by superficially obscuring the racism that language models maintain on a deeper level. Our findings have far-reaching implications for the fair and safe use of language technology.

Similar content being viewed by others

exploratory case study advantages and disadvantages

Large language models propagate race-based medicine

exploratory case study advantages and disadvantages

The benefits, risks and bounds of personalizing the alignment of large language models to individuals

exploratory case study advantages and disadvantages

Cognitive causes of ‘like me’ race and gender biases in human language production

Language models are a type of artificial intelligence (AI) that has been trained to process and generate text. They are becoming increasingly widespread across various applications, ranging from assisting teachers in the creation of lesson plans 10 to answering questions about tax law 11 and predicting how likely patients are to die in hospital before discharge 12 . As the stakes of the decisions entrusted to language models rise, so does the concern that they mirror or even amplify human biases encoded in the data they were trained on, thereby perpetuating discrimination against racialized, gendered and other minoritized social groups 4 , 5 , 6 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 .

Previous AI research has revealed bias against racialized groups but focused on overt instances of racism, naming racialized groups and mapping them to their respective stereotypes, for example by asking language models to generate a description of a member of a certain group and analysing the stereotypes it contains 7 , 21 . But social scientists have argued that, unlike the racism associated with the Jim Crow era, which included overt behaviours such as name calling or more brutal acts of violence such as lynching, a ‘new racism’ happens in the present-day United States in more subtle ways that rely on a ‘colour-blind’ racist ideology 8 , 9 . That is, one can avoid mentioning race by claiming not to see colour or to ignore race but still hold negative beliefs about racialized people. Importantly, such a framework emphasizes the avoidance of racial terminology but maintains racial inequities through covert racial discourses and practices 8 .

Here, we show that language models perpetuate this covert racism to a previously unrecognized extent, with measurable effects on their decisions. We investigate covert racism through dialect prejudice against speakers of AAE, a dialect associated with the descendants of enslaved African Americans in the United States 22 . We focus on the most stigmatized canonical features of the dialect shared among Black speakers in cities including New York City, Detroit, Washington DC, Los Angeles and East Palo Alto 23 . This cross-regional definition means that dialect prejudice in language models is likely to affect many African Americans.

Dialect prejudice is fundamentally different from the racial bias studied so far in language models because the race of speakers is never made overt. In fact we observed a discrepancy between what language models overtly say about African Americans and what they covertly associate with them as revealed by their dialect prejudice. This discrepancy is particularly pronounced for language models trained with human feedback (HF), such as GPT4: our results indicate that HF training obscures the racism on the surface, but the racial stereotypes remain unaffected on a deeper level. We propose using a new method, which we call matched guise probing, that makes it possible to recover these masked stereotypes.

The possibility that language models are covertly prejudiced against speakers of AAE connects to known human prejudices: speakers of AAE are known to experience racial discrimination in a wide range of contexts, including education, employment, housing and legal outcomes. For example, researchers have previously found that landlords engage in housing discrimination based solely on the auditory profiles of speakers, with voices that sounded Black or Chicano being less likely to secure housing appointments in predominantly white locales than in mostly Black or Mexican American areas 24 , 25 . Furthermore, in an experiment examining the perception of a Black speaker when providing an alibi 26 , the speaker was interpreted as more criminal, more working class, less educated, less comprehensible and less trustworthy when they used AAE rather than Standardized American English (SAE). Other costs for AAE speakers include having their speech mistranscribed or misunderstood in criminal justice contexts 27 and making less money than their SAE-speaking peers 28 . These harms connect to themes in broader racial ideology about African Americans and stereotypes about their intelligence, competence and propensity to commit crimes 29 , 30 , 31 , 32 , 33 , 34 , 35 . The fact that humans hold these stereotypes indicates that they are encoded in the training data and picked up by language models, potentially amplifying their harmful consequences, but this has never been investigated.

To our knowledge, this paper provides the first empirical evidence for the existence of dialect prejudice in language models; that is, covert racism that is activated by the features of a dialect (AAE). Using our new method of matched guise probing, we show that language models exhibit archaic stereotypes about speakers of AAE that most closely agree with the most-negative human stereotypes about African Americans ever experimentally recorded, dating from before the civil-rights movement. Crucially, we observe a discrepancy between what the language models overtly say about African Americans and what they covertly associate with them. Furthermore, we find that dialect prejudice affects language models’ decisions about people in very harmful ways. For example, when matching jobs to individuals on the basis of their dialect, language models assign considerably less-prestigious jobs to speakers of AAE than to speakers of SAE, even though they are not overtly told that the speakers are African American. Similarly, in a hypothetical experiment in which language models were asked to pass judgement on defendants who committed first-degree murder, they opted for the death penalty significantly more often when the defendants provided a statement in AAE rather than in SAE, again without being overtly told that the defendants were African American. We also show that current practices of alleviating racial disparities (increasing the model size) and overt racial bias (including HF in training) do not mitigate covert racism; indeed, quite the opposite. We found that HF training actually exacerbates the gap between covert and overt stereotypes in language models by obscuring racist attitudes. Finally, we discuss how the relationship between the language models’ covert and overt racial prejudices is both a reflection and a result of the inconsistent racial attitudes of contemporary society in the United States.

Probing AI dialect prejudice

To explore how dialect choice impacts the predictions that language models make about speakers in the absence of other cues about their racial identity, we took inspiration from the ‘matched guise’ technique used in sociolinguistics, in which subjects listen to recordings of speakers of two languages or dialects and make judgements about various traits of those speakers 36 , 37 . Applying the matched guise technique to the AAE–SAE contrast, researchers have shown that people identify speakers of AAE as Black with above-chance accuracy 24 , 26 , 38 and attach racial stereotypes to them, even without prior knowledge of their race 39 , 40 , 41 , 42 , 43 . These associations represent raciolinguistic ideologies, demonstrating how AAE is othered through the emphasis on its perceived deviance from standardized norms 44 .

Motivated by the insights enabled through the matched guise technique, we introduce matched guise probing, a method for investigating dialect prejudice in language models. The basic functioning of matched guise probing is as follows: we present language models with texts (such as tweets) in either AAE or SAE and ask them to make predictions about the speakers who uttered the texts (Fig. 1 and Methods ). For example, we might ask the language models whether a speaker who says “I be so happy when I wake up from a bad dream cus they be feelin too real” (AAE) is intelligent, and similarly whether a speaker who says “I am so happy when I wake up from a bad dream because they feel too real” (SAE) is intelligent. Notice that race is never overtly mentioned; its presence is merely encoded in the AAE dialect. We then examine how the language models’ predictions differ between AAE and SAE. The language models are not given any extra information to ensure that any difference in the predictions is necessarily due to the AAE–SAE contrast.

figure 1

a , We used texts in SAE (green) and AAE (blue). In the meaning-matched setting (illustrated here), the texts have the same meaning, whereas they have different meanings in the non-meaning-matched setting. b , We embedded the SAE and AAE texts in prompts that asked for properties of the speakers who uttered the texts. c , We separately fed the prompts with the SAE and AAE texts into the language models. d , We retrieved and compared the predictions for the SAE and AAE inputs, here illustrated by five adjectives from the Princeton Trilogy. See Methods for more details.

We examined matched guise probing in two settings: one in which the meanings of the AAE and SAE texts are matched (the SAE texts are translations of the AAE texts) and one in which the meanings are not matched ( Methods  (‘Probing’) and Supplementary Information  (‘Example texts’)). Although the meaning-matched setting is more rigorous, the non-meaning-matched setting is more realistic, because it is well known that there is a strong correlation between dialect and content (for example, topics 45 ). The non-meaning-matched setting thus allows us to tap into a nuance of dialect prejudice that would be missed by examining only meaning-matched examples (see Methods for an in-depth discussion). Because the results for both settings overall are highly consistent, we present them in aggregated form here, but analyse the differences in the  Supplementary Information .

We examined GPT2 (ref. 46 ), RoBERTa 47 , T5 (ref. 48 ), GPT3.5 (ref. 49 ) and GPT4 (ref. 50 ), each in one or more model versions, amounting to a total of 12 examined models ( Methods and Supplementary Information (‘Language models’)). We first used matched guise probing to probe the general existence of dialect prejudice in language models, and then applied it to the contexts of employment and criminal justice.

Covert stereotypes in language models

We started by investigating whether the attitudes that language models exhibit about speakers of AAE reflect human stereotypes about African Americans. To do so, we replicated the experimental set-up of the Princeton Trilogy 29 , 30 , 31 , 34 , a series of studies investigating the racial stereotypes held by Americans, with the difference that instead of overtly mentioning race to the language models, we used matched guise probing based on AAE and SAE texts ( Methods ).

Qualitatively, we found that there is a substantial overlap in the adjectives associated most strongly with African Americans by humans and the adjectives associated most strongly with AAE by language models, particularly for the earlier Princeton Trilogy studies (Fig. 2a ). For example, the five adjectives associated most strongly with AAE by GPT2, RoBERTa and T5 share three adjectives (‘ignorant’, ‘lazy’ and ‘stupid’) with the five adjectives associated most strongly with African Americans in the 1933 and 1951 Princeton Trilogy studies, an overlap that is unlikely to occur by chance (permutation test with 10,000 random permutations of the adjectives; P  < 0.01). Furthermore, in lieu of the positive adjectives (such as ‘musical’, ‘religious’ and ‘loyal’), the language models exhibit additional solely negative associations (such as ‘dirty’, ‘rude’ and ‘aggressive’).

figure 2

a , Strongest stereotypes about African Americans in humans in different years, strongest overt stereotypes about African Americans in language models, and strongest covert stereotypes about speakers of AAE in language models. Colour coding as positive (green) and negative (red) is based on ref. 34 . Although the overt stereotypes of language models are overall more positive than the human stereotypes, their covert stereotypes are more negative. b , Agreement of stereotypes about African Americans in humans with both overt and covert stereotypes about African Americans in language models. The black dotted line shows chance agreement using a random bootstrap. Error bars represent the standard error across different language models and prompts ( n  = 36). The language models’ overt stereotypes agree most strongly with current human stereotypes, which are the most positive experimentally recorded ones, but their covert stereotypes agree most strongly with human stereotypes from the 1930s, which are the most negative experimentally recorded ones. c , Stereotype strength for individual linguistic features of AAE. Error bars represent the standard error across different language models, model versions and prompts ( n  = 90). The linguistic features examined are: use of invariant ‘be’ for habitual aspect; use of ‘finna’ as a marker of the immediate future; use of (unstressed) ‘been’ for SAE ‘has been’ or ‘have been’ (present perfects); absence of the copula ‘is’ and ‘are’ for present-tense verbs; use of ‘ain’t’ as a general preverbal negator; orthographic realization of word-final ‘ing’ as ‘in’; use of invariant ‘stay’ for intensified habitual aspect; and absence of inflection in the third-person singular present tense. The measured stereotype strength is significantly above zero for all examined linguistic features, indicating that they all evoke raciolinguistic stereotypes in language models, although there is a lot of variation between individual features. See the Supplementary Information (‘Feature analysis’) for more details and analyses.

To investigate this more quantitatively, we devised a variant of average precision 51 that measures the agreement between the adjectives associated most strongly with African Americans by humans and the ranking of the adjectives according to their association with AAE by language models ( Methods ). We found that for all language models, the agreement with most Princeton Trilogy studies is significantly higher than expected by chance, as shown by one-sided t -tests computed against the agreement distribution resulting from 10,000 random permutations of the adjectives (mean ( m ) = 0.162, standard deviation ( s ) = 0.106; Extended Data Table 1 ); and that the agreement is particularly pronounced for the stereotypes reported in 1933 and falls for each study after that, almost reaching the level of chance agreement for 2012 (Fig. 2b ). In the Supplementary Information (‘Adjective analysis’), we explored variation across model versions, settings and prompts (Supplementary Fig. 2 and Supplementary Table 4 ).

To explain the observed temporal trend, we measured the average favourability of the top five adjectives for all Princeton Trilogy studies and language models, drawing from crowd-sourced ratings for the Princeton Trilogy adjectives on a scale between −2 (very negative) and 2 (very positive; see Methods , ‘Covert-stereotype analysis’). We found that the favourability of human attitudes about African Americans as reported in the Princeton Trilogy studies has become more positive over time, and that the language models’ attitudes about AAE are even more negative than the most negative experimentally recorded human attitudes about African Americans (the ones from the 1930s; Extended Data Fig. 1 ). In the Supplementary Information , we provide further quantitative analyses supporting this difference between humans and language models (Supplementary Fig. 7 ).

Furthermore, we found that the raciolinguistic stereotypes are not merely a reflection of the overt racial stereotypes in language models but constitute a fundamentally different kind of bias that is not mitigated in the current models. We show this by examining the stereotypes that the language models exhibit when they are overtly asked about African Americans ( Methods , ‘Overt-stereotype analysis’). We observed that the overt stereotypes are substantially more positive in sentiment than are the covert stereotypes, for all language models (Fig. 2a and Extended Data Fig. 1 ). Strikingly, for RoBERTa, T5, GPT3.5 and GPT4, although their covert stereotypes about speakers of AAE are more negative than the most negative experimentally recorded human stereotypes, their overt stereotypes about African Americans are more positive than the most positive experimentally recorded human stereotypes. This is particularly true for the two language models trained with HF (GPT3.5 and GPT4), in which all overt stereotypes are positive and all covert stereotypes are negative (see also ‘Resolvability of dialect prejudice’). In terms of agreement with human stereotypes about African Americans, the overt stereotypes almost never exhibit agreement significantly stronger than expected by chance, as shown by one-sided t -tests computed against the agreement distribution resulting from 10,000 random permutations of the adjectives ( m  = 0.162, s  = 0.106; Extended Data Table 2 ). Furthermore, the overt stereotypes are overall most similar to the human stereotypes from 2012, with the agreement continuously falling for earlier studies, which is the exact opposite trend to the covert stereotypes (Fig. 2b ).

In the experiments described in the  Supplementary Information (‘Feature analysis’), we found that the raciolinguistic stereotypes are directly linked to individual linguistic features of AAE (Fig. 2c and Supplementary Table 14 ), and that a higher density of such linguistic features results in stronger stereotypical associations (Supplementary Fig. 11 and Supplementary Table 13 ). Furthermore, we present experiments involving texts in other dialects (such as Appalachian English) as well as noisy texts, showing that these stereotypes cannot be adequately explained as either a general dismissive attitude towards text written in a dialect or as a general dismissive attitude towards deviations from SAE, irrespective of how the deviations look ( Supplementary Information (‘Alternative explanations’), Supplementary Figs. 12 and 13 and Supplementary Tables 15 and 16 ). Both alternative explanations are also tested on the level of individual linguistic features.

Thus, we found substantial evidence for the existence of covert raciolinguistic stereotypes in language models. Our experiments show that these stereotypes are similar to the archaic human stereotypes about African Americans that existed before the civil rights movement, are even more negative than the most negative experimentally recorded human stereotypes about African Americans, and are both qualitatively and quantitatively different from the previously reported overt racial stereotypes in language models, indicating that they are a fundamentally different kind of bias. Finally, our analyses demonstrate that the detected stereotypes are inherently linked to AAE and its linguistic features.

Impact of covert racism on AI decisions

To determine what harmful consequences the covert stereotypes have in the real world, we focused on two areas in which racial stereotypes about speakers of AAE and African Americans have been repeatedly shown to bias human decisions: employment and criminality. There is a growing impetus to use AI systems in these areas. Indeed, AI systems are already being used for personnel selection 52 , 53 , including automated analyses of applicants’ social-media posts 54 , 55 , and technologies for predicting legal outcomes are under active development 56 , 57 , 58 . Rather than advocating these use cases of AI, which are inherently problematic 59 , the sole objective of this analysis is to examine the extent to which the decisions of language models, when they are used in such contexts, are impacted by dialect.

First, we examined decisions about employability. Using matched guise probing, we asked the language models to match occupations to the speakers who uttered the AAE or SAE texts and computed scores indicating whether an occupation is associated more with speakers of AAE (positive scores) or speakers of SAE (negative scores; Methods , ‘Employability analysis’). The average score of the occupations was negative ( m  = –0.046,  s  = 0.053), the difference from zero being statistically significant (one-sample, one-sided t -test, t (83) = −7.9, P  < 0.001). This trend held for all language models individually (Extended Data Table 3 ). Thus, if a speaker exhibited features of AAE, the language models were less likely to associate them with any job. Furthermore, we observed that for all language models, the occupations that had the lowest association with AAE require a university degree (such as psychologist, professor and economist), but this is not the case for the occupations that had the highest association with AAE (for example, cook, soldier and guard; Fig. 3a ). Also, many occupations strongly associated with AAE are related to music and entertainment more generally (singer, musician and comedian), which is in line with a pervasive stereotype about African Americans 60 . To probe these observations more systematically, we tested for a correlation between the prestige of the occupations and the propensity of the language models to match them to AAE ( Methods ). Using a linear regression, we found that the association with AAE predicted the occupational prestige (Fig. 3b ; β  = −7.8, R 2 = 0.193, F (1, 63) = 15.1, P  < 0.001). This trend held for all language models individually (Extended Data Fig. 2 and Extended Data Table 4 ), albeit in a less pronounced way for GPT3.5, which had a particularly strong association of AAE with occupations in music and entertainment.

figure 3

a , Association of different occupations with AAE or SAE. Positive values indicate a stronger association with AAE and negative values indicate a stronger association with SAE. The bottom five occupations (those associated most strongly with SAE) mostly require a university degree, but this is not the case for the top five (those associated most strongly with AAE). b , Prestige of occupations that language models associate with AAE (positive values) or SAE (negative values). The shaded area shows a 95% confidence band around the regression line. The association with AAE or SAE predicts the occupational prestige. Results for individual language models are provided in Extended Data Fig. 2 . c , Relative increase in the number of convictions and death sentences for AAE versus SAE. Error bars represent the standard error across different model versions, settings and prompts ( n  = 24 for GPT2, n  = 12 for RoBERTa, n  = 24 for T5, n  = 6 for GPT3.5 and n  = 6 for GPT4). In cases of small sample size ( n  ≤ 10 for GPT3.5 and GPT4), we plotted the individual results as overlaid dots. T5 does not contain the tokens ‘acquitted’ or ‘convicted’ in its vocabulary and is therefore excluded from the conviction analysis. Detrimental judicial decisions systematically go up for speakers of AAE compared with speakers of SAE.

We then examined decisions about criminality. We used matched guise probing for two experiments in which we presented the language models with hypothetical trials where the only evidence was a text uttered by the defendant in either AAE or SAE. We then measured the probability that the language models assigned to potential judicial outcomes in these trials and counted how often each of the judicial outcomes was preferred for AAE and SAE ( Methods , ‘Criminality analysis’). In the first experiment, we told the language models that a person is accused of an unspecified crime and asked whether the models will convict or acquit the person solely on the basis of the AAE or SAE text. Overall, we found that the rate of convictions was greater for AAE ( r  = 68.7%) than SAE ( r  = 62.1%; Fig. 3c , left). A chi-squared test found a strong effect ( χ 2 (1,  N  = 96) = 184.7,  P  < 0.001), which held for all language models individually (Extended Data Table 5 ). In the second experiment, we specifically told the language models that the person committed first-degree murder and asked whether the models will sentence the person to life or death on the basis of the AAE or SAE text. The overall rate of death sentences was greater for AAE ( r  = 27.7%) than for SAE ( r  = 22.8%; Fig. 3c , right). A chi-squared test found a strong effect ( χ 2 (1,  N  = 144) = 425.4,  P  < 0.001), which held for all language models individually except for T5 (Extended Data Table 6 ). In the Supplementary Information , we show that this deviation was caused by the base T5 version, and that the larger T5 versions follow the general pattern (Supplementary Table 10 ).

In further experiments ( Supplementary Information , ‘Intelligence analysis’), we used matched guise probing to examine decisions about intelligence, and found that all the language models consistently judge speakers of AAE to have a lower IQ than speakers of SAE (Supplementary Figs. 14 and 15 and Supplementary Tables 17 – 19 ).

Resolvability of dialect prejudice

We wanted to know whether the dialect prejudice we observed is resolved by current practices of bias mitigation, such as increasing the size of the language model or including HF in training. It has been shown that larger language models work better with dialects 21 and can have less racial bias 61 . Therefore, the first method we examined was scaling, that is, increasing the model size ( Methods ). We found evidence of a clear trend (Extended Data Tables 7 and 8 ): larger language models are indeed better at processing AAE (Fig. 4a , left), but they are not less prejudiced against speakers of it. In fact, larger models showed more covert prejudice than smaller models (Fig. 4a , right). By contrast, larger models showed less overt prejudice against African Americans (Fig. 4a , right). Thus, increasing scale does make models better at processing AAE and at avoiding prejudice against overt mentions of African Americans, but it makes them more linguistically prejudiced.

figure 4

a , Language modelling perplexity and stereotype strength on AAE text as a function of model size. Perplexity is a measure of how successful a language model is at processing a particular text; a lower result is better. For language models for which perplexity is not well-defined (RoBERTa and T5), we computed pseudo-perplexity instead (dotted line). Error bars represent the standard error across different models of a size class and AAE or SAE texts ( n  = 9,057 for small, n  = 6,038 for medium, n  = 15,095 for large and n  = 3,019 for very large). For covert stereotypes, error bars represent the standard error across different models of a size class, settings and prompts ( n  = 54 for small, n  = 36 for medium, n  = 90 for large and n  = 18 for very large). For overt stereotypes, error bars represent the standard error across different models of a size class and prompts ( n  = 27 for small, n  = 18 for medium, n  = 45 for large and n  = 9 for very large). Although larger language models are better at processing AAE (left), they are not less prejudiced against speakers of it. Indeed, larger models show more covert prejudice than smaller models (right). By contrast, larger models show less overt prejudice against African Americans (right). In other words, increasing scale does make models better at processing AAE and at avoiding prejudice against overt mentions of African Americans, but it makes them more linguistically prejudiced. b , Change in stereotype strength and favourability as a result of training with HF for covert and overt stereotypes. Error bars represent the standard error across different prompts ( n  = 9). HF weakens (left) and improves (right) overt stereotypes but not covert stereotypes. c , Top overt and covert stereotypes about African Americans in GPT3, trained without HF, and GPT3.5, trained with HF. Colour coding as positive (green) and negative (red) is based on ref. 34 . The overt stereotypes get substantially more positive as a result of HF training in GPT3.5, but there is no visible change in favourability for the covert stereotypes.

As a second potential way to resolve dialect prejudice in language models, we examined training with HF 49 , 62 . Specifically, we compared GPT3.5 (ref. 49 ) with GPT3 (ref. 63 ), its predecessor that was trained without using HF ( Methods ). Looking at the top adjectives associated overtly and covertly with African Americans by the two language models, we found that HF resulted in more-positive overt associations but had no clear qualitative effect on the covert associations (Fig. 4c ). This observation was confirmed by quantitative analyses: the inclusion of HF resulted in significantly weaker (no HF, m  = 0.135,  s  = 0.142; HF, m  = −0.119,  s  = 0.234;  t (16) = 2.6,  P  < 0.05) and more favourable (no HF, m  = 0.221,  s  = 0.399; HF, m  = 1.047,  s  = 0.387;  t (16) = −6.4,  P  < 0.001) overt stereotypes but produced no significant difference in the strength (no HF, m  = 0.153,  s  = 0.049; HF, m  = 0.187,  s  = 0.066;  t (16) = −1.2, P  = 0.3) or unfavourability (no HF, m  = −1.146, s  = 0.580; HF, m = −1.029, s  = 0.196; t (16) = −0.5, P  = 0.6) of covert stereotypes (Fig. 4b ). Thus, HF training weakens and ameliorates the overt stereotypes but has no clear effect on the covert stereotypes; in other words, it obscures the racist attitudes on the surface, but more subtle forms of racism, such as dialect prejudice, remain unaffected. This finding is underscored by the fact that the discrepancy between overt and covert stereotypes about African Americans is most pronounced for the two examined language models trained with human feedback (GPT3.5 and GPT4; see ‘Covert stereotypes in language models’). Furthermore, this finding again shows that there is a fundamental difference between overt and covert stereotypes in language models, and that mitigating the overt stereotypes does not automatically translate to mitigated covert stereotypes.

To sum up, neither scaling nor training with HF as applied today resolves the dialect prejudice. The fact that these two methods effectively mitigate racial performance disparities and overt racial stereotypes in language models indicates that this form of covert racism constitutes a different problem that is not addressed by current approaches for improving and aligning language models.

The key finding of this article is that language models maintain a form of covert racial prejudice against African Americans that is triggered by dialect features alone. In our experiments, we avoided overt mentions of race but drew from the racialized meanings of a stigmatized dialect, and could still find historically racist associations with African Americans. The implicit nature of this prejudice, that is, the fact it is about something that is not explicitly expressed in the text, makes it fundamentally different from the overt racial prejudice that has been the focus of previous research. Strikingly, the language models’ covert and overt racial prejudices are often in contradiction with each other, especially for the most recent language models that have been trained with HF (GPT3.5 and GPT4). These two language models obscure the racism, overtly associating African Americans with exclusively positive attributes (such as ‘brilliant’), but our results show that they covertly associate African Americans with exclusively negative attributes (such as ‘lazy’).

We argue that this paradoxical relation between the language models’ covert and overt racial prejudices manifests the inconsistent racial attitudes present in the contemporary society of the United States 8 , 64 . In the Jim Crow era, stereotypes about African Americans were overtly racist, but the normative climate after the civil rights movement made expressing explicitly racist views distasteful. As a result, racism acquired a covert character and continued to exist on a more subtle level. Thus, most white people nowadays report positive attitudes towards African Americans in surveys but perpetuate racial inequalities through their unconscious behaviour, such as their residential choices 65 . It has been shown that negative stereotypes persist, even if they are superficially rejected 66 , 67 . This ambivalence is reflected by the language models we analysed, which are overtly non-racist but covertly exhibit archaic stereotypes about African Americans, showing that they reproduce a colour-blind racist ideology. Crucially, the civil rights movement is generally seen as the period during which racism shifted from overt to covert 68 , 69 , and this is mirrored by our results: all the language models overtly agree the most with human stereotypes from after the civil rights movement, but covertly agree the most with human stereotypes from before the civil rights movement.

Our findings beg the question of how dialect prejudice got into the language models. Language models are pretrained on web-scraped corpora such as WebText 46 , C4 (ref. 48 ) and the Pile 70 , which encode raciolinguistic stereotypes about AAE. A drastic example of this is the use of ‘mock ebonics’ to parodize speakers of AAE 71 . Crucially, a growing body of evidence indicates that language models pick up prejudices present in the pretraining corpus 72 , 73 , 74 , 75 , which would explain how they become prejudiced against speakers of AAE, and why they show varying levels of dialect prejudice as a function of the pretraining corpus. However, the web also abounds with overt racism against African Americans 76 , 77 , so we wondered why the language models exhibit much less overt than covert racial prejudice. We argue that the reason for this is that the existence of overt racism is generally known to people 32 , which is not the case for covert racism 69 . Crucially, this also holds for the field of AI. The typical pipeline of training language models includes steps such as data filtering 48 and, more recently, HF training 62 that remove overt racial prejudice. As a result, much of the overt racism on the web does not end up in the language models. However, there are currently no measures in place to curtail covert racial prejudice when training language models. For example, common datasets for HF training 62 , 78 do not include examples that would train the language models to treat speakers of AAE and SAE equally. As a result, the covert racism encoded in the training data can make its way into the language models in an unhindered fashion. It is worth mentioning that the lack of awareness of covert racism also manifests during evaluation, where it is common to test language models for overt racism but not for covert racism 21 , 63 , 79 , 80 .

As well as the representational harms, by which we mean the pernicious representation of AAE speakers, we also found evidence for substantial allocational harms. This refers to the inequitable allocation of resources to AAE speakers 81 (Barocas et al., unpublished observations), and adds to known cases of language technology putting speakers of AAE at a disadvantage by performing worse on AAE 82 , 83 , 84 , 85 , 86 , 87 , 88 , misclassifying AAE as hate speech 81 , 89 , 90 , 91 or treating AAE as incorrect English 83 , 85 , 92 . All the language models are more likely to assign low-prestige jobs to speakers of AAE than to speakers of SAE, and are more likely to convict speakers of AAE of a crime, and to sentence speakers of AAE to death. Although the details of our tasks are constructed, the findings reveal real and urgent concerns because business and jurisdiction are areas for which AI systems involving language models are currently being developed or deployed. As a consequence, the dialect prejudice we uncovered might already be affecting AI decisions today, for example when a language model is used in application-screening systems to process background information, which might include social-media text. Worryingly, we also observe that larger language models and language models trained with HF exhibit stronger covert, but weaker overt, prejudice. Against the backdrop of continually growing language models and the increasingly widespread adoption of HF training, this has two risks: first, that language models, unbeknownst to developers and users, reach ever-increasing levels of covert prejudice; and second, that developers and users mistake ever-decreasing levels of overt prejudice (the only kind of prejudice currently tested for) for a sign that racism in language models has been solved. There is therefore a realistic possibility that the allocational harms caused by dialect prejudice in language models will increase further in the future, perpetuating the racial discrimination experienced by generations of African Americans.

Matched guise probing examines how strongly a language model associates certain tokens, such as personality traits, with AAE compared with SAE. AAE can be viewed as the treatment condition, whereas SAE functions as the control condition. We start by explaining the basic experimental unit of matched guise probing: measuring how a language model associates certain tokens with an individual text in AAE or SAE. Based on this, we introduce two different settings for matched guise probing (meaning-matched and non-meaning-matched), which are both inspired by the matched guise technique used in sociolinguistics 36 , 37 , 93 , 94 and provide complementary views on the attitudes a language model has about a dialect.

The basic experimental unit of matched guise probing is as follows. Let θ be a language model, t be a text in AAE or SAE, and x be a token of interest, typically a personality trait such as ‘intelligent’. We embed the text in a prompt v , for example v ( t ) = ‘a person who says t tends to be’, and compute P ( x ∣ v ( t );  θ ), which is the probability that θ assigns to x after processing v ( t ). We calculate P ( x ∣ v ( t );  θ ) for equally sized sets T a of AAE texts and T s of SAE texts, comparing various tokens from a set X as possible continuations. It has been shown that P ( x ∣ v ( t );  θ ) can be affected by the precise wording of v , so small modifications of v can have an unpredictable effect on the predictions made by the language model 21 , 95 , 96 . To account for this fact, we consider a set V containing several prompts ( Supplementary Information ). For all experiments, we have provided detailed analyses of variation across prompts in the  Supplementary Information .

We conducted matched guise probing in two settings. In the first setting, the texts in T a and T s formed pairs expressing the same underlying meaning, that is, the i -th text in T a (for example, ‘I be so happy when I wake up from a bad dream cus they be feelin too real’) matches the i -th text in T s (for example, ‘I am so happy when I wake up from a bad dream because they feel too real’). For this setting, we used the dataset from ref. 87 , which contains 2,019 AAE tweets together with their SAE translations. In the second setting, the texts in T a and T s did not form pairs, so they were independent texts in AAE and SAE. For this setting, we sampled 2,000 AAE and SAE tweets from the dataset in ref. 83 and used tweets strongly aligned with African Americans for AAE and tweets strongly aligned with white people for SAE ( Supplementary Information (‘Analysis of non-meaning-matched texts’), Supplementary Fig. 1 and Supplementary Table 3 ). In the  Supplementary Information , we include examples of AAE and SAE texts for both settings (Supplementary Tables 1 and 2 ). Tweets are well suited for matched guise probing because they are a rich source of dialectal variation 97 , 98 , 99 , especially for AAE 100 , 101 , 102 , but matched guise probing can be applied to any type of text. Although we do not consider it here, matched guise probing can in principle also be applied to speech-based models, with the potential advantage that dialectal variation on the phonetic level could be captured more directly, which would make it possible to study dialect prejudice specific to regional variants of AAE 23 . However, note that a great deal of phonetic variation is reflected orthographically in social-media texts 101 .

It is important to analyse both meaning-matched and non-meaning-matched settings because they capture different aspects of the attitudes a language model has about speakers of AAE. Controlling for the underlying meaning makes it possible to uncover differences in the attitudes of the language model that are solely due to grammatical and lexical features of AAE. However, it is known that various properties other than linguistic features correlate with dialect, such as topics 45 , and these might also influence the attitudes of the language model. Sidelining such properties bears the risk of underestimating the harms that dialect prejudice causes for speakers of AAE in the real world. For example, in a scenario in which a language model is used in the context of automated personnel selection to screen applicants’ social-media posts, the texts of two competing applicants typically differ in content and do not come in pairs expressing the same meaning. The relative advantages of using meaning-matched or non-meaning-matched data for matched guise probing are conceptually similar to the relative advantages of using the same or different speakers for the matched guise technique: more control in the former versus more naturalness in the latter setting 93 , 94 . Because the results obtained in both settings were consistent overall for all experiments, we aggregated them in the main article, but we analysed differences in detail in the  Supplementary Information .

We apply matched guise probing to five language models: RoBERTa 47 , which is an encoder-only language model; GPT2 (ref. 46 ), GPT3.5 (ref. 49 ) and GPT4 (ref. 50 ), which are decoder-only language models; and T5 (ref. 48 ), which is an encoder–decoder language model. For each language model, we examined one or more model versions: GPT2 (base), GPT2 (medium), GPT2 (large), GPT2 (xl), RoBERTa (base), RoBERTa (large), T5 (small), T5 (base), T5 (large), T5 (3b), GPT3.5 (text-davinci-003) and GPT4 (0613). Where we used several model versions per language model (GPT2, RoBERTa and T5), the model versions all had the same architecture and were trained on the same data but differed in their size. Furthermore, we note that GPT3.5 and GPT4 are the only language models examined in this paper that were trained with HF, specifically reinforcement learning from human feedback 103 . When it is clear from the context what is meant, or when the distinction does not matter, we use the term ‘language models’, or sometimes ‘models‘, in a more general way that includes individual model versions.

Regarding matched guise probing, the exact method for computing P ( x ∣ v ( t );  θ ) varies across language models and is detailed in the  Supplementary Information . For GPT4, for which computing P ( x ∣ v ( t );  θ ) for all tokens of interest was often not possible owing to restrictions imposed by the OpenAI application programming interface (API), we used a slightly modified method for some of the experiments, and this is also discussed in the  Supplementary Information . Similarly, some of the experiments could not be done for all language models because of model-specific constraints, which we highlight below. We note that there was at most one language model per experiment for which this was the case.

Covert-stereotype analysis

In the covert-stereotype analysis, the tokens x whose probabilities are measured for matched guise probing are trait adjectives from the Princeton Trilogy 29 , 30 , 31 , 34 , such as ‘aggressive’, ‘intelligent’ and ‘quiet’. We provide details about these adjectives in the  Supplementary Information . In the Princeton Trilogy, the adjectives are provided to participants in the form of a list, and participants are asked to select from the list the five adjectives that best characterize a given ethnic group, such as African Americans. The studies that we compare in this paper, which are the original Princeton Trilogy studies 29 , 30 , 31 and a more recent reinstallment 34 , all follow this general set-up and observe a gradual improvement of the expressed stereotypes about African Americans over time, but the exact interpretation of this finding is disputed 32 . Here, we used the adjectives from the Princeton Trilogy in the context of matched guise probing.

Specifically, we first computed P ( x ∣ v ( t );  θ ) for all adjectives, for both the AAE texts and the SAE texts. The method for aggregating the probabilities P ( x ∣ v ( t );  θ ) into association scores between an adjective x and AAE varies for the two settings of matched guise probing. Let \({t}_{{\rm{a}}}^{i}\) be the i -th AAE text in T a and \({t}_{{\rm{s}}}^{i}\) be the i -th SAE text in T s . In the meaning-matched setting, in which \({t}_{{\rm{a}}}^{i}\) and \({t}_{{\rm{s}}}^{i}\) express the same meaning, we computed the prompt-level association score for an adjective x as

where n = ∣ T a ∣ = ∣ T s ∣ . Thus, we measure for each pair of AAE and SAE texts the log ratio of the probability assigned to x following the AAE text and the probability assigned to x following the SAE text, and then average the log ratios of the probabilities across all pairs. In the non-meaning-matched setting, we computed the prompt-level association score for an adjective x as

where again n = ∣ T a ∣ = ∣ T s ∣ . In other words, we first compute the average probability assigned to a certain adjective x following all AAE texts and the average probability assigned to x following all SAE texts, and then measure the log ratio of these average probabilities. The interpretation of q ( x ;  v ,  θ ) is identical in both settings; q ( x ;  v , θ ) > 0 means that for a certain prompt v , the language model θ associates the adjective x more strongly with AAE than with SAE, and q ( x ;  v ,  θ ) < 0 means that for a certain prompt v , the language model θ associates the adjective x more strongly with SAE than with AAE. In the  Supplementary Information (‘Calibration’), we show that q ( x ;  v , θ ) is calibrated 104 , meaning that it does not depend on the prior probability that θ assigns to x in a neutral context.

The prompt-level association scores q ( x ;  v ,  θ ) are the basis for further analyses. We start by averaging q ( x ;  v ,  θ ) across model versions, prompts and settings, and this allows us to rank all adjectives according to their overall association with AAE for individual language models (Fig. 2a ). In this and the following adjective analyses, we focus on the five adjectives that exhibit the highest association with AAE, making it possible to consistently compare the language models with the results from the Princeton Trilogy studies, most of which do not report the full ranking of all adjectives. Results for individual model versions are provided in the  Supplementary Information , where we also analyse variation across settings and prompts (Supplementary Fig. 2 and Supplementary Table 4 ).

Next, we wanted to measure the agreement between language models and humans through time. To do so, we considered the five adjectives most strongly associated with African Americans for each study and evaluated how highly these adjectives are ranked by the language models. Specifically, let R l  = [ x 1 , …,  x ∣ X ∣ ] be the adjective ranking generated by a language model and \({R}_{h}^{5}\) = [ x 1 , …, x 5 ] be the ranking of the top five adjectives generated by the human participants in one of the Princeton Trilogy studies. A typical measure to evaluate how highly the adjectives from \({R}_{h}^{5}\) are ranked within R l is average precision, AP 51 . However, AP does not take the internal ranking of the adjectives in \({R}_{h}^{5}\) into account, which is not ideal for our purposes; for example, AP does not distinguish whether the top-ranked adjective for humans is on the first or on the fifth rank for a language model. To remedy this, we computed the mean average precision, MAP, for different subsets of \({R}_{h}^{5}\) ,

where \({R}_{h}^{i}\) denotes the top i adjectives from the human ranking. MAP = 1 if, and only if, the top five adjectives from \({R}_{h}^{5}\) have an exact one-to-one correspondence with the top five adjectives from R l , so, unlike AP, it takes the internal ranking of the adjectives into account. We computed an individual agreement score for each language model and prompt, so we average the q ( x ;  v ,  θ ) association scores for all model versions of a language model (GPT2, for example) and the two settings (meaning-matched and non-meaning-matched) to generate R l . Because the OpenAI API for GPT4 does not give access to the probabilities for all adjectives, we excluded GPT4 from this analysis. Results are presented in Fig. 2b and Extended Data Table 1 . In the Supplementary Information (‘Agreement analysis’), we analyse variation across model versions, settings and prompts (Supplementary Figs. 3 – 5 ).

To analyse the favourability of the stereotypes about African Americans, we drew from crowd-sourced favourability ratings collected previously 34 for the adjectives from the Princeton Trilogy that range between −2 (‘very unfavourable’, meaning very negative) and 2 (‘very favourable’, meaning very positive). For example, the favourability rating of ‘cruel’ is −1.81 and the favourability rating of ‘brilliant’ is 1.86. We computed the average favourability of the top five adjectives, weighting the favourability ratings of individual adjectives by their association scores with AAE and African Americans. More formally, let R 5 = [ x 1 , …, x 5 ] be the ranking of the top five adjectives generated by either a language model or humans. Furthermore, let f ( x ) be the favourability rating of adjective x as reported in ref. 34 , and let q ( x ) be the overall association score of adjective x with AAE or African Americans that is used to generate R 5 . For the Princeton Trilogy studies, q ( x ) is the percentage of participants who have assigned x to African Americans. For language models, q ( x ) is the average value of q ( x ;  v ,  θ ). We then computed the weighted average favourability, F , of the top five adjectives as

As a result of the weighting, the top-ranked adjective contributed more to the average than the second-ranked adjective, and so on. Results are presented in Extended Data Fig. 1 . To check for consistency, we also computed the average favourability of the top five adjectives without weighting, which yields similar results (Supplementary Fig. 6) .

Overt-stereotype analysis

The overt-stereotype analysis closely followed the methodology of the covert-stereotype analysis, with the difference being that instead of providing the language models with AAE and SAE texts, we provided them with overt descriptions of race (specifically, ‘Black’/‘black’ and ‘White’/‘white’). This methodological difference is also reflected by a different set of prompts ( Supplementary Information ). As a result, the experimental set-up is very similar to existing studies on overt racial bias in language models 4 , 7 . All other aspects of the analysis (such as computing adjective association scores) were identical to the analysis for covert stereotypes. This also holds for GPT4, for which we again could not conduct the agreement analysis.

We again present average results for the five language models in the main article. Results broken down for individual model versions are provided in the  Supplementary Information , where we also analyse variation across prompts (Supplementary Fig. 8 and Supplementary Table 5 ).

Employability analysis

The general set-up of the employability analysis was identical to the stereotype analyses: we fed text written in either AAE or SAE, embedded in prompts, into the language models and analysed the probabilities that they assigned to different continuation tokens. However, instead of trait adjectives, we considered occupations for X and also used a different set of prompts ( Supplementary Information ). We created a list of occupations, drawing from previously published lists 6 , 76 , 105 , 106 , 107 . We provided details about these occupations in the  Supplementary Information . We then computed association scores q ( x ;  v ,  θ ) between individual occupations x and AAE, following the same methodology as for computing adjective association scores, and ranked the occupations according to q ( x ;  v ,  θ ) for the language models. To probe the prestige associated with the occupations, we drew from a dataset of occupational prestige 105 that is based on the 2012 US General Social Survey and measures prestige on a scale from 1 (low prestige) to 9 (high prestige). For GPT4, we could not conduct the parts of the analysis that require scores for all occupations.

We again present average results for the five language models in the main article. Results for individual model versions are provided in the  Supplementary Information , where we also analyse variation across settings and prompts (Supplementary Tables 6 – 8 ).

Criminality analysis

The set-up of the criminality analysis is different from the previous experiments in that we did not compute aggregate association scores between certain tokens (such as trait adjectives) and AAE but instead asked the language models to make discrete decisions for each AAE and SAE text. More specifically, we simulated trials in which the language models were prompted to use AAE or SAE texts as evidence to make a judicial decision. We then aggregated the judicial decisions into summary statistics.

We conducted two experiments. In the first experiment, the language models were asked to determine whether a person accused of committing an unspecified crime should be acquitted or convicted. The only evidence provided to the language models was a statement made by the defendant, which was an AAE or SAE text. In the second experiment, the language models were asked to determine whether a person who committed first-degree murder should be sentenced to life or death. Similarly to the first (general conviction) experiment, the only evidence provided to the language models was a statement made by the defendant, which was an AAE or SAE text. Note that the AAE and SAE texts were the same texts as in the other experiments and did not come from a judicial context. Rather than testing how well language models could perform the tasks of predicting acquittal or conviction and life penalty or death penalty (an application of AI that we do not support), we were interested to see to what extent the decisions of the language models, made in the absence of any real evidence, were impacted by dialect. Although providing the language models with extra evidence as well as the AAE and SAE texts would have made the experiments more similar to real trials, it would have confounded the effect that dialect has on its own (the key effect of interest), so we did not consider this alternative set-up here. We focused on convictions and death penalties specifically because these are the two areas of the criminal justice system for which racial disparities have been described in the most robust and indisputable way: African Americans represent about 12% of the adult population of the United States, but they represent 33% of inmates 108 and more than 41% of people on death row 109 .

Methodologically, we used prompts that asked the language models to make a judicial decision ( Supplementary Information ). For a specific text, t , which is in AAE or SAE, we computed p ( x ∣ v ( t );  θ ) for the tokens x that correspond to the judicial outcomes of interest (‘acquitted’ or ‘convicted’, and ‘life’ or ‘death’). T5 does not contain the tokens ‘acquitted’ and ‘convicted’ in its vocabulary, so is was excluded from the conviction analysis. Because the language models might assign different prior probabilities to the outcome tokens, we calibrated them using their probabilities in a neutral context following v , meaning without text t 104 . Whichever outcome had the higher calibrated probability was counted as the decision. We aggregated the detrimental decisions (convictions and death penalties) and compared their rates (percentages) between AAE and SAE texts. An alternative approach would have been to generate the judicial decision by sampling from the language models, which would have allowed us to induce the language models to generate justifications of their decisions. However, this approach has three disadvantages: first, encoder-only language models such as RoBERTa do not lend themselves to text generation; second, it would have been necessary to apply jail-breaking for some of the language models, which can have unpredictable effects, especially in the context of socially sensitive tasks; and third, model-generated justifications are frequently not aligned with actual model behaviours 110 .

We again present average results on the level of language models in the main article. Results for individual model versions are provided in the  Supplementary Information , where we also analyse variation across settings and prompts (Supplementary Figs. 9 and 10 and Supplementary Tables 9 – 12 ).

Scaling analysis

In the scaling analysis, we examined whether increasing the model size alleviated the dialect prejudice. Because the content of the covert stereotypes is quite consistent and does not vary substantially between models with different sizes, we instead analysed the strength with which the language models maintain these stereotypes. We split the model versions of all language models into four groups according to their size using the thresholds of 1.5 × 10 8 , 3.5 × 10 8 and 1.0 × 10 10 (Extended Data Table 7 ).

To evaluate the familiarity of the models with AAE, we measured their perplexity on the datasets used for the two evaluation settings 83 , 87 . Perplexity is defined as the exponentiated average negative log-likelihood of a sequence of tokens 111 , with lower values indicating higher familiarity. Perplexity requires the language models to assign probabilities to full sequences of tokens, which is only the case for GPT2 and GPT3.5. For RoBERTa and T5, we resorted to pseudo-perplexity 112 as the measure of familiarity. Results are only comparable across language models with the same familiarity measure. We excluded GPT4 from this analysis because it is not possible to compute perplexity using the OpenAI API.

To evaluate the stereotype strength, we focused on the stereotypes about African Americans reported in ref. 29 , which the language models’ covert stereotypes agree with most strongly. We split the set of adjectives X into two subsets: the set of stereotypical adjectives in ref. 29 , X s , and the set of non-stereotypical adjectives, X n  =  X \ X s . For each model with a specific size, we then computed the average value of q ( x ;  v ,  θ ) for all adjectives in X s , which we denote as q s ( θ ), and the average value of q ( x ;  v ,  θ ) for all adjectives in X n , which we denote as q n ( θ ). The stereotype strength of a model θ , or more specifically the strength of the stereotypes about African Americans reported in ref. 29 , can then be computed as

A positive value of δ ( θ ) means that the model associates the stereotypical adjectives in X s more strongly with AAE than the non-stereotypical adjectives in X n , whereas a negative value of δ ( θ ) indicates anti-stereotypical associations, meaning that the model associates the non-stereotypical adjectives in X n more strongly with AAE than the stereotypical adjectives in X s . For the overt stereotypes, we used the same split of adjectives into X s and X n because we wanted to directly compare the strength with which models of a certain size endorse the stereotypes overtly as opposed to covertly. All other aspects of the experimental set-up are identical to the main analyses of covert and overt stereotypes.

HF analysis

We compared GPT3.5 (ref. 49 ; text-davinci-003) with GPT3 (ref. 63 ; davinci), its predecessor language model that was trained without HF. Similarly to other studies that compare these two language models 113 , this set-up allowed us to examine the effects of HF training as done for GPT3.5 in isolation. We compared the two language models in terms of favourability and stereotype strength. For favourability, we followed the methodology we used for the overt-stereotype analysis and evaluated the average weighted favourability of the top five adjectives associated with AAE. For stereotype strength, we followed the methodology we used for the scaling analysis and evaluated the average strength of the stereotypes as reported in ref.  29 .

Reporting summary

Further information on research design is available in the  Nature Portfolio Reporting Summary linked to this article.

Data availability

All the datasets used in this study are publicly available. The dataset released as ref. 87 can be found at https://aclanthology.org/2020.emnlp-main.473/ . The dataset released as ref. 83 can be found at http://slanglab.cs.umass.edu/TwitterAAE/ . The human stereotype scores used for evaluation can be found in the published articles of the Princeton Trilogy studies 29 , 30 , 31 , 34 . The most recent of these articles 34 also contains the human favourability scores for the trait adjectives. The dataset of occupational prestige that we used for the employability analysis can be found in the corresponding paper 105 . The Brown Corpus 114 , which we used for the  Supplementary Information (‘Feature analysis’), can be found at http://www.nltk.org/nltk_data/ . The dataset containing the parallel AAE, Appalachian English and Indian English texts 115 , which we used in the  Supplementary Information (‘Alternative explanations’), can be found at https://huggingface.co/collections/SALT-NLP/value-nlp-666b60a7f76c14551bda4f52 .

Code availability

Our code is written in Python and draws on the Python packages openai and transformers for language-model probing, as well as numpy, pandas, scipy and statsmodels for data analysis. The feature analysis described in the  Supplementary Information also uses the VALUE Python library 88 . Our code is publicly available on GitHub at https://github.com/valentinhofmann/dialect-prejudice .

Zhao, W. et al. WildChat: 1M ChatGPT interaction logs in the wild. In Proc. Twelfth International Conference on Learning Representations (OpenReview.net, 2024).

Zheng, L. et al. LMSYS-Chat-1M: a large-scale real-world LLM conversation dataset. In Proc. Twelfth International Conference on Learning Representations (OpenReview.net, 2024).

Gaebler, J. D., Goel, S., Huq, A. & Tambe, P. Auditing the use of language models to guide hiring decisions. Preprint at https://arxiv.org/abs/2404.03086 (2024).

Sheng, E., Chang, K.-W., Natarajan, P. & Peng, N. The woman worked as a babysitter: on biases in language generation. In Proc. 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (eds Inui. K. et al.) 3407–3412 (Association for Computational Linguistics, 2019).

Nangia, N., Vania, C., Bhalerao, R. & Bowman, S. R. CrowS-Pairs: a challenge dataset for measuring social biases in masked language models. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing (eds Webber, B. et al.) 1953–1967 (Association for Computational Linguistics, 2020).

Nadeem, M., Bethke, A. & Reddy, S. StereoSet: measuring stereotypical bias in pretrained language models. In Proc. 59th Annual Meeting of the Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (eds Zong, C. et al.) 5356–5371 (Association for Computational Linguistics, 2021).

Cheng, M., Durmus, E. & Jurafsky, D. Marked personas: using natural language prompts to measure stereotypes in language models. In Proc. 61st Annual Meeting of the Association for Computational Linguistics (eds Rogers, A. et al.) 1504–1532 (Association for Computational Linguistics, 2023).

Bonilla-Silva, E. Racism without Racists: Color-Blind Racism and the Persistence of Racial Inequality in America 4th edn (Rowman & Littlefield, 2014).

Golash-Boza, T. A critical and comprehensive sociological theory of race and racism. Sociol. Race Ethn. 2 , 129–141 (2016).

Article   Google Scholar  

Kasneci, E. et al. ChatGPT for good? On opportunities and challenges of large language models for education. Learn. Individ. Differ. 103 , 102274 (2023).

Nay, J. J. et al. Large language models as tax attorneys: a case study in legal capabilities emergence. Philos. Trans. R. Soc. A 382 , 20230159 (2024).

Article   ADS   Google Scholar  

Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature 619 , 357–362 (2023).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Bolukbasi, T., Chang, K.-W., Zou, J., Saligrama, V. & Kalai, A. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. Adv. Neural Inf. Process. Syst. 30 , 4356–4364 (2016).

Google Scholar  

Caliskan, A., Bryson, J. J. & Narayanan, A. Semantics derived automatically from language corpora contain human-like biases. Science 356 , 183–186 (2017).

Article   ADS   CAS   PubMed   Google Scholar  

Basta, C., Costa-jussà, M. R. & Casas, N. Evaluating the underlying gender bias in contextualized word embeddings. In Proc. First Workshop on Gender Bias in Natural Language Processing (eds Costa-jussà, M. R. et al.) 33–39 (Association for Computational Linguistics, 2019).

Kurita, K., Vyas, N., Pareek, A., Black, A. W. & Tsvetkov, Y. Measuring bias in contextualized word representations. In Proc. First Workshop on Gender Bias in Natural Language Processing (eds Costa-jussà, M. R. et al.) 166–172 (Association for Computational Linguistics, 2019).

Abid, A., Farooqi, M. & Zou, J. Persistent anti-muslim bias in large language models. In Proc. 2021 AAAI/ACM Conference on AI, Ethics, and Society (eds Fourcade, M. et al.) 298–306 (Association for Computing Machinery, 2021).

Bender, E. M., Gebru, T., McMillan-Major, A. & Shmitchell, S. On the dangers of stochastic parrots: can language models be too big? In Proc. 2021 ACM Conference on Fairness, Accountability, and Transparency 610–623 (Association for Computing Machinery, 2021).

Li, L. & Bamman, D. Gender and representation bias in GPT-3 generated stories. In Proc. Third Workshop on Narrative Understanding (eds Akoury, N. et al.) 48–55 (Association for Computational Linguistics, 2021).

Tamkin, A. et al. Evaluating and mitigating discrimination in language model decisions. Preprint at https://arxiv.org/abs/2312.03689 (2023).

Rae, J. W. et al. Scaling language models: methods, analysis & insights from training Gopher. Preprint at https://arxiv.org/abs/2112.11446 (2021).

Green, L. J. African American English: A Linguistic Introduction (Cambridge Univ. Press, 2002).

King, S. From African American Vernacular English to African American Language: rethinking the study of race and language in African Americans’ speech. Annu. Rev. Linguist. 6 , 285–300 (2020).

Purnell, T., Idsardi, W. & Baugh, J. Perceptual and phonetic experiments on American English dialect identification. J. Lang. Soc. Psychol. 18 , 10–30 (1999).

Massey, D. S. & Lundy, G. Use of Black English and racial discrimination in urban housing markets: new methods and findings. Urban Aff. Rev. 36 , 452–469 (2001).

Dunbar, A., King, S. & Vaughn, C. Dialect on trial: an experimental examination of raciolinguistic ideologies and character judgments. Race Justice https://doi.org/10.1177/21533687241258772 (2024).

Rickford, J. R. & King, S. Language and linguistics on trial: Hearing Rachel Jeantel (and other vernacular speakers) in the courtroom and beyond. Language 92 , 948–988 (2016).

Grogger, J. Speech patterns and racial wage inequality. J. Hum. Resour. 46 , 1–25 (2011).

Katz, D. & Braly, K. Racial stereotypes of one hundred college students. J. Abnorm. Soc. Psychol. 28 , 280–290 (1933).

Gilbert, G. M. Stereotype persistance and change among college students. J. Abnorm. Soc. Psychol. 46 , 245–254 (1951).

Article   CAS   Google Scholar  

Karlins, M., Coffman, T. L. & Walters, G. On the fading of social stereotypes: studies in three generations of college students. J. Pers. Soc. Psychol. 13 , 1–16 (1969).

Article   CAS   PubMed   Google Scholar  

Devine, P. G. & Elliot, A. J. Are racial stereotypes really fading? The Princeton Trilogy revisited. Pers. Soc. Psychol. Bull. 21 , 1139–1150 (1995).

Madon, S. et al. Ethnic and national stereotypes: the Princeton Trilogy revisited and revised. Pers. Soc. Psychol. Bull. 27 , 996–1010 (2001).

Bergsieker, H. B., Leslie, L. M., Constantine, V. S. & Fiske, S. T. Stereotyping by omission: eliminate the negative, accentuate the positive. J. Pers. Soc. Psychol. 102 , 1214–1238 (2012).

Article   PubMed   PubMed Central   Google Scholar  

Ghavami, N. & Peplau, L. A. An intersectional analysis of gender and ethnic stereotypes: testing three hypotheses. Psychol. Women Q. 37 , 113–127 (2013).

Lambert, W. E., Hodgson, R. C., Gardner, R. C. & Fillenbaum, S. Evaluational reactions to spoken languages. J. Abnorm. Soc. Psychol. 60 , 44–51 (1960).

Ball, P. Stereotypes of Anglo-Saxon and non-Anglo-Saxon accents: some exploratory Australian studies with the matched guise technique. Lang. Sci. 5 , 163–183 (1983).

Thomas, E. R. & Reaser, J. Delimiting perceptual cues used for the ethnic labeling of African American and European American voices. J. Socioling. 8 , 54–87 (2004).

Atkins, C. P. Do employment recruiters discriminate on the basis of nonstandard dialect? J. Employ. Couns. 30 , 108–118 (1993).

Payne, K., Downing, J. & Fleming, J. C. Speaking Ebonics in a professional context: the role of ethos/source credibility and perceived sociability of the speaker. J. Tech. Writ. Commun. 30 , 367–383 (2000).

Rodriguez, J. I., Cargile, A. C. & Rich, M. D. Reactions to African-American vernacular English: do more phonological features matter? West. J. Black Stud. 28 , 407–414 (2004).

Billings, A. C. Beyond the Ebonics debate: attitudes about Black and standard American English. J. Black Stud. 36 , 68–81 (2005).

Kurinec, C. A. & Weaver, C. III “Sounding Black”: speech stereotypicality activates racial stereotypes and expectations about appearance. Front. Psychol. 12 , 785283 (2021).

Rosa, J. & Flores, N. Unsettling race and language: toward a raciolinguistic perspective. Lang. Soc. 46 , 621–647 (2017).

Salehi, B., Hovy, D., Hovy, E. & Søgaard, A. Huntsville, hospitals, and hockey teams: names can reveal your location. In Proc. 3rd Workshop on Noisy User-generated Text (eds Derczynski, L. et al.) 116–121 (Association for Computational Linguistics, 2017).

Radford, A. et al. Language models are unsupervised multitask learners. OpenAI https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf (2019).

Liu, Y. et al. RoBERTa: a robustly optimized BERT pretraining approach. Preprint at https://arxiv.org/abs/1907.11692 (2019).

Raffel, C. et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21 , 1–67 (2020).

MathSciNet   Google Scholar  

Ouyang, L. et al. Training language models to follow instructions with human feedback. In Proc. 36th Conference on Neural Information Processing Systems (eds Koyejo, S. et al.) 27730–27744 (NeurIPS, 2022).

OpenAI et al. GPT-4 technical report. Preprint at https://arxiv.org/abs/2303.08774 (2023).

Zhang, E. & Zhang, Y. Average precision. In Encyclopedia of Database Systems (eds Liu, L. & Özsu, M. T.) 192–193 (Springer, 2009).

Black, J. S. & van Esch, P. AI-enabled recruiting: what is it and how should a manager use it? Bus. Horiz. 63 , 215–226 (2020).

Hunkenschroer, A. L. & Luetge, C. Ethics of AI-enabled recruiting and selection: a review and research agenda. J. Bus. Ethics 178 , 977–1007 (2022).

Upadhyay, A. K. & Khandelwal, K. Applying artificial intelligence: implications for recruitment. Strateg. HR Rev. 17 , 255–258 (2018).

Tippins, N. T., Oswald, F. L. & McPhail, S. M. Scientific, legal, and ethical concerns about AI-based personnel selection tools: a call to action. Pers. Assess. Decis. 7 , 1 (2021).

Aletras, N., Tsarapatsanis, D., Preoţiuc-Pietro, D. & Lampos, V. Predicting judicial decisions of the European Court of Human Rights: a natural language processing perspective. PeerJ Comput. Sci. 2 , e93 (2016).

Surden, H. Artificial intelligence and law: an overview. Ga State Univ. Law Rev. 35 , 1305–1337 (2019).

Medvedeva, M., Vols, M. & Wieling, M. Using machine learning to predict decisions of the European Court of Human Rights. Artif. Intell. Law 28 , 237–266 (2020).

Weidinger, L. et al. Taxonomy of risks posed by language models. In Proc. 2022 ACM Conference on Fairness, Accountability, and Transparency 214–229 (Association for Computing Machinery, 2022).

Czopp, A. M. & Monteith, M. J. Thinking well of African Americans: measuring complimentary stereotypes and negative prejudice. Basic Appl. Soc. Psychol. 28 , 233–250 (2006).

Chowdhery, A. et al. PaLM: scaling language modeling with pathways. J. Mach. Learn. Res. 24 , 11324–11436 (2023).

Bai, Y. et al. Training a helpful and harmless assistant with reinforcement learning from human feedback. Preprint at https://arxiv.org/abs/2204.05862 (2022).

Brown, T. B. et al. Language models are few-shot learners. In  Proc. 34th International Conference on Neural Information Processing Systems  (eds Larochelle, H. et al.) 1877–1901 (NeurIPS, 2020).

Dovidio, J. F. & Gaertner, S. L. Aversive racism. Adv. Exp. Soc. Psychol. 36 , 1–52 (2004).

Schuman, H., Steeh, C., Bobo, L. D. & Krysan, M. (eds) Racial Attitudes in America: Trends and Interpretations (Harvard Univ. Press, 1998).

Crosby, F., Bromley, S. & Saxe, L. Recent unobtrusive studies of Black and White discrimination and prejudice: a literature review. Psychol. Bull. 87 , 546–563 (1980).

Terkel, S. Race: How Blacks and Whites Think and Feel about the American Obsession (New Press, 1992).

Jackman, M. R. & Muha, M. J. Education and intergroup attitudes: moral enlightenment, superficial democratic commitment, or ideological refinement? Am. Sociol. Rev. 49 , 751–769 (1984).

Bonilla-Silva, E. The New Racism: Racial Structure in the United States, 1960s–1990s. In Race, Ethnicity, and Nationality in the United States: Toward the Twenty-First Century 1st edn (ed. Wong, P.) Ch. 4 (Westview Press, 1999).

Gao, L. et al. The Pile: an 800GB dataset of diverse text for language modeling. Preprint at https://arxiv.org/abs/2101.00027 (2021).

Ronkin, M. & Karn, H. E. Mock Ebonics: linguistic racism in parodies of Ebonics on the internet. J. Socioling. 3 , 360–380 (1999).

Dodge, J. et al. Documenting large webtext corpora: a case study on the Colossal Clean Crawled Corpus. In Proc. 2021 Conference on Empirical Methods in Natural Language Processing (eds Moens, M.-F. et al.) 1286–1305 (Association for Computational Linguistics, 2021).

Steed, R., Panda, S., Kobren, A. & Wick, M. Upstream mitigation is not all you need: testing the bias transfer hypothesis in pre-trained language models. In Proc. 60th Annual Meeting of the Association for Computational Linguistics (eds Muresan, S. et al.) 3524–3542 (Association for Computational Linguistics, 2022).

Feng, S., Park, C. Y., Liu, Y. & Tsvetkov, Y. From pretraining data to language models to downstream tasks: tracking the trails of political biases leading to unfair NLP models. In Proc. 61st Annual Meeting of the Association for Computational Linguistics (eds Rogers, A. et al.) 11737–11762 (Association for Computational Linguistics, 2023).

Köksal, A. et al. Language-agnostic bias detection in language models with bias probing. In Findings of the Association for Computational Linguistics: EMNLP 2023 (eds Bouamor, H. et al.) 12735–12747 (Association for Computational Linguistics, 2023).

Garg, N., Schiebinger, L., Jurafsky, D. & Zou, J. Word embeddings quantify 100 years of gender and ethnic stereotypes. Proc. Natl Acad. Sci. USA 115 , E3635–E3644 (2018).

Ferrer, X., van Nuenen, T., Such, J. M. & Criado, N. Discovering and categorising language biases in Reddit. In Proc. Fifteenth International AAAI Conference on Web and Social Media (eds Budak, C. et al.) 140–151 (Association for the Advancement of Artificial Intelligence, 2021).

Ethayarajh, K., Choi, Y. & Swayamdipta, S. Understanding dataset difficulty with V-usable information. In Proc. 39th International Conference on Machine Learning (eds Chaudhuri, K. et al.) 5988–6008 (Proceedings of Machine Learning Research, 2022).

Hoffmann, J. et al. Training compute-optimal large language models. Preprint at https://arxiv.org/abs/2203.15556 (2022).

Liang, P. et al. Holistic evaluation of language models. Transactions on Machine Learning Research https://openreview.net/forum?id=iO4LZibEqW (2023).

Blodgett, S. L., Barocas, S., Daumé III, H. & Wallach, H. Language (technology) is power: A critical survey of “bias” in NLP. In Proc. 58th Annual Meeting of the Association for Computational Linguistics (eds Jurafsky, D. et al.) 5454–5476 (Association for Computational Linguistics, 2020).

Jørgensen, A., Hovy, D. & Søgaard, A. Challenges of studying and processing dialects in social media. In Proc. Workshop on Noisy User-generated Text (eds Xu, W. et al.) 9–18 (Association for Computational Linguistics, 2015).

Blodgett, S. L., Green, L. & O’Connor, B. Demographic dialectal variation in social media: a case study of African-American English. In Proc. 2016 Conference on Empirical Methods in Natural Language Processing (eds Su, J. et al.) 1119–1130 (Association for Computational Linguistics, 2016).

Jørgensen, A., Hovy, D. & Søgaard, A. Learning a POS tagger for AAVE-like language. In Proc. 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (eds Knight, K. et al.) 1115–1120 (Association for Computational Linguistics, 2016).

Blodgett, S. L. & O’Connor, B. Racial disparity in natural language processing: a case study of social media African-American English. Preprint at https://arxiv.org/abs/1707.00061 (2017).

Blodgett, S. L., Wei, J. & O’Connor, B. Twitter universal dependency parsing for African-American and mainstream American English. In Proc. 56th Annual Meeting of the Association for Computational Linguistics (eds Gurevych, I. & Miyao, Y.) 1415–1425 (Association for Computational Linguistics, 2018).

Groenwold, S. et al. Investigating African-American vernacular English in transformer-based text generation. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing (eds Webber, B. et al.) 5877–5883 (Association for Computational Linguistics, 2020).

Ziems, C., Chen, J., Harris, C., Anderson, J. & Yang, D. VALUE: Understanding dialect disparity in NLU. In Proc. 60th Annual Meeting of the Association for Computational Linguistics (eds Muresan, S. et al.) 3701–3720 (Association for Computational Linguistics, 2022).

Davidson, T., Bhattacharya, D. & Weber, I. Racial bias in hate speech and abusive language detection datasets. In Proc. Third Workshop on Abusive Language Online (eds Roberts, S. T. et al.) 25–35 (Association for Computational Linguistics, 2019).

Sap, M., Card, D., Gabriel, S., Choi, Y. & Smith, N. A. The risk of racial bias in hate speech detection. In Proc. 57th Annual Meeting of the Association for Computational Linguistics (eds Korhonen, A. et al.) 1668–1678 (Association for Computational Linguistics, 2019).

Harris, C., Halevy, M., Howard, A., Bruckman, A. & Yang, D. Exploring the role of grammar and word choice in bias toward African American English (AAE) in hate speech classification. In Proc. 2022 ACM Conference on Fairness, Accountability, and Transparency 789–798 (Association for Computing Machinery, 2022).

Gururangan, S. et al. Whose language counts as high quality? Measuring language ideologies in text data selection. In Proc. 2022 Conference on Empirical Methods in Natural Language Processing (eds Goldberg, Y. et al.) 2562–2580 (Association for Computational Linguistics, 2022).

Gaies, S. J. & Beebe, J. D. The matched-guise technique for measuring attitudes and their implications for language education: a critical assessment. In Language Acquisition and the Second/Foreign Language Classroom (ed. Sadtano, E.) 156–178 (SEAMEO Regional Language Centre, 1991).

Hudson, R. A. Sociolinguistics (Cambridge Univ. Press, 1996).

Delobelle, P., Tokpo, E., Calders, T. & Berendt, B. Measuring fairness with biased rulers: a comparative study on bias metrics for pre-trained language models. In Proc. 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (eds Carpuat, M. et al.) 1693–1706 (Association for Computational Linguistics, 2022).

Mattern, J., Jin, Z., Sachan, M., Mihalcea, R. & Schölkopf, B. Understanding stereotypes in language models: Towards robust measurement and zero-shot debiasing. Preprint at https://arxiv.org/abs/2212.10678 (2022).

Eisenstein, J., O’Connor, B., Smith, N. A. & Xing, E. P. A latent variable model for geographic lexical variation. In Proc. 2010 Conference on Empirical Methods in Natural Language Processing (eds Li, H. & Màrquez, L.) 1277–1287 (Association for Computational Linguistics, 2010).

Doyle, G. Mapping dialectal variation by querying social media. In Proc. 14th Conference of the European Chapter of the Association for Computational Linguistics (eds Wintner, S. et al.) 98–106 (Association for Computational Linguistics, 2014).

Huang, Y., Guo, D., Kasakoff, A. & Grieve, J. Understanding U.S. regional linguistic variation with Twitter data analysis. Comput. Environ. Urban Syst. 59 , 244–255 (2016).

Eisenstein, J. What to do about bad language on the internet. In Proc. 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (eds Vanderwende, L. et al.) 359–369 (Association for Computational Linguistics, 2013).

Eisenstein, J. Systematic patterning in phonologically-motivated orthographic variation. J. Socioling. 19 , 161–188 (2015).

Jones, T. Toward a description of African American vernacular English dialect regions using “Black Twitter”. Am. Speech 90 , 403–440 (2015).

Christiano, P. F. et al. Deep reinforcement learning from human preferences. Proc. 31st International Conference on Neural Information Processing Systems (eds von Luxburg, U. et al.) 4302–4310 (NeurIPS, 2017).

Zhao, T. Z., Wallace, E., Feng, S., Klein, D. & Singh, S. Calibrate before use: Improving few-shot performance of language models. In Proc. 38th International Conference on Machine Learning (eds Meila, M. & Zhang, T.) 12697–12706 (Proceedings of Machine Learning Research, 2021).

Smith, T. W. & Son, J. Measuring Occupational Prestige on the 2012 General Social Survey (NORC at Univ. Chicago, 2014).

Zhao, J., Wang, T., Yatskar, M., Ordonez, V. & Chang, K.-W. Gender bias in coreference resolution: evaluation and debiasing methods. In Proc. 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (eds Walker, M. et al.) 15–20 (Association for Computational Linguistics, 2018).

Hughes, B. T., Srivastava, S., Leszko, M. & Condon, D. M. Occupational prestige: the status component of socioeconomic status. Collabra Psychol. 10 , 92882 (2024).

Gramlich, J. The gap between the number of blacks and whites in prison is shrinking. Pew Research Centre https://www.pewresearch.org/short-reads/2019/04/30/shrinking-gap-between-number-of-blacks-and-whites-in-prison (2019).

Walsh, A. The criminal justice system is riddled with racial disparities. Prison Policy Initiative Briefing https://www.prisonpolicy.org/blog/2016/08/15/cjrace (2016).

Röttger, P. et al. Political compass or spinning arrow? Towards more meaningful evaluations for values and opinions in large language models. Preprint at https://arxiv.org/abs/2402.16786 (2024).

Jurafsky, D. & Martin, J. H. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (Prentice Hall, 2000).

Salazar, J., Liang, D., Nguyen, T. Q. & Kirchhoff, K. Masked language model scoring. In Proc. 58th Annual Meeting of the Association for Computational Linguistics (eds Jurafsky, D. et al.) 2699–2712 (Association for Computational Linguistics, 2020).

Santurkar, S. et al. Whose opinions do language models reflect? In Proc. 40th International Conference on Machine Learning (eds Krause, A. et al.) 29971–30004 (Proceedings of Machine Learning Research, 2023).

Francis, W. N. & Kucera, H. Brown Corpus Manual (Brown Univ.,1979).

Ziems, C. et al. Multi-VALUE: a framework for cross-dialectal English NLP. In Proc. 61st Annual Meeting of the Association for Computational Linguistics (eds Rogers, A. et al.) 744–768 (Association for Computational Linguistics, 2023).

Download references

Acknowledgements

V.H. was funded by the German Academic Scholarship Foundation. P.R.K. was funded in part by the Open Phil AI Fellowship. This work was also funded by the Hoffman-Yee Research Grants programme and the Stanford Institute for Human-Centered Artificial Intelligence. We thank A. Köksal, D. Hovy, K. Gligorić, M. Harrington, M. Casillas, M. Cheng and P. Röttger for feedback on an earlier version of the article.

Author information

Authors and affiliations.

Allen Institute for AI, Seattle, WA, USA

Valentin Hofmann

University of Oxford, Oxford, UK

LMU Munich, Munich, Germany

Stanford University, Stanford, CA, USA

Pratyusha Ria Kalluri & Dan Jurafsky

The University of Chicago, Chicago, IL, USA

Sharese King

You can also search for this author in PubMed   Google Scholar

Contributions

V.H., P.R.K., D.J. and S.K. designed the research. V.H. performed the research and analysed the data. V.H., P.R.K., D.J. and S.K. wrote the paper.

Corresponding authors

Correspondence to Valentin Hofmann or Sharese King .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature thanks Rodney Coates and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended data fig. 1 weighted average favourability of top stereotypes about african americans in humans and top overt as well as covert stereotypes about african americans in language models (lms)..

The overt stereotypes are more favourable than the reported human stereotypes, except for GPT2. The covert stereotypes are substantially less favourable than the least favourable reported human stereotypes from 1933. Results without weighting, which are very similar, are provided in Supplementary Fig. 6 .

Extended Data Fig. 2 Prestige of occupations associated with AAE (positive values) versus SAE (negative values), for individual language models.

The shaded areas show 95% confidence bands around the regression lines. The association with AAE versus SAE is negatively correlated with occupational prestige, for all language models. We cannot conduct this analysis with GPT4 since the OpenAI API does not give access to the probabilities for all occupations.

Supplementary information

Supplementary information, reporting summary, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Hofmann, V., Kalluri, P.R., Jurafsky, D. et al. AI generates covertly racist decisions about people based on their dialect. Nature (2024). https://doi.org/10.1038/s41586-024-07856-5

Download citation

Received : 08 February 2024

Accepted : 19 July 2024

Published : 28 August 2024

DOI : https://doi.org/10.1038/s41586-024-07856-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

exploratory case study advantages and disadvantages

IMAGES

  1. Exploratory Research: Definition & How To Conduct This Research

    exploratory case study advantages and disadvantages

  2. (PDF) An Exploratory Study of Advantages and Disadvantages of Website Preservation

    exploratory case study advantages and disadvantages

  3. Case Study Method Advantages And Disadvantages 2024

    exploratory case study advantages and disadvantages

  4. 10 Case Study Advantages and Disadvantages (2024)

    exploratory case study advantages and disadvantages

  5. Solved What are the advantages and disadvantages of each

    exploratory case study advantages and disadvantages

  6. advantages and disadvantages of case studies

    exploratory case study advantages and disadvantages

VIDEO

  1. AN INTRODUCTION TO A CASE STUDY AS A QUALITATIVE METHOD PART

  2. Plastic Roads

  3. USA Study Application Process

  4. Cohort Study in Research

  5. Exploratory vs Descriptive Research|Difference between exploratory and descriptive research

  6. epidemiological methods in urdu

COMMENTS

  1. Exploratory Research

    Advantages and disadvantages of exploratory research. Like any other research design, exploratory studies have their trade-offs: they provide a unique set of benefits but also come with downsides. Advantages. It can be very helpful in narrowing down a challenging or nebulous problem that has not been previously studied.

  2. Case Study Methodology of Qualitative Research: Key Attributes and

    A case study is one of the most commonly used methodologies of social research. This article attempts to look into the various dimensions of a case study research strategy, the different epistemological strands which determine the particular case study type and approach adopted in the field, discusses the factors which can enhance the effectiveness of a case study research, and the debate ...

  3. (PDF) Case study as a research method

    Case study method enables a researcher to closely examine the data within a specific context. In most cases, a case study method selects a small geograph ical area or a very li mited number. of ...

  4. Case Study

    Yin posits three categories of case study—exploratory, descriptive, and explanatory. A pilot study is generally considered an exploratory case study. Descriptive case studies focus on the characteristics of the case. The explanatory case studies are employed for causal studies. ... Case study as a method has both advantages and disadvantages ...

  5. Case Study Research Method in Psychology

    Case studies are in-depth investigations of a person, group, event, or community. Typically, data is gathered from various sources using several methods (e.g., observations & interviews). The case study research method originated in clinical medicine (the case history, i.e., the patient's personal history). In psychology, case studies are ...

  6. (PDF) The case study as a type of qualitative research

    Its aim is to give a detailed description of a case study - its definition, some classifications, and several advantages and disadvantages - in order to provide a better understanding of this ...

  7. The case study approach

    The case study approach allows in-depth, multi-faceted explorations of complex issues in their real-life settings. The value of the case study approach is well recognised in the fields of business, law and policy, but somewhat less so in health services research. Based on our experiences of conducting several health-related case studies, we reflect on the different types of case study design ...

  8. (PDF) Qualitative Case Study Methodology: Study Design and

    For this project, we used an intrinsic exploratory case study approach (Baxter & Jack, 2008; Stake, 2002;Yin, 2002). Case studies are "aimed at description and exploration of complex and entangled ...

  9. Chapter 8: Case study

    Advantages and disadvantages of qualitative case studies. Advantages of using a case study approach include the ability to explore the subtleties and intricacies of complex social situations, and the use of multiple data collection methods and data from multiple sources within the case, which enables rigour through triangulation.

  10. PDF The utility of case study as a methodology for work-integrated learning

    Exploratory case studies explore situations in which the case (intervention for example) being evaluated has no clear or single set of outcomes. According to Yin a ... both within and between cases i.e. not just one case as in a holistic case study. Advantages and Disadvantages of Case Study Irrespective of the type of case study, ...

  11. Toward Developing a Framework for Conducting Case Study Research

    The guide for the case study report is often omitted from case study plans because investigators view the reporting phase as being far in the future. Yin (1994) proposed that the report is planned at the start. Case studies do not have a widely accepted reporting format - hence the experience of the investigator is a key factor (Tellis, 1997).

  12. PDF Chapter 3: Method (Exploratory Case Study) Chapter 3: Method

    Field The method phenomenon a variety used for an exploratory case (Merriam, of a resources, understanding evaluating Yin (2010) of described a qualitative the perceptions analyzing qualitative evaluations of people regarding Qualitative methods and presenting the findings. to produce as collecting a particular.

  13. PDF Case study as a research method

    It also explores on the advantages and disadvantages of case study as a research method. Introduction Case study research, through reports of past studies, allows the exploration and understanding ... is considered an example of an exploratory case study (Yin, 1984; McDonough and McDonough, 1997) and is crucial in determining the protocol that ...

  14. The Advantages and Limitations of Single Case Study Analysis

    However, as Yin notes, case studies can - like all forms of social science research - be exploratory, descriptive, and/or explanatory in nature. It is "a common misconception", he notes, "that the various research methods should be arrayed hierarchically… many social scientists still deeply believe that case studies are only ...

  15. Exploratory Research

    Advantages and disadvantages of exploratory research. Like any other research design, exploratory research has its trade-offs: ... A case study is a detailed study of a specific subject in its real-world context, focusing on a person, group, event, or organisation. 87. Scribbr.

  16. A Quick Guide to Case Study with Examples

    Exploratory case study: Exploratory research is conducted to understand the nature of the problem. It does not focus on finding evidence or a conclusion of the problem. ... Advantages and Disadvantages of Case Study. Advantages Disadvantages; It's useful for rare outcomes. An ample amount of information is obtained with few participants.

  17. Exploratory Research

    It has been noted that "exploratory research is the initial research, which forms the basis of more conclusive research. It can even help in determining the research design, sampling methodology and data collection method" [2]. Exploratory research "tends to tackle new problems on which little or no previous research has been done" [3].

  18. Grounded Theory: A Guide for Exploratory Studies in Management Research

    Gustafsson (2017) defines case studies as "an intensive study about a person, a group of people or a unit, which is aimed to generalize over several units." Gerring (2004) provides a similar definition and further argues that the case study methodology is "not a way of analyzing casual relations" (Gerring, 2004, p. 341).

  19. 10 Case Study Advantages and Disadvantages (2024)

    Advantages. 1. In-depth analysis of complex phenomena. Case study design allows researchers to delve deeply into intricate issues and situations. By focusing on a specific instance or event, researchers can uncover nuanced details and layers of understanding that might be missed with other research methods, especially large-scale survey studies.

  20. The Essential Pros and Cons of Exploratory Research

    Exploratory research offers a great amount of researcher discretion. The lack of structure enables the researcher to direct the progression of the research processes and in that sense, it offers a greater degree of flexibility and freedom. Another pro of exploratory research is the economical way in which the process can be conducted.

  21. The contribution of case study design to supporting research on

    Despite this, the results did provide useful information regarding the use of case study design in Clubhouse research, including the advantages and disadvantages. In turn, this prompts a variety of considerations for researchers who may consider using case study design in Clubhouse settings in future, with these considerations outlined in the ...

  22. (PDF) Case Study Research

    The case study method is a research strategy that aims to gain an in-depth understanding of a specific phenomenon by collecting and analyzing specific data within its true context (Rebolj, 2013 ...

  23. Exploratory Research: Overview, Application, Advantages and

    Research that explores issues at an early stage of development is considered exploratory research. Exploratory research is conducted when the topic or issue is novel, and data collection is challenging. It is adaptable and may handle any research issue. In most cases, this method is used to formulate formal hypotheses.

  24. Day One: Placebo Workshop: Translational Research Domains and ...

    The National Institute of Mental Health (NIMH) hosted a virtual workshop on the placebo effect. The purpose of this workshop was to bring together experts in neurobiology, clinical trials, and regulatory science to examine placebo effects in drug, device, and psychosocial interventions for mental health conditions. Topics included interpretability of placebo signals within the context of ...

  25. AI generates covertly racist decisions about people based on their

    Hundreds of millions of people now interact with language models, with uses ranging from help with writing1,2 to informing hiring decisions3. However, these language models are known to perpetuate ...