S371 Social Work Research - Jill Chonody: What is Quantitative Research?

  • Choosing a Topic
  • Choosing Search Terms
  • What is Quantitative Research?
  • Requesting Materials

Quantitative Research in the Social Sciences

This page is courtesy of University of Southern California: http://libguides.usc.edu/content.php?pid=83009&sid=615867

Quantitative methods emphasize objective measurements and the statistical, mathematical, or numerical analysis of data collected through polls, questionnaires, and surveys, or by manipulating pre-existing statistical data using computational techniques . Quantitative research focuses on gathering numerical data and generalizing it across groups of people or to explain a particular phenomenon.

Babbie, Earl R. The Practice of Social Research . 12th ed. Belmont, CA: Wadsworth Cengage, 2010; Muijs, Daniel. Doing Quantitative Research in Education with SPSS . 2nd edition. London: SAGE Publications, 2010.

Characteristics of Quantitative Research

Your goal in conducting quantitative research study is to determine the relationship between one thing [an independent variable] and another [a dependent or outcome variable] within a population. Quantitative research designs are either descriptive [subjects usually measured once] or experimental [subjects measured before and after a treatment]. A descriptive study establishes only associations between variables; an experimental study establishes causality.

Quantitative research deals in numbers, logic, and an objective stance. Quantitative research focuses on numberic and unchanging data and detailed, convergent reasoning rather than divergent reasoning [i.e., the generation of a variety of ideas about a research problem in a spontaneous, free-flowing manner].

Its main characteristics are :

  • The data is usually gathered using structured research instruments.
  • The results are based on larger sample sizes that are representative of the population.
  • The research study can usually be replicated or repeated, given its high reliability.
  • Researcher has a clearly defined research question to which objective answers are sought.
  • All aspects of the study are carefully designed before data is collected.
  • Data are in the form of numbers and statistics, often arranged in tables, charts, figures, or other non-textual forms.
  • Project can be used to generalize concepts more widely, predict future results, or investigate causal relationships.
  • Researcher uses tools, such as questionnaires or computer software, to collect numerical data.

The overarching aim of a quantitative research study is to classify features, count them, and construct statistical models in an attempt to explain what is observed.

  Things to keep in mind when reporting the results of a study using quantiative methods :

  • Explain the data collected and their statistical treatment as well as all relevant results in relation to the research problem you are investigating. Interpretation of results is not appropriate in this section.
  • Report unanticipated events that occurred during your data collection. Explain how the actual analysis differs from the planned analysis. Explain your handling of missing data and why any missing data does not undermine the validity of your analysis.
  • Explain the techniques you used to "clean" your data set.
  • Choose a minimally sufficient statistical procedure ; provide a rationale for its use and a reference for it. Specify any computer programs used.
  • Describe the assumptions for each procedure and the steps you took to ensure that they were not violated.
  • When using inferential statistics , provide the descriptive statistics, confidence intervals, and sample sizes for each variable as well as the value of the test statistic, its direction, the degrees of freedom, and the significance level [report the actual p value].
  • Avoid inferring causality , particularly in nonrandomized designs or without further experimentation.
  • Use tables to provide exact values ; use figures to convey global effects. Keep figures small in size; include graphic representations of confidence intervals whenever possible.
  • Always tell the reader what to look for in tables and figures .

NOTE:   When using pre-existing statistical data gathered and made available by anyone other than yourself [e.g., government agency], you still must report on the methods that were used to gather the data and describe any missing data that exists and, if there is any, provide a clear explanation why the missing datat does not undermine the validity of your final analysis.

Babbie, Earl R. The Practice of Social Research . 12th ed. Belmont, CA: Wadsworth Cengage, 2010; Brians, Craig Leonard et al. Empirical Political Analysis: Quantitative and Qualitative Research Methods . 8th ed. Boston, MA: Longman, 2011; McNabb, David E. Research Methods in Public Administration and Nonprofit Management: Quantitative and Qualitative Approaches . 2nd ed. Armonk, NY: M.E. Sharpe, 2008; Quantitative Research Methods . Writing@CSU. Colorado State University; Singh, Kultar. Quantitative Social Research Methods . Los Angeles, CA: Sage, 2007.

Basic Research Designs for Quantitative Studies

Before designing a quantitative research study, you must decide whether it will be descriptive or experimental because this will dictate how you gather, analyze, and interpret the results. A descriptive study is governed by the following rules: subjects are generally measured once; the intention is to only establish associations between variables; and, the study may include a sample population of hundreds or thousands of subjects to ensure that a valid estimate of a generalized relationship between variables has been obtained. An experimental design includes subjects measured before and after a particular treatment, the sample population may be very small and purposefully chosen, and it is intended to establish causality between variables. Introduction The introduction to a quantitative study is usually written in the present tense and from the third person point of view. It covers the following information:

  • Identifies the research problem -- as with any academic study, you must state clearly and concisely the research problem being investigated.
  • Reviews the literature -- review scholarship on the topic, synthesizing key themes and, if necessary, noting studies that have used similar methods of inquiry and analysis. Note where key gaps exist and how your study helps to fill these gaps or clarifies existing knowledge.
  • Describes the theoretical framework -- provide an outline of the theory or hypothesis underpinning your study. If necessary, define unfamiliar or complex terms, concepts, or ideas and provide the appropriate background information to place the research problem in proper context [e.g., historical, cultural, economic, etc.].

Methodology The methods section of a quantitative study should describe how each objective of your study will be achieved. Be sure to provide enough detail to enable the reader can make an informed assessment of the methods being used to obtain results associated with the research problem. The methods section should be presented in the past tense.

  • Study population and sampling -- where did the data come from; how robust is it; note where gaps exist or what was excluded. Note the procedures used for their selection;
  • Data collection – describe the tools and methods used to collect information and identify the variables being measured; describe the methods used to obtain the data; and, note if the data was pre-existing [i.e., government data] or you gathered it yourself. If you gathered it yourself, describe what type of instrument you used and why. Note that no data set is perfect--describe any limitations in methods of gathering data.
  • Data analysis -- describe the procedures for processing and analyzing the data. If appropriate, describe the specific instruments of analysis used to study each research objective, including mathematical techniques and the type of computer software used to manipulate the data.

Results The finding of your study should be written objectively and in a succinct and precise format. In quantitative studies, it is common to use graphs, tables, charts, and other non-textual elements to help the reader understand the data. Make sure that non-textual elements do not stand in isolation from the text but are being used to supplement the overall description of the results and to help clarify key points being made. Further information about how to effectively present data using charts and graphs can be found here .

  • Statistical analysis -- how did you analyze the data? What were the key findings from the data? The findings should be present in a logical, sequential order. Describe but do not interpret these trends or negative results; save that for the discussion section. The results should be presented in the past tense.

Discussion Discussions should be analytic, logical, and comprehensive. The discussion should meld together your findings in relation to those identified in the literature review, and placed within the context of the theoretical framework underpinning the study. The discussion should be presented in the present tense.

  • Interpretation of results -- reiterate the research problem being investigated and compare and contrast the findings with the research questions underlying the study. Did they affirm predicted outcomes or did the data refute it?
  • Description of trends, comparison of groups, or relationships among variables -- describe any trends that emerged from your analysis and explain all unanticipated and statistical insignificant findings.
  • Discussion of implications – what is the meaning of your results? Highlight key findings based on the overall results and note findings that you believe are important. How have the results helped fill gaps in understanding the research problem?
  • Limitations -- describe any limitations or unavoidable bias in your study and, if necessary, note why these limitations did not inhibit effective interpretation of the results.

Conclusion End your study by to summarizing the topic and provide a final comment and assessment of the study.

  • Summary of findings – synthesize the answers to your research questions. Do not report any statistical data here; just provide a narrative summary of the key findings and describe what was learned that you did not know before conducting the study.
  • Recommendations – if appropriate to the aim of the assignment, tie key findings with policy recommendations or actions to be taken in practice.
  • Future research – note the need for future research linked to your study’s limitations or to any remaining gaps in the literature that were not addressed in your study.

Black, Thomas R. Doing Quantitative Research in the Social Sciences: An Integrated Approach to Research Design, Measurement and Statistics . London: Sage, 1999; Gay,L. R. and Peter Airasain. Educational Research: Competencies for Analysis and Applications . 7th edition. Upper Saddle River, NJ: Merril Prentice Hall, 2003; Hector, Anestine.  An Overview of Quantitative Research in Compostion and TESOL . Department of English, Indiana University of Pennsylvania; Hopkins, Will G. “Quantitative Research Design.” Sportscience 4, 1 (2000); A Strategy for Writing Up Research Results . The Structure, Format, Content, and Style of a Journal-Style Scientific Paper. Department of Biology. Bates College; Nenty, H. Johnson. "Writing a Quantitative Research Thesis." International Journal of Educational Science 1 (2009): 19-32; Ouyang, Ronghua (John). Basic Inquiry of Quantitative Research . Kennesaw State University.

  • << Previous: Finding Quantitative Research
  • Next: Databases >>
  • Last Updated: Jul 11, 2023 1:03 PM
  • URL: https://libguides.iun.edu/S371socialworkresearch

Social Work Research Methods That Drive the Practice

A social worker surveys a community member.

Social workers advocate for the well-being of individuals, families and communities. But how do social workers know what interventions are needed to help an individual? How do they assess whether a treatment plan is working? What do social workers use to write evidence-based policy?

Social work involves research-informed practice and practice-informed research. At every level, social workers need to know objective facts about the populations they serve, the efficacy of their interventions and the likelihood that their policies will improve lives. A variety of social work research methods make that possible.

Data-Driven Work

Data is a collection of facts used for reference and analysis. In a field as broad as social work, data comes in many forms.

Quantitative vs. Qualitative

As with any research, social work research involves both quantitative and qualitative studies.

Quantitative Research

Answers to questions like these can help social workers know about the populations they serve — or hope to serve in the future.

  • How many students currently receive reduced-price school lunches in the local school district?
  • How many hours per week does a specific individual consume digital media?
  • How frequently did community members access a specific medical service last year?

Quantitative data — facts that can be measured and expressed numerically — are crucial for social work.

Quantitative research has advantages for social scientists. Such research can be more generalizable to large populations, as it uses specific sampling methods and lends itself to large datasets. It can provide important descriptive statistics about a specific population. Furthermore, by operationalizing variables, it can help social workers easily compare similar datasets with one another.

Qualitative Research

Qualitative data — facts that cannot be measured or expressed in terms of mere numbers or counts — offer rich insights into individuals, groups and societies. It can be collected via interviews and observations.

  • What attitudes do students have toward the reduced-price school lunch program?
  • What strategies do individuals use to moderate their weekly digital media consumption?
  • What factors made community members more or less likely to access a specific medical service last year?

Qualitative research can thereby provide a textured view of social contexts and systems that may not have been possible with quantitative methods. Plus, it may even suggest new lines of inquiry for social work research.

Mixed Methods Research

Combining quantitative and qualitative methods into a single study is known as mixed methods research. This form of research has gained popularity in the study of social sciences, according to a 2019 report in the academic journal Theory and Society. Since quantitative and qualitative methods answer different questions, merging them into a single study can balance the limitations of each and potentially produce more in-depth findings.

However, mixed methods research is not without its drawbacks. Combining research methods increases the complexity of a study and generally requires a higher level of expertise to collect, analyze and interpret the data. It also requires a greater level of effort, time and often money.

The Importance of Research Design

Data-driven practice plays an essential role in social work. Unlike philanthropists and altruistic volunteers, social workers are obligated to operate from a scientific knowledge base.

To know whether their programs are effective, social workers must conduct research to determine results, aggregate those results into comprehensible data, analyze and interpret their findings, and use evidence to justify next steps.

Employing the proper design ensures that any evidence obtained during research enables social workers to reliably answer their research questions.

Research Methods in Social Work

The various social work research methods have specific benefits and limitations determined by context. Common research methods include surveys, program evaluations, needs assessments, randomized controlled trials, descriptive studies and single-system designs.

Surveys involve a hypothesis and a series of questions in order to test that hypothesis. Social work researchers will send out a survey, receive responses, aggregate the results, analyze the data, and form conclusions based on trends.

Surveys are one of the most common research methods social workers use — and for good reason. They tend to be relatively simple and are usually affordable. However, surveys generally require large participant groups, and self-reports from survey respondents are not always reliable.

Program Evaluations

Social workers ally with all sorts of programs: after-school programs, government initiatives, nonprofit projects and private programs, for example.

Crucially, social workers must evaluate a program’s effectiveness in order to determine whether the program is meeting its goals and what improvements can be made to better serve the program’s target population.

Evidence-based programming helps everyone save money and time, and comparing programs with one another can help social workers make decisions about how to structure new initiatives. Evaluating programs becomes complicated, however, when programs have multiple goal metrics, some of which may be vague or difficult to assess (e.g., “we aim to promote the well-being of our community”).

Needs Assessments

Social workers use needs assessments to identify services and necessities that a population lacks access to.

Common social work populations that researchers may perform needs assessments on include:

  • People in a specific income group
  • Everyone in a specific geographic region
  • A specific ethnic group
  • People in a specific age group

In the field, a social worker may use a combination of methods (e.g., surveys and descriptive studies) to learn more about a specific population or program. Social workers look for gaps between the actual context and a population’s or individual’s “wants” or desires.

For example, a social worker could conduct a needs assessment with an individual with cancer trying to navigate the complex medical-industrial system. The social worker may ask the client questions about the number of hours they spend scheduling doctor’s appointments, commuting and managing their many medications. After learning more about the specific client needs, the social worker can identify opportunities for improvements in an updated care plan.

In policy and program development, social workers conduct needs assessments to determine where and how to effect change on a much larger scale. Integral to social work at all levels, needs assessments reveal crucial information about a population’s needs to researchers, policymakers and other stakeholders. Needs assessments may fall short, however, in revealing the root causes of those needs (e.g., structural racism).

Randomized Controlled Trials

Randomized controlled trials are studies in which a randomly selected group is subjected to a variable (e.g., a specific stimulus or treatment) and a control group is not. Social workers then measure and compare the results of the randomized group with the control group in order to glean insights about the effectiveness of a particular intervention or treatment.

Randomized controlled trials are easily reproducible and highly measurable. They’re useful when results are easily quantifiable. However, this method is less helpful when results are not easily quantifiable (i.e., when rich data such as narratives and on-the-ground observations are needed).

Descriptive Studies

Descriptive studies immerse the researcher in another context or culture to study specific participant practices or ways of living. Descriptive studies, including descriptive ethnographic studies, may overlap with and include other research methods:

  • Informant interviews
  • Census data
  • Observation

By using descriptive studies, researchers may glean a richer, deeper understanding of a nuanced culture or group on-site. The main limitations of this research method are that it tends to be time-consuming and expensive.

Single-System Designs

Unlike most medical studies, which involve testing a drug or treatment on two groups — an experimental group that receives the drug/treatment and a control group that does not — single-system designs allow researchers to study just one group (e.g., an individual or family).

Single-system designs typically entail studying a single group over a long period of time and may involve assessing the group’s response to multiple variables.

For example, consider a study on how media consumption affects a person’s mood. One way to test a hypothesis that consuming media correlates with low mood would be to observe two groups: a control group (no media) and an experimental group (two hours of media per day). When employing a single-system design, however, researchers would observe a single participant as they watch two hours of media per day for one week and then four hours per day of media the next week.

These designs allow researchers to test multiple variables over a longer period of time. However, similar to descriptive studies, single-system designs can be fairly time-consuming and costly.

Learn More About Social Work Research Methods

Social workers have the opportunity to improve the social environment by advocating for the vulnerable — including children, older adults and people with disabilities — and facilitating and developing resources and programs.

Learn more about how you can earn your  Master of Social Work online at Virginia Commonwealth University . The highest-ranking school of social work in Virginia, VCU has a wide range of courses online. That means students can earn their degrees with the flexibility of learning at home. Learn more about how you can take your career in social work further with VCU.

From M.S.W. to LCSW: Understanding Your Career Path as a Social Worker

How Palliative Care Social Workers Support Patients With Terminal Illnesses

How to Become a Social Worker in Health Care

Gov.uk, Mixed Methods Study

MVS Open Press, Foundations of Social Work Research

Open Social Work Education, Scientific Inquiry in Social Work

Open Social Work, Graduate Research Methods in Social Work: A Project-Based Approach

Routledge, Research for Social Workers: An Introduction to Methods

SAGE Publications, Research Methods for Social Work: A Problem-Based Approach

Theory and Society, Mixed Methods Research: What It Is and What It Could Be

READY TO GET STARTED WITH OUR ONLINE M.S.W. PROGRAM FORMAT?

Bachelor’s degree is required.

VCU Program Helper

This AI chatbot provides automated responses, which may not always be accurate. By continuing with this conversation, you agree that the contents of this chat session may be transcribed and retained. You also consent that this chat session and your interactions, including cookie usage, are subject to our privacy policy .

  • Search Menu

Sign in through your institution

  • Advance articles
  • Editor's Choice
  • Author Guidelines
  • Submission Site
  • Open Access
  • About The British Journal of Social Work
  • About the British Association of Social Workers
  • Editorial Board
  • Advertising and Corporate Services
  • Journals Career Network
  • Self-Archiving Policy
  • Dispatch Dates
  • Journals on Oxford Academic
  • Books on Oxford Academic

Issue Cover

  • < Previous

Nature and Extent of Quantitative Research in Social Work Journals: A Systematic Review from 2016 to 2020

  • Article contents
  • Figures & tables
  • Supplementary Data

Sebastian Kurten, Nausikaä Brimmel, Kathrin Klein, Katharina Hutter, Nature and Extent of Quantitative Research in Social Work Journals: A Systematic Review from 2016 to 2020, The British Journal of Social Work , Volume 52, Issue 4, June 2022, Pages 2008–2023, https://doi.org/10.1093/bjsw/bcab171

  • Permissions Icon Permissions

This study reviews 1,406 research articles published between 2016 and 2020 in the European Journal of Social Work (EJSW), the British Journal of Social Work (BJSW) and Research on Social Work Practice (RSWP). It assesses the proportion and complexity of quantitative research designs amongst published articles and investigates differences between the journals. Furthermore, the review investigates the complexity of the statistical methods employed and identifies the most frequently addressed topics. From the 1,406 articles, 504 (35.8 percent) used a qualitative methodology, 389 (27.7 percent) used a quantitative methodology, 85 (6 percent) used the mixed methods (6 percent), 253 (18 percent) articles were theoretical in nature, 148 (10.5 percent) conducted reviews and 27 (1.9 percent) gave project overviews. The proportion of quantitative research articles was higher in RSWP (55.4 percent) than in the EJSW (14.1 percent) and the BJSW (20.5 percent). The topic analysis could identify at least forty different topics addressed by the articles. Although the proportion of quantitative research is rather small in social work research, the review could not find evidence that it is of low sophistication. Finally, this study concludes that future research would benefit from making explicit why a certain methodology was chosen.

Personal account

  • Sign in with email/username & password
  • Get email alerts
  • Save searches
  • Purchase content
  • Activate your purchase/trial code
  • Add your ORCID iD

Institutional access

Sign in with a library card.

  • Sign in with username/password
  • Recommend to your librarian
  • Institutional account management
  • Get help with access

Access to content on Oxford Academic is often provided through institutional subscriptions and purchases. If you are a member of an institution with an active account, you may be able to access content in one of the following ways:

IP based access

Typically, access is provided across an institutional network to a range of IP addresses. This authentication occurs automatically, and it is not possible to sign out of an IP authenticated account.

Choose this option to get remote access when outside your institution. Shibboleth/Open Athens technology is used to provide single sign-on between your institution’s website and Oxford Academic.

  • Click Sign in through your institution.
  • Select your institution from the list provided, which will take you to your institution's website to sign in.
  • When on the institution site, please use the credentials provided by your institution. Do not use an Oxford Academic personal account.
  • Following successful sign in, you will be returned to Oxford Academic.

If your institution is not listed or you cannot sign in to your institution’s website, please contact your librarian or administrator.

Enter your library card number to sign in. If you cannot sign in, please contact your librarian.

Society Members

Society member access to a journal is achieved in one of the following ways:

Sign in through society site

Many societies offer single sign-on between the society website and Oxford Academic. If you see ‘Sign in through society site’ in the sign in pane within a journal:

  • Click Sign in through society site.
  • When on the society site, please use the credentials provided by that society. Do not use an Oxford Academic personal account.

If you do not have a society account or have forgotten your username or password, please contact your society.

Sign in using a personal account

Some societies use Oxford Academic personal accounts to provide access to their members. See below.

A personal account can be used to get email alerts, save searches, purchase content, and activate subscriptions.

Some societies use Oxford Academic personal accounts to provide access to their members.

Viewing your signed in accounts

Click the account icon in the top right to:

  • View your signed in personal account and access account management features.
  • View the institutional accounts that are providing access.

Signed in but can't access content

Oxford Academic is home to a wide variety of products. The institutional subscription may not cover the content that you are trying to access. If you believe you should have access to that content, please contact your librarian.

For librarians and administrators, your personal account also provides access to institutional account management. Here you will find options to view and activate subscriptions, manage institutional settings and access options, access usage statistics, and more.

Short-term Access

To purchase short-term access, please sign in to your personal account above.

Don't already have a personal account? Register

Month: Total Views:
August 2021 89
September 2021 28
October 2021 34
November 2021 23
December 2021 20
January 2022 34
February 2022 9
March 2022 9
April 2022 27
May 2022 32
June 2022 54
July 2022 52
August 2022 62
September 2022 56
October 2022 75
November 2022 41
December 2022 43
January 2023 24
February 2023 35
March 2023 44
April 2023 34
May 2023 49
June 2023 18
July 2023 23
August 2023 25
September 2023 35
October 2023 27
November 2023 19
December 2023 18
January 2024 14
February 2024 23
March 2024 35
April 2024 16
May 2024 17
June 2024 11
July 2024 14
August 2024 4

Email alerts

Citing articles via.

  • Recommend to your Library

Affiliations

  • Online ISSN 1468-263X
  • Print ISSN 0045-3102
  • Copyright © 2024 British Association of Social Workers
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Rights and permissions
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Logo for VIVA's Pressbooks

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

19 11. Quantitative measurement

Chapter outline.

  • Overview of measurement (11 minute read)
  • Operationalization and levels of measurement (20 minute read)
  • Scales and indices (15 minute read)
  • Reliability and validity (20 minute read)
  • Ethical and social justice considerations for measurement (6 minute read)

Content warning: Discussions of immigration issues, parents and gender identity, anxiety, and substance use.

11.1 Overview of measurement

Learning Objectives

Learners will be able to…

  • Provide an overview of the measurement process in social work research
  • Describe why accurate measurement is important for research

This chapter begins with an interesting question: Is my apple the same as your apple? Let’s pretend you want to study apples. Perhaps you have read that chemicals in apples may impact neurotransmitters and you want to test if apple consumption improves mood among college students. So, in order to conduct this study, you need to make sure that you provide apples to a treatment group, right? In order to increase the rigor of your study, you may also want to have a group of students, ones who do not get to eat apples, to serve as a comparison group. Don’t worry if this seems new to you. We will discuss this type of design in Chapter 13 . For now, just concentrate on apples.

In order to test your hypothesis about apples, you need to define exactly what is meant by the term “apple” so you ensure everyone is consuming the same thing. You also need to know what you consider a “dose” of this thing that we call “apple” and make sure everyone is consuming the same kind of apples and you need a way to ensure that you give the same amount of apples to everyone in your treatment group. So, let’s start by making sure we understand what the term “apple” means. Say you have an object that you identify as an apple and I have an object that I identify as an apple. Perhaps my “apple” is a chocolate apple, one that looks similar to an apple but made of chocolate and red dye, and yours is a honeycrisp. Perhaps yours is papier-mache and mine is a Macbook Pro.  All of these are defined as apples, right?

Decorative image

You can see the multitude of ways we could conceptualize “apple,” and how that could create a problem for our research. If I get a Red Delicious (ick) apple and you get a Granny Smith (yum) apple and we observe a change in neurotransmitters, it’s going to be even harder than usual to say the apple influenced the neurotransmitters because we didn’t define “apple” well enough. Measurement in this case is essential to treatment fidelity , which is when you ensure that everyone receives the same, or close to the same, treatment as possible. In other words, you need to make sure everyone is consuming the same kind of apples and you need a way to ensure that you give the same amount of apples to everyone in your treatment group.

In social science, when we use the term  measurement , we mean the process by which we describe and ascribe meaning to the key facts, concepts, or other phenomena that we are investigating. At its core, measurement is about defining one’s terms in as clear and precise a way as possible. Of course, measurement in social science isn’t quite as simple as using a measuring cup or spoon, but there are some basic tenets on which most social scientists agree when it comes to measurement. We’ll explore those, as well as some of the ways that measurement might vary depending on your unique approach to the study of your topic.

An important point here is that measurement does not require any particular instruments or procedures. What it does  require is  some systematic procedure for assigning scores, meanings, and descriptions to individuals or objects so that those scores represent the characteristic of interest. You can measure phenomena in many different ways, but you must be sure that how you choose to measure gives you information and data that lets you answer your research question. If you’re looking for information about a person’s income, but your main points of measurement have to do with the money they have in the bank, you’re not really going to find the information you’re looking for!

What do social scientists measure?

The question of what social scientists measure can be answered by asking yourself what social scientists study. Think about the topics you’ve learned about in other social work classes you’ve taken or the topics you’ve considered investigating yourself. Let’s consider Melissa Milkie and Catharine Warner’s study (2011) [1] of first graders’ mental health. In order to conduct that study, Milkie and Warner needed to have some idea about how they were going to measure mental health. What does mental health mean, exactly? And how do we know when we’re observing someone whose mental health is good and when we see someone whose mental health is compromised? Understanding how measurement works in research methods helps us answer these sorts of questions.

As you might have guessed, social scientists will measure just about anything that they have an interest in investigating. For example, those who are interested in learning something about the correlation between social class and levels of happiness must develop some way to measure both social class and happiness. Those who wish to understand how well immigrants cope in their new locations must measure immigrant status and coping. Those who wish to understand how a person’s gender shapes their workplace experiences must measure gender and workplace experiences. You get the idea. Social scientists can and do measure just about anything you can imagine observing or wanting to study. Of course, some things are easier to observe or measure than others.

In 1964, philosopher Abraham Kaplan (1964) [2] wrote The   Conduct of Inquiry,  which has since become a classic work in research methodology (Babbie, 2010). [3] In his text, Kaplan describes different categories of things that behavioral scientists observe. One of those categories, which Kaplan called “observational terms,” is probably the simplest to measure in social science. Observational terms are the sorts of things that we can see with the naked eye simply by looking at them. Kaplan roughly defines them as conditions that are easy to identify and verify through direct observation. If, for example, we wanted to know how the conditions of playgrounds differ across different neighborhoods, we could directly observe the variety, amount, and condition of equipment at various playgrounds.

Indirect observables , on the other hand, are less straightforward to assess. In Kaplan’s framework, they are conditions that are subtle and complex that we must use existing knowledge and intuition to define.If we conducted a study for which we wished to know a person’s income, we’d probably have to ask them their income, perhaps in an interview or a survey. Thus, we have observed income, even if it has only been observed indirectly. Birthplace might be another indirect observable. We can ask study participants where they were born, but chances are good we won’t have directly observed any of those people being born in the locations they report.

How do social scientists measure?

Measurement in social science is a process. It occurs at multiple stages of a research project: in the planning stages, in the data collection stage, and sometimes even in the analysis stage. Recall that previously we defined measurement as the process by which we describe and ascribe meaning to the key facts, concepts, or other phenomena that we are investigating. Once we’ve identified a research question, we begin to think about what some of the key ideas are that we hope to learn from our project. In describing those key ideas, we begin the measurement process.

Let’s say that our research question is the following: How do new college students cope with the adjustment to college? In order to answer this question, we’ll need some idea about what coping means. We may come up with an idea about what coping means early in the research process, as we begin to think about what to look for (or observe) in our data-collection phase. Once we’ve collected data on coping, we also have to decide how to report on the topic. Perhaps, for example, there are different types or dimensions of coping, some of which lead to more successful adjustment than others. However we decide to proceed, and whatever we decide to report, the point is that measurement is important at each of these phases.

As the preceding example demonstrates, measurement is a process in part because it occurs at multiple stages of conducting research. We could also think of measurement as a process because it involves multiple stages. From identifying your key terms to defining them to figuring out how to observe them and how to know if your observations are any good, there are multiple steps involved in the measurement process. An additional step in the measurement process involves deciding what elements your measures contain. A measure’s elements might be very straightforward and clear, particularly if they are directly observable. Other measures are more complex and might require the researcher to account for different themes or types. These sorts of complexities require paying careful attention to a concept’s level of measurement and its dimensions. We’ll explore these complexities in greater depth at the end of this chapter, but first let’s look more closely at the early steps involved in the measurement process, starting with conceptualization.

The idea of coming up with your own measurement tool might sound pretty intimidating at this point. The good news is that if you find something in the literature that works for you, you can use it with proper attribution. If there are only pieces of it that you like, you can just use those pieces, again with proper attribution. You don’t always have to start from scratch!

Key Takeaways

  • Measurement (i.e. the measurement process) gives us the language to define/describe what we are studying.
  • In research, when we develop measurement tools, we move beyond concepts that may be subjective and abstract to a definition that is clear and concise.
  • Good social work researchers are intentional with the measurement process.
  • Engaging in the measurement process requires us to think critically about what we want to study. This process may be challenging and potentially time-consuming.
  • How easy or difficult do you believe it will be to study these topics?
  • Think about the chapter on literature reviews. Is there a significant body of literature on the topics you are interested in studying?
  • Are there existing measurement tools that may be appropriate to use for the topics you are interested in studying?

11.2 Operationalization and levels of measurement

  • Define constructs and operationalization and describe their relationship
  • Be able to start operationalizing variables in your research project
  • Identify the level of measurement for each type of variable
  • Demonstrate knowledge of how each type of variable can be used

Now we have some ideas about what and how social scientists need to measure, so let’s get into the details. In this section, we are going to talk about how to make your variables measurable (operationalization) and how you ultimately characterize your variables in order to analyze them (levels of measurement).

Operationalizing your variables

“Operationalizing” is not a word I’d ever heard before I became a researcher, and actually, my browser’s spell check doesn’t even recognize it. I promise it’s a real thing, though. In the most basic sense, when we operationalize a variable, we break it down into measurable parts. Operationalization is the process of determining how to measure a construct that cannot be directly observed. And a constructs are conditions that are not directly observable and represent states of being, experiences, and ideas. But why construct ? We call them constructs because they are built using different ideas and parameters.

As we know from Section 11.1, sometimes the measures that we are interested in are more complex and more abstract than observational terms or indirect observables . Think about some of the things you’ve learned about in other social work classes—for example, ethnocentrism. What is ethnocentrism? Well, from completing an introduction to social work class you might know that it’s a construct that has something to do with the way a person judges another’s culture. But how would you measure  it? Here’s another construct: bureaucracy. We know this term has something to do with organizations and how they operate, but measuring such a construct is trickier than measuring, say, a person’s income. In both cases, ethnocentrism and bureaucracy, these theoretical notions represent ideas whose meaning we have come to agree on. Though we may not be able to observe these abstractions directly, we can observe the things that they are made up of.

importance of quantitative research in social work

Now, let’s operationalize bureaucracy and ethnocentrism. The construct of bureaucracy could be measured by counting the number of supervisors that need to approve routine spending by public administrators. The greater the number of administrators that must sign off on routine matters, the greater the degree of bureaucracy. Similarly, we might be able to ask a person the degree to which they trust people from different cultures around the world and then assess the ethnocentrism inherent in their answers. We can measure constructs like bureaucracy and ethnocentrism by defining them in terms of what we can observe.

How we operationalize our constructs (and ultimately measure our variables) can affect the conclusions we can draw from our research. Let’s say you’re reviewing a state program to make it more efficient in connecting people to public services. What might be different if we decide to measure bureaucracy by the number of forms someone has to fill out to get a public service instead of the number of people who have to review the forms, like we talked about above? Maybe you find that there is an unnecessary amount of paperwork based on comparisons to other state programs, so you recommend that some of it be eliminated. This is probably a good thing, but will it actually make the program more efficient like eliminating some of the reviews that paperwork has to go through would? I’m not really making a judgment on which way is better to measure bureaucracy, but I encourage you to think about the costs and benefits of each way we operationalized the construct of bureaucracy, and extend this to the way you operationalize your own concepts in your research project.

Levels of Measurement

Now, we’re going to move into some more concrete characterizations of variables. You now hopefully understand how to operationalize your concepts so that you can turn them into variables. Imagine a process kind of like what you see in Figure 11.1 below.

importance of quantitative research in social work

Notice that the arrows from the construct point toward the research question, because ultimately, measuring them will help answer your question!

The level of measuremen t of a variable tells us how the values of the variable relate to each other  and what mathematical operations we can perform with the variable. (That second part will become important once we move into quantitative analysis in Chapter 14  and Chapter 15 ).  Many students find this definition a bit confusing. What does it mean when we say that the level of measurement tells us about mathematical operations? So before we move on, let’s clarify this a bit. 

Let’s say you work for your a community nonprofit that wants to develop programs relevant to community members’ ages (i.e., tutoring for kids in school, job search and resume help for adults, and home visiting for elderly community members). However, you do not have a good understanding of the ages of the people who visit your community center. Below is a part of a questionnaire that you developed to.

  • How old are you? – Under 18 years old – 18-30 years old – 31-50 years old – 51-60 years old – Over 60 years old
  • How old are you? _____ years

Look at the two items on this questionnaire. They both ask about age, but t he first item asks about age but asks the participant to identify the age range. The second item asks you to identify the actual age in years. These two questions give us data that represent the same information measured at a different level.

It would help your agency if you knew the average age of clients, right? So, which item on the questionnaire will provide this information? Item one’s choices are grouped into categories. Can you compute an average age from these choices? No. Conversely, participants completing item two are asked to provide an actual number, one that you could use to determine an average age. In summary, the two items both ask the participants to report their age. However, the type of data collected from both items is different and must be analyzed differently. 

We can think about the four levels of measurement as going from less to more specific, or as it’s more commonly called, lower to higher: nominal, ordinal , interval , and ratio . Each of these levels differ and help the researcher understand something about their data.  Think about levels of measurement as a hierarchy.

In order to determine the level of measurement, please examine your data and then ask these four questions (in order).

  • Do I have mutually exclusive categories? If the answer is yes, continue to question #2.
  • Do my item choices have a hierarchy or order? In other words, can you put your item choices in order? If no, stop–you have nominal level data. If the answer is yes, continue to question #3.
  • Can I add, subtract, divide, and multiply my answer choices? If no, stop–you have ordinal level data. If the answer is yes, continue to question #4.
  • Is it possible that the answer to this item can be zero? If the answer is no—you have interval level data. If the answer is yes, you are at the ratio level of measurement.

Nominal level .  The nominal level of measurement is the lowest level of measurement. It contains categories are mutually exclusive, which means means that anyone who falls into one category cannot not fall into another category. The data can be represented with words (like yes/no) or numbers that correspond to words or a category (like 1 equaling yes and 0 equaling no). Even when the categories are represented as numbers in our data, the number itself does not have an actual numerical value. It is merely a number we have assigned so that we can use the variable in mathematical operations (which we will start talking about in Chapter 14.1 ). We say this level of measurement is lowest or least specific because someone who falls into a category we’ve designated could differ from someone else in the same category. Let’s say on our questionnaire above, we also asked folks whether they own a car. They can answer yes or no, and they fall into mutually exclusive categories. In this case, we would know whether they own a car, but not whether owning a car really affects their life significantly. Maybe they have chosen not to own one and are happy to take the bus, bike, or walk. Maybe they do not own one but would like to own one. We cannot get this information from a nominal variable, which is ok when we have meaningful categories. Nominal variables are especially useful when we just need the frequency of a particular characteristic in our sample.

The nominal level of measurement usually includes many demographic characteristics like race, gender, or marital status.

Ordinal leve l . The ordinal level of measurement is the next level of measurement and contains slightly more specific information than the nominal level. This level has mutually exclusive categories and a hierarchy or order. Let’s go back to the first item on the questionnaire we talked about above.

Do we have mutually exclusive categories? Yes. Someone who selects item A cannot also select item B. So, we know that we have at least nominal level data. However, the next question that we need to ask is “Do my answer choices have order?” or “Can I put my answer choices in order?” The answer is yes, someone who selects A is younger than someone who selects B or C. So, you have at least ordinal level data.

From a data analysis and statistical perspective, ordinal variables get treated exactly like nominal variables because they are both categorical variables , or variables whose values are organized into mutually exclusive groups but whose numerical values cannot be used in mathematical operations. You’ll see this term used again when we get into bivariate analysis in Chapter 15.

Interval level The interval level of measurement is a higher level of measurement. This level marks the point where we are able . This level contains all of the characteristics of the previous levels (mutually exclusive categories and order). What distinguishes it from the ordinal level is that the interval level can be used to conduct mathematical computations with data (like an average, for instance).

Let’s think back to our questionnaire about age again and take a look at the second question where we asked for a person’s exact age in years. Age in years is mutually exclusive – someone can’t be 14 and 15 at the same time – and the order of ages is meaningful, since being 18 means something different than being 32. Now, we can also take the answers to this question and do math with them, like addition, subtraction, multiplication, and division.

Ratio level . Ratio level data is the highest level of measurement. It has mutually exclusive categories, order, and you can perform mathematical operations on it. The main difference between the interval and ratio levels is that the ratio level has an absolute zero, meaning that a value of zero is both possible and meaningful. You might be thinking, “Well, age has an absolute zero,” but someone who is not yet born does not have an age, and the minute they’re born, they are not zero years old anymore.

Data at the ratio level of measurement are usually amounts or numbers of things, and can be negative (if that makes conceptual sense, of course). For example, you could ask someone to report how many A’s they have on their transcript or how many semesters they have earned a 4.0. They could have zero A’s and that would be a valid answer.

From a data analysis and statistical perspective, interval and ratio variables are treated exactly the same because they are both continuous variables , or variables whose values are mutually exclusive and can be used in mathematical operations. Technically, a continuous variable could have an infinite number of values.

What does the level of measurement tell us?

We have spent time learning how to determine our data’s level of measurement. Now what? How could we use this information to help us as we measure concepts and develop measurement tools? First, the types of statistical tests that we are able to use are dependent on our data’s level of measurement. (We will discuss this soon in Chapter 15.) The higher the level of measurement, the more complex statistical tests we are able to conduct. This knowledge may help us decide what kind of data we need to gather, and how. That said, we have to balance this knowledge with the understanding that sometimes, collecting data at a higher level of measurement could negatively impact our studies. For instance, sometimes providing answers in ranges may make prospective participants feel more comfortable responding to sensitive items. Imagine that you were interested in collecting information on topics such as income, number of sexual partners, number of times used illicit drugs, etc. You would have to think about the sensitivity of these items and determine if it would make more sense to collect some data at a lower level of measurement.

Finally, sometimes when analyzing data, researchers find a need to change a data’s level of measurement. For example, a few years ago, a student was interested in studying the relationship between mental health and life satisfaction. This student collected a variety of data. One item asked about the number of mental health diagnoses, reported as the actual number. When analyzing data, my student examined the mental health diagnosis variable and noticed that she had two groups, those with none or one diagnosis and those with many diagnoses. Instead of using the ratio level data (actual number of mental health diagnoses), she collapsed her cases into two categories, few and many. She decided to use this variable in her analyses. It is important to note that you can move a higher level of data to a lower level of data; however, you are unable to move a lower level to a higher level.

  • Operationalization involves figuring out how to measure a construct you cannot directly observe.
  • Nominal variables have mutually exclusive categories with no natural order. They cannot be used for mathematical operations like addition or subtraction. Race or gender would be one example.
  • Ordinal variables have mutually exclusive categories  and a natural order. They also cannot be used for mathematical operations like addition or subtraction. Age when measured in categories (i.e., 18-25 years old) would be an example.
  • Interval variables have mutually exclusive categories, a natural order, and can be used for mathematical operations. Age as a raw number would be an example.
  • Ratio variables have mutually exclusive categories, a natural order, can be used for mathematical operations, and have an absolute zero value. The number of times someone calls a legislator to advocate for a policy would be an example.
  • Nominal and ordinal variables are categorical variables, meaning they have mutually exclusive categories and cannot be used for mathematical operations, even when assigned a number.
  • Interval and ratio variables are continuous variables, meaning their values are mutually exclusive and can be used in mathematical operations.
  • Researchers should consider the costs and benefits of how they operationalize their variables, including what level of measurement they choose, since the level of measurement can affect how you must gather your data.
  • What are the primary constructs being explored in the research?
  • Could you (or the study authors) have chosen another way to operationalize this construct?
  • What are these variables’ levels of measurement?
  • Are they categorical or continuous?

11.3 Scales and indices

  • Identify different types of scales and compare them to each other
  • Understand how to begin the process of constructing scales or indices

Quantitative data analysis requires the construction of two types of measures of variables: indices and scales. These measures are frequently used and are important since social scientists often study variables that possess no clear and unambiguous indicators–for instance, age or gender. Researchers often centralize much of work in regards to the attitudes and orientations of a group of people, which require several items to provide indication of the variables. Secondly, researchers seek to establish ordinal categories from very low to very high (vice-versa), which single data items can not ensure, while an index or scale can.

Although they exhibit differences (which will later be discussed) the two have in common various factors.

  • Both are ordinal measures of variables.
  • Both can order the units of analysis in terms of specific variables.
  • Both are composite measures of variables ( measurements based on more than one one data item ).

In general, indices are a sum of series of individual yes/no questions, that are then combined in a single numeric score. They are usually a measure of the quantity of some social phenomenon and are constructed at a ratio level of measurement. More sophisticated indices weigh individual items according to their importance in the concept being measured (i.e. in a multiple choice test where different questions are worth different numbers of points). Some interval-level indices are not weight counted, but contain other indexes or scales within them (i.e. college admissions that score an applicant based on GPA, SAT scores, essays, and place a different point from each source).

This section discusses two formats used for measurement in research: scales and indices (sometimes called indexes). These two formats are helpful in research because they use multiple indicators to develop a composite (or total) score. Co mposite scores provide a much greater understanding of concepts than a single item could. Although we won’t delve too deeply into the process of scale development, we will cover some important topics for you to understand how scales and indices can be used.

Types of scales

As a student, you are very familiar with end of the semester course evaluations. These evaluations usually include statements such as, “My instructor created an environment of respect” and ask students to use a scale to indicate how much they agree or disagree with the statements.  These scales, if developed and administered appropriately, provide a wealth of information to instructors that may be used to refine and update courses. If you examine the end of semester evaluations, you will notice that they are organized, use language that is specific to your course, and have very intentional methods of implementation. In essence, these tools are developed to encourage completion.

As you read about these scales, think about the information that you want to gather from participants. What type or types of scales would be the best for you to use and why? Are there existing scales or do you have to create your own?

The Likert scale

Most people have seen some version of a Likert scale. Designed by Rensis Likert (Likert, 1932) [4] , a Likert scale is a very popular rating scale for measuring ordinal data in social work research. This scale includes Likert items that are simply-worded statements to which participants can indicate their extent of agreement or disagreement on a five- or seven-point scale ranging from “strongly disagree” to “strongly agree.” You will also see Likert scales used for importance, quality, frequency, and likelihood, among lots of other concepts. Below is an example of how we might use a Likert scale to assess your attitudes about research as you work your way through this textbook.

Table 11.1 Likert scale
I like research more now than when I started reading this book.
This textbook is easy to use.
I feel confident about how well I understand levels of measurement.
This textbook is helping me plan my research proposal.

Likert scales are excellent ways to collect information. They are popular; thus, your prospective participants may already be familiar with them. However, they do pose some challenges. You have to be very clear about your question prompts. What does strongly agree mean and how is this differentiated from agree ? In order to clarify this for participants, some researchers will place definitions of these items at the beginning of the tool.

There are a few other, less commonly used, scales discussed next.

Semantic differential scale

This is a composite (multi-item) scale where respondents are asked to indicate their opinions or feelings toward a single statement using different pairs of adjectives framed as polar opposites. For instance, in the above Likert scale, the participant is asked how much they agree or disagree with a statement. In a semantic differential scale, the participant is asked to indicate how they feel about a specific item. This makes the s emantic differential scale an excellent technique for measuring people’s attitudes or feelings toward objects, events, or behaviors. The following is an example of a semantic differential scale that was created to assess participants’ feelings about the content taught in their research class.  

Feelings About My Research Class

Directions: Please review the pair of words and then select the one that most accurately reflects your feelings about the content of your research class.

Boring……………………………………….Exciting

Waste of Time…………………………..Worthwhile

Dry…………………………………………….Engaging

Irrelevant…………………………………..Relevant

Guttman scale

This composite scale was designed by Louis Guttman and uses a series of items arranged in increasing order of intensity (least intense to most intense) of the concept. This type of scale allows us to understand the intensity of beliefs or feelings. Each item in the above Guttman scale has a weight (this is not indicated on the tool) which varies with the intensity of that item, and the weighted combination of each response is used as an aggregate measure of an observation. Let’s pretend that you are working with a group of parents whose children have identified as part of the transgender community. You want to know how comfortable they feel with their children. You could develop the following items.

Example Guttman Scale Items

  • I would allow my child to use a name that was not gender-specific (e.g., Ryan, Taylor)    Yes/No
  • I would allow my child to wear clothing of the opposite gender (e.g., dresses for boys)   Yes/No
  • I would allow my child to use the pronoun of the opposite sex                                             Yes/No
  • I would allow my child to live as the opposite gender                                                             Yes/No

Notice how the items move from lower intensity to higher intensity. A researcher reviews the yes answers and creates a score for each participant.

Indices (Indexes)

An index is a composite score derived from aggregating measures of multiple concepts (called components) using a set of rules and formulas. It is different from a scale. Scales also aggregate measures; however, these measures examine different dimensions or the same dimension of a single construct. A well-known example of an index is the consumer price index (CPI), which is computed every month by the Bureau of Labor Statistics of the U.S. Department of Labor. The CPI is a measure of how much consumers have to pay for goods and services (in general) and is divided into eight major categories (food and beverages, housing, apparel, transportation, healthcare, recreation, education and communication, and “other goods and services”), which are further subdivided into more than 200 smaller items. Each month, government employees call all over the country to get the current prices of more than 80,000 items. Using a complicated weighting scheme that takes into account the location and probability of purchase for each item, analysts then combine these prices into an overall index score using a series of formulas and rules.

Another example of an index is the Duncan Socioeconomic Index (SEI). This index is used to quantify a person’s socioeconomic status (SES) and is a combination of three concepts: income, education, and occupation. Income is measured in dollars, education in years or degrees achieved, and occupation is classified into categories or levels by status. These very different measures are combined to create an overall SES index score. However, SES index measurement has generated a lot of controversy and disagreement among researchers.

The process of creating an index is similar to that of a scale. First, conceptualize (define) the index and its constituent components. Though this appears simple, there may be a lot of disagreement on what components (concepts/constructs) should be included or excluded from an index. For instance, in the SES index, isn’t income correlated with education and occupation? And if so, should we include one component only or all three components? Reviewing the literature, using theories, and/or interviewing experts or key stakeholders may help resolve this issue. Second, operationalize and measure each component. For instance, how will you categorize occupations, particularly since some occupations may have changed with time (e.g., there were no Web developers before the Internet)? Third, create a rule or formula for calculating the index score. Again, this process may involve a lot of subjectivity. Lastly, validate the index score using existing or new data.

Differences Between Scales and Indices

Though indices and scales yield a single numerical score or value representing a concept of interest, they are different in many ways. First, indices often comprise components that are very different from each other (e.g., income, education, and occupation in the SES index) and are measured in different ways. Conversely, scales typically involve a set of similar items that use the same rating scale (such as a five-point Likert scale about customer satisfaction).

Second, indices often combine objectively measurable values such as prices or income, while scales are designed to assess subjective or judgmental constructs such as attitude, prejudice, or self-esteem. Some argue that the sophistication of the scaling methodology makes scales different from indexes, while others suggest that indexing methodology can be equally sophisticated. Nevertheless, indexes and scales are both essential tools in social science research.

A note on scales and indices

Scales and indices seem like clean, convenient ways to measure different phenomena in social science, but just like with a lot of research, we have to be mindful of the assumptions and biases underneath. What if a scale or an index was developed using only White women as research participants? Is it going to be useful for other groups? It very well might be, but when using a scale or index on a group for whom it hasn’t been tested, it will be very important to evaluate the validity and reliability of the instrument, which we address in the next section.

It’s important to note that while scales and indices are often made up of nominal or ordinal variables, when we analyze them into composite scores, we will treat them as interval/ratio variables.

  • Scales and indices are common ways to collect information and involve using multiple indicators in measurement.
  • A key difference between a scale and an index is that a scale contains multiple indicators for one concept, whereas an indicator examines multiple concepts (components).
  • In order to create scales or indices, researchers must have a clear understanding of the indicators for what they are studying.
  • What is the level of measurement for each item on each tool? Take a second and think about why the tool’s creator decided to include these levels of measurement. Identify any levels of measurement you would change and why.
  • If these tools don’t exist for what you are interested in studying, why do you think that is?

11.4 Reliability and validity in measurement

  • Discuss measurement error, the different types, and how to minimize the probability of them
  • Differentiate between reliability and validity and understand how these are related to each other and relevant to understanding the value of a measurement tool
  • Compare and contrast the types of reliability and demonstrate how to evaluate each type
  • Compare and contrast the types of validity and demonstrate how to evaluate each type

The previous chapter provided insight into measuring concepts in social work research. We discussed the importance of identifying concepts and their corresponding indicators as a way to help us operationalize them. In essence, we now understand that when we think about our measurement process, we must be intentional and thoughtful in the choices that we make. Before we talk about how to evaluate our measurement process, let’s discuss why we want to evaluate our process. We evaluate our process so that we minimize our chances of error . But what is measurement error?

Types of Errors

We need to be concerned with two types of errors in measurement: systematic and random errors. Systematic errors are errors that are generally predictable. These are errors that, “are due to the process that biases the results.” [5] For instance, my cat stepping on the scale with me each morning is a systematic error in measuring my weight. I could predict that each measurement would be off by 13 pounds. (He’s a bit of a chonk.)

There are multiple categories of systematic errors.

  • Social desirability , occurs when you ask participants a question and they answer in the way that they feel is the most socially desired . For instance, let's imagine that you want to understand the level of prejudice that participants feel regarding immigrants and decide to conduct face-to-face interviews with participants. Some participants may feel compelled to answer in a way that indicates that they are less prejudiced than they really are. 
  • [pb_glossary id="2096"]Acquiescence bias  occurs when participants answer items in some type of pattern, usually skewed to more favorable responses. For example, imagine that you took a research class and loved it. The professor was great and you learned so much. When asked to complete the end of course questionnaire, you immediately mark "strongly agree" to all items without really reading all of the items. After all, you really loved the class. However, instead of reading and reflecting on each item, you "acquiesced" and used your overall impression of the experience to answer all of the items.
  • Leading questions are those questions that are worded in a way so that the participant is "lead" to a specific answer. For instance, think about the question, "Have you ever hurt a sweet, innocent child?" Most people, regardless of their true response, may answer "no" simply because the wording of the question leads the participant to believe that "no" is the correct answer.

In order to minimize these types of errors, you should think about what you are studying and examine potential public perceptions of this issue. Next, think about how your questions are worded and how you will administer your tool (we will discuss these in greater detail in the next chapter). This will help you determine if your methods inadvertently increase the probability of these types of errors. 

These errors differ from random errors , whic are "due to chance and are not systematic in any way." [6] Sometimes it is difficult to "tease out" random errors. When you take your statistics class, you will learn more about random errors and what to do about them. They're hard to observe until you start diving deeper into statistical analysis, so put a pin in them for now.

Now that we have a good understanding of the two types of errors, let's discuss what we can do to evaluate our measurement process and minimize the chances of these occurring. Remember, quality projects are clear on what is measured , how it is measured, and why it is measured . In addition, quality projects are attentive to the appropriateness of measurement tools and evaluate whether tools are used correctly and consistently.  But how do we do that? Good researchers  do not simply  assume  that their measures work. Instead, they collect data to  demonstrate that they work. If their research does not demonstrate that a measure works, they stop using it. There are two key factors to consider in deciding whether your measurements are good: reliability and validity.

Reliability

Reliability refers to the consistency of a measure. Psychologists consider three types of reliability: over time (test-retest reliability), across items (internal consistency), and across different researchers (inter-rater reliability).

Test-retest reliability

When researchers measure a construct that they assume to be consistent across time, then the scores they obtain should also be consistent across time. Test-retest reliability is the extent to which this is actually the case. For example, intelligence is generally thought to be consistent across time. A person who is highly intelligent today will be highly intelligent next week. This means that any good measure of intelligence should produce roughly the same scores for this individual next week as it does today. Clearly, a measure that produces highly inconsistent scores over time cannot be a very good measure of a construct that is supposed to be consistent.

Assessing test-retest reliability requires using the measure on a group of people at one time, using it again on the  same group of people at a later time. At neither point has the research participant received any sort of intervention. Once you have these two measurements, you then look at the correlation between the two sets of scores. This is typically done by graphing the data in a scatterplot and computing the correlation coefficient. Figure 11.2 shows the correlation between two sets of scores of several university students on the Rosenberg Self-Esteem Scale, administered two times, a week apart. The correlation coefficient for these data is +.95. In general, a test-retest correlation of +.80 or greater is considered to indicate good reliability.

A scatterplot with scores at time 1 on the x-axis and scores at time 2 on the y-axis, both ranging from 0 to 30. The dots on the scatter plot indicate a strong, positive correlation.

Again, high test-retest correlations make sense when the construct being measured is assumed to be consistent over time, which is the case for intelligence, self-esteem, and the Big Five personality dimensions. But other constructs are not assumed to be stable over time. The very nature of mood, for example, is that it changes. So a measure of mood that produced a low test-retest correlation over a period of a month would not be a cause for concern.

Internal consistency

Another kind of reliability is internal consistency , which is the consistency of people’s responses across the items on a multiple-item measure. In general, all the items on such measures are supposed to reflect the same underlying construct, so people’s scores on those items should be correlated with each other. On the Rosenberg Self-Esteem Scale, people who agree that they are a person of worth should tend to agree that they have a number of good qualities. If people’s responses to the different items are not correlated with each other, then it would no longer make sense to claim that they are all measuring the same underlying construct. This is as true for behavioral and physiological measures as for self-report measures. For example, people might make a series of bets in a simulated game of roulette as a measure of their level of risk seeking. This measure would be internally consistent to the extent that individual participants’ bets were consistently high or low across trials.

Interrater Reliability

Many behavioral measures involve significant judgment on the part of an observer or a rater. Interrater reliability is the extent to which different observers are consistent in their judgments. For example, if you were interested in measuring university students’ social skills, you could make video recordings of them as they interacted with another student whom they are meeting for the first time. Then you could have two or more observers watch the videos and rate each student’s level of social skills. To the extent that each participant does, in fact, have some level of social skills that can be detected by an attentive observer, different observers’ ratings should be highly correlated with each other.

Validity , another key element of assessing measurement quality, is the extent to which the scores from a measure represent the variable they are intended to. But how do researchers make this judgment? We have already considered one factor that they take into account—reliability. When a measure has good test-retest reliability and internal consistency, researchers should be more confident that the scores represent what they are supposed to. There has to be more to it, however, because a measure can be extremely reliable but have no validity whatsoever. As an absurd example, imagine someone who believes that people’s index finger length reflects their self-esteem and therefore tries to measure self-esteem by holding a ruler up to people’s index fingers. Although this measure would have extremely good test-retest reliability, it would have absolutely no validity. The fact that one person’s index finger is a centimeter longer than another’s would indicate nothing about which one had higher self-esteem.

Discussions of validity usually divide it into several distinct “types.” But a good way to interpret these types is that they are other kinds of evidence—in addition to reliability—that should be taken into account when judging the validity of a measure.

Face validity

Face validity is the extent to which a measurement method appears “on its face” to measure the construct of interest. Most people would expect a self-esteem questionnaire to include items about whether they see themselves as a person of worth and whether they think they have good qualities. So a questionnaire that included these kinds of items would have good face validity. The finger-length method of measuring self-esteem, on the other hand, seems to have nothing to do with self-esteem and therefore has poor face validity. Although face validity can be assessed quantitatively—for example, by having a large sample of people rate a measure in terms of whether it appears to measure what it is intended to—it is usually assessed informally.

Face validity is at best a very weak kind of evidence that a measurement method is measuring what it is supposed to. One reason is that it is based on people’s intuitions about human behavior, which are frequently wrong. It is also the case that many established measures in psychology work quite well despite lacking face validity. The Minnesota Multiphasic Personality Inventory-2 (MMPI-2) measures many personality characteristics and disorders by having people decide whether each of over 567 different statements applies to them—where many of the statements do not have any obvious relationship to the construct that they measure. For example, the items “I enjoy detective or mystery stories” and “The sight of blood doesn’t frighten me or make me sick” both measure the suppression of aggression. In this case, it is not the participants’ literal answers to these questions that are of interest, but rather whether the pattern of the participants’ responses to a series of questions matches those of individuals who tend to suppress their aggression.

Content validity

Content validity is the extent to which a measure “covers” the construct of interest. For example, if a researcher conceptually defines test anxiety as involving both sympathetic nervous system activation (leading to nervous feelings) and negative thoughts, then his measure of test anxiety should include items about both nervous feelings and negative thoughts. Or consider that attitudes are usually defined as involving thoughts, feelings, and actions toward something. By this conceptual definition, a person has a positive attitude toward exercise to the extent that they think positive thoughts about exercising, feels good about exercising, and actually exercises. So to have good content validity, a measure of people’s attitudes toward exercise would have to reflect all three of these aspects. Like face validity, content validity is not usually assessed quantitatively. Instead, it is assessed by carefully checking the measurement method against the conceptual definition of the construct.

Criterion validity

Criterion validity is the extent to which people’s scores on a measure are correlated with other variables (known as criteria) that one would expect them to be correlated with. For example, people’s scores on a new measure of test anxiety should be negatively correlated with their performance on an important school exam. If it were found that people’s scores were in fact negatively correlated with their exam performance, then this would be a piece of evidence that these scores really represent people’s test anxiety. But if it were found that people scored equally well on the exam regardless of their test anxiety scores, then this would cast doubt on the validity of the measure.

A criterion can be any variable that one has reason to think should be correlated with the construct being measured, and there will usually be many of them. For example, one would expect test anxiety scores to be negatively correlated with exam performance and course grades and positively correlated with general anxiety and with blood pressure during an exam. Or imagine that a researcher develops a new measure of physical risk taking. People’s scores on this measure should be correlated with their participation in “extreme” activities such as snowboarding and rock climbing, the number of speeding tickets they have received, and even the number of broken bones they have had over the years. When the criterion is measured at the same time as the construct, criterion validity is referred to as concurrent validity ; however, when the criterion is measured at some point in the future (after the construct has been measured), it is referred to as predictive validity (because scores on the measure have “predicted” a future outcome).

Discriminant validity

Discriminant validity , on the other hand, is the extent to which scores on a measure are not  correlated with measures of variables that are conceptually distinct. For example, self-esteem is a general attitude toward the self that is fairly stable over time. It is not the same as mood, which is how good or bad one happens to be feeling right now. So people’s scores on a new measure of self-esteem should not be very highly correlated with their moods. If the new measure of self-esteem were highly correlated with a measure of mood, it could be argued that the new measure is not really measuring self-esteem; it is measuring mood instead.

Increasing the reliability and validity of measures

We have reviewed the types of errors and how to evaluate our measures based on reliability and validity considerations. However, what can we do while selecting or creating our tool so that we minimize the potential of errors? Many of our options were covered in our discussion about reliability and validity. Nevertheless, the following table provides a quick summary of things that you should do when creating or selecting a measurement tool.

Table 11.2 Increasing the reliability and validity of items
Make sure that you engage in a rigorous literature review so that you understand the concept that you are studying. This means understanding the different ways that your concept may manifest itself. This review should include a search for existing instruments.

Note: If an instrument is standardized, that means it has been rigorously studied and tested.

Use content experts to review your instrument. This is a good way to check the face validity of your items. Additionally, content experts can also help you understand the content validity.
Pilot test your instrument on a sufficient number of people and get detailed feedback. Ask your group to provide feedback on the wording and clarity of items. Keep detailed notes and make adjustments BEFORE you administer your final tool.
Provide training for anyone helping to administer your tool. You should provide those helping you with a written research protocol that explains all of the steps of the project. You should also problem solve and answer any questions that those helping you may have. This will increase the chances that your tool will be administered in a consistent manner.
When thinking of items, use a higher level of measurement, if possible. This will provide more information. 
Use multiple indicators for a variable. Think about the number of items that you will include in your tool.
Conduct an item-by-item assessment of multiple-item measures. When you do this assessment, think about each word and how it changes the meaning of your item. 
  • In measurement, two types of errors can occur: systematic, which we might be able to predict, and random, which are difficult to predict but can sometimes be addressed during statistical analysis.
  • There are two distinct criteria by which researchers evaluate their measures: reliability and validity. Reliability is consistency across time (test-retest reliability), across items (internal consistency), and across researchers (interrater reliability). Validity is the extent to which the scores actually represent the variable they are intended to.
  • Validity is a judgment based on various types of evidence. The relevant evidence includes the measure’s reliability, whether it covers the construct of interest, and whether the scores it produces are correlated with other variables they are expected to be correlated with and not correlated with variables that are conceptually distinct.
  • Once you have used a measure, you should reevaluate its reliability and validity based on your new data. Remember that the assessment of reliability and validity is an ongoing process.
  • Provide a clear statement regarding the reliability and validity of these tools. What strengths did you notice? What were the limitations?
  • Think about your target population . Are there changes that need to be made in order for one of these tools to be appropriate for your population?
  • If you decide to create your own tool, how will you assess its validity and reliability?

11.5 Ethical and social justice considerations for measurement

  • Identify potential cultural, ethical, and social justice issues in measurement.

Just like with other parts of the research process, how we decide to measure what we are researching is influenced by our backgrounds, including our culture, implicit biases, and individual experiences. For me as a middle-class, cisgender white woman, the decisions I make about measurement will probably default to ones that make the most sense to me and others like me, and thus measure characteristics about us most accurately if I don't think carefully about it. There are major implications for research here because this could affect the validity of my measurements for other populations.

This doesn't mean that standardized scales or indices, for instance, won't work for diverse groups of people. What it means is that researchers must not ignore difference in deciding how to measure a variable in their research. Doing so may serve to push already marginalized people further into the margins of academic research and, consequently, social work intervention. Social work researchers, with our strong orientation toward celebrating difference and working for social justice, are obligated to keep this in mind for ourselves and encourage others to think about it in their research, too.

This involves reflecting on what we are measuring, how we are measuring, and why we are measuring. Do we have biases that impacted how we operationalized our concepts? Did we include st a keholders and gatekeepers in the development of our concepts? This can be a way to gain access to vulnerable populations. What feedback did we receive on our measurement process and how was it incorporated into our work? These are all questions we should ask as we are thinking about measurement. Further, engaging in this intentionally reflective process will help us maximize the chances that our measurement will be accurate and as free from bias as possible.

The NASW Code of Ethics discusses social work research and the importance of engaging in practices that do not harm participants. [14] This is especially important considering that many of the topics studied by social workers are those that are disproportionately experienced by marginalized and oppressed populations. Some of these populations have had negative experiences with the research process: historically, their stories have been viewed through lenses that reinforced the dominant culture's standpoint. Thus, when thinking about measurement in research projects, we must remember that the way in which concepts or constructs are measured will impact how marginalized or oppressed persons are viewed.  It is important that social work researchers examine current tools to ensure appropriateness for their population(s). Sometimes this may require researchers to use or adapt existing tools. Other times, this may require researchers to develop completely new measures. In summary, the measurement protocols selected should be tailored and attentive to the experiences of the communities to be studied.

But it's not just about reflecting and identifying problems and biases in our measurement, operationalization, and conceptualization - what are we going to  do about it? Consider this as you move through this book and become a more critical consumer of research. Sometimes there isn't something you can do in the immediate sense - the literature base at this moment just is what it is. But how does that inform what you will do later?

  • Social work researchers must be attentive to personal and institutional biases in the measurement process that affect marginalized groups.
  • What are the potential social justice considerations surrounding your methods?
  • What are some strategies you could employ to ensure that you engage in ethical research?
  • Milkie, M. A., & Warner, C. H. (2011). Classroom learning environments and the mental health of first grade children. Journal of Health and Social Behavior, 52 , 4–22 ↵
  • Kaplan, A. (1964). The conduct of inquiry: Methodology for behavioral science . San Francisco, CA: Chandler Publishing Company. ↵
  • Earl Babbie offers a more detailed discussion of Kaplan’s work in his text. You can read it in: Babbie, E. (2010). The practice of social research (12th ed.). Belmont, CA: Wadsworth. ↵
  • Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 140, 1–55. ↵
  • Engel, R. & Schutt, R. (2013). The practice of research in social work (3rd. ed.) . Thousand Oaks, CA: SAGE. ↵
  • Engel, R. & Shutt, R. (2013). The practice of research in social work (3rd. ed.). Thousand Oaks, CA: SAGE. ↵
  • Sullivan G. M. (2011). A primer on the validity of assessment instruments. Journal of graduate medical education, 3 (2), 119–120. doi:10.4300/JGME-D-11-00075.1 ↵
  • https://www.socialworkers.org/about/ethics/code-of-ethics/code-of-ethics-english ↵

The process by which we describe and ascribe meaning to the key facts, concepts, or other phenomena that we are investigating.

In measurement, conditions that are easy to identify and verify through direct observation.

In measurement, conditions that are subtle and complex that we must use existing knowledge and intuition to define.

The process of determining how to measure a construct that cannot be directly observed

Conditions that are not directly observable and represent states of being, experiences, and ideas.

“a logical grouping of attributes that can be observed and measured and is expected to vary from person to person in a population” (Gillespie & Wagner, 2018, p. 9)

The level that describes the type of operations can be conducted with your data. There are four nominal, ordinal, interval, and ratio.

Level of measurement that follows nominal level. Has mutually exclusive categories and a hierarchy (order).

A higher level of measurement. Denoted by having mutually exclusive categories, a hierarchy (order), and equal spacing between values. This last item means that values may be added, subtracted, divided, and multiplied.

The highest level of measurement. Denoted by mutually exclusive categories, a hierarchy (order), values can be added, subtracted, multiplied, and divided, and the presence of an absolute zero.

variables whose values are organized into mutually exclusive groups but whose numerical values cannot be used in mathematical operations.

variables whose values are mutually exclusive and can be used in mathematical operations

The differerence between that value that we get when we measure something and the true value

Errors that are generally predictable.

Errors lack any perceptable pattern.

The ability of a measurement tool to measure a phenomenon the same way, time after time. Note: Reliability does not imply validity.

The extent to which scores obtained on a scale or other measure are consistent across time

The extent to which different observers are consistent in their assessment or rating of a particular characteristic or item.

The extent to which the scores from a measure represent the variable they are intended to.

The extent to which a measurement method appears “on its face” to measure the construct of interest

The extent to which a measure “covers” the construct of interest, i.e., it's comprehensiveness to measure the construct.

The extent to which people’s scores on a measure are correlated with other variables (known as criteria) that one would expect them to be correlated with.

A type of Criterion validity. Examines how well a tool provides the same scores as an already existing tool.

A type of criterion validity that examines how well your tool predicts a future criterion.

the group of people whose needs your study addresses

individuals or groups who have an interest in the outcome of the study you conduct

the people or organizations who control access to the population you want to study

Graduate research methods in social work Copyright © 2020 by Matthew DeCarlo, Cory Cummings, Kate Agnelli is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of nihpa

Causality and Causal Inference in Social Work: Quantitative and Qualitative Perspectives

Lawrence a. palinkas.

1 School of Social Work, University of Southern California, Los Angeles, CA, USA

Achieving the goals of social work requires matching a specific solution to a specific problem. Understanding why the problem exists and why the solution should work requires a consideration of cause and effect. However, it is unclear whether it is desirable for social workers to identify cause and effect, whether it is possible for social workers to identify cause and effect, and, if so, what is the best means for doing so. These questions are central to determining the possibility of developing a science of social work and how we go about doing it. This article has four aims: (1) provide an overview of the nature of causality; (2) examine how causality is treated in social work research and practice; (3) highlight the role of quantitative and qualitative methods in the search for causality; and (4) demonstrate how both methods can be employed to support a “science” of social work.

In defining the mission of the profession of social work to enhance human well-being and help meet the basic needs of all people, the Preamble of the National Association of Social Workers Code of Ethics (2013) places great emphasis on the environmental forces that create, contribute to, and address problems in living. Implied in this emphasis is the assumption of a causal link between these environmental forces and the problems they create or contribute to. For instance, when faced with the challenge of providing care to a client with a depressive disorder, we first attempt to identify the factors that contributed to the onset of the disorder. Furthermore, to address these problems, we must appropriately and effectively match a specific solution to a specific problem. This, too, requires us to consider a causal link between the solution and its outcome (elimination of the problem or mitigation and treatment of its impacts). Thus, a client with a depressive disorder may benefit from treatment that addresses the symptoms, which may involve pharmacotherapy and/or psychotherapy.

However, the complexity of the issues we face as social workers forces us to consider whether it is desirable, much less even possible, to identify cause and effect, and if so, what is the best means for doing so. The issue of desirability has been raised in conjunction with criticism of the value of the scientific method in general and scientifically based evidencebased practice in particular ( Heineman, 1981 ; Karger, 1983 ; Otto & Ziegler, 2008 ; Tyson, 1995 ). The issue of feasibility has been raised in conjunction with the claim that the complexity of social phenomena renders the use of scientific methods as problematic and incomplete ( Otto & Ziegler, 2008 ; Rosen, 2003 ). These questions are by no means limited to social work, but they are central to our consideration of whether it is possible to develop a science of social work and, if so, how we go about doing it.

This article has four aims: (1) provide an overview of the nature of causality and causal inference; (2) examine how causality and causal inference are treated in social work research and practice; (3) highlight the role of quantitative and qualitative methods in the search for causality; and (4) demonstrate how both methods can be employed to support a “science” of social work.

The Nature of Causality and Causal Inference

The human sciences, including social work, place great emphasis on understanding the causes and effects of human behavior, yet there is a lack of consensus as to how cause and effect can and should be linked ( Parascandola & Weed, 2001 ; Salmon, 1998 ; Susser, 1973 ). What little consensus exists seems to be that effects are assumed to be consequences of causes. Causes and effects may be singular in nature (e.g., cigarette smoking causes cancer) or they may be multifactoral (e.g., cancer is caused by genetic predisposition, certain health behaviors like cigarette smoking and diet, and exposure to environmental hazards like toxic chemicals; cigarette smoking causes cancer, hypertension, diabetes, and emphysema). This relationship can be viewed both spatially and temporally. For instance, the presence of a depressive disorder in an individual may have some determinants that are distal (genetic predisposition, childhood experience) and some determinants that are proximal (e.g., recent life events like loss of employment, death of spouse) to the current episode of depressive symptoms. A link between one or more causes and one or more effects may also be viewed as direct and indirect (mediating, moderating; Koeske, 1993 ; Kramer, 1988 ; Susser, 1973 ). Thus, while the death of a spouse may contribute to the onset of a depressive disorder, it may do so directly or indirectly by virtue that it deprives the survivor of an important source of social support. Likewise, the death of a spouse may contribute differentially to the risk of a depressive disorder depending on whether the survivor is a male or female. Causal inference, in turn, may be viewed as the process of establishing the link between the perceived cause or causes and the perceived effect or effects.

Causation may also be viewed from the perspective of the distinction between necessary and sufficient causes. For instance, “a given exposure is considered a necessary cause of an outcome if the outcome does not occur in its absence. It is a sufficient cause if it always (i.e., in all individuals) leads to an outcome without requiring the presence or absence of any other factors” ( Kramer, 1988 , p. 256). However, causes may also be multifactorial, in which case causes are neither necessary nor sufficient for any given individual. The necessary and sufficient cause definitions assume that all causes are deterministic, while a probabilistic view of causation is one in which a cause increases the probability or chance that its effect will occur but may be neither necessary nor sufficient for its occurrence ( Parascandola & Weed, 2001 ). Kramer (1988) and others ( Kleinbaum, Kupper, & Morgenstern, 1982 ; Parascandola & Weed, 2001 ) argue that a probabilistic definition of causation is more consistent with the aims of applied human sciences like public health.

Our current notions of causation and causal inference generally owe their intellectual origins to the British social philosopher David Hume (1738/1975) . Hume's criteria of causation emphasize the importance of a temporal priority in which causes must necessarily occur or exist prior to the occurrence or existence of an effect (e.g., the cause and effect must be contiguous in space and time, the cause must be prior to the effect, and the relationship between cause and effect must be constant). Hume's criteria also stresses the one-to-one relationship between cause and effect (e.g., the same cause always produces the same effect, and the same effect only occurs in the presence of the same cause; where several different objects produce the same effect, it must be the result of some characteristic the causes have in common. However, Hume's criteria do not specify the tools used to describe that relationship—in other words, it does not provide any guidance on the methods used to determine if a relationship exists between two variables or phenomena and if the nature of that relationship is causal, correlational, or coincidental. A more contemporary version of these criteria was developed by the British biostatistician Austin Bradford Hill that is widely used in the field of public health ( Hill, 1965 ; see Table 1 ). Like Hume, these criteria give priority to the temporal relationship between a cause and effect (i.e., the first must precede the second) and to specificity (i.e., a single cause produces a specific effect), but also suggest the importance of measurement or quantification of the relationship (i.e., strength of association and existence of a doseresponse relationship) and experimental designs (They also suggest that support for the causal inference requires confirmation using other types of information or knowledge (i.e., consistency, plausibility, and coherence).

Hill's Criteria of Causation.

TitleDescription
Temporal relationshipExposure must always precede the outcome
StrengthThe stronger the association, the more likely that the relation of to is causal
Dose–response relationshipAn increasing amount of exposure increases the risk
ConsistencyThe association is consistent when results are replicated in studies in different settings using different methods
PlausibilityThe association has some theoretical basis or agrees with currently accepted understanding of pathological processes
Consideration of alternative explanationsHave other possible explanations been considered and ruled out?
ExperimentThe condition can be altered (prevented or mitigated) by an appropriate experimental regimen
SpecificityA single putative cause produces a specific effect
CoherenceThe association should be compatible with existing theory and knowledge

Source: Hill (1965) .

Causality in Social Work Research and Practice

Lewis (1975) argues that causal inference is an essential part of social work practice as well as social work research. However, the association between causality and causal inference in the field of social work and logical positivism and critical rationalism with its emphasis on universal laws has subjected the search for causal linkages to criticism from those who view it as deterministic, limited in its ability to address the complexity of social phenomena, and inconsistent with the goals of the profession ( Otto & Ziegler, 2008 ). As Padgett (2008 , p. 168) observes, “anti-positivistic skeptics question whether the search for causation is plausible or desirable, given the postmodern premise that facts are ‘fictitious’ ( Lofland & Lofland, 1995 ).” Nevertheless, embedded in much of social work research is an implicit understanding that actions have consequences and that most of the characteristics of the human condition can be linked directly or indirectly to one or more factors or events that are in some way responsible for that condition.

In social work research, randomized controlled trials (RCTs) have been used primarily to demonstrate causal linkages between specific interventions that are treated as independent variables and specific outcomes that are treated as dependent variables. For instance, Ell and colleagues (2010) assessed the effectiveness of an evidence-based, socioculturally adapted, collaborative depression care intervention for treatment of depression and diabetes in a group of 387 predominately Hispanic primary care patients recruited from two safety net clinics. The causal chain tested in this study was that the intervention (which included problem-solving therapy and/or antidepressant medication based on a stepped-care algorithm; first-line treatment choice; telephone treatment response, adherence, and relapse prevention follow-up over 12 months; plus systems navigation assistance) resulted in an improvement in mood (or a reduction in depressive symptoms), which, in turn, resulted in improvement in Hemoglobin A1C levels. In this instance, improvement in H1C levels was a direct effect of the reduction in depressive symptoms and an indirect effect of the depression treatment intervention. In another example, Glisson and colleagues (2010) conducted a RCT of the effectiveness of Multisystemic Therapy (MST) and the Availability, Responsiveness, and Continuity (ARC) organizational intervention in reducing problem behavior in delinquent youth residing in 14 rural counties in Tennessee, using a 2 × 2 design in which youth were randomized into receiving MST or treatment as usual, and counties were randomized into receiving the ARC intervention. A multilevel mixed effects regression analysis of 6-month treatment outcomes found that total youth problem behavior in the MST plus ARC condition was at a nonclinical level and significantly lower than in other conditions. The causal chain tested in this study was that the ARC intervention resulted in the successful implementation of MST, which, in turn, resulted in a reduction of youth problem behavior. In this instance, reduction of youth problem behaviors was a direct effect of the MST intervention and an indirect effect of the ARC organizational intervention.

However, qualitative methods have also been used in social work research to make causal inferences linking two sets of phenomena. For instance, Gutierrez, GlenMaye, and DeLois (1995) conducted interviews with administrators and staff at six different agencies to identify elements of the organizational context of empowerment practice. Using a modified grounded theory approach, they identified four sets of factors (funding sources, social environment, intrapersonal issues, and interpersonal issues) that constitute barriers to maintaining and implementing an empowerment-based approach in social work practice. For instance, “differing philosophies or politics of more traditional service providers (cause) negatively affected the willingness or ability of empowerment-based agencies to refer clients to other services (effect)” Gutierrez, GlenMaye, & DeLois, 1995 , p. 252, parentheses added). Alaggia and Millington (2008) conducted a phenomenological analysis of the lived experience of 14 men who were sexually abused in childhood to “generate knowledge … on the effects of boyhood sexual abuse on the present lives of men, and to understand how those effects found expression in men's everyday lives” (p. 267). In this instance, sexual abuse during childhood is treated as the cause and anger and rage, sexual disturbance and ambivalence, and loss and hope were identified as effects. The attempt to examine effects of childhood sexual abuse using a phenomenological approach is especially noteworthy because the focus on interpretative understanding or verstehen is often seen as a rejection of causal understanding (cf. Otto & Ziegler, 2008 ).

Qualitative and Quantitative Perspectives on Causality

Although these two studies are representative of the use of different qualitative methodological approaches to identify connections between certain phenomena and certain outcome, in social work as in other fields, priority in the determination of causality is given to quantitative methods in general and RCTs in particular. Otto and Ziegler (2008) note that RCTs are considered the best form of evidence of practice effectiveness ( McNeece & Thyer, 2004 ) and, therefore, of causality. “These designs serve to control or cancel out and differences that are effects of other Events (Z) to assess whether Event X (cause)—as independent variable—is nonspuriously conjunct with Event Y (effect) in the context of a controlled ceteris paribus condition” ( Otto & Ziegler, 2008 , p. 275). They further argue that the criteria of using the RCT design to determine causal connections between an intervention and its outcomes can hardly be applied to qualitative research such as ethnographic studies or deep hermeneutical interviews ( Otto & Ziegler, 2008 , p. 275). Consequently, qualitative studies are placed on a lower rank of evidence of causality ( McNeece & Thyer, 2004 ), and below what Cook and Campbell (1979) considered as the minimum interpretable design necessary and adequate for drawing valid conclusions about the effectiveness of treatments ( Otto & Ziegler, 2008 , p. 275).

However, there are inherent limitations to relying on RCTs to determine causality in social work research. Circumstances may preclude the use of the RCT design, including small sample sizes, especially in multilevel studies where single individuals are embedded in organizations like schools or agencies; concerns about external validity; the ethics of providing service to one group and denying the same service to another group of clients; the expense and logistics involved in conducting such research; the unwillingness of participants or organizations to accept randomization; and the expense and logistical challenges in conducting longitudinal follow-up assessments ( Glasgow, Magid, Beck, Ritzwoller, & Estabrooks, 2005 ; Landsverk, Brown, Chamberlain, Palinkas, & Horwitz, 2012 ; Palinkas & Soydan, 2012 ).

Furthermore, causal models can be constructed using quantitative or qualitative data. In the example presented in Figure 1 , the model of social capital effects on psychosocial adjustment of Chinese migrant children was developed by Wu, Palinkas, and He (2010) using structural equation modeling. On the other hand, using qualitative data collected from leaders of county-level child welfare, mental health and juvenile justice systems in California, Palinkas and colleagues (2014) also developed a model of interorganizational collaboration that posited causal linkages between characteristics of the outer context (availability of funding, legislative mandates, size of jurisdiction, and extent of responsibility for same client population), inner context (characteristics of the participating organizations and individual members of those organizations), and characteristics of the collaboration itself (focus on a single vs. multiple initiatives, formality, frequency of interaction) and the structure of social networks that, in turn, are linked to the pace and progress of implementation of evidence-based practices (see Figure 2 ).

An external file that holds a picture, illustration, etc.
Object name is nihms669969f1.jpg

Standardized solutions for the structural model of social capital effects on the psychosocial adjustment of Chinese migrant children. Source: Wu, Palinkas, and Xe (2010) .

An external file that holds a picture, illustration, etc.
Object name is nihms669969f2.jpg

Heuristic model of interorganizational collaboration for implementation of evidence-based practices. Source: Palinkas et al. (2014) .

Finally, not all qualitative methodologists have rejected the notion that the construction of causal inferences is both desirable and possible. Miles and Huberman (1994 , p. 4), for instance, “aim to account for events, rather than simply to document their sequence. We look for an individual or a social process, a mechanism, a structure at the core of events that can be captured to provide a causal description of the forces at work” (italics in original). Sayer (2000) argues that causal explanation is not only legitimate in qualitative research, but a particular strength of this approach, although it uses a different strategy from quantitative research, based on a process rather than a variance concept of causality. Ragin's (1987) qualitative comparative analysis involves representing each case as a combination of causes and effects that can then be compared with each other. Another qualitative comparative method, analytic induction, is described as an “exhaustive examination of cases in order to prove universal, causal generalizations” ( Vidich & Lyman, 2000 , p. 57). Denzin (1978) considered analytic induction to be one of three major strategies for establishing the existence of a causal relationship, the other two being the statistical method and the experimental method. Even Lofland (1971) , considered a skeptic of the search for causation, argued that the strong suit of the qualitative researcher is the ability to provide order, rich descriptive detail, stating that “it is perfectly appropriate that one be curious about causes, so long as one recognizes that whatever account or explanation he develops is conjecture” (p. 62).

It would seem, therefore, that quantitative and qualitative methods each present certain advantages and disadvantages in making causal inferences whether one identifies with a logical positivist or postpositivist or a postmodernist, social constructivist view of human nature, or is more at ease with the process of counting quantitative data or interpreting qualitative data. However, as no single method is adequate to the challenge of linking cause and effect in a deterministic or probabilistic fashion, it is perhaps prudent to heed the advice of Campbell (1999) , who maintained that because proving causality with certainty in explaining social phenomena is problematic and because all methods for proving causality are imperfect, multiple methods, both quantitative and qualitative, are needed to generate and test theory, improve understanding over time of how the world operates, and support informed policy making and social program decision making.

Causality and the Science of Social Work

The path to causality can be viewed as moving across a series of steps that begin with identification and proceed to description, explanation generation, explanation testing, and prescription or control. Identification first occurs through reports or studies that point to the existence of a previously unknown or unrecognized phenomenon. Description of the phenomenon may involve qualitative (narratives, case studies) and/or quantitative (frequencies, percentages) data. Both methodological approaches may be employed in the next step, which is the identification of associations between variables and the generation of hypotheses to be tested that can help to explain why the variables are in association with one another. The next step is then to test the hypotheses and the validity of the presumed explanation. This step usually requires the use of prospective longitudinal designs and the use of quantitative methods. The final step is the construction of experimental conditions that enable the investigator to simultaneously control for the possibility of alternate explanations for the observed association between one variable presumed to be the cause and the other variable or variables presumed to be the effect. This step usually requires the use of the RCT design and the use of quantitative methods.

One can conceive of two separate arguments that link these discrete steps in a meaningful way. In the first argument, the further we proceed along the path of scientific inquiry, the more we rely on quantitative methods to make causal inferences and support the existence of a causal link/relationship. However, as noted previously, there are inherent limitations to relying on RCTs to determine causality in social work research. In the second argument, qualitative and quantitative methods each make distinct contributions to the task of proving causality. Thus, in using quantitative methods, priority is placed on confirmation of hypothesis through experimentation and a narrow or segmented focus on potential causal explanations, while in using qualitative methods, priority is placed on exploration of phenomenon and generation of hypotheses through observation and a broad or holistic focus on the social context in which causal links occur.

Although they may differ with respect to the value placed on each set of methods (with the quantitative methods being considered dominant in the first argument and coequal with qualitative methods in the second argument), both arguments posit a relationship between qualitative and quantitative methods and both assume that each set of methods has a role to play in understanding causality and in making causal inferences. Relationships between the two sets of methods have been increasingly articulated using the terminology of mixed methods, defined as the integrated use of quantitative and qualitative methods in ways that provide greater understanding or insight into a phenomenon that might be obtainable from either method used alone ( Palinkas, Horwitz, Chamberlain, Hurlburt, & Landsverk, 2011 ). Cresswell and Plano Clark (2011) identify five different types of mixed methods designs. A Triangulation design is used when there is a need to compare results from different sources of information regarding the hypothesized same phenomenon or parameter to seek corroboration. An Explanatory or complementary design is used to understand a phenomenon more comprehensively or completely. An Exploratory design is used for instrument, taxonomy, or typology development , where qualitative data serve as an initial exploration to identify variables, constructs, taxonomies, or instruments for a subsequent quantitative study phase. An Embedded or Expansion design is used to assess hypothesized different phenomena or parameters using different methods. Finally, an Initiation or Transformative design is used to understand a phenomenon more insightfully, discovering new ideas, perspectives, and meanings. Each of these designs may be used to identify, describe, explain, verify, and control the relationships linking one phenomenon or set of phenomena to another phenomenon or set of phenomena in a causal fashion. This combined use of quantitative and qualitative methods may occur simultaneously, in which one method usually drives the project theoretically with the supplemental project designed to elicit information that the base method cannot achieve or for the results to inform in greater detail about one part of the dominant project, or sequentially, in which the method that theoretically drives the project is used first, with the second method designed to resolve problems/issues uncovered by the first study or to provide a logical extension from the findings of the first study.

An illustration of the use of mixed method designs to examine causality and causal inference can be found in the Child STEPS Effectiveness Trial (CSET), carried out by the Research Network on Youth Mental Health and funded by the John D. and Catherine T. MacArthur Foundation ( Chorpita et al., 2013 ; Weisz et al., 2012 ). The CSET focused on children aged 8–13 who had been referred for treatment of problems involving disruptive conduct, depression, anxiety, or any combination of these. Ten clinical service organizations in Honolulu and Boston, 84 therapists, and 174 youths participated in the project. Youth participants were treated with the usual treatment procedures in their settings or with one or more of three selected evidence-based treatments (EBTs): cognitive-behavioral therapy (CBT) for anxiety, CBT for depression, and behavioral parent training (BPT) for conduct problems. These evidence-based treatments were tested in two forms: standard manual treatment (standard), using full treatment manuals; and modular treatment (modular) in which therapists learn all the component practices of the evidence-based treatments but individualize the use of the components for each child, guided by a clinical algorithm and measurement feedback on practices and clinical progress. A cluster randomization design was employed with therapists assigned to one of three conditions (usual care, standard, and modular) and youth who met study criteria randomized to treatment delivered by one of these three groups of therapists.

Mixed effects regression analyses showed significantly superior outcome trajectories for modular treatment (cause) relative to usual care on weekly measures of a standardized Brief Problem Checklist and a patient-generated Top Problems Assessment (effect), and youths receiving modular treatment had significantly fewer diagnoses than usual care youths at posttreatment ( Chorpita et al., 2013 ; Weisz et al., 2012 ). In contrast, none of these outcomes showed significant differences between standard treatment and usual care. Follow-up tests also showed significantly better outcomes for modular treatment than standard treatment on the weekly trajectory measures. In general, the modular approach outperformed usual care and the standard approach on the clinical outcome measures, and the standard approach did not outperform usual care.

Although the use of the modular approach to evidence-based treatment was assumed to have caused an improvement in behavioral health outcomes in this population, the quantitative data alone could not explain why the modular approach was more successful than the standard approach. To address that question, a qualitative study of the process of EBT dissemination and implementation was embedded in the RCT. Semistructured interviews and focus groups were conducted with included 38 therapists, six project supervisors, and eight clinical organization directors or senior administrators to identify patterns of use of the EBTs once the randomized trial had been concluded. Twenty-six of the 28 therapists (93%) who had been assigned to the standard or the modular conditions reported using the techniques with nonstudy cases subsequent to the conclusion of the trial. However, the pattern of use among all therapists, including those in the standard manualized condition, was more consistent with the modular approach. While all of the therapists in these two conditions thought the EBTs were helpful, what distinguished the two groups of therapists was the perception that the modular approach (cause) allowed for more flexibility, accommodation, and control over the therapeutic alliance with clients (effects) than the standard approach. Both therapists and supervisors felt that the modular approach gave them more “license” to negotiate with researchers with respect to circumstances in which the modules could themselves be modified or, more often than not supplemented with additional materials and techniques acquired through experience with working with similar clients ( Palinkas et al., 2013 ).

We began by asking three questions. The first was whether it is desirable for social workers to identify cause and effect. It is desirable if we believe social work to be an applied, empirically grounded social and cultural science aiming at both causal explanation and interpretative understanding ( Otto & Ziegler, 2008 , p. 273), one that includes elements of logical positivism and postmodernist social constructivism. It is also desirable if the foundation of our profession is to change the lives of our clients for the better. As Kramer (1988 , p. 255) makes a similar argument for examining causality in public health, stating that “an understanding of cause is essential for change … A deliberate intervention (change in exposure) will be successful in altering outcome only to the extent that the exposure is a true cause of that outcome.” Alternatively, we might question whether it is possible to develop and implement a solution without a comprehensive understanding of the problem one is trying to solve (Can we achieve y without understanding x ?). To answer that question, we would have to determine whether that understanding can be comprehensive without understanding the cause of a problem (Is an understanding of x necessary to produce y ?). Further, even if the solution mitigated the consequences of the problem (e.g., reducing symptoms of depression or anxiety), is it truly an effective solution if the cause remains unaddressed (Can we produce y without changing x )?

The second question we addressed was whether it is possible for social workers to determine causality. Social workers face inherent challenges in adopting exclusively positivist criteria for determining causality. Making connections between a cause and an effect is possible whether one adheres to a positivist or a social constructivist view of society and behavior. If understanding cause and effect is the foundation of any science, then that understanding is possible if it is seen as a process and not as a specific outcome, especially if the process and outcome are both context-specific.

Finally, we asked about the best means of determining causality or making causal inferences if it is both possible and desirable for social workers to do so. The answer is that both qualitative and quantitative methods can and should be used to fulfill specific roles in that process. Qualitative methods would be especially important in the early exploratory stages of scientific inquiry and for providing in-depth understanding of the causal chain and the context in which it exists. Quantitative methods would be especially important in the later confirmatory stages of scientific inquiry and for generalizing findings to other populations in other settings. Both methods are fundamental to a science of social work.

The integrated use of quantitative and qualitative methods is certainly not a novel concept. Haight (2010) , for instance, called for the integration of postpositivist perspectives of critical realism, with an emphasis on quantitative methods and research designs, and interpretative perspectives with an emphasis on qualitative or mixed research designs and methods. While “postpositivist research using quantitative methods can help to identify generally effective interventions and eliminate the use of harmful or ineffective interventions, … interpretist research using qualitative methods can enhance understanding of the ways in which cultural context (cause) interact with interventions, resulting in diverse outcomes (effects): ( Haight, 2010 , p. 102). Epstein's (2009) model of “evidence-informed practice” calls for the integrated use of evidence-based practice with its emphasis on standardized quantitative measures and RCT designs and reflective practice with its emphasis on qualitative observation. What is novel here is that the process of making causal inferences is not limited to quantitative methods or RCT designs.

Perhaps the greatest challenge we face in creating a science of social work is being faithful to the principles of scientific inquiry on one hand while simultaneously being responsive to the needs, activities, traditions, and multiple perspectives of our discipline. The diversity of these needs, activities, traditions, and perspectives reflect the complexity of the problems we seek to solve and the underlying factors that are responsible for those problems. This complexity makes it difficult to identify single or specific causes of single or specific effects. However, while this complexity may be viewed as an obstacle to the creation of a science of social work, it also represents a unique opportunity to create a science that acknowledges the importance of qualitative as well as quantitative methods, of practice-based evidence as well as evidence-based practice, and explanation grounded in social constructivism as well as logical positivism.

Acknowledgments

Funding : The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Support for this article was provided by grants from the William T. Grant Foundation (Grant no. 10648: L. Palinkas, PI), National Institute of Mental Health (P30-MH074678: J. Landsverk, PI; and R01MH076158: P. Chamberlain, PI), and National Institute on Drug Abuse (P30 DA027828-01-A1: C. Hendricks Brown, PI).

Declaration of Conflicting Interests : The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

  • Alaggia R, Millington G. Male child sexual abuse: A phenomenology of betrayal. Clinical Social Work. 2008; 36 :265–275. [ Google Scholar ]
  • Campbell DT. Legacies of logical positivism and beyond. In: Campbell DT, Russo MJ, editors. Social experimentation. Thousand Oaks, CA: Sage; 1999. pp. 131–144. [ Google Scholar ]
  • Chorpita BF, Weisz JR, Daleiden EL, Schoenwald SK, Palinkas LA, Miranda J, et al. the Research Network on Youth Mental Health. Long term outcomes for the Child STEPs randomized effectiveness trial: A comparison of modular and standard treatment designs with usual care. Journal of Consulting and Clinical Psychology. 2013; 81 :999–1009. [ PubMed ] [ Google Scholar ]
  • Cook TD, Campbell DT. Quasi-experimentation: Design and analysis issues for field settings. Chicago, IL: Rand McNally; 1979. [ Google Scholar ]
  • Cresswell J, Plano Clark VL. Designing and conducting mixed method research. 2nd. Thousand Oaks, CA: Sage; 2011. [ Google Scholar ]
  • Denzin NK. The logic of naturalistic inquiry. In: Denzin NK, editor. Sociological methods. Thousand Oaks, CA: Sage; 1978. [ Google Scholar ]
  • Ell K, Katon W, Xie B, Lee PJ, Kapetanovic S, Guterman J, Chou CP. Collaborative care management of major depression among low-income, predominately Hispanic subjects with diabetes: A randomized controlled trial. Diabetes Care. 2010; 33 :706–713. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Epstein I. Promoting harmony where is commonly conflict: Evidence-informed practice as an integrative strategy. Social Work in Health Care. 2009; 48 :216–231. [ PubMed ] [ Google Scholar ]
  • Glasgow RE, Magid DJ, Beck A, Ritzwoller D, Estabrooks PA. Practical clinical trials for translating research to practice: Design and measurement recommendations. Medical Care. 2005; 43 :551–557. [ PubMed ] [ Google Scholar ]
  • Glisson C, Schoenwald SK, Hemmelgarn A, Green P, Dukes D, Armstrong KS, Chapman JE. Randomized trial of MST and ARC in a two-level evidence-based treatment implementation strategy. Journal of Consulting and Clinical Psychology. 2010; 78 :537–550. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Gutierrez L, GlenMaye L, DeLois K. The organizational context of empowerment practice: Implications for social work administration. Social Work. 1995; 40 :249–258. [ Google Scholar ]
  • Haight WL. The multiple roles of applied social science research in evidence-based practice. Social Work. 2010; 55 :101–103. [ PubMed ] [ Google Scholar ]
  • Heineman MH. The obsolete scientific imperative in social work research. Social Service Review. 1981; 55 :371–395. [ Google Scholar ]
  • Hill AB. The environment and disease: Association or causation? Proceedings of the Royal Society of Medicine. 1965; 58 :295–300. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hume D. A treatise of human nature: Reprinted from the original edition in three volumes and edited with an analytical index by L A Selby-Bigge. London, England: Oxford University Press; 1975. Original work published 1738. [ Google Scholar ]
  • Karger HJ. Science, research, and social work: Who controls the profession? Social Work. 1983; 28 :200–205. [ Google Scholar ]
  • Kleinbaum DG, Kupper LL, Morgenstern HL. Epidemiologic research: Principles and quantitative methods. Belmont CA: Lifetime Learning; 1982. [ Google Scholar ]
  • Koeske GF. Moderator variables in social work research. Journal of Social Service Research. 1993; 16 :159–178. [ Google Scholar ]
  • Kramer MS. Clinical epidemiology and biostatistics: A primer for clinical investigators and decision-makers. London, England: Springer-Verlag; 1988. [ Google Scholar ]
  • Landsverk J, Brown CH, Chamberlain P, Palinkas LA, Horwitz SM. Design and analysis in dissemination and implementation research. In: Brownson RC, Colditz GA, Proctor EK, editors. Dissemination and implementation research in health: Translating science to practice. New York, NY: Oxford University Press; 2012. pp. 225–260. [ Google Scholar ]
  • Lewis H. Reasoning in practice. Smith College Studies in Social Work. 1975; 46 :3–15. [ Google Scholar ]
  • Lofland J. Analyzing social settings: A guide to qualitative observation and analysis. Belmont, CA: Wadsworth; 1971. [ Google Scholar ]
  • Lofland J, Lofland L. Analyzing social settings: A guide to qualitative observation and analysis. 3rd. Belmont, CA: Wadsworth; 1995. [ Google Scholar ]
  • McNeece CA, Thyer BA. Evidence-based practice and social work. Journal of Evidence-Based Practice. 2004; 1 :7–25. [ PubMed ] [ Google Scholar ]
  • Miles MB, Huberman AM. Qualitative data analysis: An expanded sourcebook. 2nd. Thousand Oaks, CA: Sage; 1994. [ Google Scholar ]
  • National Association of Social Workers. Code of ethics. 2013 Retrieved from http://www.socialworkers.org/pubs/code/code.asp .
  • Otto HU, Ziegler H. The notion of causal impact in evidence-based social work: An introduction to the special issue on what works ? Research on Social Work Practice. 2008; 18 :273–277. [ Google Scholar ]
  • Padgett DK. Qualitative methods in social work research. 2nd. Thousand Oaks, CA: Sage; 2008. [ Google Scholar ]
  • Palinkas LA, Fuentes D, Garcia AR, Finno M, Holloway IW, Chamberlain P. Inter-organizational collaboration in the implementation of evidence-based practices among agencies serving abused and neglected youth. Administration and Policy in Mental Health and Mental Health Services Research. 2014; 41 :74–85. [ PubMed ] [ Google Scholar ]
  • Palinkas LA, Holloway IW, Rice E, Fuentes D, Wu Q, Chamberlain P. Social networks and implementation of evidence-based practices in public youth-serving systems: A mixed methods study. Implementation Science. 2011; 6 :113. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Palinkas LA, Horwitz SM, Chamberlain P, Hurlburt M, Landsverk J. Mixed method designs in mental health services research. Psychiatric Services. 2011; 62 :255–263. [ PubMed ] [ Google Scholar ]
  • Palinkas LA, Soydan H. Translation and implementation of evidence based practice. New York, NY: Oxford University Press; 2012. [ Google Scholar ]
  • Palinkas LA, Weisz JR, Chorpita B, Levine B, Garland A, Hoagwood KE, Landsverk J. Use of evidence-based treatments for youth mental health subsequent to a randomized controlled effectiveness trial: A qualitative study. Psychiatric Services. 2013; 64 :1110–1118. [ PubMed ] [ Google Scholar ]
  • Parascandola M, Weed DL. Causation in epidemiology. Journal of Epidemiology and Community Health. 2001; 55 :905–912. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Ragin CC. The comparative method: Moving beyond qualitative and quantitative strategies. Berkeley: University of California Press; 1987. [ Google Scholar ]
  • Rosen A. Evidence-based social work practice: Challenges and promise. Social Work Research. 2003; 27 :197–208. [ Google Scholar ]
  • Salmon WC. Causality and explanation. New York, NY: Oxford University Press; 1998. [ Google Scholar ]
  • Sayer A. Realism and social science. Thousand Oaks, CA: Sage; 2000. [ Google Scholar ]
  • Susser M. Causal thinking in the health sciences: Concepts and strategies of epidemiology. New York, NY: Oxford University Press; 1973. [ Google Scholar ]
  • Tyson K. New foundations for scientific social and behavioral research: The heuristic paradigm. Boston, MA: Allyn & Bacon; 1995. [ Google Scholar ]
  • Vidich AJ, Lyman SM. Qualitative methods: Their history in sociology and anthropology. In: Denzen NK, Lincoln YS, editors. Handbook of qualitative research. Thousand Oaks, CA: Sage; 2000. pp. 37–84. [ Google Scholar ]
  • Weisz JR, Chorpita BF, Palinkas LA, Schoenwald SK, Miranda J, Bearman SK, et al. the Research Network on Youth Mental Health. Testing standard and modular designs for psychotherapy with youth depression, anxiety, and conduct problems: A randomized effectiveness trial. Archives of General Psychiatry. 2012; 69 :274–282. [ PubMed ] [ Google Scholar ]
  • Wu Q, Palinkas LA, He X. An ecological examination of social capital effects on the academic achievement of Chinese migrant children. British Journal of Social Work. 2010; 40 :2578–2597. [ Google Scholar ]

Logo for Open Oregon Educational Resources

19 11. Quantitative measurement

Chapter outline.

  • Conceptual definitions (17 minute read)
  • Operational definitions (36 minute read)
  • Measurement quality (21 minute read)
  • Ethical and social justice considerations (15 minute read)

Content warning: examples in this chapter contain references to ethnocentrism, toxic masculinity, racism in science, drug use, mental health and depression, psychiatric inpatient care, poverty and basic needs insecurity, pregnancy, and racism and sexism in the workplace and higher education.

11.1 Conceptual definitions

Learning objectives.

Learners will be able to…

  • Define measurement and conceptualization
  • Apply Kaplan’s three categories to determine the complexity of measuring a given variable
  • Identify the role previous research and theory play in defining concepts
  • Distinguish between unidimensional and multidimensional concepts
  • Critically apply reification to how you conceptualize the key variables in your research project

In social science, when we use the term  measurement , we mean the process by which we describe and ascribe meaning to the key facts, concepts, or other phenomena that we are investigating. At its core, measurement is about defining one’s terms in as clear and precise a way as possible. Of course, measurement in social science isn’t quite as simple as using a measuring cup or spoon, but there are some basic tenets on which most social scientists agree when it comes to measurement. We’ll explore those, as well as some of the ways that measurement might vary depending on your unique approach to the study of your topic.

An important point here is that measurement does not require any particular instruments or procedures. What it does require is a systematic procedure for assigning scores, meanings, and descriptions to individuals or objects so that those scores represent the characteristic of interest. You can measure phenomena in many different ways, but you must be sure that how you choose to measure gives you information and data that lets you answer your research question. If you’re looking for information about a person’s income, but your main points of measurement have to do with the money they have in the bank, you’re not really going to find the information you’re looking for!

The question of what social scientists measure can be answered by asking yourself what social scientists study. Think about the topics you’ve learned about in other social work classes you’ve taken or the topics you’ve considered investigating yourself. Let’s consider Melissa Milkie and Catharine Warner’s study (2011) [1] of first graders’ mental health. In order to conduct that study, Milkie and Warner needed to have some idea about how they were going to measure mental health. What does mental health mean, exactly? And how do we know when we’re observing someone whose mental health is good and when we see someone whose mental health is compromised? Understanding how measurement works in research methods helps us answer these sorts of questions.

As you might have guessed, social scientists will measure just about anything that they have an interest in investigating. For example, those who are interested in learning something about the correlation between social class and levels of happiness must develop some way to measure both social class and happiness. Those who wish to understand how well immigrants cope in their new locations must measure immigrant status and coping. Those who wish to understand how a person’s gender shapes their workplace experiences must measure gender and workplace experiences (and get more specific about which experiences are under examination). You get the idea. Social scientists can and do measure just about anything you can imagine observing or wanting to study. Of course, some things are easier to observe or measure than others.

importance of quantitative research in social work

Observing your variables

In 1964, philosopher Abraham Kaplan (1964) [2] wrote The   Conduct of Inquiry,  which has since become a classic work in research methodology (Babbie, 2010). [3] In his text, Kaplan describes different categories of things that behavioral scientists observe. One of those categories, which Kaplan called “observational terms,” is probably the simplest to measure in social science. Observational terms are the sorts of things that we can see with the naked eye simply by looking at them. Kaplan roughly defines them as conditions that are easy to identify and verify through direct observation. If, for example, we wanted to know how the conditions of playgrounds differ across different neighborhoods, we could directly observe the variety, amount, and condition of equipment at various playgrounds.

Indirect observables , on the other hand, are less straightforward to assess. In Kaplan’s framework, they are conditions that are subtle and complex that we must use existing knowledge and intuition to define. If we conducted a study for which we wished to know a person’s income, we’d probably have to ask them their income, perhaps in an interview or a survey. Thus, we have observed income, even if it has only been observed indirectly. Birthplace might be another indirect observable. We can ask study participants where they were born, but chances are good we won’t have directly observed any of those people being born in the locations they report.

Sometimes the measures that we are interested in are more complex and more abstract than observational terms or indirect observables. Think about some of the concepts you’ve learned about in other social work classes—for example, ethnocentrism. What is ethnocentrism? Well, from completing an introduction to social work class you might know that it has something to do with the way a person judges another’s culture. But how would you  measure  it? Here’s another construct: bureaucracy. We know this term has something to do with organizations and how they operate but measuring such a construct is trickier than measuring something like a person’s income. The theoretical concepts of ethnocentrism and bureaucracy represent ideas whose meanings we have come to agree on. Though we may not be able to observe these abstractions directly, we can observe their components.

Kaplan referred to these more abstract things that behavioral scientists measure as constructs.  Constructs  are “not observational either directly or indirectly” (Kaplan, 1964, p. 55), [4] but they can be defined based on observables. For example, the construct of bureaucracy could be measured by counting the number of supervisors that need to approve routine spending by public administrators. The greater the number of administrators that must sign off on routine matters, the greater the degree of bureaucracy. Similarly, we might be able to ask a person the degree to which they trust people from different cultures around the world and then assess the ethnocentrism inherent in their answers. We can measure constructs like bureaucracy and ethnocentrism by defining them in terms of what we can observe. [5]

The idea of coming up with your own measurement tool might sound pretty intimidating at this point. The good news is that if you find something in the literature that works for you, you can use it (with proper attribution, of course). If there are only pieces of it that you like, you can reuse those pieces (with proper attribution and describing/justifying any changes). You don’t always have to start from scratch!

Look at the variables in your research question.

  • Classify them as direct observables, indirect observables, or constructs.
  • Do you think measuring them will be easy or hard?
  • What are your first thoughts about how to measure each variable? No wrong answers here, just write down a thought about each variable.

importance of quantitative research in social work

Measurement starts with conceptualization

In order to measure the concepts in your research question, we first have to understand what we think about them. As an aside, the word concept  has come up quite a bit, and it is important to be sure we have a shared understanding of that term. A  concept is the notion or image that we conjure up when we think of some cluster of related observations or ideas. For example, masculinity is a concept. What do you think of when you hear that word? Presumably, you imagine some set of behaviors and perhaps even a particular style of self-presentation. Of course, we can’t necessarily assume that everyone conjures up the same set of ideas or images when they hear the word  masculinity . While there are many possible ways to define the term and some may be more common or have more support than others, there is no universal definition of masculinity. What counts as masculine may shift over time, from culture to culture, and even from individual to individual (Kimmel, 2008). This is why defining our concepts is so important.\

Not all researchers clearly explain their theoretical or conceptual framework for their study, but they should! Without understanding how a researcher has defined their key concepts, it would be nearly impossible to understand the meaning of that researcher’s findings and conclusions. Back in Chapter 7 , you developed a theoretical framework for your study based on a survey of the theoretical literature in your topic area. If you haven’t done that yet, consider flipping back to that section to familiarize yourself with some of the techniques for finding and using theories relevant to your research question. Continuing with our example on masculinity, we would need to survey the literature on theories of masculinity. After a few queries on masculinity, I found a wonderful article by Wong (2010) [6] that analyzed eight years of the journal Psychology of Men & Masculinity and analyzed how often different theories of masculinity were used . Not only can I get a sense of which theories are more accepted and which are more marginal in the social science on masculinity, I am able to identify a range of options from which I can find the theory or theories that will inform my project. 

Identify a specific theory (or more than one theory) and how it helps you understand…

  • Your independent variable(s).
  • Your dependent variable(s).
  • The relationship between your independent and dependent variables.

Rather than completing this exercise from scratch, build from your theoretical or conceptual framework developed in previous chapters.

In quantitative methods, conceptualization involves writing out clear, concise definitions for our key concepts. These are the kind of definitions you are used to, like the ones in a dictionary. A conceptual definition involves defining a concept in terms of other concepts, usually by making reference to how other social scientists and theorists have defined those concepts in the past. Of course, new conceptual definitions are created all the time because our conceptual understanding of the world is always evolving.

Conceptualization is deceptively challenging—spelling out exactly what the concepts in your research question mean to you. Following along with our example, think about what comes to mind when you read the term masculinity. How do you know masculinity when you see it? Does it have something to do with men or with social norms? If so, perhaps we could define masculinity as the social norms that men are expected to follow. That seems like a reasonable start, and at this early stage of conceptualization, brainstorming about the images conjured up by concepts and playing around with possible definitions is appropriate. However, this is just the first step. At this point, you should be beyond brainstorming for your key variables because you have read a good amount of research about them

In addition, we should consult previous research and theory to understand the definitions that other scholars have already given for the concepts we are interested in. This doesn’t mean we must use their definitions, but understanding how concepts have been defined in the past will help us to compare our conceptualizations with how other scholars define and relate concepts. Understanding prior definitions of our key concepts will also help us decide whether we plan to challenge those conceptualizations or rely on them for our own work. Finally, working on conceptualization is likely to help in the process of refining your research question to one that is specific and clear in what it asks. Conceptualization and operationalization (next section) are where “the rubber meets the road,” so to speak, and you have to specify what you mean by the question you are asking. As your conceptualization deepens, you will often find that your research question becomes more specific and clear.

If we turn to the literature on masculinity, we will surely come across work by Michael Kimmel , one of the preeminent masculinity scholars in the United States. After consulting Kimmel’s prior work (2000; 2008), [7] we might tweak our initial definition of masculinity. Rather than defining masculinity as “the social norms that men are expected to follow,” perhaps instead we’ll define it as “the social roles, behaviors, and meanings prescribed for men in any given society at any one time” (Kimmel & Aronson, 2004, p. 503). [8] Our revised definition is more precise and complex because it goes beyond addressing one aspect of men’s lives (norms), and addresses three aspects: roles, behaviors, and meanings. It also implies that roles, behaviors, and meanings may vary across societies and over time. Using definitions developed by theorists and scholars is a good idea, though you may find that you want to define things your own way.

As you can see, conceptualization isn’t as simple as applying any random definition that we come up with to a term. Defining our terms may involve some brainstorming at the very beginning. But conceptualization must go beyond that, to engage with or critique existing definitions and conceptualizations in the literature. Once we’ve brainstormed about the images associated with a particular word, we should also consult prior work to understand how others define the term in question. After we’ve identified a clear definition that we’re happy with, we should make sure that every term used in our definition will make sense to others. Are there terms used within our definition that also need to be defined? If so, our conceptualization is not yet complete. Our definition includes the concept of “social roles,” so we should have a definition for what those mean and become familiar with role theory to help us with our conceptualization. If we don’t know what roles are, how can we study them?

Let’s say we do all of that. We have a clear definition of the term masculinity with reference to previous literature and we also have a good understanding of the terms in our conceptual definition…then we’re done, right? Not so fast. You’ve likely met more than one man in your life, and you’ve probably noticed that they are not the same, even if they live in the same society during the same historical time period. This could mean there are dimensions of masculinity. In terms of social scientific measurement, concepts can be said to have multiple dimensions  when there are multiple elements that make up a single concept. With respect to the term  masculinity , dimensions could based on gender identity, gender performance, sexual orientation, etc.. In any of these cases, the concept of masculinity would be considered to have multiple dimensions.

While you do not need to spell out every possible dimension of the concepts you wish to measure, it is important to identify whether your concepts are unidimensional (and therefore relatively easy to define and measure) or multidimensional (and therefore require multi-part definitions and measures). In this way, how you conceptualize your variables determines how you will measure them in your study. Unidimensional concepts are those that are expected to have a single underlying dimension. These concepts can be measured using a single measure or test. Examples include simple concepts such as a person’s weight, time spent sleeping, and so forth. 

One frustrating this is that there is no clear demarcation between concepts that are inherently unidimensional or multidimensional. Even something as simple as age could be broken down into multiple dimensions including mental age and chronological age, so where does conceptualization stop? How far down the dimensional rabbit hole do we have to go? Researchers should consider two things. First, how important is this variable in your study? If age is not important in your study (maybe it is a control variable), it seems like a waste of time to do a lot of work drawing from developmental theory to conceptualize this variable. A unidimensional measure from zero to dead is all the detail we need. On the other hand, if we were measuring the impact of age on masculinity, conceptualizing our independent variable (age) as multidimensional may provide a richer understanding of its impact on masculinity. Finally, your conceptualization will lead directly to your operationalization of the variable, and once your operationalization is complete, make sure someone reading your study could follow how your conceptual definitions informed the measures you chose for your variables. 

Write a conceptual definition for your independent and dependent variables.

  • Cite and attribute definitions to other scholars, if you use their words.
  • Describe how your definitions are informed by your theoretical framework.
  • Place your definition in conversation with other theories and conceptual definitions commonly used in the literature.
  • Are there multiple dimensions of your variables?
  • Are any of these dimensions important for you to measure?

importance of quantitative research in social work

Do researchers actually know what we’re talking about?

Conceptualization proceeds differently in qualitative research compared to quantitative research. Since qualitative researchers are interested in the understandings and experiences of their participants, it is less important for them to find one fixed definition for a concept before starting to interview or interact with participants. The researcher’s job is to accurately and completely represent how their participants understand a concept, not to test their own definition of that concept.

If you were conducting qualitative research on masculinity, you would likely consult previous literature like Kimmel’s work mentioned above. From your literature review, you may come up with a  working definition  for the terms you plan to use in your study, which can change over the course of the investigation. However, the definition that matters is the definition that your participants share during data collection. A working definition is merely a place to start, and researchers should take care not to think it is the only or best definition out there.

In qualitative inquiry, your participants are the experts (sound familiar, social workers?) on the concepts that arise during the research study. Your job as the researcher is to accurately and reliably collect and interpret their understanding of the concepts they describe while answering your questions. Conceptualization of concepts is likely to change over the course of qualitative inquiry, as you learn more information from your participants. Indeed, getting participants to comment on, extend, or challenge the definitions and understandings of other participants is a hallmark of qualitative research. This is the opposite of quantitative research, in which definitions must be completely set in stone before the inquiry can begin.

The contrast between qualitative and quantitative conceptualization is instructive for understanding how quantitative methods (and positivist research in general) privilege the knowledge of the researcher over the knowledge of study participants and community members. Positivism holds that the researcher is the “expert,” and can define concepts based on their expert knowledge of the scientific literature. This knowledge is in contrast to the lived experience that participants possess from experiencing the topic under examination day-in, day-out. For this reason, it would be wise to remind ourselves not to take our definitions too seriously and be critical about the limitations of our knowledge.

Conceptualization must be open to revisions, even radical revisions, as scientific knowledge progresses. While I’ve suggested consulting prior scholarly definitions of our concepts, you should not assume that prior, scholarly definitions are more real than the definitions we create. Likewise, we should not think that our own made-up definitions are any more real than any other definition. It would also be wrong to assume that just because definitions exist for some concept that the concept itself exists beyond some abstract idea in our heads. Building on the paradigmatic ideas behind interpretivism and the critical paradigm, researchers call the assumption that our abstract concepts exist in some concrete, tangible way is known as reification . It explores the power dynamics behind how we can create reality by how we define it.

Returning again to our example of masculinity. Think about our how our notions of masculinity have developed over the past few decades, and how different and yet so similar they are to patriarchal definitions throughout history. Conceptual definitions become more or less popular based on the power arrangements inside of social science the broader world. Western knowledge systems are privileged, while others are viewed as unscientific and marginal. The historical domination of social science by white men from WEIRD countries meant that definitions of masculinity were imbued their cultural biases and were designed explicitly and implicitly to preserve their power. This has inspired movements for cognitive justice as we seek to use social science to achieve global development.

Key Takeaways

  • Measurement is the process by which we describe and ascribe meaning to the key facts, concepts, or other phenomena that we are investigating.
  • Kaplan identified three categories of things that social scientists measure including observational terms, indirect observables, and constructs.
  • Some concepts have multiple elements or dimensions.
  • Researchers often use measures previously developed and studied by other researchers.
  • Conceptualization is a process that involves coming up with clear, concise definitions.
  • Conceptual definitions are based on the theoretical framework you are using for your study (and the paradigmatic assumptions underlying those theories).
  • Whether your conceptual definitions come from your own ideas or the literature, you should be able to situate them in terms of other commonly used conceptual definitions.
  • Researchers should acknowledge the limited explanatory power of their definitions for concepts and how oppression can shape what explanations are considered true or scientific.

Think historically about the variables in your research question.

  • How has our conceptual definition of your topic changed over time?
  • What scholars or social forces were responsible for this change?

Take a critical look at your conceptual definitions.

  • How participants might define terms for themselves differently, in terms of their daily experience?
  • On what cultural assumptions are your conceptual definitions based?
  • Are your conceptual definitions applicable across all cultures that will be represented in your sample?

11.2 Operational definitions

  • Define and give an example of indicators and attributes for a variable
  • Apply the three components of an operational definition to a variable
  • Distinguish between levels of measurement for a variable and how those differences relate to measurement
  • Describe the purpose of composite measures like scales and indices

Conceptual definitions are like dictionary definitions. They tell you what a concept means by defining it using other concepts. In this section we will move from the abstract realm (theory) to the real world (measurement). Operationalization is the process by which researchers spell out precisely how a concept will be measured in their study. It involves identifying the specific research procedures we will use to gather data about our concepts. If conceptually defining your terms means looking at theory, how do you operationally define your terms? By looking for indicators of when your variable is present or not, more or less intense, and so forth. Operationalization is probably the most challenging part of quantitative research, but once it’s done, the design and implementation of your study will be straightforward.

importance of quantitative research in social work

Operationalization works by identifying specific  indicators that will be taken to represent the ideas we are interested in studying. If we are interested in studying masculinity, then the indicators for that concept might include some of the social roles prescribed to men in society such as breadwinning or fatherhood. Being a breadwinner or a father might therefore be considered indicators  of a person’s masculinity. The extent to which a man fulfills either, or both, of these roles might be understood as clues (or indicators) about the extent to which he is viewed as masculine.

Let’s look at another example of indicators. Each day, Gallup researchers poll 1,000 randomly selected Americans to ask them about their well-being. To measure well-being, Gallup asks these people to respond to questions covering six broad areas: physical health, emotional health, work environment, life evaluation, healthy behaviors, and access to basic necessities. Gallup uses these six factors as indicators of the concept that they are really interested in, which is well-being .

Identifying indicators can be even simpler than the examples described thus far. Political party affiliation is another relatively easy concept for which to identify indicators. If you asked a person what party they voted for in the last national election (or gained access to their voting records), you would get a good indication of their party affiliation. Of course, some voters split tickets between multiple parties when they vote and others swing from party to party each election, so our indicator is not perfect. Indeed, if our study were about political identity as a key concept, operationalizing it solely in terms of who they voted for in the previous election leaves out a lot of information about identity that is relevant to that concept. Nevertheless, it’s a pretty good indicator of political party affiliation.

Choosing indicators is not an arbitrary process. As described earlier, utilizing prior theoretical and empirical work in your area of interest is a great way to identify indicators in a scholarly manner. And you conceptual definitions will point you in the direction of relevant indicators. Empirical work will give you some very specific examples of how the important concepts in an area have been measured in the past and what sorts of indicators have been used. Often, it makes sense to use the same indicators as previous researchers; however, you may find that some previous measures have potential weaknesses that your own study will improve upon.

All of the examples in this chapter have dealt with questions you might ask a research participant on a survey or in a quantitative interview. If you plan to collect data from other sources, such as through direct observation or the analysis of available records, think practically about what the design of your study might look like and how you can collect data on various indicators feasibly. If your study asks about whether the participant regularly changes the oil in their car, you will likely not observe them directly doing so. Instead, you will likely need to rely on a survey question that asks them the frequency with which they change their oil or ask to see their car maintenance records.

  • What indicators are commonly used to measure the variables in your research question?
  • How can you feasibly collect data on these indicators?
  • Are you planning to collect your own data using a questionnaire or interview? Or are you planning to analyze available data like client files or raw data shared from another researcher’s project?

Remember, you need raw data . You research project cannot rely solely on the results reported by other researchers or the arguments you read in the literature. A literature review is only the first part of a research project, and your review of the literature should inform the indicators you end up choosing when you measure the variables in your research question.

Unlike conceptual definitions which contain other concepts, operational definition consists of the following components: (1) the variable being measured and its attributes, (2) the measure you will use, (3) how you plan to interpret the data collected from that measure to draw conclusions about the variable you are measuring.

Step 1: Specifying variables and attributes

The first component, the variable, should be the easiest part. At this point in quantitative research, you should have a research question that has at least one independent and at least one dependent variable. Remember that variables must be able to vary. For example, the United States is not a variable. Country of residence is a variable, as is patriotism. Similarly, if your sample only includes men, gender is a constant in your study, not a variable. A  constant is a characteristic that does not change in your study.

When social scientists measure concepts, they sometimes use the language of variables and attributes. A  variable refers to a quality or quantity that varies across people or situations. Attributes  are the characteristics that make up a variable. For example, the variable hair color would contain attributes like blonde, brown, black, red, gray, etc. A variable’s attributes determine its level of measurement. There are four possible levels of measurement: nominal, ordinal, interval, and ratio. The first two levels of measurement are  categorical , meaning their attributes are categories rather than numbers. The latter two levels of measurement are  continuous , meaning their attributes are numbers.

importance of quantitative research in social work

Levels of measurement

Hair color is an example of a nominal level of measurement.  Nominal measures are categorical, and those categories cannot be mathematically ranked. As a brown-haired person (with some gray), I can’t say for sure that brown-haired people are better than blonde-haired people. As with all nominal levels of measurement, there is no ranking order between hair colors; they are simply different. That is what constitutes a nominal level of gender and race are also measured at the nominal level.

What attributes are contained in the variable  hair color ? While blonde, brown, black, and red are common colors, some people may not fit into these categories if we only list these attributes. My wife, who currently has purple hair, wouldn’t fit anywhere. This means that our attributes were not exhaustive. Exhaustiveness  means that all possible attributes are listed. We may have to list a lot of colors before we can meet the criteria of exhaustiveness. Clearly, there is a point at which exhaustiveness has been reasonably met. If a person insists that their hair color is  light burnt sienna , it is not your responsibility to list that as an option. Rather, that person would reasonably be described as brown-haired. Perhaps listing a category for  other color  would suffice to make our list of colors exhaustive.

What about a person who has multiple hair colors at the same time, such as red and black? They would fall into multiple attributes. This violates the rule of  mutual exclusivity , in which a person cannot fall into two different attributes. Instead of listing all of the possible combinations of colors, perhaps you might include a  multi-color  attribute to describe people with more than one hair color.

Making sure researchers provide mutually exclusive and exhaustive is about making sure all people are represented in the data record. For many years, the attributes for gender were only male or female. Now, our understanding of gender has evolved to encompass more attributes that better reflect the diversity in the world. Children of parents from different races were often classified as one race or another, even if they identified with both cultures. The option for bi-racial or multi-racial on a survey not only more accurately reflects the racial diversity in the real world but validates and acknowledges people who identify in that manner. If we did not measure race in this way, we would leave empty the data record for people who identify as biracial or multiracial, impairing our search for truth.

Unlike nominal-level measures, attributes at the  ordinal  level can be rank ordered. For example, someone’s degree of satisfaction in their romantic relationship can be ordered by rank. That is, you could say you are not at all satisfied, a little satisfied, moderately satisfied, or highly satisfied. Note that even though these have a rank order to them (not at all satisfied is certainly worse than highly satisfied), we cannot calculate a mathematical distance between those attributes. We can simply say that one attribute of an ordinal-level variable is more or less than another attribute.

This can get a little confusing when using rating scales . If you have ever taken a customer satisfaction survey or completed a course evaluation for school, you are familiar with rating scales. “On a scale of 1-5, with 1 being the lowest and 5 being the highest, how likely are you to recommend our company to other people?” That surely sounds familiar. Rating scales use numbers, but only as a shorthand, to indicate what attribute (highly likely, somewhat likely, etc.) the person feels describes them best. You wouldn’t say you are “2” likely to recommend the company, but you would say you are not very likely to recommend the company. Ordinal-level attributes must also be exhaustive and mutually exclusive, as with nominal-level variables.

At the  interval   level, attributes must also be exhaustive and mutually exclusive and there is equal distance between attributes. Interval measures are also continuous, meaning their attributes are numbers, rather than categories. IQ scores are interval level, as are temperatures in Fahrenheit and Celsius. Their defining characteristic is that we can say how much more or less one attribute differs from another. We cannot, however, say with certainty what the ratio of one attribute is in comparison to another. For example, it would not make sense to say that a person with an IQ score of 140 has twice the IQ of a person with a score of 70. However, the difference between IQ scores of 80 and 100 is the same as the difference between IQ scores of 120 and 140.

While we cannot say that someone with an IQ of 140 is twice as intelligent as someone with an IQ of 70 because IQ is measured at the interval level, we can say that someone with six siblings has twice as many as someone with three because number of siblings is measured at the ratio level. Finally, at the ratio   level, attributes are mutually exclusive and exhaustive, attributes can be rank ordered, the distance between attributes is equal, and attributes have a true zero point. Thus, with these variables, we can  say what the ratio of one attribute is in comparison to another. Examples of ratio-level variables include age and years of education. We know that a person who is 12 years old is twice as old as someone who is 6 years old. Height measured in meters and weight measured in kilograms are good examples. So are counts of discrete objects or events such as the number of siblings one has or the number of questions a student answers correctly on an exam. The differences between each level of measurement are visualized in Table 11.1.

Table 11.1 Criteria for Different Levels of Measurement
Nominal Ordinal Interval Ratio
Exhaustive X X X X
Mutually exclusive X X X X
Rank-ordered X X X
Equal distance between attributes X X
True zero point X

Levels of measurement=levels of specificity

We have spent time learning how to determine our data’s level of measurement. Now what? How could we use this information to help us as we measure concepts and develop measurement tools? First, the types of statistical tests that we are able to use are dependent on our data’s level of measurement. With nominal-level measurement, for example, the only available measure of central tendency is the mode. With ordinal-level measurement, the median or mode can be used as indicators of central tendency. Interval and ratio-level measurement are typically considered the most desirable because they permit for any indicators of central tendency to be computed (i.e., mean, median, or mode). Also, ratio-level measurement is the only level that allows meaningful statements about ratios of scores. The higher the level of measurement, the more complex statistical tests we are able to conduct. This knowledge may help us decide what kind of data we need to gather, and how.

That said, we have to balance this knowledge with the understanding that sometimes, collecting data at a higher level of measurement could negatively impact our studies. For instance, sometimes providing answers in ranges may make prospective participants feel more comfortable responding to sensitive items. Imagine that you were interested in collecting information on topics such as income, number of sexual partners, number of times someone used illicit drugs, etc. You would have to think about the sensitivity of these items and determine if it would make more sense to collect some data at a lower level of measurement (e.g., asking if they are sexually active or not (nominal) versus their total number of sexual partners (ratio).

Finally, sometimes when analyzing data, researchers find a need to change a data’s level of measurement. For example, a few years ago, a student was interested in studying the relationship between mental health and life satisfaction. This student used a variety of measures. One item asked about the number of mental health symptoms, reported as the actual number. When analyzing data, my student examined the mental health symptom variable and noticed that she had two groups, those with none or one symptoms and those with many symptoms. Instead of using the ratio level data (actual number of mental health symptoms), she collapsed her cases into two categories, few and many. She decided to use this variable in her analyses. It is important to note that you can move a higher level of data to a lower level of data; however, you are unable to move a lower level to a higher level.

  • Check that the variables in your research question can vary…and that they are not constants or one of many potential attributes of a variable.
  • Think about the attributes your variables have. Are they categorical or continuous? What level of measurement seems most appropriate?

importance of quantitative research in social work

Step 2: Specifying measures for each variable

Let’s pick a social work research question and walk through the process of operationalizing variables to see how specific we need to get. I’m going to hypothesize that residents of a psychiatric unit who are more depressed are less likely to be satisfied with care. Remember, this would be a inverse relationship—as depression increases, satisfaction decreases. In this question, depression is my independent variable (the cause) and satisfaction with care is my dependent variable (the effect). Now we have identified our variables, their attributes, and levels of measurement, we move onto the second component: the measure itself.

So, how would you measure my key variables: depression and satisfaction? What indicators would you look for? Some students might say that depression could be measured by observing a participant’s body language. They may also say that a depressed person will often express feelings of sadness or hopelessness. In addition, a satisfied person might be happy around service providers and often express gratitude. While these factors may indicate that the variables are present, they lack coherence. Unfortunately, what this “measure” is actually saying is that “I know depression and satisfaction when I see them.” While you are likely a decent judge of depression and satisfaction, you need to provide more information in a research study for how you plan to measure your variables. Your judgment is subjective, based on your own idiosyncratic experiences with depression and satisfaction. They couldn’t be replicated by another researcher. They also can’t be done consistently for a large group of people. Operationalization requires that you come up with a specific and rigorous measure for seeing who is depressed or satisfied.

Finding a good measure for your variable depends on the kind of variable it is. Variables that are directly observable don’t come up very often in my students’ classroom projects, but they might include things like taking someone’s blood pressure, marking attendance or participation in a group, and so forth. To measure an indirectly observable variable like age, you would probably put a question on a survey that asked, “How old are you?” Measuring a variable like income might require some more thought, though. Are you interested in this person’s individual income or the income of their family unit? This might matter if your participant does not work or is dependent on other family members for income. Do you count income from social welfare programs? Are you interested in their income per month or per year? Even though indirect observables are relatively easy to measure, the measures you use must be clear in what they are asking, and operationalization is all about figuring out the specifics of what you want to know. For more complicated constructs, you will need compound measures (that use multiple indicators to measure a single variable).

How you plan to collect your data also influences how you will measure your variables. For social work researchers using secondary data like client records as a data source, you are limited by what information is in the data sources you can access. If your organization uses a given measurement for a mental health outcome, that is the one you will use in your study. Similarly, if you plan to study how long a client was housed after an intervention using client visit records, you are limited by how their caseworker recorded their housing status in the chart. One of the benefits of collecting your own data is being able to select the measures you feel best exemplify your understanding of the topic.

Measuring unidimensional concepts

The previous section mentioned two important considerations: how complicated the variable is and how you plan to collect your data. With these in hand, we can use the level of measurement to further specify how you will measure your variables and consider specialized rating scales developed by social science researchers.

Measurement at each level

Nominal measures assess categorical variables. These measures are used for variables or indicators that have mutually exclusive attributes, but that cannot be rank-ordered. Nominal measures ask about the variable and provide names or labels for different attribute values like social work, counseling, and nursing for the variable profession. Nominal measures are relatively straightforward.

Ordinal measures often use a rating scale. It is an ordered set of responses that participants must choose from. Figure 11.1 shows several examples. The number of response options on a typical rating scale is usualy five or seven, though it can range from three to 11. Five-point scales are best for unipolar scales where only one construct is tested, such as frequency (Never, Rarely, Sometimes, Often, Always). Seven-point scales are best for bipolar scales where there is a dichotomous spectrum, such as liking (Like very much, Like somewhat, Like slightly, Neither like nor dislike, Dislike slightly, Dislike somewhat, Dislike very much). For bipolar questions, it is useful to offer an earlier question that branches them into an area of the scale; if asking about liking ice cream, first ask “Do you generally like or dislike ice cream?” Once the respondent chooses like or dislike, refine it by offering them relevant choices from the seven-point scale. Branching improves both reliability and validity (Krosnick & Berent, 1993). [9] Although you often see scales with numerical labels, it is best to only present verbal labels to the respondents but convert them to numerical values in the analyses. Avoid partial labels or length or overly specific labels. In some cases, the verbal labels can be supplemented with (or even replaced by) meaningful graphics. The last rating scale shown in Figure 11.1 is a visual-analog scale, on which participants make a mark somewhere along the horizontal line to indicate the magnitude of their response.

importance of quantitative research in social work

Interval measures are those where the values measured are not only rank-ordered, but are also equidistant from adjacent attributes. For example, the temperature scale (in Fahrenheit or Celsius), where the difference between 30 and 40 degree Fahrenheit is the same as that between 80 and 90 degree Fahrenheit. Likewise, if you have a scale that asks respondents’ annual income using the following attributes (ranges): $0 to 10,000, $10,000 to 20,000, $20,000 to 30,000, and so forth, this is also an interval measure, because the mid-point of each range (i.e., $5,000, $15,000, $25,000, etc.) are equidistant from each other. The intelligence quotient (IQ) scale is also an interval measure, because the measure is designed such that the difference between IQ scores 100 and 110 is supposed to be the same as between 110 and 120 (although we do not really know whether that is truly the case). Interval measures allow us to examine “how much more” is one attribute when compared to another, which is not possible with nominal or ordinal measures. You may find researchers who “pretend” (incorrectly) that ordinal rating scales are actually interval measures so that we can use different statistical techniques for analyzing them. As we will discuss in the latter part of the chapter, this is a mistake because there is no way to know whether the difference between a 3 and a 4 on a rating scale is the same as the difference between a 2 and a 3. Those numbers are just placeholders for categories.

Ratio measures are those that have all the qualities of nominal, ordinal, and interval scales, and in addition, also have a “true zero” point (where the value zero implies lack or non-availability of the underlying construct). Think about how to measure the number of people working in human resources at a social work agency. It could be one, several, or none (if the company contracts out for those services). Measuring interval and ratio data is relatively easy, as people either select or input a number for their answer. If you ask a person how many eggs they purchased last week, they can simply tell you they purchased `a dozen eggs at the store, two at breakfast on Wednesday, or none at all.

Commonly used rating scales in questionnaires

The level of measurement will give you the basic information you need, but social scientists have developed specialized instruments for use in questionnaires, a common tool used in quantitative research. As we mentioned before, if you plan to source your data from client files or previously published results

Although Likert scale is a term colloquially used to refer to almost any rating scale (e.g., a 0-to-10 life satisfaction scale), it has a much more precise meaning. In the 1930s, researcher Rensis Likert (pronounced LICK-ert) created a new approach for measuring people’s attitudes (Likert, 1932) . [10]  It involves presenting people with several statements—including both favorable and unfavorable statements—about some person, group, or idea. Respondents then express their agreement or disagreement with each statement on a 5-point scale:  Strongly Agree ,  Agree ,  Neither Agree nor Disagree ,  Disagree ,  Strongly Disagree . Numbers are assigned to each response a nd then summed across all items to produce a score representing the attitude toward the person, group, or idea. For items that are phrased in an opposite direction (e.g., negatively worded statements instead of positively worded statements), reverse coding is used so that the numerical scoring of statements also runs in the opposite direction.  The entire set of items came to be called a Likert scale, as indicated in Table 11.2 below.

Unless you are measuring people’s attitude toward something by assessing their level of agreement with several statements about it, it is best to avoid calling it a Likert scale. You are probably just using a rating scale. Likert scales allow for more granularity (more finely tuned response) than yes/no items, including whether respondents are neutral to the statement. Below is an example of how we might use a Likert scale to assess your attitudes about research as you work your way through this textbook.

Table 11.2 Likert scale
I like research more now than when I started reading this book.
This textbook is easy to use.
I feel confident about how well I understand levels of measurement.
This textbook is helping me plan my research proposal.

Semantic differential scales are composite (multi-item) scales in which respondents are asked to indicate their opinions or feelings toward a single statement using different pairs of adjectives framed as polar opposites. Whereas in the above Likert scale, the participant is asked how much they agree or disagree with a statement, in a semantic differential scale the participant is asked to indicate how they feel about a specific item. This makes the s emantic differential scale an excellent technique for measuring people’s attitudes or feelings toward objects, events, or behaviors. Table 11.3 is an example of a semantic differential scale that was created to assess participants’ feelings about this textbook. 

Very much Somewhat Neither Somewhat Very much
Boring Exciting
Useless Useful
Hard Easy
Irrelevant Applicable

This composite scale was designed by Louis Guttman and uses a series of items arranged in increasing order of intensity (least intense to most intense) of the concept. This type of scale allows us to understand the intensity of beliefs or feelings. Each item in the above Guttman scale has a weight (this is not indicated on the tool) which varies with the intensity of that item, and the weighted combination of each response is used as an aggregate measure of an observation.

Example Guttman Scale Items

  • I often felt the material was not engaging                               Yes/No
  • I was often thinking about other things in class                     Yes/No
  • I was often working on other tasks during class                     Yes/No
  • I will work to abolish research from the curriculum              Yes/No

Notice how the items move from lower intensity to higher intensity. A researcher reviews the yes answers and creates a score for each participant.

Composite measures: Scales and indices

Depending on your research design, your measure may be something you put on a survey or pre/post-test that you give to your participants. For a variable like age or income, one well-worded question may suffice. Unfortunately, most variables in the social world are not so simple. Depression and satisfaction are multidimensional concepts. Relying on a single indicator like a question that asks “Yes or no, are you depressed?” does not encompass the complexity of depression, including issues with mood, sleeping, eating, relationships, and happiness. There is no easy way to delineate between multidimensional and unidimensional concepts, as its all in how you think about your variable. Satisfaction could be validly measured using a unidimensional ordinal rating scale. However, if satisfaction were a key variable in our study, we would need a theoretical framework and conceptual definition for it. That means we’d probably have more indicators to ask about like timeliness, respect, sensitivity, and many others, and we would want our study to say something about what satisfaction truly means in terms of our other key variables. However, if satisfaction is not a key variable in your conceptual framework, it makes sense to operationalize it as a unidimensional concept.

For more complicated measures, researchers use scales and indices (sometimes called indexes) to measure their variables because they assess multiple indicators to develop a composite (or total) score. Co mposite scores provide a much greater understanding of concepts than a single item could. Although we won’t delve too deeply into the process of scale development, we will cover some important topics for you to understand how scales and indices developed by other researchers can be used in your project.

Although they exhibit differences (which will later be discussed) the two have in common various factors.

  • Both are ordinal measures of variables.
  • Both can order the units of analysis in terms of specific variables.
  • Both are composite measures .

importance of quantitative research in social work

The previous section discussed how to measure respondents’ responses to predesigned items or indicators belonging to an underlying construct. But how do we create the indicators themselves? The process of creating the indicators is called scaling. More formally, scaling is a branch of measurement that involves the construction of measures by associating qualitative judgments about unobservable constructs with quantitative, measurable metric units. Stevens (1946) [11] said, “Scaling is the assignment of objects to numbers according to a rule.” This process of measuring abstract concepts in concrete terms remains one of the most difficult tasks in empirical social science research.

The outcome of a scaling process is a scale , which is an empirical structure for measuring items or indicators of a given construct. Understand that multidimensional “scales”, as discussed in this section, are a little different from “rating scales” discussed in the previous section. A rating scale is used to capture the respondents’ reactions to a given item on a questionnaire. For example, an ordinally scaled item captures a value between “strongly disagree” to “strongly agree.” Attaching a rating scale to a statement or instrument is not scaling. Rather, scaling is the formal process of developing scale items, before rating scales can be attached to those items.

If creating your own scale sounds painful, don’t worry! For most multidimensional variables, you would likely be duplicating work that has already been done by other researchers. Specifically, this is a branch of science called psychometrics. You do not need to create a scale for depression because scales such as the Patient Health Questionnaire (PHQ-9), the Center for Epidemiologic Studies Depression Scale (CES-D), and Beck’s Depression Inventory (BDI) have been developed and refined over dozens of years to measure variables like depression. Similarly, scales such as the Patient Satisfaction Questionnaire (PSQ-18) have been developed to measure satisfaction with medical care. As we will discuss in the next section, these scales have been shown to be reliable and valid. While you could create a new scale to measure depression or satisfaction, a study with rigor would pilot test and refine that new scale over time to make sure it measures the concept accurately and consistently. This high level of rigor is often unachievable in student research projects because of the cost and time involved in pilot testing and validating, so using existing scales is recommended.

Unfortunately, there is no good one-stop=shop for psychometric scales. The Mental Measurements Yearbook provides a searchable database of measures for social science variables, though it woefully incomplete and often does not contain the full documentation for scales in its database. You can access it from a university library’s list of databases. If you can’t find anything in there, your next stop should be the methods section of the articles in your literature review. The methods section of each article will detail how the researchers measured their variables, and often the results section is instructive for understanding more about measures. In a quantitative study, researchers may have used a scale to measure key variables and will provide a brief description of that scale, its names, and maybe a few example questions. If you need more information, look at the results section and tables discussing the scale to get a better idea of how the measure works. Looking beyond the articles in your literature review, searching Google Scholar using queries like “depression scale” or “satisfaction scale” should also provide some relevant results. For example, searching for documentation for the Rosenberg Self-Esteem Scale (which we will discuss in the next section), I found this report from researchers investigating acceptance and commitment therapy which details this scale and many others used to assess mental health outcomes. If you find the name of the scale somewhere but cannot find the documentation (all questions and answers plus how to interpret the scale), a general web search with the name of the scale and “.pdf” may bring you to what you need. Or, to get professional help with finding information, always ask a librarian!

Unfortunately, these approaches do not guarantee that you will be able to view the scale itself or get information on how it is interpreted. Many scales cost money to use and may require training to properly administer. You may also find scales that are related to your variable but would need to be slightly modified to match your study’s needs. You could adapt a scale to fit your study, however changing even small parts of a scale can influence its accuracy and consistency. While it is perfectly acceptable in student projects to adapt a scale without testing it first (time may not allow you to do so), pilot testing is always recommended for adapted scales, and researchers seeking to draw valid conclusions and publish their results must take this additional step.

An index is a composite score derived from aggregating measures of multiple concepts (called components) using a set of rules and formulas. It is different from a scale. Scales also aggregate measures; however, these measures examine different dimensions or the same dimension of a single construct. A well-known example of an index is the consumer price index (CPI), which is computed every month by the Bureau of Labor Statistics of the U.S. Department of Labor. The CPI is a measure of how much consumers have to pay for goods and services (in general) and is divided into eight major categories (food and beverages, housing, apparel, transportation, healthcare, recreation, education and communication, and “other goods and services”), which are further subdivided into more than 200 smaller items. Each month, government employees call all over the country to get the current prices of more than 80,000 items. Using a complicated weighting scheme that takes into account the location and probability of purchase for each item, analysts then combine these prices into an overall index score using a series of formulas and rules.

Another example of an index is the Duncan Socioeconomic Index (SEI). This index is used to quantify a person’s socioeconomic status (SES) and is a combination of three concepts: income, education, and occupation. Income is measured in dollars, education in years or degrees achieved, and occupation is classified into categories or levels by status. These very different measures are combined to create an overall SES index score. However, SES index measurement has generated a lot of controversy and disagreement among researchers.

The process of creating an index is similar to that of a scale. First, conceptualize (define) the index and its constituent components. Though this appears simple, there may be a lot of disagreement on what components (concepts/constructs) should be included or excluded from an index. For instance, in the SES index, isn’t income correlated with education and occupation? And if so, should we include one component only or all three components? Reviewing the literature, using theories, and/or interviewing experts or key stakeholders may help resolve this issue. Second, operationalize and measure each component. For instance, how will you categorize occupations, particularly since some occupations may have changed with time (e.g., there were no Web developers before the Internet)? As we will see in step three below, researchers must create a rule or formula for calculating the index score. Again, this process may involve a lot of subjectivity, so validating the index score using existing or new data is important.

Scale and index development at often taught in their own course in doctoral education, so it is unreasonable for you to expect to develop a consistently accurate measure within the span of a week or two. Using available indices and scales is recommended for this reason.

Differences between scales and indices

Though indices and scales yield a single numerical score or value representing a concept of interest, they are different in many ways. First, indices often comprise components that are very different from each other (e.g., income, education, and occupation in the SES index) and are measured in different ways. Conversely, scales typically involve a set of similar items that use the same rating scale (such as a five-point Likert scale about customer satisfaction).

Second, indices often combine objectively measurable values such as prices or income, while scales are designed to assess subjective or judgmental constructs such as attitude, prejudice, or self-esteem. Some argue that the sophistication of the scaling methodology makes scales different from indexes, while others suggest that indexing methodology can be equally sophisticated. Nevertheless, indexes and scales are both essential tools in social science research.

Scales and indices seem like clean, convenient ways to measure different phenomena in social science, but just like with a lot of research, we have to be mindful of the assumptions and biases underneath. What if a scale or an index was developed using only White women as research participants? Is it going to be useful for other groups? It very well might be, but when using a scale or index on a group for whom it hasn’t been tested, it will be very important to evaluate the validity and reliability of the instrument, which we address in the rest of the chapter.

Finally, it’s important to note that while scales and indices are often made up of nominal or ordinal variables, when we analyze them into composite scores, we will treat them as interval/ratio variables.

  • Look back to your work from the previous section, are your variables unidimensional or multidimensional?
  • Describe the specific measures you will use (actual questions and response options you will use with participants) for each variable in your research question.
  • If you are using a measure developed by another researcher but do not have all of the questions, response options, and instructions needed to implement it, put it on your to-do list to get them.

importance of quantitative research in social work

Step 3: How you will interpret your measures

The final stage of operationalization involves setting the rules for how the measure works and how the researcher should interpret the results. Sometimes, interpreting a measure can be incredibly easy. If you ask someone their age, you’ll probably interpret the results by noting the raw number (e.g., 22) someone provides and that it is lower or higher than other people’s ages. However, you could also recode that person into age categories (e.g., under 25, 20-29-years-old, generation Z, etc.). Even scales may be simple to interpret. If there is a scale of problem behaviors, one might simply add up the number of behaviors checked off–with a range from 1-5 indicating low risk of delinquent behavior, 6-10 indicating the student is moderate risk, etc. How you choose to interpret your measures should be guided by how they were designed, how you conceptualize your variables, the data sources you used, and your plan for analyzing your data statistically. Whatever measure you use, you need a set of rules for how to take any valid answer a respondent provides to your measure and interpret it in terms of the variable being measured.

For more complicated measures like scales, refer to the information provided by the author for how to interpret the scale. If you can’t find enough information from the scale’s creator, look at how the results of that scale are reported in the results section of research articles. For example, Beck’s Depression Inventory (BDI-II) uses 21 statements to measure depression and respondents rate their level of agreement on a scale of 0-3. The results for each question are added up, and the respondent is put into one of three categories: low levels of depression (1-16), moderate levels of depression (17-30), or severe levels of depression (31 and over).

One common mistake I see often is that students will introduce another variable into their operational definition. This is incorrect. Your operational definition should mention only one variable—the variable being defined. While your study will certainly draw conclusions about the relationships between variables, that’s not what operationalization is. Operationalization specifies what instrument you will use to measure your variable and how you plan to interpret the data collected using that measure.

Operationalization is probably the trickiest component of basic research methods, so please don’t get frustrated if it takes a few drafts and a lot of feedback to get to a workable definition. At the time of this writing, I am in the process of operationalizing the concept of “attitudes towards research methods.” Originally, I thought that I could gauge students’ attitudes toward research methods by looking at their end-of-semester course evaluations. As I became aware of the potential methodological issues with student course evaluations, I opted to use focus groups of students to measure their common beliefs about research. You may recall some of these opinions from Chapter 1 , such as the common beliefs that research is boring, useless, and too difficult. After the focus group, I created a scale based on the opinions I gathered, and I plan to pilot test it with another group of students. After the pilot test, I expect that I will have to revise the scale again before I can implement the measure in a real social work research project. At the time I’m writing this, I’m still not completely done operationalizing this concept.

  • Operationalization involves spelling out precisely how a concept will be measured.
  • Operational definitions must include the variable, the measure, and how you plan to interpret the measure.
  • There are four different levels of measurement: nominal, ordinal, interval, and ratio (in increasing order of specificity).
  • Scales and indices are common ways to collect information and involve using multiple indicators in measurement.
  • A key difference between a scale and an index is that a scale contains multiple indicators for one concept, whereas an indicator examines multiple concepts (components).
  • Using scales developed and refined by other researchers can improve the rigor of a quantitative study.

Use the research question that you developed in the previous chapters and find a related scale or index that researchers have used. If you have trouble finding the exact phenomenon you want to study, get as close as you can.

  • What is the level of measurement for each item on each tool? Take a second and think about why the tool’s creator decided to include these levels of measurement. Identify any levels of measurement you would change and why.
  • If these tools don’t exist for what you are interested in studying, why do you think that is?

11.3 Measurement quality

  • Define and describe the types of validity and reliability
  • Assess for systematic error

The previous chapter provided insight into measuring concepts in social work research. We discussed the importance of identifying concepts and their corresponding indicators as a way to help us operationalize them. In essence, we now understand that when we think about our measurement process, we must be intentional and thoughtful in the choices that we make. This section is all about how to judge the quality of the measures you’ve chosen for the key variables in your research question.

Reliability

First, let’s say we’ve decided to measure alcoholism by asking people to respond to the following question: Have you ever had a problem with alcohol? If we measure alcoholism this way, then it is likely that anyone who identifies as an alcoholic would respond “yes.” This may seem like a good way to identify our group of interest, but think about how you and your peer group may respond to this question. Would participants respond differently after a wild night out, compared to any other night? Could an infrequent drinker’s current headache from last night’s glass of wine influence how they answer the question this morning? How would that same person respond to the question before consuming the wine? In each cases, the same person might respond differently to the same question at different points, so it is possible that our measure of alcoholism has a reliability problem.  Reliability  in measurement is about consistency.

One common problem of reliability with social scientific measures is memory. If we ask research participants to recall some aspect of their own past behavior, we should try to make the recollection process as simple and straightforward for them as possible. Sticking with the topic of alcohol intake, if we ask respondents how much wine, beer, and liquor they’ve consumed each day over the course of the past 3 months, how likely are we to get accurate responses? Unless a person keeps a journal documenting their intake, there will very likely be some inaccuracies in their responses. On the other hand, we might get more accurate responses if we ask a participant how many drinks of any kind they have consumed in the past week.

Reliability can be an issue even when we’re not reliant on others to accurately report their behaviors. Perhaps a researcher is interested in observing how alcohol intake influences interactions in public locations. They may decide to conduct observations at a local pub by noting how many drinks patrons consume and how their behavior changes as their intake changes. What if the researcher has to use the restroom, and the patron next to them takes three shots of tequila during the brief period the researcher is away from their seat? The reliability of this researcher’s measure of alcohol intake depends on their ability to physically observe every instance of patrons consuming drinks. If they are unlikely to be able to observe every such instance, then perhaps their mechanism for measuring this concept is not reliable.

The following subsections describe the types of reliability that are important for you to know about, but keep in mind that you may see other approaches to judging reliability mentioned in the empirical literature.

Test-retest reliability

When researchers measure a construct that they assume to be consistent across time, then the scores they obtain should also be consistent across time. Test-retest reliability is the extent to which this is actually the case. For example, intelligence is generally thought to be consistent across time. A person who is highly intelligent today will be highly intelligent next week. This means that any good measure of intelligence should produce roughly the same scores for this individual next week as it does today. Clearly, a measure that produces highly inconsistent scores over time cannot be a very good measure of a construct that is supposed to be consistent.

Assessing test-retest reliability requires using the measure on a group of people at one time, using it again on the  same group of people at a later time. Unlike an experiment, you aren’t giving participants an intervention but trying to establish a reliable baseline of the variable you are measuring. Once you have these two measurements, you then look at the correlation between the two sets of scores. This is typically done by graphing the data in a scatterplot and computing the correlation coefficient. Figure 11.2 shows the correlation between two sets of scores of several university students on the Rosenberg Self-Esteem Scale, administered two times, a week apart. The correlation coefficient for these data is +.95. In general, a test-retest correlation of +.80 or greater is considered to indicate good reliability.

A scatterplot with scores at time 1 on the x-axis and scores at time 2 on the y-axis, both ranging from 0 to 30. The dots on the scatter plot indicate a strong, positive correlation.

Again, high test-retest correlations make sense when the construct being measured is assumed to be consistent over time, which is the case for intelligence, self-esteem, and the Big Five personality dimensions. But other constructs are not assumed to be stable over time. The very nature of mood, for example, is that it changes. So a measure of mood that produced a low test-retest correlation over a period of a month would not be a cause for concern.

Internal consistency

Another kind of reliability is internal consistency , which is the consistency of people’s responses across the items on a multiple-item measure. In general, all the items on such measures are supposed to reflect the same underlying construct, so people’s scores on those items should be correlated with each other. On the Rosenberg Self-Esteem Scale, people who agree that they are a person of worth should tend to agree that they have a number of good qualities. If people’s responses to the different items are not correlated with each other, then it would no longer make sense to claim that they are all measuring the same underlying construct. This is as true for behavioral and physiological measures as for self-report measures. For example, people might make a series of bets in a simulated game of roulette as a measure of their level of risk seeking. This measure would be internally consistent to the extent that individual participants’ bets were consistently high or low across trials. A specific statistical test known as Cronbach’s Alpha provides a way to measure how well each question of a scale is related to the others.

Interrater reliability

Many behavioral measures involve significant judgment on the part of an observer or a rater. Interrater reliability is the extent to which different observers are consistent in their judgments. For example, if you were interested in measuring university students’ social skills, you could make video recordings of them as they interacted with another student whom they are meeting for the first time. Then you could have two or more observers watch the videos and rate each student’s level of social skills. To the extent that each participant does, in fact, have some level of social skills that can be detected by an attentive observer, different observers’ ratings should be highly correlated with each other.

importance of quantitative research in social work

Validity , another key element of assessing measurement quality, is the extent to which the scores from a measure represent the variable they are intended to. But how do researchers make this judgment? We have already considered one factor that they take into account—reliability. When a measure has good test-retest reliability and internal consistency, researchers should be more confident that the scores represent what they are supposed to. There has to be more to it, however, because a measure can be extremely reliable but have no validity whatsoever. As an absurd example, imagine someone who believes that people’s index finger length reflects their self-esteem and therefore tries to measure self-esteem by holding a ruler up to people’s index fingers. Although this measure would have extremely good test-retest reliability, it would have absolutely no validity. The fact that one person’s index finger is a centimeter longer than another’s would indicate nothing about which one had higher self-esteem.

Discussions of validity usually divide it into several distinct “types.” But a good way to interpret these types is that they are other kinds of evidence—in addition to reliability—that should be taken into account when judging the validity of a measure.

Face validity

Face validity is the extent to which a measurement method appears “on its face” to measure the construct of interest. Most people would expect a self-esteem questionnaire to include items about whether they see themselves as a person of worth and whether they think they have good qualities. So a questionnaire that included these kinds of items would have good face validity. The finger-length method of measuring self-esteem, on the other hand, seems to have nothing to do with self-esteem and therefore has poor face validity. Although face validity can be assessed quantitatively—for example, by having a large sample of people rate a measure in terms of whether it appears to measure what it is intended to—it is usually assessed informally.

Face validity is at best a very weak kind of evidence that a measurement method is measuring what it is supposed to. One reason is that it is based on people’s intuitions about human behavior, which are frequently wrong. It is also the case that many established measures in psychology work quite well despite lacking face validity. The Minnesota Multiphasic Personality Inventory-2 (MMPI-2) measures many personality characteristics and disorders by having people decide whether each of over 567 different statements applies to them—where many of the statements do not have any obvious relationship to the construct that they measure. For example, the items “I enjoy detective or mystery stories” and “The sight of blood doesn’t frighten me or make me sick” both measure the suppression of aggression. In this case, it is not the participants’ literal answers to these questions that are of interest, but rather whether the pattern of the participants’ responses to a series of questions matches those of individuals who tend to suppress their aggression.

Content validity

Content validity is the extent to which a measure “covers” the construct of interest. For example, if a researcher conceptually defines test anxiety as involving both sympathetic nervous system activation (leading to nervous feelings) and negative thoughts, then his measure of test anxiety should include items about both nervous feelings and negative thoughts. Or consider that attitudes are usually defined as involving thoughts, feelings, and actions toward something. By this conceptual definition, a person has a positive attitude toward exercise to the extent that they think positive thoughts about exercising, feels good about exercising, and actually exercises. So to have good content validity, a measure of people’s attitudes toward exercise would have to reflect all three of these aspects. Like face validity, content validity is not usually assessed quantitatively. Instead, it is assessed by carefully checking the measurement method against the conceptual definition of the construct.

Criterion validity

Criterion validity is the extent to which people’s scores on a measure are correlated with other variables (known as criteria) that one would expect them to be correlated with. For example, people’s scores on a new measure of test anxiety should be negatively correlated with their performance on an important school exam. If it were found that people’s scores were in fact negatively correlated with their exam performance, then this would be a piece of evidence that these scores really represent people’s test anxiety. But if it were found that people scored equally well on the exam regardless of their test anxiety scores, then this would cast doubt on the validity of the measure.

A criterion can be any variable that one has reason to think should be correlated with the construct being measured, and there will usually be many of them. For example, one would expect test anxiety scores to be negatively correlated with exam performance and course grades and positively correlated with general anxiety and with blood pressure during an exam. Or imagine that a researcher develops a new measure of physical risk taking. People’s scores on this measure should be correlated with their participation in “extreme” activities such as snowboarding and rock climbing, the number of speeding tickets they have received, and even the number of broken bones they have had over the years. When the criterion is measured at the same time as the construct, criterion validity is referred to as concurrent validity ; however, when the criterion is measured at some point in the future (after the construct has been measured), it is referred to as predictive validity (because scores on the measure have “predicted” a future outcome).

Discriminant validity

Discriminant validity , on the other hand, is the extent to which scores on a measure are not  correlated with measures of variables that are conceptually distinct. For example, self-esteem is a general attitude toward the self that is fairly stable over time. It is not the same as mood, which is how good or bad one happens to be feeling right now. So people’s scores on a new measure of self-esteem should not be very highly correlated with their moods. If the new measure of self-esteem were highly correlated with a measure of mood, it could be argued that the new measure is not really measuring self-esteem; it is measuring mood instead.

Increasing the reliability and validity of measures

We have reviewed the types of errors and how to evaluate our measures based on reliability and validity considerations. However, what can we do while selecting or creating our tool so that we minimize the potential of errors? Many of our options were covered in our discussion about reliability and validity. Nevertheless, the following table provides a quick summary of things that you should do when creating or selecting a measurement tool. While not all of these will be feasible in your project, it is important to include easy-to-implement measures in your research context.

Make sure that you engage in a rigorous literature review so that you understand the concept that you are studying. This means understanding the different ways that your concept may manifest itself. This review should include a search for existing instruments. [12]

  • Do you understand all the dimensions of your concept? Do you have a good understanding of the content dimensions of your concept(s)?
  • What instruments exist? How many items are on the existing instruments? Are these instruments appropriate for your population?
  • Are these instruments standardized? Note: If an instrument is standardized, that means it has been rigorously studied and tested.

Consult content experts to review your instrument. This is a good way to check the face validity of your items. Additionally, content experts can also help you understand the content validity. [13]

  • Do you have access to a reasonable number of content experts? If not, how can you locate them?
  • Did you provide a list of critical questions for your content reviewers to use in the reviewing process?

Pilot test your instrument on a sufficient number of people and get detailed feedback. [14] Ask your group to provide feedback on the wording and clarity of items. Keep detailed notes and make adjustments BEFORE you administer your final tool.

  • How many people will you use in your pilot testing?
  • How will you set up your pilot testing so that it mimics the actual process of administering your tool?
  • How will you receive feedback from your pilot testing group? Have you provided a list of questions for your group to think about?

Provide training for anyone collecting data for your project. [15] You should provide those helping you with a written research protocol that explains all of the steps of the project. You should also problem solve and answer any questions that those helping you may have. This will increase the chances that your tool will be administered in a consistent manner.

  • How will you conduct your orientation/training? How long will it be? What modality?
  • How will you select those who will administer your tool? What qualifications do they need?

When thinking of items, use a higher level of measurement, if possible. [16] This will provide more information and you can always downgrade to a lower level of measurement later.

  • Have you examined your items and the levels of measurement?
  • Have you thought about whether you need to modify the type of data you are collecting? Specifically, are you asking for information that is too specific (at a higher level of measurement) which may reduce participants’ willingness to participate?

Use multiple indicators for a variable. [17] Think about the number of items that you will include in your tool.

  • Do you have enough items? Enough indicators? The correct indicators?

Conduct an item-by-item assessment of multiple-item measures. [18] When you do this assessment, think about each word and how it changes the meaning of your item.

  • Are there items that are redundant? Do you need to modify, delete, or add items?

importance of quantitative research in social work

Types of error

As you can see, measures never perfectly describe what exists in the real world. Good measures demonstrate validity and reliability but will always have some degree of error. Systematic error (also called bias) causes our measures to consistently output incorrect data in one direction or another on a measure, usually due to an identifiable process. Imagine you created a measure of height, but you didn’t put an option for anyone over six feet tall. If you gave that measure to your local college or university, some of the taller students might not be measured accurately. In fact, you would be under the mistaken impression that the tallest person at your school was six feet tall, when in actuality there are likely people taller than six feet at your school. This error seems innocent, but if you were using that measure to help you build a new building, those people might hit their heads!

A less innocent form of error arises when researchers word questions in a way that might cause participants to think one answer choice is preferable to another. For example, if I were to ask you “Do you think global warming is caused by human activity?” you would probably feel comfortable answering honestly. But what if I asked you “Do you agree with 99% of scientists that global warming is caused by human activity?” Would you feel comfortable saying no, if that’s what you honestly felt? I doubt it. That is an example of a  leading question , a question with wording that influences how a participant responds. We’ll discuss leading questions and other problems in question wording in greater detail in Chapter 12 .

In addition to error created by the researcher, your participants can cause error in measurement. Some people will respond without fully understanding a question, particularly if the question is worded in a confusing way. Let’s consider another potential source or error. If we asked people if they always washed their hands after using the bathroom, would we expect people to be perfectly honest? Polling people about whether they wash their hands after using the bathroom might only elicit what people would like others to think they do, rather than what they actually do. This is an example of  social desirability bias , in which participants in a research study want to present themselves in a positive, socially desirable way to the researcher. People in your study will want to seem tolerant, open-minded, and intelligent, but their true feelings may be closed-minded, simple, and biased. Participants may lie in this situation. This occurs often in political polling, which may show greater support for a candidate from a minority race, gender, or political party than actually exists in the electorate.

A related form of bias is called  acquiescence bias , also known as “yea-saying.” It occurs when people say yes to whatever the researcher asks, even when doing so contradicts previous answers. For example, a person might say yes to both “I am a confident leader in group discussions” and “I feel anxious interacting in group discussions.” Those two responses are unlikely to both be true for the same person. Why would someone do this? Similar to social desirability, people want to be agreeable and nice to the researcher asking them questions or they might ignore contradictory feelings when responding to each question. You could interpret this as someone saying “yeah, I guess.” Respondents may also act on cultural reasons, trying to “save face” for themselves or the person asking the questions. Regardless of the reason, the results of your measure don’t match what the person truly feels.

So far, we have discussed sources of error that come from choices made by respondents or researchers. Systematic errors will result in responses that are incorrect in one direction or another. For example, social desirability bias usually means that the number of people who say  they will vote for a third party in an election is greater than the number of people who actually vote for that candidate. Systematic errors such as these can be reduced, but random error can never be eliminated. Unlike systematic error, which biases responses consistently in one direction or another,  random error  is unpredictable and does not consistently result in scores that are consistently higher or lower on a given measure. Instead, random error is more like statistical noise, which will likely average out across participants.

Random error is present in any measurement. If you’ve ever stepped on a bathroom scale twice and gotten two slightly different results, maybe a difference of a tenth of a pound, then you’ve experienced random error. Maybe you were standing slightly differently or had a fraction of your foot off of the scale the first time. If you were to take enough measures of your weight on the same scale, you’d be able to figure out your true weight. In social science, if you gave someone a scale measuring depression on a day after they lost their job, they would likely score differently than if they had just gotten a promotion and a raise. Even if the person were clinically depressed, our measure is subject to influence by the random occurrences of life. Thus, social scientists speak with humility about our measures. We are reasonably confident that what we found is true, but we must always acknowledge that our measures are only an approximation of reality.

Humility is important in scientific measurement, as errors can have real consequences. At the time I’m writing this, my wife and I are expecting our first child. Like most people, we used a pregnancy test from the pharmacy. If the test said my wife was pregnant when she was not pregnant, that would be a false positive . On the other hand, if the test indicated that she was not pregnant when she was in fact pregnant, that would be a  false negative . Even if the test is 99% accurate, that means that one in a hundred women will get an erroneous result when they use a home pregnancy test. For us, a false positive would have been initially exciting, then devastating when we found out we were not having a child. A false negative would have been disappointing at first and then quite shocking when we found out we were indeed having a child. While both false positives and false negatives are not very likely for home pregnancy tests (when taken correctly), measurement error can have consequences for the people being measured.

  • Reliability is a matter of consistency.
  • Validity is a matter of accuracy.
  • There are many types of validity and reliability.
  • Systematic error may arise from the researcher, participant, or measurement instrument.
  • Systematic error biases results in a particular direction, whereas random error can be in any direction.
  • All measures are prone to error and should interpreted with humility.

Use the measurement tools you located in the previous exercise. Evaluate the reliability and validity of these tools. Hint: You will need to go into the literature to “research” these tools.

  • Provide a clear statement regarding the reliability and validity of these tools. What strengths did you notice? What were the limitations?
  • Think about your target population . Are there changes that need to be made in order for one of these tools to be appropriate for your population?
  • If you decide to create your own tool, how will you assess its validity and reliability?

11.4 Ethical and social justice considerations

  • Identify potential cultural, ethical, and social justice issues in measurement.

With your variables operationalized, it’s time to take a step back and look at how measurement in social science impact our daily lives. As we will see, how we measure things is both shaped by power arrangements inside our society, and more insidiously, by establishing what is scientifically true, measures have their own power to influence the world. Just like reification in the conceptual world, how we operationally define concepts can reinforce or fight against oppressive forces.

importance of quantitative research in social work

Data equity

How we decide to measure our variables determines what kind of data we end up with in our research project. Because scientific processes are a part of our sociocultural context, the same biases and oppressions we see in the real world can be manifested or even magnified in research data. Jagadish and colleagues (2021) [19] presents four dimensions of data equity that are relevant to consider: in representation of non-dominant groups within data sets; in how data is collected, analyzed, and combined across datasets; in equitable and participatory access to data, and finally in the outcomes associated with the data collection. Historically, we have mostly focused on the outcomes of measures producing outcomes that are biased in one way or another, and this section reviews many such examples. However, it is important to note that equity must also come from designing measures that respond to questions like:

  • Are groups historically suppressed from the data record represented in the sample?
  • Are equity data gathered by researchers and used to uncover and quantify inequity?
  • Are the data accessible across domains and levels of expertise, and can community members participate in the design, collection, and analysis of the public data record?
  • Are the data collected used to monitor and mitigate inequitable impacts?

So, it’s not just about whether measures work for one population for another. Data equity is about the context in which data are created from how we measure people and things. We agree with these authors that data equity should be considered within the context of automated decision-making systems and recognizing a broader literature around the role of administrative systems in creating and reinforcing discrimination. To combat the inequitable processes and outcomes we describe below, researchers must foreground equity as a core component of measurement.

Flawed measures & missing measures

At the end of every semester, students in just about every university classroom in the United States complete similar student evaluations of teaching (SETs). Since every student is likely familiar with these, we can recognize many of the concepts we discussed in the previous sections. There are number of rating scale questions that ask you to rate the professor, class, and teaching effectiveness on a scale of 1-5. Scores are averaged across students and used to determine the quality of teaching delivered by the faculty member. SETs scores are often a principle component of how faculty are reappointed to teaching positions. Would it surprise you to learn that student evaluations of teaching are of questionable quality? If your instructors are assessed with a biased or incomplete measure, how might that impact your education?

Most often, student scores are averaged across questions and reported as a final average. This average is used as one factor, often the most important factor, in a faculty member’s reappointment to teaching roles. We learned in this chapter that rating scales are ordinal, not interval or ratio, and the data are categories not numbers. Although rating scales use a familiar 1-5 scale, the numbers 1, 2, 3, 4, & 5 are really just helpful labels for categories like “excellent” or “strongly agree.” If we relabeled these categories as letters (A-E) rather than as numbers (1-5), how would you average them?

Averaging ordinal data is methodologically dubious, as the numbers are merely a useful convention. As you will learn in Chapter 14 , taking the median value is what makes the most sense with ordinal data. Median values are also less sensitive to outliers. So, a single student who has strong negative or positive feelings towards the professor could bias the class’s SETs scores higher or lower than what the “average” student in the class would say, particularly for classes with few students or in which fewer students completed evaluations of their teachers.

We care about teaching quality because more effective teachers will produce more knowledgeable and capable students. However, student evaluations of teaching are not particularly good indicators of teaching quality and are not associated with the independently measured learning gains of students (i.e., test scores, final grades) (Uttl et al., 2017). [20] This speaks to the lack of criterion validity. Higher teaching quality should be associated with better learning outcomes for students, but across multiple studies stretching back years, there is no association that cannot be better explained by other factors. To be fair, there are scholars who find that SETs are valid and reliable. For a thorough defense of SETs as well as a historical summary of the literature see Benton & Cashin (2012). [21]

Even though student evaluations of teaching often contain dozens of questions, researchers often find that the questions are so highly interrelated that one concept (or factor, as it is called in a factor analysis ) explains a large portion of the variance in teachers’ scores on student evaluations (Clayson, 2018). [22] Personally, I believe based on completing SETs myself that factor is probably best conceptualized as student satisfaction, which is obviously worthwhile to measure, but is conceptually quite different from teaching effectiveness or whether a course achieved its intended outcomes. The lack of a clear operational and conceptual definition for the variable or variables being measured in student evaluations of teaching also speaks to a lack of content validity. Researchers check content validity by comparing the measurement method with the conceptual definition, but without a clear conceptual definition of the concept measured by student evaluations of teaching, it’s not clear how we can know our measure is valid. Indeed, the lack of clarity around what is being measured in teaching evaluations impairs students’ ability to provide reliable and valid evaluations. So, while many researchers argue that the class average SETs scores are reliable in that they are consistent over time and across classes, it is unclear what exactly is being measured even if it is consistent (Clayson, 2018). [23]

As a faculty member, there are a number of things I can do to influence my evaluations and disrupt validity and reliability. Since SETs scores are associated with the grades students perceive they will receive (e.g., Boring et al., 2016), [24] guaranteeing everyone a final grade of A in my class will likely increase my SETs scores and my chances at tenure and promotion. I could time an email reminder to complete SETs with releasing high grades for a major assignment to boost my evaluation scores. On the other hand, student evaluations might be coincidentally timed with poor grades or difficult assignments that will bias student evaluations downward. Students may also infer I am manipulating them and give me lower SET scores as a result. To maximize my SET scores and chances and promotion, I also need to select which courses I teach carefully. Classes that are more quantitatively oriented generally receive lower ratings than more qualitative and humanities-driven classes, which makes my decision to teach social work research a poor strategy (Uttl & Smibert, 2017). [25] The only manipulative strategy I will admit to using is bringing food (usually cookies or donuts) to class during the period in which students are completing evaluations. Measurement is impacted by context.

As a white cis-gender male educator, I am adversely impacted by SETs because of their sketchy validity, reliability, and methodology. The other flaws with student evaluations actually help me while disadvantaging teachers from oppressed groups. Heffernan (2021) [26] provides a comprehensive overview of the sexism, racism, ableism, and prejudice baked into student evaluations:

“In all studies relating to gender, the analyses indicate that the highest scores are awarded in subjects filled with young, white, male students being taught by white English first language speaking, able-bodied, male academics who are neither too young nor too old (approx. 35–50 years of age), and who the students believe are heterosexual. Most deviations from this scenario in terms of student and academic demographics equates to lower SET scores. These studies thus highlight that white, able-bodied, heterosexual, men of a certain age are not only the least affected, they benefit from the practice. When every demographic group who does not fit this image is significantly disadvantaged by SETs, these processes serve to further enhance the position of the already privileged” (p. 5).

The staggering consistency of studies examining prejudice in SETs has led to some rather superficial reforms like reminding students to not submit racist or sexist responses in the written instructions given before SETs. Yet, even though we know that SETs are systematically biased against women, people of color, and people with disabilities, the overwhelming majority of universities in the United States continue to use them to evaluate faculty for promotion or reappointment. From a critical perspective, it is worth considering why university administrators continue to use such a biased and flawed instrument. SETs produce data that make it easy to compare faculty to one another and track faculty members over time. Furthermore, they offer students a direct opportunity to voice their concerns and highlight what went well.

As the people with the greatest knowledge about what happened in the classroom as whether it met their expectations, providing students with open-ended questions is the most productive part of SETs. Personally, I have found focus groups written, facilitated, and analyzed by student researchers to be more insightful than SETs. MSW student activists and leaders may look for ways to evaluate faculty that are more methodologically sound and less systematically biased, creating institutional change by replacing or augmenting traditional SETs in their department. There is very rarely student input on the criteria and methodology for teaching evaluations, yet students are the most impacted by helpful or harmful teaching practices.

Students should fight for better assessment in the classroom because well-designed assessments provide documentation to support more effective teaching practices and discourage unhelpful or discriminatory practices. Flawed assessments like SETs, can lead to a lack of information about problems with courses, instructors, or other aspects of the program. Think critically about what data your program uses to gauge its effectiveness. How might you introduce areas of student concern into how your program evaluates itself? Are there issues with food or housing insecurity, mentorship of nontraditional and first generation students, or other issues that faculty should consider when they evaluate their program? Finally, as you transition into practice, think about how your agency measures its impact and how it privileges or excludes client and community voices in the assessment process.

Let’s consider an example from social work practice. Let’s say you work for a mental health organization that serves youth impacted by community violence. How should you measure the impact of your services on your clients and their community? Schools may be interested in reducing truancy, self-injury, or other behavioral concerns. However, by centering delinquent behaviors in how we measure our impact, we may be inattentive to the role of trauma, family dynamics, and other cognitive and social processes beyond “delinquent behavior.” Indeed, we may bias our interventions by focusing on things that are not as important to clients’ needs. Social workers want to make sure their programs are improving over time, and we rely on our measures to indicate what to change and what to keep. If our measures present a partial or flawed view, we lose our ability to establish and act on scientific truths.

While writing this section, one of the authors wrote this commentary article addressing potential racial bias in social work licensing exams. If you are interested in an example of missing or flawed measures that relates to systems your social work practice is governed by (rather than SETs which govern our practice in higher education) check it out!

You may also be interested in similar arguments against the standard grading scale (A-F), and why grades (numerical, letter, etc.) do not do a good job of measuring learning. Think critically about the role that grades play in your life as a student, your self-concept, and your relationships with teachers. Your test and grade anxiety is due in part to how your learning is measured. Those measurements end up becoming an official record of your scholarship and allow employers or funders to compare you to other scholars. The stakes for measurement are the same for participants in your research study.

importance of quantitative research in social work

Self-reflection and measurement

Student evaluations of teaching are just like any other measure. How we decide to measure what we are researching is influenced by our backgrounds, including our culture, implicit biases, and individual experiences. For me as a middle-class, cisgender white woman, the decisions I make about measurement will probably default to ones that make the most sense to me and others like me, and thus measure characteristics about us most accurately if I don’t think carefully about it. There are major implications for research here because this could affect the validity of my measurements for other populations.

This doesn’t mean that standardized scales or indices, for instance, won’t work for diverse groups of people. What it means is that researchers must not ignore difference in deciding how to measure a variable in their research. Doing so may serve to push already marginalized people further into the margins of academic research and, consequently, social work intervention. Social work researchers, with our strong orientation toward celebrating difference and working for social justice, are obligated to keep this in mind for ourselves and encourage others to think about it in their research, too.

This involves reflecting on what we are measuring, how we are measuring, and why we are measuring. Do we have biases that impacted how we operationalized our concepts? Did we include stakeholders and gatekeepers in the development of our concepts? This can be a way to gain access to vulnerable populations. What feedback did we receive on our measurement process and how was it incorporated into our work? These are all questions we should ask as we are thinking about measurement. Further, engaging in this intentionally reflective process will help us maximize the chances that our measurement will be accurate and as free from bias as possible.

The NASW Code of Ethics discusses social work research and the importance of engaging in practices that do not harm participants. This is especially important considering that many of the topics studied by social workers are those that are disproportionately experienced by marginalized and oppressed populations. Some of these populations have had negative experiences with the research process: historically, their stories have been viewed through lenses that reinforced the dominant culture’s standpoint. Thus, when thinking about measurement in research projects, we must remember that the way in which concepts or constructs are measured will impact how marginalized or oppressed persons are viewed. It is important that social work researchers examine current tools to ensure appropriateness for their population(s). Sometimes this may require researchers to use existing tools. Other times, this may require researchers to adapt existing measures or develop completely new measures in collaboration with community stakeholders. In summary, the measurement protocols selected should be tailored and attentive to the experiences of the communities to be studied.

Unfortunately, social science researchers do not do a great job of sharing their measures in a way that allows social work practitioners and administrators to use them to evaluate the impact of interventions and programs on clients. Few scales are published under an open copyright license that allows other people to view it for free and share it with others. Instead, the best way to find a scale mentioned in an article is often to simply search for it in Google with “.pdf” or “.docx” in the query to see if someone posted a copy online (usually in violation of copyright law). As we discussed in Chapter 4 , this is an issue of information privilege, or the structuring impact of oppression and discrimination on groups’ access to and use of scholarly information. As a student at a university with a research library, you can access the Mental Measurement Yearbook to look up scales and indexes that measure client or program outcomes while researchers unaffiliated with university libraries cannot do so. Similarly, the vast majority of scholarship in social work and allied disciplines does not share measures, data, or other research materials openly, a best practice in open and collaborative science. It is important to underscore these structural barriers to using valid and reliable scales in social work practice. An invalid or unreliable outcome test may cause ineffective or harmful programs to persist or may worsen existing prejudices and oppressions experienced by clients, communities, and practitioners.

But it’s not just about reflecting and identifying problems and biases in our measurement, operationalization, and conceptualization—what are we going to  do about it? Consider this as you move through this book and become a more critical consumer of research. Sometimes there isn’t something you can do in the immediate sense—the literature base at this moment just is what it is. But how does that inform what you will do later?

A place to start: Stop oversimplifying race

We will address many more of the critical issues related to measurement in the next chapter. One way to get started in bringing cultural awareness to scientific measurement is through a critical examination of how we analyze race quantitatively. There are many important methodological objections to how we measure the impact of race. We encourage you to watch Dr. Abigail Sewell’s three-part workshop series called “Nested Models for Critical Studies of Race & Racism” for the Inter-university Consortium for Political and Social Research (ICPSR). She discusses how to operationalize and measure inequality, racism, and intersectionality and critiques researchers’ attempts to oversimplify or overlook racism when we measure concepts in social science. If you are interested in developing your social work research skills further, consider applying for financial support from your university to attend an ICPSR summer seminar like Dr. Sewell’s where you can receive more advanced and specialized training in using research for social change.

  • Part 1: Creating Measures of Supraindividual Racism (2-hour video)
  • Part 2: Evaluating Population Risks of Supraindividual Racism (2-hour video)
  • Part 3: Quantifying Intersectionality (2-hour video)
  • Social work researchers must be attentive to personal and institutional biases in the measurement process that affect marginalized groups.
  • What is measured and how it is measured is shaped by power, and social workers must be critical and self-reflective in their research projects.

Think about your current research question and the tool(s) that you will use to gather data. Even if you haven’t chosen your tools yet, think of some that you have encountered in the literature so far.

  • How does your positionality and experience shape what variables you are choosing to measure and how you measure them?
  • Evaluate the measures in your study for potential biases.
  • If you are using measures developed by another researcher, investigate whether it is valid and reliable in other studies across cultures.
  • Milkie, M. A., & Warner, C. H. (2011). Classroom learning environments and the mental health of first grade children. Journal of Health and Social Behavior, 52 , 4–22 ↵
  • Kaplan, A. (1964). The conduct of inquiry: Methodology for behavioral science . San Francisco, CA: Chandler Publishing Company. ↵
  • Earl Babbie offers a more detailed discussion of Kaplan’s work in his text. You can read it in: Babbie, E. (2010). The practice of social research (12th ed.). Belmont, CA: Wadsworth. ↵
  • In this chapter, we will use the terms concept and construct interchangeably. While each term has a distinct meaning in research conceptualization, we do not believe this distinction is important enough to warrant discussion in this chapter. ↵
  • Wong, Y. J., Steinfeldt, J. A., Speight, Q. L., & Hickman, S. J. (2010). Content analysis of Psychology of men & masculinity (2000–2008).  Psychology of Men & Masculinity ,  11 (3), 170. ↵
  • Kimmel, M. (2000).  The  gendered society . New York, NY: Oxford University Press; Kimmel, M. (2008). Masculinity. In W. A. Darity Jr. (Ed.),  International  encyclopedia of the social sciences  (2nd ed., Vol. 5, p. 1–5). Detroit, MI: Macmillan Reference USA ↵
  • Kimmel, M. & Aronson, A. B. (2004).  Men and masculinities: A-J . Denver, CO: ABL-CLIO. ↵
  • Krosnick, J.A. & Berent, M.K. (1993). Comparisons of party identification and policy preferences: The impact of survey question format.  American Journal of Political Science, 27 (3), 941-964. ↵
  • Likert, R. (1932). A technique for the measurement of attitudes.  Archives of Psychology,140 , 1–55. ↵
  • Stevens, S. S. (1946). On the Theory of Scales of Measurement.  Science ,  103 (2684), 677-680. ↵
  • Sullivan G. M. (2011). A primer on the validity of assessment instruments. Journal of graduate medical education, 3 (2), 119–120. doi:10.4300/JGME-D-11-00075.1 ↵
  • Engel, R. & Schutt, R. (2013). The practice of research in social work (3rd. ed.) . Thousand Oaks, CA: SAGE. ↵
  • Engel, R. & Schutt, R. (2013). The practice of research in social work (3rd. ed.). Thousand Oaks, CA: SAGE. ↵
  • Jagadish, H. V., Stoyanovich, J., & Howe, B. (2021). COVID-19 Brings Data Equity Challenges to the Fore. Digital Government: Research and Practice ,  2 (2), 1-7. ↵
  • Uttl, B., White, C. A., & Gonzalez, D. W. (2017). Meta-analysis of faculty's teaching effectiveness: Student evaluation of teaching ratings and student learning are not related. Studies in Educational Evaluation ,  54 , 22-42. ↵
  • Benton, S. L., & Cashin, W. E. (2014). Student ratings of instruction in college and university courses. In Higher education: Handbook of theory and research  (pp. 279-326). Springer, Dordrecht. ↵
  • Clayson, D. E. (2018). Student evaluation of teaching and matters of reliability.  Assessment & Evaluation in Higher Education ,  43 (4), 666-681. ↵
  • Clayson, D. E. (2018). Student evaluation of teaching and matters of reliability. Assessment & Evaluation in Higher Education ,  43 (4), 666-681. ↵
  • Boring, A., Ottoboni, K., & Stark, P. (2016). Student evaluations of teaching (mostly) do not measure teaching effectiveness.  ScienceOpen Research . ↵
  • Uttl, B., & Smibert, D. (2017). Student evaluations of teaching: teaching quantitative courses can be hazardous to one’s career. Peer Journal ,  5 , e3299. ↵
  • Heffernan, T. (2021). Sexism, racism, prejudice, and bias: a literature review and synthesis of research surrounding student evaluations of courses and teaching.  Assessment & Evaluation in Higher Education , 1-11. ↵

The process by which we describe and ascribe meaning to the key facts, concepts, or other phenomena under investigation in a research study.

In measurement, conditions that are easy to identify and verify through direct observation.

In measurement, conditions that are subtle and complex that we must use existing knowledge and intuition to define.

Conditions that are not directly observable and represent states of being, experiences, and ideas.

A mental image that summarizes a set of similar observations, feelings, or ideas

developing clear, concise definitions for the key concepts in a research question

concepts that are comprised of multiple elements

concepts that are expected to have a single underlying dimension

assuming that abstract concepts exist in some concrete, tangible way

process by which researchers spell out precisely how a concept will be measured in their study

Clues that demonstrate the presence, intensity, or other aspects of a concept in the real world

unprocessed data that researchers can analyze using quantitative and qualitative methods (e.g., responses to a survey or interview transcripts)

a characteristic that does not change in a study

The characteristics that make up a variable

variables whose values are organized into mutually exclusive groups but whose numerical values cannot be used in mathematical operations.

variables whose values are mutually exclusive and can be used in mathematical operations

The lowest level of measurement; categories cannot be mathematically ranked, though they are exhaustive and mutually exclusive

Exhaustive categories are options for closed ended questions that allow for every possible response (no one should feel like they can't find the answer for them).

Mutually exclusive categories are options for closed ended questions that do not overlap, so people only fit into one category or another, not both.

Level of measurement that follows nominal level. Has mutually exclusive categories and a hierarchy (rank order), but we cannot calculate a mathematical distance between attributes.

An ordered set of responses that participants must choose from.

A level of measurement that is continuous, can be rank ordered, is exhaustive and mutually exclusive, and for which the distance between attributes is known to be equal. But for which there is no zero point.

The highest level of measurement. Denoted by mutually exclusive categories, a hierarchy (order), values can be added, subtracted, multiplied, and divided, and the presence of an absolute zero.

measuring people’s attitude toward something by assessing their level of agreement with several statements about it

Composite (multi-item) scales in which respondents are asked to indicate their opinions or feelings toward a single statement using different pairs of adjectives framed as polar opposites.

A composite scale using a series of items arranged in increasing order of intensity of the construct of interest, from least intense to most intense.

measurements of variables based on more than one one indicator

An empirical structure for measuring items or indicators of the multiple dimensions of a concept.

a composite score derived from aggregating measures of multiple concepts (called components) using a set of rules and formulas

The ability of a measurement tool to measure a phenomenon the same way, time after time. Note: Reliability does not imply validity.

The extent to which scores obtained on a scale or other measure are consistent across time

The consistency of people’s responses across the items on a multiple-item measure. Responses about the same underlying construct should be correlated, though not perfectly.

The extent to which different observers are consistent in their assessment or rating of a particular characteristic or item.

The extent to which the scores from a measure represent the variable they are intended to.

The extent to which a measurement method appears “on its face” to measure the construct of interest

The extent to which a measure “covers” the construct of interest, i.e., it's comprehensiveness to measure the construct.

The extent to which people’s scores on a measure are correlated with other variables (known as criteria) that one would expect them to be correlated with.

A type of criterion validity. Examines how well a tool provides the same scores as an already existing tool administered at the same point in time.

A type of criterion validity that examines how well your tool predicts a future criterion.

The extent to which scores on a measure are not correlated with measures of variables that are conceptually distinct.

(also known as bias) refers to when a measure consistently outputs incorrect data, usually in one direction and due to an identifiable process

When a participant's answer to a question is altered due to the way in which a question is written. In essence, the question leads the participant to answer in a specific way.

Social desirability bias occurs when we create questions that lead respondents to answer in ways that don't reflect their genuine thoughts or feelings to avoid being perceived negatively.

In a measure, when people say yes to whatever the researcher asks, even when doing so contradicts previous answers.

Unpredictable error that does not result in scores that are consistently higher or lower on a given measure but are nevertheless inaccurate.

when a measure indicates the presence of a phenomenon, when in reality it is not present

when a measure does not indicate the presence of a phenomenon, when in reality it is present

the group of people whose needs your study addresses

The value in the middle when all our values are placed in numerical order. Also called the 50th percentile.

individuals or groups who have an interest in the outcome of the study you conduct

the people or organizations who control access to the population you want to study

Graduate research methods in social work Copyright © 2021 by Matthew DeCarlo, Cory Cummings, Kate Agnelli is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

  • Tools and Resources
  • Customer Services
  • Addictions and Substance Use
  • Administration and Management
  • Aging and Older Adults
  • Biographies
  • Children and Adolescents
  • Clinical and Direct Practice
  • Couples and Families
  • Criminal Justice
  • Disabilities
  • Ethics and Values
  • Gender and Sexuality
  • Health Care and Illness
  • Human Behavior
  • International and Global Issues
  • Macro Practice
  • Mental and Behavioral Health
  • Policy and Advocacy
  • Populations and Practice Settings
  • Race, Ethnicity, and Culture
  • Religion and Spirituality
  • Research and Evidence-Based Practice
  • Social Justice and Human Rights
  • Social Work Profession
  • Share This Facebook LinkedIn Twitter

Article contents

  • Catheleen Jordan Catheleen Jordan University of Texas at Arlington
  •  and  Cynthia Franklin Cynthia Franklin University of Texas at Austin
  • https://doi.org/10.1093/acrefore/9780199975839.013.24
  • Published online: 11 June 2013
  • This version: 03 September 2013
  • Previous version

Assessment is an ongoing process of data collection aimed at identifying client strengths and problems. Early assessment models were based on psychoanalytic theory; however, current assessment is based on brief, evidence-based practice models. Both quantitative and qualitative methods may be used to create an integrative skills approach that links assessment to intervention. Specifically, assessment guides treatment planning, as well as informs intervention selection and monitoring.

  • integrative skills assessment
  • qualitative methods
  • quantitative methods

Updated in this version

Bibliography updated to reflect recent research.

Jordan and Franklin discuss an integrative skills assessment approach; that is an ongoing process of data collection aimed at understanding clients in the context of their environmental systems (Jordan & Franklin, 2011 ). Multiple methods should be used to formulate a complete picture of this intricate system; these may include both quantitative and qualitative techniques. Quantitative techniques are methods that allow for operationally defining clients' problems. An example is a scale that gives a numerical score of the client's depression. Qualitative techniques, on the other hand, describe the complexity of clients' problems in more detail. An example of a qualitative measure is a mapping technique such as a genogram. This entry will address assessment as an integrative approach, linking assessment and intervention, and will discuss quantitative and qualitative methods.

An Integrative Skills Assessment Approach

Several practice models have made an important contribution to social work assessment. Jordan and Franklin ( 2011 , pp. 8-35) reviewed early assessment models, including the psychosocial model of Florence Hollis, Gordon Hamilton, and Helen Perlman. The term person-in-environment originated in this approach, and its goal is to determine a client's psychosocial diagnosis. An adaptation of the psychosocial model is the functional approach that deemphasized history and focused on clients' problem-solving ability. These models were based on psychoanalytic theory early on, and ego psychology as the model evolved. Specific techniques used included classical psychiatric interviewing, as well as testing, observations, and interpretations. In contrast, today's assessment has been influenced by brief, evidence-based practice models. Contributors include Eileen Gambrill and Richard Stuart, whose behavioral approaches brought the measurement perspective into social work practice. Hudson ( 1982 ) developed a clinical assessment system of computerized scales to easily measure clients' inter- and intrapersonal problems. Kevin Corcoran and Joel Fischer published the first volume of their Measures for Clinical Practice in 1987 ; this book of measures was designed for use in daily clinical work. In 1995 , Jordan and Franklin attempted to integrate qualitative and quantitative approaches to create a comprehensive assessment approach that they referred to as an integrative skills assessment approach. An integrative skills assessment model has these characteristics: theoretical and technical eclectism and a de-emphasis on history, as well as an emphasis on problem and strengths defining, treatment planning, and outcome monitoring. Building collaborative relationships with clients is an important component of the assessment; the qualitative techniques may help with this. Also, collaborative relationships may help the client move successfully from the assessment phase into intervention.

Linking Assessment and Intervention

Today's assessment is an evidence-based approach.

Clinical Decision-Making

Evidence-based approaches assume that the best evidence is used along with critical thinking skills, knowledge of best practices, and client input (Gibbs & Gambrill, 2002 ; McNeece & Thyer, 2004 ). Assessment is an ongoing process beginning with problem (and strength) identification using both quantitative and qualitative techniques.

Problem Monitoring

Qualitative data help the practitioner to understand clients' contextual issues and to establish rapport, while quantitative data may be used to monitor clients' problems and strengths. Monitoring may be structured by using a single subject design approach (Bloom, Fischer, & Orme, 2005 ). Problems targeted for change are monitored over the course of treatment, usually weekly or even daily. That is, the client completes the same measurement over time so that comparisons may be made to track improvements. These improvements are tracked over the phases of treatment, usually baseline (assessment), treatment, and follow-up. Data are analyzed using a variety of simple statistical procedures. The intervention may be changed if necessary, if the monitory reveals that no progress is occurring.

Treatment Planning

Moving from assessment to intervention.

Jordan and Franklin ( 2002 ) presented an evidence-based framework for treatment planning with families, including the following steps: problem selection, problem definition, goal development, objective construction, intervention creation, and diagnosis determination. Interventions should logically follow from the problems identified at the assessment (beginning) phase, called baseline . The baseline data indicate the extent and severity of the problem, as well as appropriate outcomes or goals. Following with an evidence-based approach, practitioners search the literature for interventions showing the best evidence at solving the client's particular problem.

Quantitative Clinical Assessment Methods

Quantitative assessment methods provide us with a numerical representation of clients' problems or strengths.

Rationale for Including Quantitative Measures in Assessment

Four reasons help us understand the benefit of using quantitative measures in client assessment (Jordan & Franklin, 2011 , pp. 73-76). First, understanding, measuring, and monitoring improve the treatment process. This allows treatment to be changed if no progress is seen. Second, use of clinical research methods allows practitioners to contribute to the clinical practice knowledge base. Third, practice evaluation provides the accountability necessary for managed care and external funders. Fourth, today's practice environment requires social workers to possess greater measurement skills to be competitive with other similar professionals.

Quantitative Methods of Measuring Client Behavior

Examples of quantitative methods that may be used by practitioners include the following: (a) client self-reporting and monitoring, (b) self-anchored and rating scales, (c) questionnaires, (d) direct behavioral observation, (e) role play and analogue situations, (f) behavioral by-products, (g) psychophysiological measures, (h) goal attainment scaling, (i) standardized measures, and (j) projective measures.

Resources and Guidelines

Guidelines for developing a measurement system for assessment include the following: (a) using multiple methods, (b) developing baseline indicators of client functioning, (c) using repeated measures, and (d) using both global and specific measures. A resource for obtaining quantitative methods is Corcoran and Fischer's Measures for Clinical Practice ( 2005 ).

Qualitative Clinical Assessment Methods

Qualitative assessment methods seek to understand the meaning of the client system by using contextual techniques and add an extra level of depth to the clinical assessment.

Rationale for Including Qualitative Measures in Assessment

Qualitative techniques such as biographical narratives, interviews, or experiential exercises seek a holistic understanding of the client. Five unique contributions that qualitative assessment measures bring include, first, the ability to uncover realities that would be missed when using only quantitative approaches (Jordan & Franklin, 2011 , pp. 127-130). For example, a client may be asked to keep a diary to add context to her standardized measurement recording her depression. Second, the standardized instruments used in quantitative assessments have limited usefulness for people of color. Qualitative assessments offer open-ended process-oriented techniques to access clients' cultural scripts and meanings. Third, qualitative assessments promote practitioner's self-awareness and therefore a positive therapeutic alliance. Fourth, the holistic nature of qualitative assessment encourages a reciprocal client-social-worker relationship. Finally, a fifth rationale relates to qualitative technique's fit with many theoretical and therapeutic perspectives, including family systems, ecosystems, cognitive-constructivist, feminist therapies, and so forth.

Qualitative Methods of Measuring Client Behavior

Qualitative methods include ethnographic interviewing; narrative approaches such as process recording, case studies, and self-characterization; repertory grids; graphic methods; and participant observations.

Validity and reliability are very important in qualitative assessment and rest on the credibility and completeness of the data collected. Questions to ask include the following: Were multiple measurements used? Does the information tell the whole story? Do the conclusions drawn make sense? Are there any unexplained gaps? and so forth. For a resource, see Deborah Padgett's The Qualitative Research Experience ( 2003 ).

Assessment is an ongoing process whereby qualitative and quantitative assessment methods may be used together in data-gathering. The use of multiple methods is necessary to improve the reliability and validity of clinical information. Specifically, qualitative methods may enhance the clinician's understanding of the context within which problems occur, while quantitative methods provide information on the specific nature of the problem. The assessment then informs treatment planning and guides intervention selection.

Future Trends

With the increasing emphasis on evidence-informed practice, assessment is certain to latch on to this broader definition as well. Evidence-informed practice is defined by the Institute of Medicine as consideration of research evidence, clinician expertise, client values, in addition to contextual variables in clinical decision-making. Adding contextual variables to the assessment equation gives a meatier picture of the client-in-situation, taking us as social workers back to our roots.

  • Bloom, M. , Fischer, J. , & Orme, J. (2005). Evaluating practice: Guidelines for the accountable professional (5th ed.). Boston: Allyn & Bacon.
  • Corcoran, K. , & Fischer, J. (2005). Measures for clinical practice. New York: Free Press.
  • Gibbs, L. , & Gambrill, E. (2002). Evidence-based practice: Counterarguments to objections. Research on Social Work Practice , 12(3), 452-476.
  • Hudson, W. (1982). The clinical measurement package. Homewood, IL: Dorsey Press.
  • Jordan, C. , & Franklin, C. (1995). Clinical assessment for social workers: Quantitative and qualitative methods. Chicago: Lyceum Books.
  • Jordan, C. , & Franklin, C. (2002). Treatment planning with families: An evidence-based approach. In A. Roberts & G. Greene (Eds.), Social worker's desk reference. New York: Oxford University Press.
  • Jordan, C. , & Franklin, C. (2011). Clinical assessment for social workers: Quantitative and qualitative methods (3rd ed.). Chicago: Lyceum Books.
  • McNeece, A. , & Thyer, B. (2004). Evidence-based practice and social work. Journal of Evidence-Based Social Work , 1(1), 7-24.
  • Padgett, D. (2003). The qualitative research experience. Belmont, CA: Wadsworth.

Further Reading

  • Hoefer, R. , & Jordan, C. (2007). Missing links in evidence-based practice for macro social work. In M. Roberts-DeGennaro (Ed.), Journal of Evidence-Based Social Work.
  • Walter W. Hudson's WALMYR Publishing Co.-Assessment Scales. http://www.walmyr.com/scales.html
  • Psychological Assessment http://www.apa.org/journals/pas/ http://www.guidetopsychology.com/testing.htm

Related Articles

  • Evidence-Based Practice
  • Qualitative Research
  • Quantitative Research

Other Resources

  • New & Featured
  • Forthcoming Articles

Printed from Encyclopedia of Social Work. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 14 August 2024

  • Cookie Policy
  • Privacy Policy
  • Legal Notice
  • Accessibility
  • [81.177.180.204]
  • 81.177.180.204

Character limit 500 /500

Logo for VCU Pressbooks

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Part 3: Using quantitative methods

10. Quantitative sampling

Chapter outline.

  • The sampling process (25 minute read)
  • Sampling approaches for quantitative research (15 minute read)
  • Sample quality (24 minute read)

Content warning: examples contain references to addiction to technology, domestic violence and batterer intervention, cancer, illegal drug use, LGBTQ+ discrimination, binge drinking, intimate partner violence among college students, child abuse, neocolonialism and Western hegemony.

10.1 The sampling process

Learning objectives.

Learners will be able to…

  • Decide where to get your data and who you might need to talk to
  • Evaluate whether it is feasible for you to collect first-hand data from your target population
  • Describe the process of sampling
  • Apply population, sampling frame, and other sampling terminology to sampling people your project’s target population

One of the things that surprised me most as a research methods professor is how much my students struggle with understanding sampling. It is surprising because people engage in sampling all the time. How do you learn whether you like a particular food, like BBQ ribs? You sample them from different restaurants! Obviously, social scientists put a bit more effort and thought into the process than that, but the underlying logic is the same. By sampling a small group of BBQ ribs from different restaurants and liking most of them, you can conclude that when you encounter BBQ ribs again, you will probably like them. You don’t need to eat all of the BBQ ribs in the world to come to that conclusion, just a small sample. [1] Part of the difficulty my students face is learning sampling terminology, which is the focus of this section.

importance of quantitative research in social work

Who is your study about and who should you talk to?

At this point in the research process, you know what your research question is. Our goal in this chapter is to help you understand how to find the people (or documents) you need to study in order to find the answer to your research question. It may be helpful at this point to distinguish between two concepts. Your unit of analysis is the entity that you wish to be able to say something about at the end of your study (probably what you’d consider to be the main focus of your study). Your unit of observation is the entity (or entities) that you actually observe, measure, or collect in the course of trying to learn something about your unit of analysis.

It is often the case that your unit of analysis and unit of observation are the same. For example, we may want to say something about social work students (unit of analysis), so we ask social work students at our university to complete a survey for our study (unit of observation). In this case, we are observing individuals , i.e. students, so we can make conclusions about individual s .

On the other hand, our unit of analysis and observation can differ. We could sample social work students to draw conclusions about organizations or universities. Perhaps we are comparing students at historically Black colleges and universities (HBCUs) and primarily white institutions (PWIs). Even though our sample was made up of individual students from various colleges (our unit of observation), our unit of analysis was the university as an organization. Conclusions we made from individual-level data were used to understand larger organizations.

Similarly, we could adjust our sampling approach to target specific student cohorts. Perhaps we wanted to understand the experiences of Black social work students in PWIs. We could choose either an individual unit of observation by selecting students, or a group unit of observation by studying the National Association of Black Social Workers .

Sometimes the units of analysis and observation differ due to pragmatic reasons. If we wanted to study whether being a social work student impacted family relationships, we may choose to study family members of students in social work programs who could give us information about how they behaved in the home. In this case, we would be observing family members to draw conclusions about individual students.

In sum, there are many potential units of analysis that a social worker might examine, but some of the most common include i ndividuals, groups, and organizations. Table 10.1 details examples identifying the units of observation and analysis in a hypothetical study of student addiction to electronic gadgets.

Table 10.1 Units of analysis and units of observation
       
Which students are most likely to be addicted to their electronic gadgets? Individuals Survey of students on campus Individuals New Media majors, men, and students with high socioeconomic status are all more likely than other students to become addicted to their electronic gadgets.
Do certain types of social clubs have more gadget-addicted members than other sorts of clubs? Groups Survey of students on campus Individuals Clubs with a scholarly focus, such as social work club and the math club, have more gadget-addicted members than clubs with a social focus, such as the 100-bottles-of- beer-on-the-wall club and the knitting club.
How do different colleges address the problem of electronic gadget addiction? Organizations Content analysis of policies Documents Campuses without strong computer science programs are more likely than those with such programs to expel students who have been found to have addictions to their electronic gadgets.
Note Please remember that the findings described here are hypothetical. There is no reason to think that any of the hypothetical findings described here would actually bear out if empirically tested.

First-hand vs. second-hand knowledge

Your unit of analysis will be determined by your research question. Specifically, it should relate to your target population. Your unit of observation, on the other hand, is determined largely by the method of data collection you use to answer that research question. Let’s consider a common issue in social work research: understanding the effectiveness of different social work interventions. Who has first-hand knowledge and who has second-hand knowledge? Well, practitioners would have first-hand knowledge about implementing the intervention. For example, they might discuss with you the unique language they use help clients understand the intervention. Clients, on the other hand, have first-hand knowledge about the impact of those interventions on their lives. If you want to know if an intervention is effective, you need to ask people who have received it!

Unfortunately, student projects run into pragmatic limitations with sampling from client groups. Clients are often diagnosed with severe mental health issues or have other ongoing issues that render them a vulnerable population at greater risk of harm. Asking a person who was recently experiencing suicidal ideation about that experience may interfere with ongoing treatment. Client records are also confidential and cannot be shared with researchers unless clients give explicit permission. Asking one’s own clients to participate in the study creates a dual relationship with the client, as both clinician and researcher, and dual relationship have conflicting responsibilities and boundaries.

Obviously, studies are done with social work clients all the time. But for student projects in the classroom, it is often required to get second-hand information from a population that is less vulnerable. Students may instead choose to study clinicians and how they perceive the effectiveness of different interventions. While clinicians can provide an informed perspective, they have less knowledge about personally receiving the intervention. In general, researchers prefer to sample the people who have first-hand knowledge about their topic, though feasibility often forces them to analyze second-hand information instead.

Population: Who do you want to study?

In social scientific research, a  population   is the cluster of people you are most interested in. It is often the “who” that you want to be able to say something about at the end of your study. While populations in research may be rather large, such as “the American people,” they are more typically more specific than that. For example, a large study for which the population of interest is the American people will likely specify which American people, such as adults over the age of 18 or citizens or legal permanent residents. Based on your work in Chapter 2 , you should have a target population identified in your working question. That might be something like “people with developmental disabilities” or “students in a social work program.”

It is almost impossible for a researcher to gather data from their entire population of interest. This might sound surprising or disappointing until you think about the kinds of research questions that social workers typically ask. For example, let’s say we wish to answer the following question: “How does gender impact attendance in a batterer intervention program?” Would you expect to be able to collect data from all people in batterer intervention programs across all nations from all historical time periods? Unless you plan to make answering this research question your entire life’s work (and then some), I’m guessing your answer is a resounding no. So, what to do? Does not having the time or resources to gather data from every single person of interest mean having to give up your research interest?

Let’s think about who could possibly be in your study.

  • What is your population, the people you want to make conclusions about?
  • Do your unit of analysis and unit of observation differ or are they the same?
  • Can you ethically and practically get first-hand information from the people most knowledgeable about the topic, or will you rely on second-hand information from less vulnerable populations?

Setting: Where will you go to get your data?

While you can’t gather data from everyone, you can find some people from your target population to study. The first rule of sampling is: go where your participants are. You will need to figure out where you will go to get your data. For many student researchers, it is their agency, their peers, their family and friends, or whoever comes across students’ social media posts or emails asking people to participate in their study.

Each setting (agency, social media) limits your reach to only a small segment of your target population who has the opportunity to be a part of your study. This intermediate point between the overall population and the sample of people who actually participate in the researcher’s study is called a sampling frame . A sampling frame is a list of people from which you will draw your sample.

But where do you find a sampling frame? Answering this question is the first step in conducting human subjects research. Social work researchers must think about locations or groups in which your target population gathers or interacts. For example, a study on quality of care in nursing homes may choose a local nursing home because it’s easy to access. The sampling frame could be all of the residents of the nursing home. You would select your participants for your study from the list of residents. Note that this is a real list. That is, an administrator at the nursing home would give you a list with every resident’s name or ID number from which you would select your participants. If you decided to include more nursing homes in your study, then your sampling frame could be all the residents at all the nursing homes who agreed to participate in your study.

Let’s consider some more examples. Unlike nursing home patients, cancer survivors do not live in an enclosed location and may no longer receive treatment at a hospital or clinic. For social work researchers to reach participants, they may consider partnering with a support group that services this population. Perhaps there is a support group at a local church survivors may attend. Without a set list of people, your sampling frame would simply be the people who showed up to the support group on the nights you were there. Similarly, if you posted an advertisement in an online peer-support group for people with cancer, your sampling frame is the people in that group.

More challenging still is recruiting people who are homeless, those with very low income, or those who belong to stigmatized groups. For example, a research study by Johnson and Johnson (2014) [2] attempted to learn usage patterns of “bath salts,” or synthetic stimulants that are marketed as “legal highs.” Users of “bath salts” don’t often gather for meetings, and reaching out to individual treatment centers is unlikely to produce enough participants for a study, as the use of bath salts is rare. To reach participants, these researchers ingeniously used online discussion boards in which users of these drugs communicate. Their sampling frame included everyone who participated in the online discussion boards during the time they collected data. Another example might include using a flyer to let people know about your study, in which case your sampling frame would be anyone who walks past your flyer wherever you hang it—usually in a strategic location where you know your population will be.

In conclusion, sampling frames can be a real list of people like the list of faculty and their ID numbers in a university department, which allows you to clearly identify who is in your study and what chance they have of being selected. However, not all sampling frames allow you to be so specific. It is also important to remember that accessing your sampling frame must be practical and ethical, as we discussed in Chapter 2 and Chapter 6 . For studies that present risks to participants, approval from gatekeepers and the university’s institutional review board (IRB) is needed.

Criteria: What characteristics must your participants have/not have?

Your sampling frame is not just everyone in the setting you identified. For example, if you were studying MSW students who are first-generation college students, you might select your university as the setting, but not everyone in your program is a first-generation student. You need to be more specific about which characteristics or attributes individuals either must have or cannot have before they participate in the study. These are known as inclusion and exclusion criteria, respectively.

Inclusion criteria are the characteristics a person must possess in order to be included in your sample. If you were conducting a survey on LGBTQ+ discrimination at your agency, you might want to sample only clients who identify as LGBTQ+. In that case, your inclusion criteria for your sample would be that individuals have to identify as LGBTQ+.

Comparably,  exclusion criteria are characteristics that disqualify a person from being included in your sample. In the previous example, you could think of cisgenderism and heterosexuality as your exclusion criteria because no person who identifies as heterosexual or cisgender would be included in your sample. Exclusion criteria are often the mirror image of inclusion criteria. However, there may be other criteria by which we want to exclude people from our sample. For example, we may exclude clients who were recently discharged or those who have just begun to receive services.

importance of quantitative research in social work

Recruitment: How will you ask people to participate in your study?

Once you have a location and list of people from which to select, all that is left is to reach out to your participants. Recruitment refers to the process by which the researcher informs potential participants about the study and asks them to participate in the research project. Recruitment comes in many different forms. If you have ever received a phone call asking for you to participate in a survey, someone has attempted to recruit you for their study. Perhaps you’ve seen print advertisements on buses, in student centers, or in a newspaper. I’ve received many emails that were passed around my school asking for participants, usually for a graduate student project. As we learn more about specific types of sampling, make sure your recruitment strategy makes sense with your sampling approach. For example, if you put up a flyer in the student health office to recruit student athletes for your study, you may not be targeting your recruitment efforts to settings where your target population is likely to see your recruitment materials.

Recruiting human participants

Sampling is the first time in which you will contact potential study participants. Before you start this process, it is important to make sure you have approval from your university’s institutional review board as well as any gatekeepers at the locations in which you plan to conduct your study. As we discussed in section 10.1, the first rule of sampling is to go where your participants are. If you are studying domestic violence, reach out to local shelters, advocates, or service agencies. Gatekeepers will be necessary to gain access to your participants. For example, a gatekeeper can forward your recruitment email across their employee email list. Review our discussion of gatekeepers in Chapter 2 before proceeding with contacting potential participants as part of recruitment.

Recruitment can take many forms. You may show up at a staff meeting to ask for volunteers. You may send a company-wide email. Each step of this process should be vetted by the IRB as well as other stakeholders and gatekeepers. You will also need to set reasonable expectations for how many reminders you will send to the person before moving on. Generally, it is a good idea to give people a little while to respond, though reminders are often accompanied by an increase in participation. Pragmatically, it is a good idea for you to think through each step of the recruitment process and how much time it will take to complete it.

For example, as a graduate student, I conducted a study of state-level disabilities administrators in which I was recruiting a sample of very busy people and had no financial incentives to offer them for participating in my study. It helped for my research team to bring on board a well-known agency as a research partner, allowing them to review and offer suggestions on our survey and interview questions. This collaborative process took time and had to be completed before sampling could start. Once sampling commenced, I pulled contact names from my collaborator’s database and public websites, and set a weekly schedule of email and phone contacts. I would contact the director once via email. Ten days later, I would follow up via email and by leaving a voicemail with their administrative support staff. Ten days after that, I would reach out to state administrators in a different office via email and then again via phone, if needed. The process took months to complete and required a complex Excel tracking document.

Recruitment will also expose your participants to the informed consent information you prepared. For students going through the IRB, there are templates you will have to follow in order to get your study approved. For students whose projects unfold under the supervision of their department, rather than the IRB, you should check with your professor on what the expectations are for getting participant consent. In the aforementioned study, I used our IRB’s template to create a consent form but did not include a signature line. The IRB allowed me to collect my data without a signature, as there was little risk of harm from the study. It was imperative to review consent information before completing the survey and interview with participants. Only when the participant is totally clear on the purpose, risks and benefits, confidentiality protections, and other information detailed in Chapter 6 , can you ethically move forward with including them in your sample.

Sampling available documents

As with sampling humans, sampling documents centers around the question: which documents are the most relevant to your research question, in that which will provide you first-hand knowledge. Common documents analyzed in student research projects include client files, popular media like film and music lyrics, and policies from service agencies. In a case record review, the student would create exclusion and inclusion criteria based on their research question. Once a suitable sampling frame of potential documents exists, the researcher can use probability or non-probability sampling to select which client files are ultimately analyzed.

Sampling documents must also come with consent and buy-in from stakeholders and gatekeepers. Assuming you have approval to conduct your study and access to the documents you need, the process of recruitment is much easier than in studies sampling humans. There is no informed consent process with documents, though research with confidential health or education records must be done in accordance with privacy laws such as the Health Insurance Portability and Accountability Act and the Family Educational Rights and Privacy Act . Barring any technical or policy obstacles, the gathering of documents should be easier and less time consuming than sampling humans.

Sample: Who actually participates in your study?

Once you find a sampling frame from which you can recruit your participants and decide which characteristics you will  include  and   exclude, you will recruit people using a specific sampling approach, which we will cover in Section 10.2. At the end, you’re left with the group of people you successfully recruited from your sampling frame to participate in your study, your sample . If you are a participant in a research project—answering survey questions, participating in interviews, etc.—you are part of the sample in that research project.

Visualizing sampling terms

Sampling terms can be a bit daunting at first. However, with some practice, they will become second nature. Let’s walk through an example from a research project of mine. I collected data for a research project related to how much it costs to become a licensed clinical social worker (LCSW) in each state. Becoming an LCSW is necessary to work in private clinical practice and is used by supervisors in human service organizations to sign off on clinical charts from less credentialed employees, and to provide clinical supervision. If you are interested in providing clinical services as a social worker, you should become familiar with the licensing laws in your state.

Moving from population to setting, you should consider access and consent of stakeholders and the representativeness of the setting. In moving from setting to sampling frame, keep in mind your inclusion and exclusion criteria. In moving finally to sample, keep in mind your sampling approach and recruitment strategy.

Using Figure 10.1 as a guide, my population is clearly clinical social workers, as these are the people about whom I want to draw conclusions. The next step inward would be a sampling frame. Unfortunately, there is no list of every licensed clinical social worker in the United States. I could write to each state’s social work licensing board and ask for a list of names and addresses, perhaps even using a Freedom of Information Act request if they were unwilling to share the information. That option sounds time-consuming and has a low likelihood of success. Instead, I tried to figure out a convenient setting social workers are likely to congregate. I considered setting up a booth at a National Association of Social Workers (NASW) conference and asking people to participate in my survey. Ultimately, this would prove too costly, and the people who gather at an NASW conference may not be representative of the general population of clinical social workers. I finally discovered the NASW membership email list, which is available to advertisers, including researchers advertising for research projects. While the NASW list does not contain every clinical social worker, it reaches over one hundred thousand social workers regularly through its monthly e-newsletter, a large proportion of social workers in practice, so the setting was likely to draw a representative sample. To gain access to this setting from gatekeepers, I had to provide paperwork showing my study had undergone IRB review and submit my measures for approval by the mailing list administrator.

Once I gained access from gatekeepers, my setting became the members of the NASW membership list. I decided to recruit 5,000 participants because I knew that people sometimes do not read or respond to email advertisements, and I figured maybe 20% would respond, which would give me around 1,000 responses. Figuring out my sample size was a challenge, because I had to balance the costs associated with using the NASW newsletter. As you can see on their pricing page , it would cost money to learn personal information about my potential participants, which I would need to check later in order to determine if my population was representative of the overall population of social workers. For example, I could see if my sample was comparable in race, age, gender, or state of residence to the broader population of social workers by comparing my sample with information about all social workers published by NASW. I presented my options to my external funder as:

  • I could send an email advertisement to a lot of people (5,000), but I would know very little about them and they would get only one advertisement.
  • I could send multiple advertisements to fewer people (1,000) reminding them to participate, but I would also know more about them by purchasing access to personal information.
  • I could send multiple advertisements to fewer people (2,500), but not purchase access to personal information to minimize costs.

In your project, there is no expectation you purchase access to anything, and if you plan on using email advertisements, consider places that are free to access like employee or student listservs. At the same time, you will need to consider what you can know or not know about the people who will potentially be in your study, and I could collect any personal information we wanted to check representativeness in the study itself. For this reason, we decided to go with option #1. When I sent my email recruiting participants for the study, I specified that I only wanted to hear from social workers who were either currently receiving or recently received clinical supervision for licensure—my inclusion criteria. This was important because many of the people on the NASW membership list may not be licensed or license-seeking social workers. So, my sampling frame was the email addresses on the NASW mailing list who fit the inclusion criteria for the study, which I figured would be at least a few thousand people. Unfortunately, only 150 licensed or license-seeking clinical social workers responded to my recruitment email and completed the survey. You will learn in Section 10.3 why this did not make for a very good sample.

From this example, you can see that sampling is a process. The process flows sequentially from figuring out your target population, to thinking about where to find people from your target population, to figuring out how much information you know about potential participants, and finally to selecting recruiting people from that list to be a part of your sample. Through the sampling process, you must consider where people in your target population are likely to be and how best to get their attention for your study. Sampling can be an easy process, like calling every 100th name from the phone book, or challenging, like standing every day for a few weeks in an area in which people who are homeless gather for shelter. In either case, your goal is to recruit enough people who will participate in your study so you can learn about your population.

What about sampling non-humans?

Many student projects do not involve recruiting and sampling human subjects. Instead, many research projects will sample objects like client charts, movies, or books. The same terms apply, but the process is a bit easier because there are no humans involved. If a research project involves analyzing client files, it is unlikely you will look at every client file that your agency has. You will need to figure out which client files are important to your research question. Perhaps you want to sample clients who have a diagnosis of reactive attachment disorder. You would have to create a list of all clients at your agency (setting) who have reactive attachment disorder (your inclusion criteria) then use your sampling approach (which we will discuss in the next section) to select which client files you will actually analyze for your study (your sample). Recruitment is a lot easier because, well, there’s no one to convince but your gatekeepers, the managers of your agency. However, researchers who publish chart reviews must obtain IRB permission before doing so.

Key Takeaways

  • The first rule of sampling is to go where your participants are. Think about virtual or in-person settings in which your target population gathers. Remember that you may have to engage gatekeepers and stakeholders in accessing many settings, and that you will need to assess the pragmatic challenges and ethical risks and benefits of your study.
  • Consider whether you can sample documents like agency files to answer your research question. Documents are much easier to “recruit” than people!
  • Researchers must consider which characteristics are necessary for people to have (inclusion criteria) or not have (exclusion criteria), as well as how to recruit participants into the sample.
  • Social workers can sample individuals, groups, or organizations.
  • Sometimes the unit of analysis and the unit of observation in the study differ. In student projects, this is often true as target populations may be too vulnerable to expose to research whose potential harms may outweigh the benefits.
  • One’s recruitment method has to match one’s sampling approach, as will be explained in the next chapter.

Once you have identified who may be a part of your study, the next step is to think about where those people gather. Are there in-person locations in your community or on the internet that are easily accessible. List at least one potential setting for your project. Describe for each potential setting:

  • Based on what you know right now, how representative of your population are potential participants in the setting?
  • How much information can you reasonably know about potential participants before you recruit them?
  • Are there gatekeepers and what kinds of concerns might they have?
  • Are there any stakeholders that may be beneficial to bring on board as part of your research team for the project?
  • What interests might stakeholders and gatekeepers bring to the project and would they align with your vision for the project?
  • What ethical issues might you encounter if you sampled people in this setting.

Even though you may not be 100% sure about your setting yet, let’s think about the next steps.

  • For the settings you’ve identified, how might you recruit participants?
  • Identify your inclusion criteria and exclusion criteria, and assess whether you have enough information on whether people in each setting will meet them.

10.2 Sampling approaches for quantitative research

  • Determine whether you will use probability or non-probability sampling, given the strengths and limitations of each specific sampling approach
  • Distinguish between approaches to probability sampling and detail the reasons to use each approach

Sampling in quantitative research projects is done because it is not feasible to study the whole population, and researchers hope to take what we learn about a small group of people (your sample) and apply it to a larger population. There are many ways to approach this process, and they can be grouped into two categories—probability sampling and non-probability sampling. Sampling approaches are inextricably linked with recruitment, and researchers should ensure that their proposal’s recruitment strategy matches the sampling approach.

Probability sampling approaches use a random process, usually a computer program, to select participants from the sampling frame so that everyone has an equal chance of being included. It’s important to note that random means the researcher used a process that is truly random . In a project sampling college students, standing outside of the building in which your social work department is housed and surveying everyone who walks past is not random. Because of the location, you are likely to recruit a disproportionately large number of social work students and fewer from other disciplines. Depending on the time of day, you may recruit more traditional undergraduate students, who take classes during the day, or more graduate students, who take classes in the evenings.

In this example, you are actually using non-probability sampling . Another way to say this is that you are using the most common sampling approach for student projects, availability sampling . Also called convenience sampling, this approach simply recruits people who are convenient or easily available to the researcher. If you have ever been asked by a friend to participate in their research study for their class or seen an advertisement for a study on a bulletin board or social media, you were being recruited using an availability sampling approach.

There are a number of benefits to the availability sampling approach. First and foremost, it is less costly and time-consuming for the researcher. As long as the person you are attempting to recruit has knowledge of the topic you are studying, the information you get from the sample you recruit will be relevant to your topic (although your sample may not necessarily be representative of a larger population). Availability samples can also be helpful when random sampling isn’t practical. If you are planning to survey students in an LGBTQ+ support group on campus but attendance varies from meeting to meeting, you may show up at a meeting and ask anyone present to participate in your study. A support group with varied membership makes it impossible to have a real list—or sampling frame—from which to randomly select individuals. Availability sampling would help you reach that population.

Availability sampling is appropriate for student and smaller-scale projects, but it comes with significant limitations. The purpose of sampling in quantitative research is to generalize from a small sample to a larger population. Because availability sampling does not use a random process to select participants, the researcher cannot be sure their sample is representative of the population they hope to generalize to. Instead, the recruitment processes may have been structured by other factors that may bias the sample to be different in some way than the overall population.

So, for instance, if we asked social work students about their level of satisfaction with the services at the student health center, and we sampled in the evenings, we would get most likely get a biased perspective of the issue. Students taking only night classes are much more likely to commute to school, spend less time on campus, and use fewer campus services. Our results would not represent what all social work students feel about the topic. We might get the impression that no social work student had ever visited the health center, when that is not actually true at all. Sampling bias will be discussed in detail in Section 10.3.

importance of quantitative research in social work

Approaches to probability sampling

What might be a better strategy is getting a list of all email addresses of social work students and randomly selecting email addresses of students to whom you can send your survey. This would be an example of simple random sampling . It’s important to note that you need a real list of people in your sampling frame from which to select your email addresses. For projects where the people who could potentially participate is not known by the researcher, probability sampling is not possible. It is likely that administrators at your school’s registrar would be reluctant to share the list of students’ names and email addresses. Always remember to consider the feasibility and ethical implications of the sampling approach you choose.

Usually, simple random sampling is accomplished by assigning each person, or element , in your sampling frame a number and selecting your participants using a random number generator. You would follow an identical process if you were sampling records or documents as your elements, rather than people. True randomness is difficult to achieve, and it takes complex computational calculations to do so. Although you think you can select things at random, human-generated randomness is actually quite predictable, as it falls into patterns called heuristics . To truly randomly select elements, researchers must rely on computer-generated help. Many free websites have good pseudo-random number generators. A good example is the website Random.org , which contains a random number generator that can also randomize lists of participants. Sometimes, researchers use a table of numbers that have been generated randomly. There are several possible sources for obtaining a random number table. Some statistics and research methods textbooks provide such tables in an appendix.

Though simple, this approach to sampling can be tedious since the researcher must assign a number to each person in a sampling frame. Systematic sampling techniques are somewhat less tedious but offer the benefits of a random sample. As with simple random samples, you must possess a list of everyone in your sampling frame. Once you’ve done that, to draw a systematic sample you’d simply select every k th element on your list. But what is k , and where on the list of population elements does one begin the selection process?

Diagram showing four people being selected using systematic sampling, starting at number 2 and every third person after that (5, 8, 11)

k is your selection interval or the distance between the elements you select for inclusion in your study. To begin the selection process, you’ll need to figure out how many elements you wish to include in your sample. Let’s say you want to survey 25 social work students and there are 100 social work students on your campus. In this case, your selection interval, or  k , is 4. To get your selection interval, simply divide the total number of population elements by your desired sample size. Systematic sampling starts by randomly selecting a number between 1 and  k  to start from, and then recruiting every  kth person. In our example, we may start at number 3 and then select the 7th, 11th, 15th (and so forth) person on our list of email addresses. In Figure 10.2, you can see the researcher starts at number 2 and then selects every third person for inclusion in the sample.

There is one clear instance in which systematic sampling should not be employed. If your sampling frame has any pattern to it, you could inadvertently introduce bias into your sample by using a systemic sampling strategy. (Bias will be discussed in more depth in section 10.3.) This is sometimes referred to as the problem of periodicity. Periodicity refers to the tendency for a pattern to occur at regular intervals.

To stray a bit from our example, imagine we were sampling client charts based on the date they entered a health center and recording the reason for their visit. We may expect more admissions for issues related to alcohol consumption on the weekend than we would during the week. The periodicity of alcohol intoxication may bias our sample towards either overrepresenting or underrepresenting this issue, depending on our sampling interval and whether we collected data on a weekday or weekend.

Advanced probability sampling techniques

Returning again to our idea of sampling student email addresses, one of the challenges in our study will be the different types of students. If we are interested in all social work students, it may be helpful to divide our sampling frame, or list of students, into three lists—one for traditional, full-time undergraduate students, another for part-time undergraduate students, and one more for full-time graduate students—and then randomly select from these lists. This is particularly important if we wanted to make sure our sample had the same proportion of each type of student compared with the general population.

This approach is called stratified random sampling . In stratified random sampling, a researcher will divide the study population into relevant subgroups or strata and then draw a sample from each subgroup, or stratum. Strata is the plural of stratum, so it refers to all of the groups while stratum refers to each group. This can be used to make sure your sample has the same proportion of people from each stratum. If, for example, our sample had many more graduate students than undergraduate students, we may draw incorrect conclusions that do not represent what all social work students experience.

Selecting a proportion of black, grey, and white students from a population into a sample

Generally, the goal of stratified random sampling is to recruit a sample that makes sure all elements of the population are included sufficiently that conclusions can be drawn about them. Usually, the purpose is to create a sample that is identical to the overall population along whatever strata you’ve identified. In our sample, it would be graduate and undergraduate students. Stratified random sampling is also useful when a subgroup of interest makes up a relatively small proportion of the overall sample. For example, if your social work program contained relatively few Asian students but you wanted to make sure you recruited enough Asian students to conduct statistical analysis, you could use race to divide people into subgroups or strata and then disproportionately sample from the Asian students to make sure enough of them were in your sample to draw meaningful conclusions. Statistical tests may have a minimum number

Up to this point in our discussion of probability samples, we’ve assumed that researchers will be able to access a list of population elements in order to create a sampling frame. This, as you might imagine, is not always the case. Let’s say, for example, that you wish to conduct a study of health center usage across students at each social work program in your state. Just imagine trying to create a list of every single social work student in the state. Even if you could find a way to generate such a list, attempting to do so might not be the most practical use of your time or resources. When this is the case, researchers turn to cluster sampling. Cluster sampling  occurs when a researcher begins by sampling groups (or clusters) of population elements and then selects elements from within those groups.

For a population of six clusters of two students each, two clusters were selected for the sample

Let’s work through how we might use cluster sampling. While creating a list of all social work students in your state would be next to impossible, you could easily create a list of all social work programs in your state. Then, you could draw a random sample of social work programs (your cluster) and then draw another random sample of elements (in this case, social work students) from each of the programs you randomly selected from the list of all programs.

Cluster sampling often works in stages. In this example, we sampled in two stages—(1) social work programs and (2) social work students at each program we selected. However, we could add another stage if it made sense to do so. We could randomly select (1) states in the United States (2) social work programs in that state and (3) individual social work students. As you might have guessed, sampling in multiple stages does introduce a  greater   possibility of error. Each stage is subject to its own sampling problems. But, cluster sampling is nevertheless a highly efficient method.

Jessica Holt and Wayne Gillespie (2008) [3] used cluster sampling in their study of students’ experiences with violence in intimate relationships. Specifically, the researchers randomly selected 14 classes on their campus and then drew a random sub-sample of students from those classes. But you probably know from your experience with college classes that not all classes are the same size. So, if Holt and Gillespie had simply randomly selected 14 classes and then selected the same number of students from each class to complete their survey, then students in the smaller of those classes would have had a greater chance of being selected for the study than students in the larger classes. Keep in mind, with random sampling the goal is to make sure that each element has the same chance of being selected. When clusters are of different sizes, as in the example of sampling college classes, researchers often use a method called probability proportionate to size  (PPS). This means that they take into account that their clusters are of different sizes. They do this by giving clusters different chances of being selected based on their size so that each element within those clusters winds up having an equal chance of being selected.

To summarize, probability samples allow a researcher to make conclusions about larger groups. Probability samples require a sampling frame from which elements, usually human beings, can be selected at random from a list. The use of random selection reduces the error and bias present in non-probability samples, which we will discuss in greater detail in section 10.3, though some error will always remain. In relying on a random number table or generator, researchers can more accurately state that their sample represents the population from which it was drawn. This strength is common to all probability sampling approaches summarized in Table 10.2.

Table 10.2 Types of probability samples
Simple random Researcher randomly selects elements from sampling frame.
Systematic Researcher selects every  th element from sampling frame.
Stratified Researcher creates subgroups then randomly selects elements from each subgroup.
Cluster Researcher randomly selects clusters then randomly selects elements from selected clusters.

In determining which probability sampling approach makes the most sense for your project, it helps to know more about your population. A simple random sample and systematic sample are relatively similar to carry out. They both require a list all elements in your sampling frame. Systematic sampling is slightly easier in that it does not require you to use a random number generator, instead using a sampling interval that is easy to calculate by hand.

However, the relative simplicity of both approaches is counterweighted by their lack of sensitivity to characteristics of your population. Stratified samples can better account for periodicity by creating strata that reduce or eliminate its effects. Stratified sampling also ensure that smaller subgroups are included in your sample, thereby making your sample more representative of the overall population. While these benefits are important, creating strata for this purpose requires having information about your population before beginning the sampling process. In our social work student example, we would need to know which students are full-time or part-time, graduate or undergraduate, in order to make sure our sample contained the same proportions. Would you know if someone was a graduate student or part-time student, just based on their email address? If the true population parameters are unknown, stratified sampling becomes significantly more challenging.

Common to each of the previous probability sampling approaches is the necessity of using a real list of all elements in your sampling frame. Cluster sampling is different. It allows a researcher to perform probability sampling in cases for which a list of elements is not available or feasible to create. Cluster sampling is also useful for making claims about a larger population (in our previous example, all social work students within a state). However, because sampling occurs at multiple stages in the process, (in our previous example, at the university and student level), sampling error increases. For many researchers, the benefits of cluster sampling outweigh this weaknesses.

Matching recruitment and sampling approach

Recruitment must match the sampling approach you choose in section 10.2. For many students, that will mean using recruitment techniques most relevant to availability sampling. These may include public postings such as flyers, mass emails, or social media posts. However, these methods would not make sense for a study using probability sampling. Probability sampling requires a list of names or other identifying information so you can use a random process to generate a list of people to recruit into your sample. Posting a flyer or social media message means you don’t know who is looking at the flyer, and thus, your sample could not be randomly drawn. Probability sampling often requires knowing how to contact specific participants. For example, you may do as I did, and contact potential participants via phone and email. Even then, it’s important to note that not everyone you contact will enter your study. We will discuss more about evaluating the quality of your sample in section 10.3.

  • Probability sampling approaches are more accurate when the researcher wants to generalize from a smaller sample to a larger population. However, non-probability sampling approaches are often more feasible. You will have to weigh advantages and disadvantages of each when designing your project.
  • There are many kinds of probability sampling approaches, though each require you know some information about people who potentially would participate in your study.
  • Probability sampling also requires that you assign people within the sampling frame a number and select using a truly random process.

Building on the step-by-step sampling plan from the exercises in section 10.1:

  • Identify one of the sampling approaches listed in this chapter that might be appropriate to answering your question and list the strengths and limitations of it.
  • Describe how you will recruit your participants and how your plan makes sense with the sampling approach you identified.

Examine one of the empirical articles from your literature review.

  • Identify what sampling approach they used and how they carried it out from start to finish.

10.3 Sample quality

  • Assess whether your sampling plan is likely to produce a sample that is representative of the population you want to draw conclusions about
  • Identify the considerations that go into producing a representative sample and determining sample size
  • Distinguish between error and bias in a sample and explain the factors that lead to each

Okay, so you’ve chosen where you’re going to get your data (setting), what characteristics you want and don’t want in your sample (inclusion/exclusion criteria), and how you will select and recruit participants (sampling approach and recruitment). That means you are done, right? (I mean, there’s an entire section here, so probably not.) Even if you make good choices and do everything the way you’re supposed to, you can still draw a poor sample. If you are investigating a research question using quantitative methods, the best choice is some kind of probability sampling, but aside from that, how do you know a good sample from a bad sample? As an example, we’ll use a bad sample I collected as part of a research project that didn’t go so well. Hopefully, your sampling will go much better than mine did, but we can always learn from what didn’t work.

importance of quantitative research in social work

Representativeness

A representative sample is, “a sample that looks like the population from which it was selected in all respects that are potentially relevant to the study” (Engel & Schutt, 2011). [4] For my study on how much it costs to get an LCSW in each state, I did not get a sample that looked like the overall population to which I wanted to generalize. My sample had a few states with more than ten responses and most states with no responses. That does not look like the true distribution of social workers across the country. I could compare the number of social workers in each state, based on data from the National Association of Social Workers, or the number of recent clinical MSW graduates from the Council on Social Work Education. More than that, I could see whether my sample matched the overall population of clinical social workers in gender, race, age, or any other important characteristics. Sadly, it wasn’t even close. So, I wasn’t able to use the data to publish a report.

Critique the representativeness of the sample you are planning to gather.

  • Will the sample of people (or documents) look like the population to which you want to generalize?
  • Specifically, what characteristics are important in determining whether a sample is representative of the population? How do these characteristics relate to your research question?

Consider returning to this question once you have completed the sampling process and evaluate whether the sample in your study was similar to what you designed in this section.

Many of my students erroneously assume that using a probability sampling technique will guarantee a representative sample. This is not true. Engel and Schutt (2011) identify that probability sampling increases the chance of representativeness; however, it does not guarantee that the sample will be representative. If a representative sample is important to your study, it would be best to use a sampling approach that allows you to control the proportion of specific characteristics in your sample. For instance, stratified random sampling allows you to control the distribution of specific variables of interest within your sample. However, that requires knowing information about your participants before you hand them surveys or expose them to an experiment.

In my study, if I wanted to make sure I had a certain number of people from each state (state being the strata), making the proportion of social workers from each state in my sample similar to the overall population, I would need to know which email addresses were from which states. That was not information I had. So, instead I conducted simple random sampling and randomly selected 5,000 of 100,000 email addresses on the NASW list. There was less of a guarantee of representativeness, but whatever variation existed between my sample and the population would be due to random chance. This would not be true for an availability or convenience sample. While these sampling approaches are common for student projects, they come with significant limitations in that variation between the sample and population is due to factors other than chance. We will discuss these non-random differences later in the chapter when we talk about bias. For now, just remember that the representativeness of a sample is helped by using random sampling, though it is not a guarantee.

  • Before you start sampling, do you know enough about your sampling frame to use stratified random sampling, which increases the potential of getting a representative sample?
  • Do you have enough information about your sampling frame to use another probability sampling approach like simple random sampling or cluster sampling?
  • If little information is available on which to select people, are you using availability sampling? Remember that availability sampling is okay if it is the only approach that is feasible for the researcher, but it comes with significant limitations when drawing conclusions about a larger population.

Assessing representativeness should start prior to data collection. I mentioned that I drew my sample from the NASW email list, which (like most organizations) they sell to advertisers when companies or researchers need to advertise to social workers. How representative of my population is my sampling frame? Well, the first question to ask is what proportion of my sampling frame would actually meet my exclusion and inclusion criteria. Since my study focused specifically on clinical social workers, my sampling frame likely included social workers who were not clinical social workers, like macro social workers or social work managers. However, I knew, based on the information from NASW marketers, that many people who received my recruitment email would be clinical social workers or those working towards licensure, so I was satisfied with that. Anyone who didn’t meet my inclusion criteria and opened the survey would be greeted with clear instructions that this survey did not apply to them.

At the same time, I should have assessed whether the demographics of the NASW email list and the demographics of clinical social workers more broadly were similar. Unfortunately, this was not information I could gather. I had to trust that this was likely to going to be the best sample I could draw and the most representative of all social workers.

  • Before you start, what do you know about your setting and potential participants?
  • Are there likely to be enough people in the setting of your study who meet the inclusion criteria?

You want to avoid throwing out half of the surveys you get back because the respondents aren’t a part of your target population. This is a common error I see in student proposals.

Many of you will sample people from your agency, like clients or staff. Let’s say you work for a children’s mental health agency, and you wanted to study children who have experienced abuse. Walking through the steps here might proceed like this:

  • Think about or ask your coworkers how many of the clients at your agency have experienced this issue. If it’s common, then clients at your agency would probably make a good sampling frame for your study. If not, then you may want to adjust your research question or consider a different agency to sample. You could also change your target population to be more representative with your sample. For example, while your agency’s clients may not be representative of all children who have survived abuse, they may be more representative of abuse survivors in your state, region, or county. In this way, you can draw conclusions about a smaller population, rather than everyone in the world who is a victim of child abuse.
  • Think about those characteristics that are important for individuals in your sample to have or not have. Obviously, the variables in your research question are important, but so are the variables related to it. Take a look at the empirical literature on your topic. Are there different demographic characteristics or covariates that are relevant to your topic?
  • All of this assumes that you can actually access information about your sampling frame prior to collecting data. This is a challenge in the real world. Even if you ask around your office about client characteristics, there is no way for you to know for sure until you complete your study whether it was the most representative sampling frame you could find. When in doubt, go with whatever is feasible and address any shortcomings in sampling within the limitations section of your research report. A good project is a done project.
  • While using a probability sampling approach helps with sample representativeness, it does not guarantee it. Due to random variation, samples may differ across important characteristics. If you can feasibly use a probability sampling approach, particularly stratified random sampling, it will help make your sample more representative of the population.
  • Even if you choose a sampling frame that is representative of your population and use a probability sampling approach, there is no guarantee that the sample you are able to collect will be representative. Sometimes, people don’t respond to your recruitment efforts. Other times, random chance will mean people differ on important characteristics from your target population. ¯\_(ツ)_/¯

In agency-based samples, the small size of the pool of potential participants makes it very likely that your sample will not be representative of a broader target population. Sometimes, researchers look for specific outcomes connected with sub-populations for that reason. Not all agency-based research is concerned with representativeness, and it is still worthwhile to pursue research that is relevant to only one location as its purpose is often to improve social work practice.

importance of quantitative research in social work

Sample size

Let’s assume you have found a representative sampling frame, and that you are using one of the probability sampling approaches we reviewed in section 10.2. That should help you recruit a representative sample, but how many people do you need to recruit into your sample? As with many questions about sample quality, students should keep feasibility in mind. The easiest answer I’ve given as a professor is, “as many as you can, without hurting yourself.” While your quantitative research question would likely benefit from hundreds or thousands of respondents, that is not likely to be feasible for a student who is working full-time, interning part-time, and in school full-time. Don’t feel like your study has to be perfect, but make sure you note any limitations in your final report.

To the extent possible, you should gather as many people as you can in your sample who meet your criteria. But why? Let’s think about an example you probably know well. Have you ever watched the TV show Family Feud ? Each question the host reads off starts with, “we asked 100 people…” Believe it or not,  Family Feud uses simple random sampling to conduct their surveys the American public. Part of the challenge on  Family Feud is that people can usually guess the most popular answers, but those answers that only a few people chose are much harder. They seem bizarre, and are more difficult to guess. That’s because 100 people is not a lot of people to sample. Essentially, Family Feud is trying to measure what the answer is for all 327 million people in the United States by asking 100 of them. As a result, the weird and idiosyncratic responses of a few people are likely to remain on the board as answers, and contestants have to guess answers fewer and fewer people in the sample provided. In a larger sample, the oddball answers would likely fade away and only the most popular answers would be represented on the game show’s board.

In my ill-fated study of clinical social workers, I received 87 complete responses. That is far below the hundred thousand licensed or license-eligible clinical social workers. Moreover, since I wanted to conduct state-by-state estimates, there was no way I had enough people in each state to do so. For student projects, samples of 50-100 participants are more than enough to write a paper (or start a game show), but for projects in the real world with real-world consequences, it is important to recruit the appropriate number of participants. For example, if your agency conducts a community scan of people in your service area on what services they need, the results will inform the direction of your agency, which grants they apply for, who they hire, and its mission for the next several years. Being overly confident in your sample could result in wasted resources for clients.

So what is the right number? Theoretically, we could gradually increase the sample size so that the sample approaches closer and closer to the total size of the population (Bhattacherjeee, 2012). [5] But as we’ve talked about, it is not feasible to sample everyone. How do we find that middle ground? To answer this, we need to understand the sampling distribution . Imagine in your agency’s survey of the community, you took three different probability samples from your community, and for each sample, you measured whether people experienced domestic violence. If each random sample was truly representative of the population, then your rate of domestic violence from the three random samples would be about the same and equal to the true value in the population.

But this is extremely unlikely, given that each random sample will likely constitute a different subset of the population, and hence, the rate of domestic violence you measure may be slightly different from sample to sample. Think about the sample you collect as existing on a distribution of infinite possible samples. Most samples you collect will be close to the population mean but many will not be. The degree to which they differ is associated with how much the subject you are sampling about varies in the population. In our example, samples will vary based on how varied the incidence of domestic violence is from person to person. The difference between the domestic violence rate we find and the rate for our overall population is called the sampling error .

An easy way to minimize sampling error is to increase the number of participants in your sample, but in actuality, minimizing sampling error relies on a number of factors outside of the scope of a basic student project. You can see this online textbook for more examples on sampling distributions or take an advanced methods course at your university, particularly if you are considering becoming a social work researcher. Increasing the number of people in your sample also increases your study’s power , or the odds you will detect a significant relationship between variables when one is truly present in your sample. If you intend on publishing the findings of your student project, it is worth using a power analysis to determine the appropriate sample size for your project. You can follow this excellent video series from the Center for Open Science on how to conduct power analyses using free statistics software. A faculty members who teaches research or statistics could check your work. You may be surprised to find out that there is a point at which you adding more people to your sample will not make your study any better.

Honestly, I did not do a power analysis for my study. Instead, I asked for 5,000 surveys with the hope that 1,000 would come back. Given that only 87 came back, a power analysis conducted after the survey was complete would likely to reveal that I did not have enough statistical power to answer my research questions. For your projects, try to get as many respondents as you feasibly can, but don’t worry too much about not reaching the optimal amount of people to maximize the power of your study unless you goal is to publish something that is generalizable to a large population.

A final consideration is which statistical test you plan to use to analyze your data. We have not covered statistics yet, though we will provide a brief introduction to basic statistics in this textbook. For now, remember that some statistical tests have a minimum number of people that must be present in the sample in order to conduct the analysis. You will complete a data analysis plan before you begin your project and start sampling, so you can always increase the number of participants you plan to recruit based on what you learn in the next few chapters.

  • How many people can you feasibly sample in the time you have to complete your project?

importance of quantitative research in social work

One of the interesting things about surveying professionals is that sometimes, they email you about what they perceive to be a problem with your study. I got an email from a well-meaning participant in my LCSW study saying that my results were going to be biased! She pointed out that respondents who had been in practice a long time, before clinical supervision was required, would not have paid anything for supervision. This would lead me to draw conclusions that supervision was cheap, when in fact, it was expensive. My email back to her explained that she hit on one of my hypotheses, that social workers in practice for a longer period of time faced fewer costs to becoming licensed. Her email reinforced that I needed to account for the impact of length of practice on the costs of licensure I found across the sample. She was right to be on the lookout for bias in the sample.

One of the key questions you can ask is if there is something about your process that makes it more likely you will select a certain type of person for your sample, making it less representative of the overall population. In my project, it’s worth thinking more about who is more likely to respond to an email advertisement for a research study. I know that my work email and personal email filter out advertisements, so it’s unlikely I would even see the recruitment for my own study (probably something I should have thought about before using grant funds to sample the NASW email list). Perhaps an older demographic that does not screen advertisements as closely, o r those whose NASW account was linked to a personal email with fewer junk filters would be more likely to respond. To the extent I made conclusions about clinical social workers of all ages based on a sample that was biased towards older social workers, my results would be biased. This is called selection bias , or the degree to which people in my sample differ from the overall population.

Another potential source of bias here is nonresponse bias . Because people do not often respond to email advertisements (no matter how well-written they are), my sample is likely to be representative of people with characteristics that make them more likely to respond. They may have more time on their hands to take surveys and respond to their junk mail. To the extent that the sample is comprised of social workers with a lot of time on their hands (who are those people?) my sample will be biased and not representative of the overall population.

It’s important to note that both bias and error describe how samples differ from the overall population. Error describes random variations between samples, due to chance. Using a random process to recruit participants into a sample means you will have random variation between the sample and the population. Bias creates variance between the sample and population in a specific direction, such as towards those who have time to check their junk mail. Bias may be introduced by the sampling method used or due to conscious or unconscious bias introduced by the researcher (Rubin & Babbie, 2017). [6] A researcher might select people who “look like good research participants,” in the process transferring their unconscious biases to their sample. They might exclude people from the sampling from who “would not do well with the intervention.” Careful researchers can avoid these, but unconscious and structural biases can be challenging to root out.

  • Identify potential sources of bias in your sample and brainstorm ways you can minimize them, if possible.

Critical considerations

Think back to you undergraduate degree. Did you ever participate in a research project as part of an introductory psychology or sociology course? Social science researchers on college campuses have a luxury that researchers elsewhere may not share—they have access to a whole bunch of (presumably) willing and able human guinea pigs. But that luxury comes at a cost—sample representativeness. One study of top academic journals in psychology found that over two-thirds (68%) of participants in studies published by those journals were based on samples drawn in the United States (Arnett, 2008). [7] Further, the study found that two-thirds of the work that derived from US samples published in the Journal of Personality and Social Psychology was based on samples made up entirely of American undergraduate students taking psychology courses.

These findings certainly raise the question: What do we actually learn from social science studies and about whom do we learn it? That is exactly the concern raised by Joseph Henrich and colleagues (Henrich, Heine, & Norenzayan, 2010), [8] authors of the article “The Weirdest People in the World?” In their piece, Henrich and colleagues point out that behavioral scientists very commonly make sweeping claims about human nature based on samples drawn only from WEIRD (Western, Educated, Industrialized, Rich, and Democratic) societies, and often based on even narrower samples, as is the case with many studies relying on samples drawn from college classrooms. As it turns out, robust findings about the nature of human behavior when it comes to fairness, cooperation, visual perception, trust, and other behaviors are based on studies that excluded participants from outside the United States and sometimes excluded anyone outside the college classroom (Begley, 2010). [9] This certainly raises questions about what we really know about human behavior as opposed to US resident or US undergraduate behavior. Of course, not all research findings are based on samples of WEIRD folks like college students. But even then, it would behoove us to pay attention to the population on which studies are based and the claims being made about those to whom the studies apply.

Another thing to keep in mind is that just because a sample may be representative in all respects that a researcher thinks are relevant, there may be relevant aspects that didn’t occur to the researcher when she was drawing her sample. You might not think that a person’s phone would have much to do with their voting preferences, for example. But had pollsters making predictions about the results of the 2008 presidential election not been careful to include both cell phone-only and landline households in their surveys, it is possible that their predictions would have underestimated Barack Obama’s lead over John McCain because Obama was much more popular among cell phone-only users than McCain (Keeter, Dimock, & Christian, 2008). [10] This is another example of bias.

importance of quantitative research in social work

Putting it all together

So how do we know how good our sample is or how good the samples gathered by other researchers are? While there might not be any magic or always-true rules we can apply, there are a couple of things we can keep in mind as we read the claims researchers make about their findings.

First, remember that sample quality is determined only by the sample actually obtained, not by the sampling method itself. A researcher may set out to administer a survey to a representative sample by correctly employing a random sampling approach with impeccable recruitment materials. But, if only a handful of the people sampled actually respond to the survey, the researcher should not make claims like their sample went according to plan.

Another thing to keep in mind, as demonstrated by the preceding discussion, is that researchers may be drawn to talking about implications of their findings as though they apply to some group other than the population actually sampled. Whether the sampling frame does not match the population or the sample and population differ on important criteria, the resulting sampling error can lead to bad science.

We’ve talked previously about the perils of generalizing social science findings from graduate students in the United States and other Western countries to all cultures in the world, imposing a Western view as the right and correct view of the social world. As consumers of theory and research, it is our responsibility to be attentive to this sort of (likely unintentional) bait and switch. And as researchers, it is our responsibility to make sure that we only make conclusions from samples that are representative. A larger sample size and probability sampling can improve the representativeness and generalizability of the study’s findings to larger populations, though neither are guarantees.

Finally, keep in mind that a sample allowing for comparisons of theoretically important concepts or variables is certainly better than one that does not allow for such comparisons. In a study based on a nonrepresentative sample, for example, we can learn about the strength of our social theories by comparing relevant aspects of social processes. We talked about this as theory-testing in Chapter 8 .

At their core, questions about sample quality should address who has been sampled, how they were sampled, and for what purpose they were sampled. Being able to answer those questions will help you better understand, and more responsibly interpret, research results. For your study, keep the following questions in mind.

  • Are your sample size and your sampling approach appropriate for your research question?
  • How much do you know about your sampling frame ahead of time? How will that impact the feasibility of different sampling approaches?
  • What gatekeepers and stakeholders are necessary to engage in order to access your sampling frame?
  • Are there any ethical issues that may make it difficult to sample those who have first-hand knowledge about your topic?
  • Does your sampling frame look like your population along important characteristics? Once you get your data, ask the same question of the sample you successfully recruit.
  • What about your population might make it more difficult or easier to sample?
  • Are there steps in your sampling procedure that may bias your sample to render it not representative of the population?
  • If you want to skip sampling altogether, are there sources of secondary data you can use? Or might you be able to answer you questions by sampling documents or media, rather than people?
  • The sampling plan you implement should have a reasonable likelihood of producing a representative sample. Student projects are given more leeway with nonrepresentative samples, and this limitation should be discussed in the student’s research report.
  • Researchers should conduct a power analysis to determine sample size, though quantitative student projects should endeavor to recruit as many participants as possible. Sample size impacts representativeness of the sample, its power, and which statistical tests can be conducted.
  • The sample you collect is one of an infinite number of potential samples that could have been drawn. To the extent the data in your sample varies from the data in the entire population, it includes some error or bias. Error is the result of random variations. Bias is systematic error that pushes the data in a given direction.
  • Even if you do everything right, there is no guarantee that you will draw a good sample. Flawed samples are okay to use as examples in the classroom, but the results of your research would have limited generalizability beyond your specific participants.
  • Historically, samples were drawn from dominant groups and generalized to all people. This shortcoming is a limitation of some social science literature and should be considered a colonialist scientific practice.

Media Attributions

  • ho-hyou-qEOV3icU_Y4-unsplash © Ho Hyou is licensed under a CC0 (Creative Commons Zero) license
  • young-ethnic-woman-pointing-at-camera-3880943 © Andrea Piacquadio is licensed under a CC0 (Creative Commons Zero) license
  • Figure 9.1 (final)
  • lonso97-_GyB_XcGChs-unsplash © lonso97 is licensed under a CC0 (Creative Commons Zero) license
  • Systematic_sampling © Dan Kernler is licensed under a CC BY (Attribution) license
  • Figure 10.3 Stratified sampling © Dan Kernler is licensed under a CC BY-SA (Attribution ShareAlike) license
  • Cluster_sampling © Dan Kernler is licensed under a CC BY-SA (Attribution ShareAlike) license
  • pine-watt-3_Xwxya43hE-unsplash © pine watt is licensed under a Public Domain license
  • christina-wocintechchat-com-rCyiK4_aaWw-unsplash © Christina @ wocintechchat.com is licensed under a Public Domain license
  • business-2584713_1920 © Gerd Altmann
  • control-3312776_1920 © Gerd Altmann
  • I clearly need a snack. ↵
  • Johnson, P. S., & Johnson, M. W. (2014). Investigation of “bath salts” use patterns within an online sample of users in the United States. Journal of psychoactive drugs ,  46 (5), 369-378. ↵
  • Holt, J. L., & Gillespie, W. (2008). Intergenerational transmission of violence, threatened egoism, and reciprocity: A test of multiple psychosocial factors affecting intimate partner violence.  American  Journal of Criminal Justice, 33 , 252–266. ↵
  • Engel, R. & Schutt (2011). The practice of research in social work (2nd ed.) . California: SAGE ↵
  • Bhattacherjee, A. (2012). Social science research: Principles, methods, and practices . Retrieved from: https://scholarcommons.usf.edu/cgi/viewcontent.cgi?article=1002&context=oa_textbooks ↵
  • Rubin, C. & Babbie, S. (2017). Research methods for social work (9th edition) . Boston, MA: Cengage. ↵
  • Arnett, J. J. (2008). The neglected 95%: Why American psychology needs to become less American. American Psychologist , 63, 602–614. ↵
  • Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences , 33, 61–135. ↵
  • Newsweek magazine published an interesting story about Henrich and his colleague’s study: Begley, S. (2010). What’s really human? The trouble with student guinea pigs. Retrieved from http://www.newsweek.com/2010/07/23/what-s-really-human.html ↵
  • Keeter, S., Dimock, M., & Christian, L. (2008). Calling cell phones in ’08 pre-election polls. The Pew Research Center for the People and the Press . Retrieved from  http://people-press.org/files/legacy-pdf/cell-phone-commentary.pdf ↵

entity that a researcher wants to say something about at the end of her study (individual, group, or organization)

the entities that a researcher actually observes, measures, or collects in the course of trying to learn something about her unit of analysis (individuals, groups, or organizations)

the larger group of people you want to be able to make conclusions about based on the conclusions you draw from the people in your sample

the list of people from which a researcher will draw her sample

the people or organizations who control access to the population you want to study

an administrative body established to protect the rights and welfare of human research subjects recruited to participate in research activities conducted under the auspices of the institution with which it is affiliated

Inclusion criteria are general requirements a person must possess to be a part of your sample.

characteristics that disqualify a person from being included in a sample

the process by which the researcher informs potential participants about the study and attempts to get them to participate

the group of people you successfully recruit from your sampling frame to participate in your study

sampling approaches for which a person’s likelihood of being selected from the sampling frame is known

sampling approaches for which a person’s likelihood of being selected for membership in the sample is unknown

researcher gathers data from whatever cases happen to be convenient or available

(as in generalization) to make claims about a large population based on a smaller sample of people or items

selecting elements from a list using randomly generated numbers

the units in your sampling frame, usually people or documents

selecting every kth element from your sampling frame

the distance between the elements you select for inclusion in your study

the tendency for a pattern to occur at regular intervals

dividing the study population into subgroups based on a characteristic (or strata) and then drawing a sample from each subgroup

the characteristic by which the sample is divided in stratified random sampling

a sampling approach that begins by sampling groups (or clusters) of population elements and then selects elements from within those groups

in cluster sampling, giving clusters different chances of being selected based on their size so that each element within those clusters has an equal chance of being selected

a sample that looks like the population from which it was selected in all respects that are potentially relevant to the study

the set of all possible samples you could possibly draw for your study

The difference between what you find in a sample and what actually exists in the population from which the sample was drawn.

the odds you will detect a significant relationship between variables when one is truly present in your sample

the degree to which people in my sample differs from the overall population

The bias that occurs when those who respond to your request to participate in a study are different from those who do not respond to you request to participate in a study.

Graduate research methods in social work Copyright © 2021 by Matthew DeCarlo, Cory Cummings, Kate Agnelli is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Quantitative research methods for social work: Making social work count

  • January 2017
  • Edition: 1st
  • Publisher: Palgrave
  • ISBN: 978-1-137-40026-0

Barbra Teater at City University of New York - College of Staten Island

  • City University of New York - College of Staten Island

John Devaney at The University of Edinburgh

  • The University of Edinburgh
  • This person is not on ResearchGate, or hasn't claimed this research yet.

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

Rodrigo Elías Zambrano

  • Gloria Jiménez-Marín

Araceli Galiano

  • Amer Abuali
  • CHILD INDIC RES
  • Victoria Sharley

Mark Kiiza

  • Dina Sidani

Sharif Haider

  • Mokoena Alistair

J.J. Prinsloo

  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

share this!

August 12, 2024

This article has been reviewed according to Science X's editorial process and policies . Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

One way social work researchers can better understand community needs—and move the field forward

by Matt Shipman, North Carolina State University

social work

Researchers are calling on the social work community to begin incorporating a methodology called "discrete choice experiments" (DCEs) into their research, to better understand the needs and preferences of key stakeholders. This technique is well established in other fields but is rarely used in social work.

The paper, " How to Use Discrete Choice Experiments to Capture Stakeholder Preferences in Social Work Research ," is published in the Journal of the Society for Social Work and Research .

"Social workers need to engage with a wide variety of stakeholders, from policy makers to the people who use social services ," says Alan Ellis, an associate professor of social work at North Carolina State University and corresponding author of a paper introducing social work researchers to the DCE methodology.

"But social work, as a research discipline, has not identified a standard technique for eliciting the preferences of those stakeholders—even though this is a critical issue," Ellis says.

"Although traditional survey methods can be used to evaluate stakeholder perspectives, the DCE is one of several methodologies that were specifically designed to assess the degree to which people prioritize one thing over another. In this paper, we propose that social work researchers adopt DCEs as a robust tool for capturing stakeholder preferences on any number of issues."

In a DCE, researchers ask participants to complete a series of choice tasks: hypothetical situations in which each participant is presented with alternative scenarios and selects one or more.

"For example, social work researchers may want to know how parents and other caregivers prioritize different aspects of mental health treatment when choosing services for their children," Ellis says. "A DCE can explore this question by presenting scenarios that include different types of mental health care providers, treatment methods, costs, locations and so on. Caregivers' stated choices in these scenarios can provide a lot of information about their priorities."

DCEs were first developed by marketing researchers and are now widely used in fields ranging from transportation to health care.

"We know that DCEs effectively capture preferences on a wide variety of subjects," Ellis says. "We simply want to begin using them more consistently to address issues that are important to stakeholders in social work.

"From a pure research standpoint, having a better understanding of stakeholder needs and preferences can move the field forward by helping us develop better research questions and better studies," says Ellis. "Beyond that, having a better understanding of our clients' preferences and goals will make us better social workers. Adopting DCEs can strengthen the link between social work research and practice—and ground our research , policy, and practice in the values that are important to the people we serve.

"I'm optimistic that DCEs could help us collaborate with stakeholders to effect positive change."

The paper was co-authored by Qiana Cryer-Coupet of Georgia State University, Bridget Weller of Wayne State University, Kirsten Howard and Rakhee Raghunandan of the University of Sydney, and Kathleen Thomas of the University of North Carolina at Chapel Hill.

Provided by North Carolina State University

Explore further

Feedback to editors

importance of quantitative research in social work

Physicists throw world's smallest disco party with a levitating ball of fluorescent nanodiamond

6 minutes ago

importance of quantitative research in social work

First-of-its-kind analysis reveals importance of storms in air–sea carbon exchange in Southern Ocean

18 minutes ago

importance of quantitative research in social work

Fine fragrances from test tubes: A new method to synthesize ambrox

19 minutes ago

importance of quantitative research in social work

NASA's Perseverance rover to begin long climb up Martian crater rim

importance of quantitative research in social work

Revealing the mysteries within microbial genomes with a new high-throughput approach

33 minutes ago

importance of quantitative research in social work

Characterizing the impact of 700 years of Inuvialuit subsistence hunting on beluga whales

37 minutes ago

importance of quantitative research in social work

Interactive map shows thresholds for coastal nuisance flooding

50 minutes ago

importance of quantitative research in social work

Studying the journey, not the destination, provides new insight into songbird migrations

importance of quantitative research in social work

Newly discovered ability of comammox bacteria could help reduce nitrous oxide emissions in agriculture

57 minutes ago

importance of quantitative research in social work

Planetary health diet adoption would reduce emissions by 17%, environmental scientists suggest

Relevant physicsforums posts, biographies, history, personal accounts.

17 hours ago

Is "applausive" implied terminology?

Cover songs versus the original track, which ones are better.

20 hours ago

"Trolling" in New England

21 hours ago

For WW2 buffs!

Aug 13, 2024

Why are ABBA so popular?

Aug 11, 2024

More from Art, Music, History, and Linguistics

Related Stories

importance of quantitative research in social work

Working the quads for better eHealth: Combining four areas of stakeholder relationships

Aug 8, 2024

importance of quantitative research in social work

Study outlines factors that help engage nonresident fathers in child welfare efforts

Aug 9, 2021

importance of quantitative research in social work

Social workers debunk the myths about how they help patients heal

Mar 29, 2024

importance of quantitative research in social work

Behavioral and computational study shows that social preferences can be inferred from decision speed alone

Jun 20, 2024

importance of quantitative research in social work

Study highlights complexity of public responses to corporate crises

Feb 12, 2020

importance of quantitative research in social work

Study identifies ways to better help children experiencing homelessness

Feb 6, 2024

Recommended for you

importance of quantitative research in social work

Study suggests five-second break can diffuse an argument between coupled partners

importance of quantitative research in social work

Findings suggest empowering women is key to both sustainable energy and gender justice

22 hours ago

importance of quantitative research in social work

Exploring the evolution of social norms with a supercomputer

Aug 9, 2024

importance of quantitative research in social work

Study shows people associate kindness with religious belief

importance of quantitative research in social work

Research demonstrates genetically diverse crowds are wiser

importance of quantitative research in social work

TikToks—even neutral ones—harm women's body image, but diet videos had the worst effect, study finds

Aug 7, 2024

Let us know if there is a problem with our content

Use this form if you have come across a typo, inaccuracy or would like to send an edit request for the content on this page. For general inquiries, please use our contact form . For general feedback, use the public comments section below (please adhere to guidelines ).

Please select the most appropriate category to facilitate processing of your request

Thank you for taking time to provide your feedback to the editors.

Your feedback is important to us. However, we do not guarantee individual replies due to the high volume of messages.

E-mail the story

Your email address is used only to let the recipient know who sent the email. Neither your address nor the recipient's address will be used for any other purpose. The information you enter will appear in your e-mail message and is not retained by Phys.org in any form.

Newsletter sign up

Get weekly and/or daily updates delivered to your inbox. You can unsubscribe at any time and we'll never share your details to third parties.

More information Privacy policy

Donate and enjoy an ad-free experience

We keep our content available to everyone. Consider supporting Science X's mission by getting a premium account.

E-mail newsletter

IMAGES

  1. Quantitative Research Methods for Social Work: Making Social Work Count

    importance of quantitative research in social work

  2. PPT

    importance of quantitative research in social work

  3. Quantitative Research: What It Is, Practices & Methods

    importance of quantitative research in social work

  4. Quantitative Research in Social Sciences-PhdScholars

    importance of quantitative research in social work

  5. Purpose of a Quantitative Methodology (With images)

    importance of quantitative research in social work

  6. What Are The Characteristics Of Quantitative Research? Characteristics

    importance of quantitative research in social work

COMMENTS

  1. The impact of quantitative research in social work

    The importance of quantitative research in the social sciences generally and social work specifically has been highlighted in recent years, in both an international and a British context. Consensus opinion in the UK is that quantitative work is the 'poor relation' in social work research, leading to a number of initiatives.

  2. What is Quantitative Research?

    Quantitative research deals in numbers, logic, and an objective stance. Quantitative research focuses on numberic and unchanging data and detailed, convergent reasoning rather than divergent reasoning [i.e., the generation of a variety of ideas about a research problem in a spontaneous, free-flowing manner]. Its main characteristics are:

  3. The Positive Contributions of Quantitative Methodology to Social Work

    In his important critique of Campbell's position on internal validity, he argues that "External validity"—validity of inferences that go beyond the data—is the crux of social action, not 'internal validity'" (1980, p. 231). ... Quantitative social work research does face peculiarly acute difficulties arising from the intangible ...

  4. Shaping Social Work Science: What Should Quantitative Researchers Do

    Based on a review of economists' debates on mathematical economics, this article discusses a key issue for shaping the science of social work—research methodology. The article describes three important tasks quantitative researchers need to fulfill in order to enhance the scientific rigor of social work research.

  5. Work-life balance, social support, and burnout: A quantitative study of

    Social work is acknowledged to be a high-stress profession that involves working with people in distressing circumstances and complex life situations such as those experiencing abuse, domestic violence, substance misuse, and crime (Stanley & Mettilda, 2016).It has been observed that important sources of occupational stress for social workers include excessive workload, working overtime ...

  6. Social Work Research Methods

    Quantitative data — facts that can be measured and expressed numerically — are crucial for social work. Quantitative research has advantages for social scientists. Such research can be more generalizable to large populations, as it uses specific sampling methods and lends itself to large datasets. ... The Importance of Research Design. Data ...

  7. The Nature and Extent of Quantitative Research in Social Work: A Ten

    Nature and Extent of Quantitative Research in Social Work 1521 Introduction Quantitative work seems to present many people in social work with particu lar problems. Sharland's (2009) authoritative review notes the difficulty 'with people doing qualitative research not by choice but because it's the only thing they feel safe in' (p. 31).

  8. Quantitative Research Methods for Social Work: Making Social Work Count

    This book arose from funding from the Economic and Social Research Council to address the quantitative skills gap in the social sciences. The grants were applied for under the auspices of the Joint University Council Social Work Education Committee to upskill social work academics and develop a curriculum resource with teaching aids.

  9. A Quick Guide to Quantitative Research in the Social Sciences

    This resource is intended as an easy-to-use guide for anyone who needs some quick and simple advice on quantitative aspects of research in social sciences, covering subjects such as education, sociology, business, nursing. If you area qualitative researcher who needs to venture into the world of numbers, or a student instructed to undertake a quantitative research project despite a hatred for ...

  10. Nature and Extent of Quantitative Research in Social Work Journals: A

    Although the proportion of quantitative research is rather small in social work research, the review could not find evidence that it is of low sophistication. Finally, this study concludes that future research would benefit from making explicit why a certain methodology was chosen.

  11. A Practical Guide to Writing Quantitative and Qualitative Research

    INTRODUCTION. Scientific research is usually initiated by posing evidenced-based research questions which are then explicitly restated as hypotheses.1,2 The hypotheses provide directions to guide the study, solutions, explanations, and expected results.3,4 Both research questions and hypotheses are essentially formulated based on conventional theories and real-world processes, which allow the ...

  12. 11. Quantitative measurement

    The NASW Code of Ethics discusses social work research and the importance of engaging in practices that do not harm participants. [14] This is especially important considering that many of the topics studied by social workers are those that are disproportionately experienced by marginalized and oppressed populations.

  13. Causality and Causal Inference in Social Work: Quantitative and

    The Nature of Causality and Causal Inference. The human sciences, including social work, place great emphasis on understanding the causes and effects of human behavior, yet there is a lack of consensus as to how cause and effect can and should be linked (Parascandola & Weed, 2001; Salmon, 1998; Susser, 1973).What little consensus exists seems to be that effects are assumed to be consequences ...

  14. The impact of quantitative research in social work

    This paper is the first to focus on the academic impact of quantitative research in social work developing measurable outcomes. It focuses on three leading British-based generic journals over a 10 ...

  15. 11. Quantitative measurement

    This is the opposite of quantitative research, in which definitions must be completely set in stone before the inquiry can begin. ... The NASW Code of Ethics discusses social work research and the importance of engaging in practices that do not harm participants. This is especially important considering that many of the topics studied by social ...

  16. What Is Quantitative Research? An Overview and Guidelines

    The necessity, importance, relevance, and urgency of quantitative research are articulated, establishing a strong foundation for the subsequent discussion, which delineates the scope, objectivity, goals, data, and methods that distinguish quantitative research, alongside a balanced inspection of its strengths and shortcomings, particularly in ...

  17. Assessment

    An Integrative Skills Assessment Approach. Several practice models have made an important contribution to social work assessment. Jordan and Franklin (2011, pp. 8-35) reviewed early assessment models, including the psychosocial model of Florence Hollis, Gordon Hamilton, and Helen Perlman.The term person-in-environment originated in this approach, and its goal is to determine a client's ...

  18. Quantitative Research: A Successful Investigation in Natural and Social

    Quantitative research explains phenomena by collecting numerical unchanging d etailed data t hat. are analyzed using mathematically based methods, in particular statistics that pose questions of ...

  19. 10. Quantitative sampling

    Chapter Outline. The sampling process (25 minute read); Sampling approaches for quantitative research (15 minute read); Sample quality (24 minute read); Content warning: examples contain references to addiction to technology, domestic violence and batterer intervention, cancer, illegal drug use, LGBTQ+ discrimination, binge drinking, intimate partner violence among college students, child ...

  20. A Quick Guide to Quantitative Research in the Social Sciences A Quick

    While looking into social issues impacting individuals or groups, Bryman (2017) emphasized the importance of quantitative studies since they enable researchers to gather participants' objective ...

  21. Social Work Research and Mixed Methods: Stronger With a Quality

    This framework helps researchers attend to important aspects of a mixed methods project's design, particularly as related to integration. ... Foundations of mixed methods research: Integrating quantitative and qualitative approaches in the social and behavioral sciences (2nd ed.). Sage Publications. ... Research on Social Work Practice, 33(5 ...

  22. Quantitative Research Methods for Social Work

    Quantitative research makes a very important contribution to both understanding and responding effectively to the problems that social work service users face. In this unique and authoritative text, a group of expert authors explore the key areas of data collection, analysis and evaluation and outline in detail how they can be applied to practice.

  23. The Use and Value of Mixed Methods Research in Social Work

    Mixed methods research adds three important elements to social work research: voices of participants, comprehensive analyses of phenomena, and enhanced validity of findings. For these reasons, the teaching and use of mixed methods research remain integral to social work. Keywords: Mixed methods research, social work.

  24. Quantitative research methods for social work: Making social work count

    Quantitative research methods for social work: Making social work count. January 2017. Edition: 1st. Publisher: Palgrave. ISBN: 978-1-137-40026-. Authors: Barbra Teater. City University of New ...

  25. One way social work researchers can better understand community needs

    Researchers are calling on the social work community to begin incorporating a methodology called "discrete choice experiments" (DCEs) into their research, to better understand the needs and ...