Digital Commons @ University of South Florida

  • USF Research
  • USF Libraries

Digital Commons @ USF > College of Arts and Sciences > Mathematics and Statistics > Theses and Dissertations

Mathematics and Statistics Theses and Dissertations

Theses/dissertations from 2024 2024.

The Effect of Fixed Time Delays on the Synchronization Phase Transition , Shaizat Bakhytzhan

On the Subelliptic and Subparabolic Infinity Laplacian in Grushin-Type Spaces , Zachary Forrest

Utilizing Machine Learning Techniques for Accurate Diagnosis of Breast Cancer and Comprehensive Statistical Analysis of Clinical Data , Myat Ei Ei Phyo

Quandle Rings, Idempotents and Cocycle Invariants of Knots , Dipali Swain

Comparative Analysis of Time Series Models on U.S. Stock and Exchange Rates: Bayesian Estimation of Time Series Error Term Model Versus Machine Learning Approaches , Young Keun Yang

Theses/Dissertations from 2023 2023

Classification of Finite Topological Quandles and Shelves via Posets , Hitakshi Lahrani

Applied Analysis for Learning Architectures , Himanshu Singh

Rational Functions of Degree Five That Permute the Projective Line Over a Finite Field , Christopher Sze

Theses/Dissertations from 2022 2022

New Developments in Statistical Optimal Designs for Physical and Computer Experiments , Damola M. Akinlana

Advances and Applications of Optimal Polynomial Approximants , Raymond Centner

Data-Driven Analytical Predictive Modeling for Pancreatic Cancer, Financial & Social Systems , Aditya Chakraborty

On Simultaneous Similarity of d-tuples of Commuting Square Matrices , Corey Connelly

Symbolic Computation of Lump Solutions to a Combined (2+1)-dimensional Nonlinear Evolution Equation , Jingwei He

Boundary behavior of analytic functions and Approximation Theory , Spyros Pasias

Stability Analysis of Delay-Driven Coupled Cantilevers Using the Lambert W-Function , Daniel Siebel-Cortopassi

A Functional Optimization Approach to Stochastic Process Sampling , Ryan Matthew Thurman

Theses/Dissertations from 2021 2021

Riemann-Hilbert Problems for Nonlocal Reverse-Time Nonlinear Second-order and Fourth-order AKNS Systems of Multiple Components and Exact Soliton Solutions , Alle Adjiri

Zeros of Harmonic Polynomials and Related Applications , Azizah Alrajhi

Combination of Time Series Analysis and Sentiment Analysis for Stock Market Forecasting , Hsiao-Chuan Chou

Uncertainty Quantification in Deep and Statistical Learning with applications in Bio-Medical Image Analysis , K. Ruwani M. Fernando

Data-Driven Analytical Modeling of Multiple Myeloma Cancer, U.S. Crop Production and Monitoring Process , Lohuwa Mamudu

Long-time Asymptotics for mKdV Type Reduced Equations of the AKNS Hierarchy in Weighted L 2 Sobolev Spaces , Fudong Wang

Online and Adjusted Human Activities Recognition with Statistical Learning , Yanjia Zhang

Theses/Dissertations from 2020 2020

Bayesian Reliability Analysis of The Power Law Process and Statistical Modeling of Computer and Network Vulnerabilities with Cybersecurity Application , Freeh N. Alenezi

Discrete Models and Algorithms for Analyzing DNA Rearrangements , Jasper Braun

Bayesian Reliability Analysis for Optical Media Using Accelerated Degradation Test Data , Kun Bu

On the p(x)-Laplace equation in Carnot groups , Robert D. Freeman

Clustering methods for gene expression data of Oxytricha trifallax , Kyle Houfek

Gradient Boosting for Survival Analysis with Applications in Oncology , Nam Phuong Nguyen

Global and Stochastic Dynamics of Diffusive Hindmarsh-Rose Equations in Neurodynamics , Chi Phan

Restricted Isometric Projections for Differentiable Manifolds and Applications , Vasile Pop

On Some Problems on Polynomial Interpolation in Several Variables , Brian Jon Tuesink

Numerical Study of Gap Distributions in Determinantal Point Process on Low Dimensional Spheres: L -Ensemble of O ( n ) Model Type for n = 2 and n = 3 , Xiankui Yang

Non-Associative Algebraic Structures in Knot Theory , Emanuele Zappala

Theses/Dissertations from 2019 2019

Field Quantization for Radiative Decay of Plasmons in Finite and Infinite Geometries , Maryam Bagherian

Probabilistic Modeling of Democracy, Corruption, Hemophilia A and Prediabetes Data , A. K. M. Raquibul Bashar

Generalized Derivations of Ternary Lie Algebras and n-BiHom-Lie Algebras , Amine Ben Abdeljelil

Fractional Random Weighted Bootstrapping for Classification on Imbalanced Data with Ensemble Decision Tree Methods , Sean Charles Carter

Hierarchical Self-Assembly and Substitution Rules , Daniel Alejandro Cruz

Statistical Learning of Biomedical Non-Stationary Signals and Quality of Life Modeling , Mahdi Goudarzi

Probabilistic and Statistical Prediction Models for Alzheimer’s Disease and Statistical Analysis of Global Warming , Maryam Ibrahim Habadi

Essays on Time Series and Machine Learning Techniques for Risk Management , Michael Kotarinos

The Systems of Post and Post Algebras: A Demonstration of an Obvious Fact , Daviel Leyva

Reconstruction of Radar Images by Using Spherical Mean and Regular Radon Transforms , Ozan Pirbudak

Analyses of Unorthodox Overlapping Gene Segments in Oxytricha Trifallax , Shannon Stich

An Optimal Medium-Strength Regularity Algorithm for 3-uniform Hypergraphs , John Theado

Power Graphs of Quasigroups , DayVon L. Walker

Theses/Dissertations from 2018 2018

Groups Generated by Automata Arising from Transformations of the Boundaries of Rooted Trees , Elsayed Ahmed

Non-equilibrium Phase Transitions in Interacting Diffusions , Wael Al-Sawai

A Hybrid Dynamic Modeling of Time-to-event Processes and Applications , Emmanuel A. Appiah

Lump Solutions and Riemann-Hilbert Approach to Soliton Equations , Sumayah A. Batwa

Developing a Model to Predict Prevalence of Compulsive Behavior in Individuals with OCD , Lindsay D. Fields

Generalizations of Quandles and their cohomologies , Matthew J. Green

Hamiltonian structures and Riemann-Hilbert problems of integrable systems , Xiang Gu

Optimal Latin Hypercube Designs for Computer Experiments Based on Multiple Objectives , Ruizhe Hou

Human Activity Recognition Based on Transfer Learning , Jinyong Pang

Signal Detection of Adverse Drug Reaction using the Adverse Event Reporting System: Literature Review and Novel Methods , Minh H. Pham

Statistical Analysis and Modeling of Cyber Security and Health Sciences , Nawa Raj Pokhrel

Machine Learning Methods for Network Intrusion Detection and Intrusion Prevention Systems , Zheni Svetoslavova Stefanova

Orthogonal Polynomials With Respect to the Measure Supported Over the Whole Complex Plane , Meng Yang

Theses/Dissertations from 2017 2017

Modeling in Finance and Insurance With Levy-It'o Driven Dynamic Processes under Semi Markov-type Switching Regimes and Time Domains , Patrick Armand Assonken Tonfack

Prevalence of Typical Images in High School Geometry Textbooks , Megan N. Cannon

On Extending Hansel's Theorem to Hypergraphs , Gregory Sutton Churchill

Contributions to Quandle Theory: A Study of f-Quandles, Extensions, and Cohomology , Indu Rasika U. Churchill

Linear Extremal Problems in the Hardy Space H p for 0 p , Robert Christopher Connelly

Statistical Analysis and Modeling of Ovarian and Breast Cancer , Muditha V. Devamitta Perera

Statistical Analysis and Modeling of Stomach Cancer Data , Chao Gao

Structural Analysis of Poloidal and Toroidal Plasmons and Fields of Multilayer Nanorings , Kumar Vijay Garapati

Dynamics of Multicultural Social Networks , Kristina B. Hilton

Cybersecurity: Stochastic Analysis and Modelling of Vulnerabilities to Determine the Network Security and Attackers Behavior , Pubudu Kalpani Kaluarachchi

Generalized D-Kaup-Newell integrable systems and their integrable couplings and Darboux transformations , Morgan Ashley McAnally

Patterns in Words Related to DNA Rearrangements , Lukas Nabergall

Time Series Online Empirical Bayesian Kernel Density Segmentation: Applications in Real Time Activity Recognition Using Smartphone Accelerometer , Shuang Na

Schreier Graphs of Thompson's Group T , Allen Pennington

Cybersecurity: Probabilistic Behavior of Vulnerability and Life Cycle , Sasith Maduranga Rajasooriya

Bayesian Artificial Neural Networks in Health and Cybersecurity , Hansapani Sarasepa Rodrigo

Real-time Classification of Biomedical Signals, Parkinson’s Analytical Model , Abolfazl Saghafi

Lump, complexiton and algebro-geometric solutions to soliton equations , Yuan Zhou

Theses/Dissertations from 2016 2016

A Statistical Analysis of Hurricanes in the Atlantic Basin and Sinkholes in Florida , Joy Marie D'andrea

Statistical Analysis of a Risk Factor in Finance and Environmental Models for Belize , Sherlene Enriquez-Savery

Putnam's Inequality and Analytic Content in the Bergman Space , Matthew Fleeman

On the Number of Colors in Quandle Knot Colorings , Jeremy William Kerr

Statistical Modeling of Carbon Dioxide and Cluster Analysis of Time Dependent Information: Lag Target Time Series Clustering, Multi-Factor Time Series Clustering, and Multi-Level Time Series Clustering , Doo Young Kim

Some Results Concerning Permutation Polynomials over Finite Fields , Stephen Lappano

Hamiltonian Formulations and Symmetry Constraints of Soliton Hierarchies of (1+1)-Dimensional Nonlinear Evolution Equations , Solomon Manukure

Modeling and Survival Analysis of Breast Cancer: A Statistical, Artificial Neural Network, and Decision Tree Approach , Venkateswara Rao Mudunuru

Generalized Phase Retrieval: Isometries in Vector Spaces , Josiah Park

Leonard Systems and their Friends , Jonathan Spiewak

Resonant Solutions to (3+1)-dimensional Bilinear Differential Equations , Yue Sun

Statistical Analysis and Modeling Health Data: A Longitudinal Study , Bhikhari Prasad Tharu

Global Attractors and Random Attractors of Reaction-Diffusion Systems , Junyi Tu

Time Dependent Kernel Density Estimation: A New Parameter Estimation Algorithm, Applications in Time Series Classification and Clustering , Xing Wang

On Spectral Properties of Single Layer Potentials , Seyed Zoalroshd

Theses/Dissertations from 2015 2015

Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach , Wei Chen

Active Tile Self-assembly and Simulations of Computational Systems , Daria Karpenko

Nearest Neighbor Foreign Exchange Rate Forecasting with Mahalanobis Distance , Vindya Kumari Pathirana

Statistical Learning with Artificial Neural Network Applied to Health and Environmental Data , Taysseer Sharaf

Radial Versus Othogonal and Minimal Projections onto Hyperplanes in l_4^3 , Richard Alan Warner

Ensemble Learning Method on Machine Maintenance Data , Xiaochuang Zhao

Theses/Dissertations from 2014 2014

Properties of Graphs Used to Model DNA Recombination , Ryan Arredondo

Advanced Search

  • Email Notifications and RSS
  • All Collections
  • USF Faculty Publications
  • Open Access Journals
  • Conferences and Events
  • Theses and Dissertations
  • Textbooks Collection

Useful Links

  • Mathematics and Statistics Department
  • Rights Information
  • SelectedWorks
  • Submit Research

Home | About | Help | My Account | Accessibility Statement | Language and Diversity Statements

Privacy Copyright

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

The Beginner's Guide to Statistical Analysis | 5 Steps & Examples

Statistical analysis means investigating trends, patterns, and relationships using quantitative data . It is an important research tool used by scientists, governments, businesses, and other organizations.

To draw valid conclusions, statistical analysis requires careful planning from the very start of the research process . You need to specify your hypotheses and make decisions about your research design, sample size, and sampling procedure.

After collecting data from your sample, you can organize and summarize the data using descriptive statistics . Then, you can use inferential statistics to formally test hypotheses and make estimates about the population. Finally, you can interpret and generalize your findings.

This article is a practical introduction to statistical analysis for students and researchers. We’ll walk you through the steps using two research examples. The first investigates a potential cause-and-effect relationship, while the second investigates a potential correlation between variables.

Table of contents

Step 1: write your hypotheses and plan your research design, step 2: collect data from a sample, step 3: summarize your data with descriptive statistics, step 4: test hypotheses or make estimates with inferential statistics, step 5: interpret your results, other interesting articles.

To collect valid data for statistical analysis, you first need to specify your hypotheses and plan out your research design.

Writing statistical hypotheses

The goal of research is often to investigate a relationship between variables within a population . You start with a prediction, and use statistical analysis to test that prediction.

A statistical hypothesis is a formal way of writing a prediction about a population. Every research prediction is rephrased into null and alternative hypotheses that can be tested using sample data.

While the null hypothesis always predicts no effect or no relationship between variables, the alternative hypothesis states your research prediction of an effect or relationship.

  • Null hypothesis: A 5-minute meditation exercise will have no effect on math test scores in teenagers.
  • Alternative hypothesis: A 5-minute meditation exercise will improve math test scores in teenagers.
  • Null hypothesis: Parental income and GPA have no relationship with each other in college students.
  • Alternative hypothesis: Parental income and GPA are positively correlated in college students.

Planning your research design

A research design is your overall strategy for data collection and analysis. It determines the statistical tests you can use to test your hypothesis later on.

First, decide whether your research will use a descriptive, correlational, or experimental design. Experiments directly influence variables, whereas descriptive and correlational studies only measure variables.

  • In an experimental design , you can assess a cause-and-effect relationship (e.g., the effect of meditation on test scores) using statistical tests of comparison or regression.
  • In a correlational design , you can explore relationships between variables (e.g., parental income and GPA) without any assumption of causality using correlation coefficients and significance tests.
  • In a descriptive design , you can study the characteristics of a population or phenomenon (e.g., the prevalence of anxiety in U.S. college students) using statistical tests to draw inferences from sample data.

Your research design also concerns whether you’ll compare participants at the group level or individual level, or both.

  • In a between-subjects design , you compare the group-level outcomes of participants who have been exposed to different treatments (e.g., those who performed a meditation exercise vs those who didn’t).
  • In a within-subjects design , you compare repeated measures from participants who have participated in all treatments of a study (e.g., scores from before and after performing a meditation exercise).
  • In a mixed (factorial) design , one variable is altered between subjects and another is altered within subjects (e.g., pretest and posttest scores from participants who either did or didn’t do a meditation exercise).
  • Experimental
  • Correlational

First, you’ll take baseline test scores from participants. Then, your participants will undergo a 5-minute meditation exercise. Finally, you’ll record participants’ scores from a second math test.

In this experiment, the independent variable is the 5-minute meditation exercise, and the dependent variable is the math test score from before and after the intervention. Example: Correlational research design In a correlational study, you test whether there is a relationship between parental income and GPA in graduating college students. To collect your data, you will ask participants to fill in a survey and self-report their parents’ incomes and their own GPA.

Measuring variables

When planning a research design, you should operationalize your variables and decide exactly how you will measure them.

For statistical analysis, it’s important to consider the level of measurement of your variables, which tells you what kind of data they contain:

  • Categorical data represents groupings. These may be nominal (e.g., gender) or ordinal (e.g. level of language ability).
  • Quantitative data represents amounts. These may be on an interval scale (e.g. test score) or a ratio scale (e.g. age).

Many variables can be measured at different levels of precision. For example, age data can be quantitative (8 years old) or categorical (young). If a variable is coded numerically (e.g., level of agreement from 1–5), it doesn’t automatically mean that it’s quantitative instead of categorical.

Identifying the measurement level is important for choosing appropriate statistics and hypothesis tests. For example, you can calculate a mean score with quantitative data, but not with categorical data.

In a research study, along with measures of your variables of interest, you’ll often collect data on relevant participant characteristics.

Variable Type of data
Age Quantitative (ratio)
Gender Categorical (nominal)
Race or ethnicity Categorical (nominal)
Baseline test scores Quantitative (interval)
Final test scores Quantitative (interval)
Parental income Quantitative (ratio)
GPA Quantitative (interval)

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

statistical analysis thesis pdf

In most cases, it’s too difficult or expensive to collect data from every member of the population you’re interested in studying. Instead, you’ll collect data from a sample.

Statistical analysis allows you to apply your findings beyond your own sample as long as you use appropriate sampling procedures . You should aim for a sample that is representative of the population.

Sampling for statistical analysis

There are two main approaches to selecting a sample.

  • Probability sampling: every member of the population has a chance of being selected for the study through random selection.
  • Non-probability sampling: some members of the population are more likely than others to be selected for the study because of criteria such as convenience or voluntary self-selection.

In theory, for highly generalizable findings, you should use a probability sampling method. Random selection reduces several types of research bias , like sampling bias , and ensures that data from your sample is actually typical of the population. Parametric tests can be used to make strong statistical inferences when data are collected using probability sampling.

But in practice, it’s rarely possible to gather the ideal sample. While non-probability samples are more likely to at risk for biases like self-selection bias , they are much easier to recruit and collect data from. Non-parametric tests are more appropriate for non-probability samples, but they result in weaker inferences about the population.

If you want to use parametric tests for non-probability samples, you have to make the case that:

  • your sample is representative of the population you’re generalizing your findings to.
  • your sample lacks systematic bias.

Keep in mind that external validity means that you can only generalize your conclusions to others who share the characteristics of your sample. For instance, results from Western, Educated, Industrialized, Rich and Democratic samples (e.g., college students in the US) aren’t automatically applicable to all non-WEIRD populations.

If you apply parametric tests to data from non-probability samples, be sure to elaborate on the limitations of how far your results can be generalized in your discussion section .

Create an appropriate sampling procedure

Based on the resources available for your research, decide on how you’ll recruit participants.

  • Will you have resources to advertise your study widely, including outside of your university setting?
  • Will you have the means to recruit a diverse sample that represents a broad population?
  • Do you have time to contact and follow up with members of hard-to-reach groups?

Your participants are self-selected by their schools. Although you’re using a non-probability sample, you aim for a diverse and representative sample. Example: Sampling (correlational study) Your main population of interest is male college students in the US. Using social media advertising, you recruit senior-year male college students from a smaller subpopulation: seven universities in the Boston area.

Calculate sufficient sample size

Before recruiting participants, decide on your sample size either by looking at other studies in your field or using statistics. A sample that’s too small may be unrepresentative of the sample, while a sample that’s too large will be more costly than necessary.

There are many sample size calculators online. Different formulas are used depending on whether you have subgroups or how rigorous your study should be (e.g., in clinical research). As a rule of thumb, a minimum of 30 units or more per subgroup is necessary.

To use these calculators, you have to understand and input these key components:

  • Significance level (alpha): the risk of rejecting a true null hypothesis that you are willing to take, usually set at 5%.
  • Statistical power : the probability of your study detecting an effect of a certain size if there is one, usually 80% or higher.
  • Expected effect size : a standardized indication of how large the expected result of your study will be, usually based on other similar studies.
  • Population standard deviation: an estimate of the population parameter based on a previous study or a pilot study of your own.

Once you’ve collected all of your data, you can inspect them and calculate descriptive statistics that summarize them.

Inspect your data

There are various ways to inspect your data, including the following:

  • Organizing data from each variable in frequency distribution tables .
  • Displaying data from a key variable in a bar chart to view the distribution of responses.
  • Visualizing the relationship between two variables using a scatter plot .

By visualizing your data in tables and graphs, you can assess whether your data follow a skewed or normal distribution and whether there are any outliers or missing data.

A normal distribution means that your data are symmetrically distributed around a center where most values lie, with the values tapering off at the tail ends.

Mean, median, mode, and standard deviation in a normal distribution

In contrast, a skewed distribution is asymmetric and has more values on one end than the other. The shape of the distribution is important to keep in mind because only some descriptive statistics should be used with skewed distributions.

Extreme outliers can also produce misleading statistics, so you may need a systematic approach to dealing with these values.

Calculate measures of central tendency

Measures of central tendency describe where most of the values in a data set lie. Three main measures of central tendency are often reported:

  • Mode : the most popular response or value in the data set.
  • Median : the value in the exact middle of the data set when ordered from low to high.
  • Mean : the sum of all values divided by the number of values.

However, depending on the shape of the distribution and level of measurement, only one or two of these measures may be appropriate. For example, many demographic characteristics can only be described using the mode or proportions, while a variable like reaction time may not have a mode at all.

Calculate measures of variability

Measures of variability tell you how spread out the values in a data set are. Four main measures of variability are often reported:

  • Range : the highest value minus the lowest value of the data set.
  • Interquartile range : the range of the middle half of the data set.
  • Standard deviation : the average distance between each value in your data set and the mean.
  • Variance : the square of the standard deviation.

Once again, the shape of the distribution and level of measurement should guide your choice of variability statistics. The interquartile range is the best measure for skewed distributions, while standard deviation and variance provide the best information for normal distributions.

Using your table, you should check whether the units of the descriptive statistics are comparable for pretest and posttest scores. For example, are the variance levels similar across the groups? Are there any extreme values? If there are, you may need to identify and remove extreme outliers in your data set or transform your data before performing a statistical test.

Pretest scores Posttest scores
Mean 68.44 75.25
Standard deviation 9.43 9.88
Variance 88.96 97.96
Range 36.25 45.12
30

From this table, we can see that the mean score increased after the meditation exercise, and the variances of the two scores are comparable. Next, we can perform a statistical test to find out if this improvement in test scores is statistically significant in the population. Example: Descriptive statistics (correlational study) After collecting data from 653 students, you tabulate descriptive statistics for annual parental income and GPA.

It’s important to check whether you have a broad range of data points. If you don’t, your data may be skewed towards some groups more than others (e.g., high academic achievers), and only limited inferences can be made about a relationship.

Parental income (USD) GPA
Mean 62,100 3.12
Standard deviation 15,000 0.45
Variance 225,000,000 0.16
Range 8,000–378,000 2.64–4.00
653

A number that describes a sample is called a statistic , while a number describing a population is called a parameter . Using inferential statistics , you can make conclusions about population parameters based on sample statistics.

Researchers often use two main methods (simultaneously) to make inferences in statistics.

  • Estimation: calculating population parameters based on sample statistics.
  • Hypothesis testing: a formal process for testing research predictions about the population using samples.

You can make two types of estimates of population parameters from sample statistics:

  • A point estimate : a value that represents your best guess of the exact parameter.
  • An interval estimate : a range of values that represent your best guess of where the parameter lies.

If your aim is to infer and report population characteristics from sample data, it’s best to use both point and interval estimates in your paper.

You can consider a sample statistic a point estimate for the population parameter when you have a representative sample (e.g., in a wide public opinion poll, the proportion of a sample that supports the current government is taken as the population proportion of government supporters).

There’s always error involved in estimation, so you should also provide a confidence interval as an interval estimate to show the variability around a point estimate.

A confidence interval uses the standard error and the z score from the standard normal distribution to convey where you’d generally expect to find the population parameter most of the time.

Hypothesis testing

Using data from a sample, you can test hypotheses about relationships between variables in the population. Hypothesis testing starts with the assumption that the null hypothesis is true in the population, and you use statistical tests to assess whether the null hypothesis can be rejected or not.

Statistical tests determine where your sample data would lie on an expected distribution of sample data if the null hypothesis were true. These tests give two main outputs:

  • A test statistic tells you how much your data differs from the null hypothesis of the test.
  • A p value tells you the likelihood of obtaining your results if the null hypothesis is actually true in the population.

Statistical tests come in three main varieties:

  • Comparison tests assess group differences in outcomes.
  • Regression tests assess cause-and-effect relationships between variables.
  • Correlation tests assess relationships between variables without assuming causation.

Your choice of statistical test depends on your research questions, research design, sampling method, and data characteristics.

Parametric tests

Parametric tests make powerful inferences about the population based on sample data. But to use them, some assumptions must be met, and only some types of variables can be used. If your data violate these assumptions, you can perform appropriate data transformations or use alternative non-parametric tests instead.

A regression models the extent to which changes in a predictor variable results in changes in outcome variable(s).

  • A simple linear regression includes one predictor variable and one outcome variable.
  • A multiple linear regression includes two or more predictor variables and one outcome variable.

Comparison tests usually compare the means of groups. These may be the means of different groups within a sample (e.g., a treatment and control group), the means of one sample group taken at different times (e.g., pretest and posttest scores), or a sample mean and a population mean.

  • A t test is for exactly 1 or 2 groups when the sample is small (30 or less).
  • A z test is for exactly 1 or 2 groups when the sample is large.
  • An ANOVA is for 3 or more groups.

The z and t tests have subtypes based on the number and types of samples and the hypotheses:

  • If you have only one sample that you want to compare to a population mean, use a one-sample test .
  • If you have paired measurements (within-subjects design), use a dependent (paired) samples test .
  • If you have completely separate measurements from two unmatched groups (between-subjects design), use an independent (unpaired) samples test .
  • If you expect a difference between groups in a specific direction, use a one-tailed test .
  • If you don’t have any expectations for the direction of a difference between groups, use a two-tailed test .

The only parametric correlation test is Pearson’s r . The correlation coefficient ( r ) tells you the strength of a linear relationship between two quantitative variables.

However, to test whether the correlation in the sample is strong enough to be important in the population, you also need to perform a significance test of the correlation coefficient, usually a t test, to obtain a p value. This test uses your sample size to calculate how much the correlation coefficient differs from zero in the population.

You use a dependent-samples, one-tailed t test to assess whether the meditation exercise significantly improved math test scores. The test gives you:

  • a t value (test statistic) of 3.00
  • a p value of 0.0028

Although Pearson’s r is a test statistic, it doesn’t tell you anything about how significant the correlation is in the population. You also need to test whether this sample correlation coefficient is large enough to demonstrate a correlation in the population.

A t test can also determine how significantly a correlation coefficient differs from zero based on sample size. Since you expect a positive correlation between parental income and GPA, you use a one-sample, one-tailed t test. The t test gives you:

  • a t value of 3.08
  • a p value of 0.001

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

The final step of statistical analysis is interpreting your results.

Statistical significance

In hypothesis testing, statistical significance is the main criterion for forming conclusions. You compare your p value to a set significance level (usually 0.05) to decide whether your results are statistically significant or non-significant.

Statistically significant results are considered unlikely to have arisen solely due to chance. There is only a very low chance of such a result occurring if the null hypothesis is true in the population.

This means that you believe the meditation intervention, rather than random factors, directly caused the increase in test scores. Example: Interpret your results (correlational study) You compare your p value of 0.001 to your significance threshold of 0.05. With a p value under this threshold, you can reject the null hypothesis. This indicates a statistically significant correlation between parental income and GPA in male college students.

Note that correlation doesn’t always mean causation, because there are often many underlying factors contributing to a complex variable like GPA. Even if one variable is related to another, this may be because of a third variable influencing both of them, or indirect links between the two variables.

Effect size

A statistically significant result doesn’t necessarily mean that there are important real life applications or clinical outcomes for a finding.

In contrast, the effect size indicates the practical significance of your results. It’s important to report effect sizes along with your inferential statistics for a complete picture of your results. You should also report interval estimates of effect sizes if you’re writing an APA style paper .

With a Cohen’s d of 0.72, there’s medium to high practical significance to your finding that the meditation exercise improved test scores. Example: Effect size (correlational study) To determine the effect size of the correlation coefficient, you compare your Pearson’s r value to Cohen’s effect size criteria.

Decision errors

Type I and Type II errors are mistakes made in research conclusions. A Type I error means rejecting the null hypothesis when it’s actually true, while a Type II error means failing to reject the null hypothesis when it’s false.

You can aim to minimize the risk of these errors by selecting an optimal significance level and ensuring high power . However, there’s a trade-off between the two errors, so a fine balance is necessary.

Frequentist versus Bayesian statistics

Traditionally, frequentist statistics emphasizes null hypothesis significance testing and always starts with the assumption of a true null hypothesis.

However, Bayesian statistics has grown in popularity as an alternative approach in the last few decades. In this approach, you use previous research to continually update your hypotheses based on your expectations and observations.

Bayes factor compares the relative strength of evidence for the null versus the alternative hypothesis rather than making a conclusion about rejecting the null hypothesis or not.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval

Methodology

  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Likert scale

Research bias

  • Implicit bias
  • Framing effect
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hostile attribution bias
  • Affect heuristic

Is this article helpful?

Other students also liked.

  • Descriptive Statistics | Definitions, Types, Examples
  • Inferential Statistics | An Easy Introduction & Examples
  • Choosing the Right Statistical Test | Types & Examples

More interesting articles

  • Akaike Information Criterion | When & How to Use It (Example)
  • An Easy Introduction to Statistical Significance (With Examples)
  • An Introduction to t Tests | Definitions, Formula and Examples
  • ANOVA in R | A Complete Step-by-Step Guide with Examples
  • Central Limit Theorem | Formula, Definition & Examples
  • Central Tendency | Understanding the Mean, Median & Mode
  • Chi-Square (Χ²) Distributions | Definition & Examples
  • Chi-Square (Χ²) Table | Examples & Downloadable Table
  • Chi-Square (Χ²) Tests | Types, Formula & Examples
  • Chi-Square Goodness of Fit Test | Formula, Guide & Examples
  • Chi-Square Test of Independence | Formula, Guide & Examples
  • Coefficient of Determination (R²) | Calculation & Interpretation
  • Correlation Coefficient | Types, Formulas & Examples
  • Frequency Distribution | Tables, Types & Examples
  • How to Calculate Standard Deviation (Guide) | Calculator & Examples
  • How to Calculate Variance | Calculator, Analysis & Examples
  • How to Find Degrees of Freedom | Definition & Formula
  • How to Find Interquartile Range (IQR) | Calculator & Examples
  • How to Find Outliers | 4 Ways with Examples & Explanation
  • How to Find the Geometric Mean | Calculator & Formula
  • How to Find the Mean | Definition, Examples & Calculator
  • How to Find the Median | Definition, Examples & Calculator
  • How to Find the Mode | Definition, Examples & Calculator
  • How to Find the Range of a Data Set | Calculator & Formula
  • Hypothesis Testing | A Step-by-Step Guide with Easy Examples
  • Interval Data and How to Analyze It | Definitions & Examples
  • Levels of Measurement | Nominal, Ordinal, Interval and Ratio
  • Linear Regression in R | A Step-by-Step Guide & Examples
  • Missing Data | Types, Explanation, & Imputation
  • Multiple Linear Regression | A Quick Guide (Examples)
  • Nominal Data | Definition, Examples, Data Collection & Analysis
  • Normal Distribution | Examples, Formulas, & Uses
  • Null and Alternative Hypotheses | Definitions & Examples
  • One-way ANOVA | When and How to Use It (With Examples)
  • Ordinal Data | Definition, Examples, Data Collection & Analysis
  • Parameter vs Statistic | Definitions, Differences & Examples
  • Pearson Correlation Coefficient (r) | Guide & Examples
  • Poisson Distributions | Definition, Formula & Examples
  • Probability Distribution | Formula, Types, & Examples
  • Quartiles & Quantiles | Calculation, Definition & Interpretation
  • Ratio Scales | Definition, Examples, & Data Analysis
  • Simple Linear Regression | An Easy Introduction & Examples
  • Skewness | Definition, Examples & Formula
  • Statistical Power and Why It Matters | A Simple Introduction
  • Student's t Table (Free Download) | Guide & Examples
  • T-distribution: What it is and how to use it
  • Test statistics | Definition, Interpretation, and Examples
  • The Standard Normal Distribution | Calculator, Examples & Uses
  • Two-Way ANOVA | Examples & When To Use It
  • Type I & Type II Errors | Differences, Examples, Visualizations
  • Understanding Confidence Intervals | Easy Examples & Formulas
  • Understanding P values | Definition and Examples
  • Variability | Calculating Range, IQR, Variance, Standard Deviation
  • What is Effect Size and Why Does It Matter? (Examples)
  • What Is Kurtosis? | Definition, Examples & Formula
  • What Is Standard Error? | How to Calculate (Guide with Examples)

What is your plagiarism score?

Scholars' Mine

Home > Mathematics and Statistics > MathStat TDs > Masters Theses

Mathematics and Statistics Masters Theses

Theses from 2024 2024.

A new proper orthogonal decomposition method with second difference quotients for the wave equation , Andrew Calvin Janes

The deep bsde method , Daniel Kovach

Theses from 2023 2023

THE APPLICATION OF STATISTICAL MODELING TO IDENTIFY GENETIC ASSOCIATIONS WITH MILD TRAUMATIC BRAIN INJURY OUTCOMES , Caroline Schott

META-ANALYSIS OF MESENCHYMAL STEM CELL GENE EXPRESSION MICROARRAY DATA FROM OBESE AND NON-OBESE PATIENTS , Dakota William Shields

Theses from 2022 2022

Continuous and discrete models for optimal harvesting in fisheries , Nagham Abbas Al Qubbanchee

Several problems in nonlinear Schrödinger equations , Tim Van Hoose

Theses from 2020 2020

Decoupled finite element methods for general steady two-dimensional Boussinesq equations , Lioba Boveleth

Quantifying effects of sleep deprivation on cognitive performance , Quang Nghia Le

The application of machine learning models in the concussion diagnosis process , Sujit Subhash

Theses from 2019 2019

Less is more: Beating the market with recurrent reinforcement learning , Louis Kurt Bernhard Steinmeister

Theses from 2018 2018

Models for high dimensional spatially correlated risks and application to thunderstorm loss data in Texas , Tobias Merk

An investigation of the influence of the 2007-2009 recession on the day of the week effect for the S&P 500 and its sectors , Marcel Alwin Trick

Theses from 2017 2017

The pantograph equation in quantum calculus , Thomas Griebel

Comparing region level testing methods for differential DNA methylation analysis , Arnold Albert Harder

A review of random matrix theory with an application to biological data , Jesse Aaron Marks

Family-based association studies of autism in boys via facial-feature clusters , Luke Andrew Settles

Theses from 2016 2016

Pricing of geometric Asian options in general affine stochastic volatility models , Johannes Ruppert

On the double chain ladder for reserve estimation with bootstrap applications , Larissa Schoepf

Theses from 2015 2015

Some combinatorial applications of Sage, an open source program , Jessica Ruth Chowning

Day of the week effect in returns and volatility of the S&P 500 sector indices , Juan Liu

Application of loglinear models to claims triangle runoff data , Netanya Lee Martin

Theses from 2014 2014

Adaptive wavelet discretization of tensor products in H-Tucker format , Mazen Ali

An iterative algorithm for variational data assimilation problems , Xin Shen

Statistical analysis of sleep patterns in Drosophila melanogaster , Luyang Wang

Theses from 2013 2013

Statistical analysis of microarray data in sleep deprivation , Stephanie Marie Berhorst

Immersed finite element method for interface problems with algebraic multigrid solver , Wenqiang Feng

Theses from 2012 2012

Abel dynamic equations of the first and second kind , Sabrina Heike Streipert

Lattice residuability , Philip Theodore Thiem

Theses from 2011 2011

A time series approach to electric load modelling , Matthias Benjamin Noller

Theses from 2010 2010

Closed-form solutions to discrete-time portfolio optimization problems , Mathias Christian Goeggel

Inverse limits with upper semi-continuous set valued bonding functions: an example , Christopher David Jacobsen

Theses from 2009 2009

The analogue of the iterated logarithm for quantum difference equations , Karl Friedrich Ulrich

Theses from 2008 2008

Modeling particulate matter emissions indices at the Hartsfield-Jackson Atlanta International Airport , Lu Gan

The dynamic multiplier-accelerator model in economics , Julius Severi Heim

Dynamic equations with piecewise continuous argument , Christian Keller

Theses from 2007 2007

Ostrowski and Grüss inequalities on time scales , Thomas Matthews

The Black-Scholes equation in quantum calculus , Christian Müttel

Computerized proofs of hypergeometric identities: Methods, advances, and limitations , Paul Nathaniel Runnion

Screening for noise variables , Lisa Trautwein

Theses from 2006 2006

Distance function applications of object comparison in artificial vision systems , Christina Michelle Ayres

Sensitivity analysis on the relationship between alcohol abuse or dependence and wages , Tim Jensen

Sensitivity analysis on the relationship between alcohol abuse or dependence and annual hours worked , Stefan Koerner

Endogeneity bias and two-stage least squares: a simulation study , Xujun Wang

Theses from 2005 2005

Local compactness of the hyperspace of connected subsets , Robbie A. Beane

A sequential approach to supersaturated design , Angela Marie Jugan

Tests for gene-treatment interaction in microarray data analysis , Wanrong Yin

Theses from 2003 2003

Pricing of European options , Dirk Rohmeder

Prediction intervals for the binomial distribution with dependent trials , Florian Sebastian Rueck

Theses from 2002 2002

The use of a Marakov dependent Bernoulli process to model the relationship between employment status and drug use , Kathrin Koetting

Theses from 2000 2000

Inverse limits on [0,1] using sequences of piecewise linear unimodal bonding maps , Brian Edward Raines

Theses from 1998 1998

A two-stage step-stress accelerated life testing scheme , Phyllis E. Pound Singer

Theses from 1997 1997

Some properties of hereditarily indecomposable chainable continua , Thomas John Kacvinsky

Theses from 1996 1996

The Axiom of Choice, well-ordering property, Continuum Hypothesis, and other meta-mathematical considerations , Daniel Collins

Theses from 1994 1994

Approximate distributional results for tolerance limits and confidence limits on reliability based on the maximum likelihood estimators for the logistic distribution , Teriann Collins

Theses from 1986 1986

Investigating the output angular acceleration extrema of the planar four bar mechanism , Matthew H. Koebbe

Theses from 1984 1984

Approximating distributions in order restricted inference : the simple tree ordering , Tuan Anh Tran

Theses from 1982 1982

Goodness-of-fit for the Weibull distribution with unknown parameters and censored sampling. , Michael Edward Aho

Theses from 1979 1979

On L convergence of Fourier series. , William O. Bray

Theses from 1977 1977

Characterizations of inner product spaces. , John Lee Roy Williams

Theses from 1975 1975

A study of several substitution ciphers using mathematical models. , Wanda Louise Garner

Theses from 1974 1974

Models for molecular vibration , Allan Bruce Capps

The completions of local rings and their modules. , Christopher Scott Taber

Linear geometry , Phyllis L. Thomas

Theses from 1971 1971

Integrability of the sums of the trigonometric series 1/2 aₒ + ∞ [over] Σ [over] n=1 a n cos nΘ and ∞ [over] Σ [over] n=1 a n sin nΘ , John William Garrett

Inclusion theorems for boundary value problems for delay differential equations , Leon M. Hall

Theses from 1965 1965

A study of certain conservative sets for parameters in the linear statistical model , Roger Alan Chapin

Comparison of methods to select a probability model , Howard Lyndal Colburn

Latent class analysis and information retrieval , George Loyd Jensen

Linear and quadratic programming with more than one objective function , William John Lodholz

Tschebyscheff fitting with polynomials and nonlinear functions , George F. Luffel

Theses from 1964 1964

The effect of matrix condition in the solution of a system of linear algebraic equations. , Herbert R. Alcorn

Estimation and tabulation of bias coefficients for regression analysis in incompletely specified linear models. , Harry Kerry Edwards

A study of a method for selecting the best of two or more mathematical models , August J. Garver

A study of methods for estimating parameters in the model y(t) = A₁e -p₁t + A₂e -p₂t + ϵ , Gerald Nicholas Haas

A parameter perturbation procedure for obtaining a solution to systems of nonlinear equations. , James Carlton Helm

A study of stability of numerical solution for parabolic partial differential equations. , Tsang-Chi Huang

A numerical study of Van Der Pol's nonlinear differential equation for various values of the parameter E. , Charles C. Limbaugh

A study on estimating parameters restricted by linear inequalities , William Lawrence May

Minimization of Boolean functions. , Don Laroy Rogier

A method to give the best linear combination of order statistics to estimate the mean of any symmetric population , Robert M. Smith

On a numerical solution of Dirichlet type problems with singularity on the boundary. , Randall Loran Yoakum

Theses from 1963 1963

A study of methods for estimating parameters in rational polynomial models , Thomas B. Baird

Investigation of measures of ill-conditioning , Thomas D. Calton

A numerical approach to a Sturm-Liouville type problem with variable coefficients and its application to heat transfer and temperature prediction in the lower atmosphere. , Troyce Don Jones

A study of methods for determining confidence intervals for the mean of a normal distribution with unknown varience by comparison of average lengths , Karl Richard Kneile

Stability properties of various predictor corrector methods for solving ordinary differential equations numerically. , Charles Edward. Leslie

Mathematical techniques in the solution of boundary value problems. , Vincent Paul Pusateri

A modified algorithm for Henrici's solution of y' ' = f (x,y) , Frank Garnett Walters

Theses from 1962 1962

An investigation of Lehmer's method for finding the roots of polynomial equations using the Royal-McBee LGP-30 , James W. Joiner

Theses from 1931 1931

The spinning top , Aaron Jefferson Miles

Advanced Search

  • Notify me via email or RSS
  • Collections
  • Disciplines
  • All Authors
  • Faculty Authors

Author Corner

Useful links.

  • Library Resources

S&T logo

Thesis Locations

  • View theses on map
  • View theses in Google Earth

Home | About | FAQ | My Account | Accessibility Statement

Privacy Copyright

DigitalCommons@University of Nebraska - Lincoln

Home > Statistics > Dissertations, Theses, and Student Work

Statistics, Department of

Department of statistics: dissertations, theses, and student work.

Measuring Jury Perception of Explainable Machine Learning and Demonstrative Evidence , Rachel Rogers

Examining the Effect of Word Embeddings and Preprocessing Methods on Fake News Detection , Jessica Hauschild

Exploring Experimental Design and Multivariate Analysis Techniques for Evaluating Community Structure of Bacteria in Microbiome Data , Kelsey Karnik

Human Perception of Exponentially Increasing Data Displayed on a Log Scale Evaluated Through Experimental Graphics Tasks , Emily Robinson

Factors Influencing Student Outcomes in a Large, Online Simulation-Based Introductory Statistics Course , Ella M. Burnham

Comparing Machine Learning Techniques with State-of-the-Art Parametric Prediction Models for Predicting Soybean Traits , Susweta Ray

Using Stability to Select a Shrinkage Method , Dean Dustin

Statistical Methodology to Establish a Benchmark for Evaluating Antimicrobial Resistance Genes through Real Time PCR assay , Enakshy Dutta

Group Testing Identification: Objective Functions, Implementation, and Multiplex Assays , Brianna D. Hitt

Community Impact on the Home Advantage within NCAA Men's Basketball , Erin O'Donnell

Optimal Design for a Causal Structure , Zaher Kmail

Role of Misclassification Estimates in Estimating Disease Prevalence and a Non-Linear Approach to Study Synchrony Using Heart Rate Variability in Chickens , Dola Pathak

A Characterization of a Value Added Model and a New Multi-Stage Model For Estimating Teacher Effects Within Small School Systems , Julie M. Garai

Methods to Account for Breed Composition in a Bayesian GWAS Method which Utilizes Haplotype Clusters , Danielle F. Wilson-Wells

Beta-Binomial Kriging: A New Approach to Modeling Spatially Correlated Proportions , Aimee Schwab

Simulations of a New Response-Adaptive Biased Coin Design , Aleksandra Stein

MODELING THE DYNAMIC PROCESSES OF CHALLENGE AND RECOVERY (STRESS AND STRAIN) OVER TIME , Fan Yang

A New Approach to Modeling Multivariate Time Series on Multiple Temporal Scales , Tucker Zeleny

A Reduced Bias Method of Estimating Variance Components in Generalized Linear Mixed Models , Elizabeth A. Claassen

NEW STATISTICAL METHODS FOR ANALYSIS OF HISTORICAL DATA FROM WILDLIFE POPULATIONS , Trevor Hefley

Informative Retesting for Hierarchical Group Testing , Michael S. Black

A Test for Detecting Changes in Closed Networks Based on the Number of Communications Between Nodes , Christopher S. Wichman

GROUP TESTING REGRESSION MODELS , Boan Zhang

A Comparison of Spatial Prediction Techniques Using Both Hard and Soft Data , Megan L. Liedtke Tesar

STUDYING THE HANDLING OF HEAT STRESSED CATTLE USING THE ADDITIVE BI-LOGISTIC MODEL TO FIT BODY TEMPERATURE , Fan Yang

Estimating Teacher Effects Using Value-Added Models , Jennifer L. Green

SEQUENCE COMPARISON AND STOCHASTIC MODEL BASED ON MULTI-ORDER MARKOV MODELS , Xiang Fang

DETECTING DIFFERENTIALLY EXPRESSED GENES WHILE CONTROLLING THE FALSE DISCOVERY RATE FOR MICROARRAY DATA , SHUO JIAO

Spatial Clustering Using the Likelihood Function , April Kerby

FULLY EXPONENTIAL LAPLACE APPROXIMATION EM ALGORITHM FOR NONLINEAR MIXED EFFECTS MODELS , Meijian Zhou

Advanced Search

Search Help

  • Notify me via email or RSS
  • Administrator Resources
  • How to Cite Items From This Repository
  • Copyright Information
  • Collections
  • Disciplines

Author Corner

  • Guide to Submitting
  • Submit your paper or article
  • Statistics Website

Home | About | FAQ | My Account | Accessibility Statement

Privacy Copyright

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HCA Healthc J Med
  • v.1(2); 2020
  • PMC10324782

Logo of hcahjm

Introduction to Research Statistical Analysis: An Overview of the Basics

Christian vandever.

1 HCA Healthcare Graduate Medical Education

Description

This article covers many statistical ideas essential to research statistical analysis. Sample size is explained through the concepts of statistical significance level and power. Variable types and definitions are included to clarify necessities for how the analysis will be interpreted. Categorical and quantitative variable types are defined, as well as response and predictor variables. Statistical tests described include t-tests, ANOVA and chi-square tests. Multiple regression is also explored for both logistic and linear regression. Finally, the most common statistics produced by these methods are explored.

Introduction

Statistical analysis is necessary for any research project seeking to make quantitative conclusions. The following is a primer for research-based statistical analysis. It is intended to be a high-level overview of appropriate statistical testing, while not diving too deep into any specific methodology. Some of the information is more applicable to retrospective projects, where analysis is performed on data that has already been collected, but most of it will be suitable to any type of research. This primer will help the reader understand research results in coordination with a statistician, not to perform the actual analysis. Analysis is commonly performed using statistical programming software such as R, SAS or SPSS. These allow for analysis to be replicated while minimizing the risk for an error. Resources are listed later for those working on analysis without a statistician.

After coming up with a hypothesis for a study, including any variables to be used, one of the first steps is to think about the patient population to apply the question. Results are only relevant to the population that the underlying data represents. Since it is impractical to include everyone with a certain condition, a subset of the population of interest should be taken. This subset should be large enough to have power, which means there is enough data to deliver significant results and accurately reflect the study’s population.

The first statistics of interest are related to significance level and power, alpha and beta. Alpha (α) is the significance level and probability of a type I error, the rejection of the null hypothesis when it is true. The null hypothesis is generally that there is no difference between the groups compared. A type I error is also known as a false positive. An example would be an analysis that finds one medication statistically better than another, when in reality there is no difference in efficacy between the two. Beta (β) is the probability of a type II error, the failure to reject the null hypothesis when it is actually false. A type II error is also known as a false negative. This occurs when the analysis finds there is no difference in two medications when in reality one works better than the other. Power is defined as 1-β and should be calculated prior to running any sort of statistical testing. Ideally, alpha should be as small as possible while power should be as large as possible. Power generally increases with a larger sample size, but so does cost and the effect of any bias in the study design. Additionally, as the sample size gets bigger, the chance for a statistically significant result goes up even though these results can be small differences that do not matter practically. Power calculators include the magnitude of the effect in order to combat the potential for exaggeration and only give significant results that have an actual impact. The calculators take inputs like the mean, effect size and desired power, and output the required minimum sample size for analysis. Effect size is calculated using statistical information on the variables of interest. If that information is not available, most tests have commonly used values for small, medium or large effect sizes.

When the desired patient population is decided, the next step is to define the variables previously chosen to be included. Variables come in different types that determine which statistical methods are appropriate and useful. One way variables can be split is into categorical and quantitative variables. ( Table 1 ) Categorical variables place patients into groups, such as gender, race and smoking status. Quantitative variables measure or count some quantity of interest. Common quantitative variables in research include age and weight. An important note is that there can often be a choice for whether to treat a variable as quantitative or categorical. For example, in a study looking at body mass index (BMI), BMI could be defined as a quantitative variable or as a categorical variable, with each patient’s BMI listed as a category (underweight, normal, overweight, and obese) rather than the discrete value. The decision whether a variable is quantitative or categorical will affect what conclusions can be made when interpreting results from statistical tests. Keep in mind that since quantitative variables are treated on a continuous scale it would be inappropriate to transform a variable like which medication was given into a quantitative variable with values 1, 2 and 3.

Categorical vs. Quantitative Variables

Categorical VariablesQuantitative Variables
Categorize patients into discrete groupsContinuous values that measure a variable
Patient categories are mutually exclusiveFor time based studies, there would be a new variable for each measurement at each time
Examples: race, smoking status, demographic groupExamples: age, weight, heart rate, white blood cell count

Both of these types of variables can also be split into response and predictor variables. ( Table 2 ) Predictor variables are explanatory, or independent, variables that help explain changes in a response variable. Conversely, response variables are outcome, or dependent, variables whose changes can be partially explained by the predictor variables.

Response vs. Predictor Variables

Response VariablesPredictor Variables
Outcome variablesExplanatory variables
Should be the result of the predictor variablesShould help explain changes in the response variables
One variable per statistical testCan be multiple variables that may have an impact on the response variable
Can be categorical or quantitativeCan be categorical or quantitative

Choosing the correct statistical test depends on the types of variables defined and the question being answered. The appropriate test is determined by the variables being compared. Some common statistical tests include t-tests, ANOVA and chi-square tests.

T-tests compare whether there are differences in a quantitative variable between two values of a categorical variable. For example, a t-test could be useful to compare the length of stay for knee replacement surgery patients between those that took apixaban and those that took rivaroxaban. A t-test could examine whether there is a statistically significant difference in the length of stay between the two groups. The t-test will output a p-value, a number between zero and one, which represents the probability that the two groups could be as different as they are in the data, if they were actually the same. A value closer to zero suggests that the difference, in this case for length of stay, is more statistically significant than a number closer to one. Prior to collecting the data, set a significance level, the previously defined alpha. Alpha is typically set at 0.05, but is commonly reduced in order to limit the chance of a type I error, or false positive. Going back to the example above, if alpha is set at 0.05 and the analysis gives a p-value of 0.039, then a statistically significant difference in length of stay is observed between apixaban and rivaroxaban patients. If the analysis gives a p-value of 0.91, then there was no statistical evidence of a difference in length of stay between the two medications. Other statistical summaries or methods examine how big of a difference that might be. These other summaries are known as post-hoc analysis since they are performed after the original test to provide additional context to the results.

Analysis of variance, or ANOVA, tests can observe mean differences in a quantitative variable between values of a categorical variable, typically with three or more values to distinguish from a t-test. ANOVA could add patients given dabigatran to the previous population and evaluate whether the length of stay was significantly different across the three medications. If the p-value is lower than the designated significance level then the hypothesis that length of stay was the same across the three medications is rejected. Summaries and post-hoc tests also could be performed to look at the differences between length of stay and which individual medications may have observed statistically significant differences in length of stay from the other medications. A chi-square test examines the association between two categorical variables. An example would be to consider whether the rate of having a post-operative bleed is the same across patients provided with apixaban, rivaroxaban and dabigatran. A chi-square test can compute a p-value determining whether the bleeding rates were significantly different or not. Post-hoc tests could then give the bleeding rate for each medication, as well as a breakdown as to which specific medications may have a significantly different bleeding rate from each other.

A slightly more advanced way of examining a question can come through multiple regression. Regression allows more predictor variables to be analyzed and can act as a control when looking at associations between variables. Common control variables are age, sex and any comorbidities likely to affect the outcome variable that are not closely related to the other explanatory variables. Control variables can be especially important in reducing the effect of bias in a retrospective population. Since retrospective data was not built with the research question in mind, it is important to eliminate threats to the validity of the analysis. Testing that controls for confounding variables, such as regression, is often more valuable with retrospective data because it can ease these concerns. The two main types of regression are linear and logistic. Linear regression is used to predict differences in a quantitative, continuous response variable, such as length of stay. Logistic regression predicts differences in a dichotomous, categorical response variable, such as 90-day readmission. So whether the outcome variable is categorical or quantitative, regression can be appropriate. An example for each of these types could be found in two similar cases. For both examples define the predictor variables as age, gender and anticoagulant usage. In the first, use the predictor variables in a linear regression to evaluate their individual effects on length of stay, a quantitative variable. For the second, use the same predictor variables in a logistic regression to evaluate their individual effects on whether the patient had a 90-day readmission, a dichotomous categorical variable. Analysis can compute a p-value for each included predictor variable to determine whether they are significantly associated. The statistical tests in this article generate an associated test statistic which determines the probability the results could be acquired given that there is no association between the compared variables. These results often come with coefficients which can give the degree of the association and the degree to which one variable changes with another. Most tests, including all listed in this article, also have confidence intervals, which give a range for the correlation with a specified level of confidence. Even if these tests do not give statistically significant results, the results are still important. Not reporting statistically insignificant findings creates a bias in research. Ideas can be repeated enough times that eventually statistically significant results are reached, even though there is no true significance. In some cases with very large sample sizes, p-values will almost always be significant. In this case the effect size is critical as even the smallest, meaningless differences can be found to be statistically significant.

These variables and tests are just some things to keep in mind before, during and after the analysis process in order to make sure that the statistical reports are supporting the questions being answered. The patient population, types of variables and statistical tests are all important things to consider in the process of statistical analysis. Any results are only as useful as the process used to obtain them. This primer can be used as a reference to help ensure appropriate statistical analysis.

Alpha (α)the significance level and probability of a type I error, the probability of a false positive
Analysis of variance/ANOVAtest observing mean differences in a quantitative variable between values of a categorical variable, typically with three or more values to distinguish from a t-test
Beta (β)the probability of a type II error, the probability of a false negative
Categorical variableplace patients into groups, such as gender, race or smoking status
Chi-square testexamines association between two categorical variables
Confidence intervala range for the correlation with a specified level of confidence, 95% for example
Control variablesvariables likely to affect the outcome variable that are not closely related to the other explanatory variables
Hypothesisthe idea being tested by statistical analysis
Linear regressionregression used to predict differences in a quantitative, continuous response variable, such as length of stay
Logistic regressionregression used to predict differences in a dichotomous, categorical response variable, such as 90-day readmission
Multiple regressionregression utilizing more than one predictor variable
Null hypothesisthe hypothesis that there are no significant differences for the variable(s) being tested
Patient populationthe population the data is collected to represent
Post-hoc analysisanalysis performed after the original test to provide additional context to the results
Power1-beta, the probability of avoiding a type II error, avoiding a false negative
Predictor variableexplanatory, or independent, variables that help explain changes in a response variable
p-valuea value between zero and one, which represents the probability that the null hypothesis is true, usually compared against a significance level to judge statistical significance
Quantitative variablevariable measuring or counting some quantity of interest
Response variableoutcome, or dependent, variables whose changes can be partially explained by the predictor variables
Retrospective studya study using previously existing data that was not originally collected for the purposes of the study
Sample sizethe number of patients or observations used for the study
Significance levelalpha, the probability of a type I error, usually compared to a p-value to determine statistical significance
Statistical analysisanalysis of data using statistical testing to examine a research hypothesis
Statistical testingtesting used to examine the validity of a hypothesis using statistical calculations
Statistical significancedetermine whether to reject the null hypothesis, whether the p-value is below the threshold of a predetermined significance level
T-testtest comparing whether there are differences in a quantitative variable between two values of a categorical variable

Funding Statement

This research was supported (in whole or in part) by HCA Healthcare and/or an HCA Healthcare affiliated entity.

Conflicts of Interest

The author declares he has no conflicts of interest.

Christian Vandever is an employee of HCA Healthcare Graduate Medical Education, an organization affiliated with the journal’s publisher.

This research was supported (in whole or in part) by HCA Healthcare and/or an HCA Healthcare affiliated entity. The views expressed in this publication represent those of the author(s) and do not necessarily represent the official views of HCA Healthcare or any of its affiliated entities.

Statistical Methods in Theses: Guidelines and Explanations

Signed August 2018 Naseem Al-Aidroos, PhD, Christopher Fiacconi, PhD Deborah Powell, PhD, Harvey Marmurek, PhD, Ian Newby-Clark, PhD, Jeffrey Spence, PhD, David Stanley, PhD, Lana Trick, PhD

Version:  2.00

This document is an organizational aid, and workbook, for students. We encourage students to take this document to meetings with their advisor and committee. This guide should enhance a committee’s ability to assess key areas of a student’s work. 

In recent years a number of well-known and apparently well-established findings have  failed to replicate , resulting in what is commonly referred to as the replication crisis. The APA Publication Manual 6 th Edition notes that “The essence of the scientific method involves observations that can be repeated and verified by others.” (p. 12). However, a systematic investigation of the replicability of psychology findings published in  Science  revealed that over half of psychology findings do not replicate (see a related commentary in  Nature ). Even more disturbing, a  Bayesian reanalysis of the reproducibility project  showed that 64% of studies had sample sizes so small that strong evidence for or against the null or alternative hypotheses did not exist. Indeed, Morey and Lakens (2016) concluded that most of psychology is statistically unfalsifiable due to small sample sizes and correspondingly low power (see  article ). Our discipline’s reputation is suffering. News of the replication crisis has reached the popular press (e.g.,  The Atlantic ,   The Economist ,   Slate , Last Week Tonight ).

An increasing number of psychologists have responded by promoting new research standards that involve open science and the elimination of  Questionable Research Practices . The open science perspective is made manifest in the  Transparency and Openness Promotion (TOP) guidelines  for journal publications. These guidelines were adopted some time ago by the  Association for Psychological Science . More recently, the guidelines were adopted by American Psychological Association journals ( see details ) and journals published by Elsevier ( see details ). It appears likely that, in the very near future, most journals in psychology will be using an open science approach. We strongly advise readers to take a moment to inspect the  TOP Guidelines Summary Table . 

A key aspect of open science and the TOP guidelines is the sharing of data associated with published research (with respect to medical research, see point #35 in the  World Medical Association Declaration of Helsinki ). This practice is viewed widely as highly important. Indeed, open science is recommended by  all G7 science ministers . All Tri-Agency grants must include a data-management plan that includes plans for sharing: “ research data resulting from agency funding should normally be preserved in a publicly accessible, secure and curated repository or other platform for discovery and reuse by others.”  Moreover, a 2017 editorial published in the  New England Journal of Medicine announced that the  International Committee of Medical Journal Editors believes there is  “an ethical obligation to responsibly share data.”  As of this writing,  60% of highly ranked psychology journals require or encourage data sharing .

The increasing importance of demonstrating that findings are replicable is reflected in calls to make replication a requirement for the promotion of faculty (see details in  Nature ) and experts in open science are now refereeing applications for tenure and promotion (see details at the  Center for Open Science  and  this article ). Most dramatically, in one instance, a paper resulting from a dissertation was retracted due to misleading findings attributable to Questionable Research Practices. Subsequent to the retraction, the Ohio State University’s Board of Trustees unanimously revoked the PhD of the graduate student who wrote the dissertation ( see details ). Thus, the academic environment is changing and it is important to work toward using new best practices in lieu of older practices—many of which are synonymous with Questionable Research Practices. Doing so should help you avoid later career regrets and subsequent  public mea culpas . One way to achieve your research objectives in this new academic environment is  to incorporate replications into your research . Replications are becoming more common and there are even websites dedicated to helping students conduct replications (e.g.,  Psychology Science Accelerator ) and indexing the success of replications (e.g., Curate Science ). You might even consider conducting a replication for your thesis (subject to committee approval).

As early-career researchers, it is important to be aware of the changing academic environment. Senior principal investigators may be  reluctant to engage in open science  (see this student perspective in a  blog post  and  podcast ) and research on resistance to data sharing indicates that one of the barriers to sharing data is that researchers do not feel that they have knowledge of  how to share data online . This document is an educational aid and resource to provide students with introductory knowledge of how to participate in open science and online data sharing to start their education on these subjects. 

Guidelines and Explanations

In light of the changes in psychology, faculty members who teach statistics/methods have reviewed the literature and generated this guide for graduate students. The guide is intended to enhance the quality of student theses by facilitating their engagement in open and transparent research practices and by helping them avoid Questionable Research Practices, many of which are now deemed unethical and covered in the ethics section of textbooks.

This document is an informational tool.

How to Start

In order to follow best practices, some first steps need to be followed. Here is a list of things to do:

  • Get an Open Science account. Registration at  osf.io  is easy!
  • If conducting confirmatory hypothesis testing for your thesis, pre-register your hypotheses (see Section 1-Hypothesizing). The Open Science Foundation website has helpful  tutorials  and  guides  to get you going.
  • Also, pre-register your data analysis plan. Pre-registration typically includes how and when you will stop collecting data, how you will deal with violations of statistical assumptions and points of influence (“outliers”), the specific measures you will use, and the analyses you will use to test each hypothesis, possibly including the analysis script. Again, there is a lot of help available for this. 

Exploratory and Confirmatory Research Are Both of Value, But Do Not Confuse the Two

We note that this document largely concerns confirmatory research (i.e., testing hypotheses). We by no means intend to devalue exploratory research. Indeed, it is one of the primary ways that hypotheses are generated for (possible) confirmation. Instead, we emphasize that it is important that you clearly indicate what of your research is exploratory and what is confirmatory. Be clear in your writing and in your preregistration plan. You should explicitly indicate which of your analyses are exploratory and which are confirmatory. Please note also that if you are engaged in exploratory research, then Null Hypothesis Significance Testing (NHST) should probably be avoided (see rationale in  Gigerenzer  (2004) and  Wagenmakers et al., (2012) ). 

This document is structured around the stages of thesis work:  hypothesizing, design, data collection, analyses, and reporting – consistent with the headings used by Wicherts et al. (2016). We also list the Questionable Research Practices associated with each stage and provide suggestions for avoiding them. We strongly advise going through all of these sections during thesis/dissertation proposal meetings because a priori decisions need to be made prior to data collection (including analysis decisions). 

To help to ensure that the student has informed the committee about key decisions at each stage, there are check boxes at the end of each section.

How to Use This Document in a Proposal Meeting

  • Print off a copy of this document and take it to the proposal meeting.
  • During the meeting, use the document to seek assistance from faculty to address potential problems.
  • Revisit responses to issues raised by this document (especially the Analysis and Reporting Stages) when you are seeking approval to proceed to defense.

Consultation and Help Line

Note that the Center for Open Science now has a help line (for individual researchers and labs) you can call for help with open science issues. They also have training workshops. Please see their  website  for details.

  • Hypothesizing
  • Data Collection
  • Printer-friendly version
  • PDF version

PhD Dissertations

2024
Title Author Supervisor
Statistical Learning and Modeling with Graphs and Networks ,
Estimation and Inference of Optimal Policies ,
2023
Title Author Supervisor
Statistical Methods for the Analysis and Prediction of Hierarchical Time Series Data with Applications to Demography
Exponential Family Models for Rich Preference Ranking Data
Bayesian methods for variable selection ,
Statistical methods for genomic sequencing data
Mixture models to fit heavy-tailed, heterogeneous or sparse data ,
Addressing double dipping through selective inference and data thinning
Methods for the Statistical Analysis of Preferences, with Applications to Social Science Data
Estimating subnational health and demographic indicators using complex survey data
Inference and Estimation for Network Data
Interpretation and Validation for unsupervised learning
2022
Title Author Supervisor
Likelihood-based haplotype frequency modeling using variable-order Markov chains
Statistical Divergences for Learning and Inference: Limit Laws and Non-Asymptotic Bounds ,
Missing Data Methods for Observational Health Dataset
Methods, Models, and Interpretations for Spatial-Temporal Public Health Applications
Statistical Methods for Clustering and High Dimensional Time Series Analysis
Causal Structure Learning in High Dimensions ,
Geometric algorithms for interpretable manifold learning
2021
Title Author Supervisor
Improving Uncertainty Quantification and Visualization for Spatiotemporal Earthquake Rate Models for the Pacific Northwest ,
Statistical modeling of long memory and uncontrolled effects in neural recordings
Distribution-free consistent tests of independence via marginal and multivariate ranks
Causality, Fairness, and Information in Peer Review ,
Subnational Estimation of Period Child Mortality in a Low and Middle Income Countries Context
Progress in nonparametric minimax estimation and high dimensional hypothesis testing ,
Likelihood Analysis of Causal Models
Bayesian Models in Population Projections and Climate Change Forecast
2020
Title Author Supervisor
Statistical Methods for Adaptive Immune Receptor Repertoire Analysis and Comparison
Statistical Methods for Geospatial Modeling with Stratified Cluster Survey Data
Representation Learning for Partitioning Problems
Estimation and Inference in Changepoint Models
Space-Time Contour Models for Sea Ice Forecasting ,
Non-Gaussian Graphical Models: Estimation with Score Matching and Causal Discovery under Zero-Inflation ,
Scalable Learning in Latent State Sequence Models
2019
Title Author Supervisor
Latent Variable Models for Prediction & Inference with Proxy Network Measures
Bayesian Hierarchical Models and Moment Bounds for High-Dimensional Time Series ,
Estimation and testing under shape constraints ,
Inferring network structure from partially observed graphs
Fitting Stochastics Epidemic Models to Multiple Data Types
Realized genome sharing in random effects models for quantitative genetic traits
Large-Scale B Cell Receptor Sequence Analysis Using Phylogenetics and Machine Learning
Statistical Methods for Manifold Recovery and C^ (1, 1) Regression on Manifolds
2018
Title Author Supervisor
Topics in Statistics and Convex Geometry: Rounding, Sampling, and Interpolation
Estimation and Testing Following Model Selection
Topics on Least Squares Estimation
Discovering Interaction in Multivariate Time Series
Nonparametric inference on monotone functions, with applications to observational studies
Model-Based Penalized Regression
Bayesian Methods for Graphical Models with Limited Data
Parameter Identification and Assessment of Independence in Multivariate Statistical Modeling
Preferential sampling and model checking in phylodynamic inference
Linear Structural Equation Models with Non-Gaussian Errors: Estimation and Discovery
Coevolution Regression and Composite Likelihood Estimation for Social Networks
2017
Title Author Supervisor
"Applications of Robust Statistical Methods in Quantitative Finance"
"Scalable Manifold Learning and Related Topics"
"Topics in Graph Clustering"
"Methods for Estimation and Inference for High-Dimensional Models" ,
"Scalable Methods for the Inference of Identity by Descent"
2016
Title Author Supervisor
"Space-Time Smoothing Models for Surveillance and Complex Survey Data"
"Testing Independence in High Dimensions & Identifiability of Graphical Models"
"Likelihood-Based Inference for Partially Observed Multi-Type Markov Branching Processes"
"Bayesian Methods for Inferring Gene Regulatory Networks" ,
"Finite Sampling Exponential Bounds"
"Finite Population Inference for Causal Parameters"
"Projection and Estimation of International Migration"
"Statistical Hurdle Models for Single Cell Gene Expression: Differential Expression and Graphical Modeling"
2015
Title Author Supervisor
"Theory and Methods for Tensor Data"
"Discrete-Time Threshold Regression for Survival Data with Time-Dependent Covariates"
"Degeneracy, Duration, and Co-Evolution: Extending Exponential Random Graph Models (ERGM) for Social Network Analysis"
"The Likelihood Pivot: Performing Inference with Confidence"
"Lord's Paradox and Targeted Interventions: The Case of Special Education" ,
"Bayesian Modeling of a High Resolution Housing Price Index"
"Phylogenetic Stochastic Mapping"
2014
Title Author Supervisor
"Monte Carlo Estimation of Identity by Descent in Populations"
"Bayesian Spatial and Temporal Methods for Public Health Data" ,
"Functional Quantitative Genetics and the Missing Heritability Problem"
"Predictive Modeling of Cholera Outbreaks in Bangladesh" ,
"Gravimetric Anomaly Detection Using Compressed Sensing"
"R-Squared Inference Under Non-Normal Error"
2013
Title Author Supervisor
"Learning and Manifolds: Leveraging the Intrinsic Geometry"
"An Algorithmic Framework for High Dimensional Regression with Dependent Variables"
"Bayesian Population Reconstruction: A Method for Estimating Age- and Sex-Specific Vital Rates and Population Counts with Uncertainty from Fragmentary Data"
"Bayesian Nonparametric Inference of Effective Population Size Trajectories from Genomic Data"
"Modeling Heterogeneity Within and Between Matrices and Arrays"
"Shape-Constrained Inference for Concave-Transformed Densities and their Modes"
"Statistical Inference Using Kronecker Structured Covariance"
2012
Title Author Supervisor
"Bayesian Modeling For Multivariate Mixed Outcomes With Applications To Cognitive Testing Data"
"Tests for Differences between Least Squares and Robust Regression Parameter Estimates and Related To Pics"
"Bayesian Modeling of Health Data in Space and Time"
"Coordinate-Free Exponential Families on Contingency Tables" ,
2011
Title Author Supervisor
"Seeing the trees through the forest; a competition model for growth and mortality"
"Bayesian Inference of Exponential-family Random Graph Models for Social Networks"
"Statistical Models for Estimating and Predicting HIV/AIDS Epidemics"
"Modeling the Game of Soccer Using Potential Functions"
"Parametrizations of Discrete Graphical Models"
"A Bayesian Surveillance System for Detecting Clusters of Non-Infectious Diseases"
"Statistical Approaches to Analyze Mass Spectrometry Data Graduating Year" ,
2010
Title Author Supervisor
"Multivariate Geostatistics and Geostatistical Model Averaging"
"Covariance estimation in the Presence of Diverse Types of Data"
"Portfolio Optimization with Tail Risk Measures and Non-Normal Returns"
"Convex analysis methods in shape constrained estimation."
"Estimating social contact networks to improve epidemic simulation models"
2009
Title Author Supervisor
"Models for Heterogeneity in Heterosexual Partnership Networks"
"A comparison of alternative methodologies for estimation of HIV incidence"
"Bayesian Model Averaging and Multivariate Conditional Independence Structures"
"Conditional tests for localizing trait genes"
"Combining and Evaluating Probabilistic Forecasts"
"Probabilistic weather forecasting using Bayesian model averaging"
"Statistical Analysis of Portfolio Risk and Performance Measures: the Influence Function Approach"
"Factor Model Monte Carlo Methods for General Fund-of-Funds Portfolio Management"
"Statistical Models for Social Network Data and Processes"
2008
Title Author Supervisor
"Inference from partially-observed network data"
"Models and Inference of Transmission of DNA Methylation Patterns in Mammalian Somatic Cells"
"Estimates and projections of the total fertility rate"
"Nonparametric estimation of multivariate monotone densities"
"Learning transcriptional regulatory networks from the integration of heterogeneous high-throughout data"
"Extensions of Latent Class Transition Models with Application to Chronic Disability Survey Data"
"Statistical Solutions to Some Problems in Medical Imaging" ,
"Statistical methods for peptide and protein identification using mass spectrometry"
2007
Title Author Supervisor
"Statistical Methodology for Longitudinal Social Network Data"
"Probabilistic weather forecasting with spatial dependence"
"Wavelet variance analysis for time series and random fields" ,
"Bayesian hierarchical curve registration"
""Up-and-Down" and the Percentile-Finding Problem"
2006
Title Author Supervisor
"An efficient and flexible model for patterns of population genetic variation"
"Learning in Spectral Clustering"
"Variable selection and other extensions of the mixture model clustering framework"
"Algorithms for Estimating the Cluster Tree of a Density"
"Likelihood inference for population structure, using the coalescent"
"Exploring rates and patterns of variability in gene conversion and crossover in the human genome"
"Alleviating ecological bias in generalized linear models and optimal design with subsample data" ,
"Nonparametric estimation for current status data with competing risks" ,
"Goodness-of-fit statistics based on phi-divergences"
2005
Title Author Supervisor
"Alternative models for estimating genetic maps from pedigree data"
"Allele-sharing methods for linkage detection using extended pedigrees"
"Robust estimation of factor models in finance"
"Using the structure of d-connecting paths as a qualitative measure of the strength of dependence" ,
"Alternative estimators of wavelet variance" , ,
"Bayesian robust analysis of gene expression microarray data"
2004
Title Author Supervisor
"Maximum likelihood estimation in Gaussian AMP chain graph models and Gaussian ancestral graph models" ,
"Nonparametric estimation of a k-monotone density: A new asymptotic distribution theory"
2003
Title Author Supervisor
"The genetic structure of related recombinant lines"
"Joint relationship inference from three or more individuals in the presence of genotyping error"
"Personal characteristics and covariate measurement error in disease risk estimation" ,
"Model based and hybrid clustering of large datasets" ,
2002
Title Author Supervisor
"Practical importance sampling methods for finite mixture models and multiple imputation"
"Applying graphical models to partially observed data-generating processes" ,
"Generalized linear mixed models: development and comparison of different estimation methods"
2001
Title Author Supervisor
"Bayesian inference for deterministic simulation models for environmental assessment"
"Modeling recessive lethals: An explanation for excess sharing in siblings"
"Estimation with bivariate interval censored data"
"Latent models for cross-covariance" ,
2000
Title Author Supervisor
"Wavelet-based estimation for trend contaminated long memory processes" ,
"Global covariance modeling: A deformation approach to anisotropy"
"Likelihood inference for parameteric models of dispersal"
"Bayesian inference in hidden stochastic population processes"
"Logic regression and statistical issues related to the protein folding problem" ,
"Likelihood ratio inference in regular and non-regular problems"
"Estimating the association between airborne particulate matter and elderly mortality in Seattle, Washington using Bayesian Model Averaging" ,
"Nonhomogeneous hidden Markov models for downscaling synoptic atmospheric patterns to precipitation amounts" ,
"Detecting and extracting complex patterns from images and realizations of spatial point processes"
"A model selection approach to partially linear regression"
1999
Title Author Supervisor
"Generalization of boosting algorithms and applications of Bayesian inference for massive datasets" ,
"Bayesian inference for noninvertible deterministic simulation models, with application to bowhead whale assessment"
"Monte Carlo likelihood calculation for identity by descent data"
"Fast automatic unsupervised image segmentation and curve detection in spatial point processes"
"Semiparametric inference based on estimating equations in regressions models for two phase outcome dependent sampling" ,
"Capture-recapture estimation of bowhead whale population size using photo-identification data" ,
"Lifetime and disease onset distributions from incomplete observations"
"Statistical approaches to distinct value estimation" ,
1998
Title Author Supervisor
"Application of ridge regression for improved estimation of parameters in compartmental models"
"Bayesian modeling of highly structured systems using Markov chain Monte Carlo"
"Assessing nonstationary time series using wavelets" ,
"Lattice conditional independence models for incomplete multivariate data and for seemingly unrelated regressions" ,
"Estimation for counting processes with incomplete data"
"Regularization techniques for linear regression with a large set of carriers"
"Large sample theory for pseudo maximum likelihood estimates in semiparametric models"
"Additive mixture models for multichannel image data"
1997
Title Author Supervisor
"Phylogenies via conditional independence modeling"
"Bayesian model averaging in censored survival models"
"Bayesian information retrieval"
"Statistical inference for partially observed markov population processes"
"Tools for the advancement of undergraduate statistics education"
"A new learning procedure in acyclic directed graphs"
1996
Title Author Supervisor
"Variability estimation in linear inverse problems"
"Inference in a discrete parameter space"
"Bootstrapping functional m-estimators"
1995
Title Author Supervisor
"Estimation of heterogeneous space-time covariance"
"Semiparametric estimation of major gene and random environmental effects for age of onset"
"Statistical analysis of biological monitoring data: State-space models for species compositions"
1994
Title Author Supervisor
"Spatial applications of Markov chain Monte Carlo for bayesian inference"
"Accounting for model uncertainty in linear regression"
"Robust estimation in point processes"
"Multilevel modeling of discrete event history data using Markov chain Monte Carlo methods"
"Estimation in regression models with interval censoring"
1993
Title Author Supervisor
"A Bayesian framework and importance sampling methods for synthesizing multiple sources of evidence and uncertainty linked by a complex mechanistic model"
"State-space modeling of salmon migration and Monte Carlo Alternatives to the Kalman filter"
"The Poisson clumping heuristic and the survival of genome in small pedigrees"
"Markov chain Monte Carlo estimates of probabilities on complex structures"
"A class of stochastic models for relating synoptic atmospheric patterns to local hydrologic phenomena"
1992
Title Author Supervisor
"Auxiliary and missing covariate problems in failure time regression analysis"
"A high order hidden markov model"
"Bayesian methods for the analysis of misclassified or incomplete multivariate discrete data"
1991
Title Author Supervisor
"General-weights bootstrap of the empirical process"
"The weighted likelihood bootstrap and an algorithm for prepivoting"
1990
Title Author Supervisor
"Modelling agricultural field trials in the presence of outliers and fertility jumps"
"Modeling and bootstrapping for non-gaussian time series"
"Genetic restoration on complex pedigrees"
"Incorporating covariates into a beta-binomial model with applications to medicare policy: A Bayes/empirical Bayes approach"
"Likelihood and exponential families"
1989
Title Author Supervisor
"Classical inference in spatial statistics"
"Estimation of mixing and mixed distributions"
1988
Title Author Supervisor
"Exploratory methods for censored data"
"Aspects of robust analysis in designed experiments"
"Diagnostics for time series models"
"Constrained cluster analysis and image understanding"
1987
Title Author Supervisor
"Time series models for continuous proportions"
"The data viewer: A program for graphical data analysis"
"Additive principal components: A method for estimating additive constraints with small variance from multivariate data"
"Kullback-Leibler estimation of probability measures with an application to clustering"
1986
Title Author Supervisor
"Estimation for infinite variance autoregressive processes"
"A computer system for Monte Carlo experimentation"
1985
Title Author Supervisor
"Robust estimation for the errors-in-variables model"
"Robust statistics on compact metric spaces"
"Weak convergence and a law of the iterated logarithm for processes indexed by points in a metric space"
1983
Title Author Supervisor
"The statistics of long memory processes"

We use cookies on reading.ac.uk to improve your experience, monitor site performance and tailor content to you

Read our cookie policy to find out how to manage your cookie settings

This site may not work correctly on Internet Explorer. We recommend switching to a different browser for a better experience.

Statistics PhD theses

2015 onwards.

Abdulrafiu Babatunde Odunuga   
Philip Maybank
Natalie Dimier
Chintu Desai Statistical study designs for phase III pharmacogenetic clinical trials
Frank Owusu-Ansah Methodology for joint modelling of spatial variation and competition
effects in the analysis of varietal selection trials
Supada Charoensawat A likelihood approach based upon the proportional hazards model for SROC modelling in meta-analysis of diagnostic studies
Pianpool Kirdwichai A nonparametric regression approach to the analysis of genomewide association studies
Reynaldo Martina DStat thesis: Challenges in modelling pharmacogenetic data: Investigating biomarker and clinical response simultaneously for optimal dose prediction
Rungruttikarn Moungmai Family-based genetic association studies in a likelihood framework
Michael Dunbar Multiple hydro-ecological stressor interactions assessed using statistical models
Osama Abdulhey Alcohol consumption and mortality from all and specific causes: the J-hypothesis. A systematic review and meta-analysis of current and historical evidence
Rattana Lerdsuwansri Generalisation of the Lincoln-Peterson approach to non-binary source variables
Krisana Lanumteang Estimation of the size of a target population using Capture-Recapture methods based upon multiple sources and continuous time experiments
Rainer-Georg Göldner Investigation of new single locus and multivariate methods for the analysis of genetic association studies
Isak Neema Survey and monitoring crimes in Namibia through the likelihood based cluster analysis
Mercedes Andrade Bejarano Monthly average temperature modelling for Valle del Cauca (Colombia)
Robert Mastrodomenico Statistical analysis of genetic association studies
Ruth Butler DStat thesis: An exploration of the statistical consequences of sub-sampling for species identification
Carmen Ybarra Moncada Multivariate methods with application to spectroscopy
Alun Bedding The Bayesian analysis of dose titration to effect in Phase II clinical trials in order to design Phase III
Timothy Montague Adaptive designs for bioequivalence trials
Magnus Kjaer Clinical trials of cytostatic agents with repeated measurements: using the regression coefficients as response
Kamziah Abd Kudus Survival analysis models for interval censored data with application to an plantation spacing trial
Isobel Barnes Point estimation after a sequential clinical trial
Ben Carter Statistical methodology for the analysis of microarray data
Joanna Burke Regularised regression in QTL mapping
Alexandre M F G da Silva Methods for the analysis of multivariate lifetime data with frailty
Harsukhjit Deo Analysis of a Quantitative Trait Locus for twin data using univariate and multivariate linear mixed effects models
Kim Bolland The design and analysis of neurological trials yielding repeated ordinal data
Fazil Baksh Sequential tests of association with applications in genetic epidemiology
Martyn Byng A statistical model for locating regulatory regions in novel DNA sequences
Rob Deardon Representation bias in field trials for airborne plant pathogens
Marian Hamshere Statistical aspects of objects generated by dynamic processes at sea, detected by remote sensing techniques
Mike Branson The analysis of survival data in which patients switch treatments
Christoph Lang Generalised estimating equation methods in statistical genetics
V R P Putcha Random effects in survival analysis
Robin Fletcher Statistical inversion of surface parameters from ATSR-2 satellite observations
Seth Ohemeng-Dapaah Methods for analysis and interpretation of genotype by environment interaction
Emmanuelle Vincent Sequential designs for clinical trials involving multiple treatments
Pi Wen Tsai Three-level designs robust to model uncertainty
Jo Farebrother Statistical design and analysis of factorial combination drug trials
Mark Lennon Design and analysis of multiple site large plot field experiments
Norberto Lavorenti Fitting models in a bivariate analysis of intercrops
Bernard North Contributions to survival analysis
Karen Ayres Measuring genetic correlations within and between loci, with implications for disequilibrium mapping and forensic identification
Andrew Morris Transmission tests of linkage and association using samples of nuclear families with at least one affected child
Julian Higgins Exploiting information in random effects meta-analysis
Mohammed Inayat Khan Improving precision of agricultural field experiments in Pakistan
Luzia Trinca Blocking response surface designs
Phil Bowtell Non-linear functional relationships
Louise Burt Statistical modelling of volcanic hazards
Helen Millns The application of statistical methods to the analysis of diet and coronary heart disease in Scotland
Dominic Neary Methods of analysis for ordinal repeated measures data
Graham Pursey Shape location and classification with reference to fungal spores
Nigel Stallard Increasing efficiency in the design and analysis of animal toxicology studies
Katarzyna Stepniewska Some variable selection problems in medical research

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

A Handbook of Statistical Analyses using SPSS

Profile image of Sam Winchester

Related Papers

Hariono Lee

statistical analysis thesis pdf

Agus Kurniawan

Muhammad Ali

Efthymia Nikita

These notes are a basic guide to statistics using SPSS and are primarily written for undergraduate students, as well as for graduates and scholars with limited statistical knowledge. Explanations of basic statistical concepts are provided, along with an introduction to confidence intervals and a brief description of each statistical test. However, the notes focus on applied statistics. In other words, emphasis is placed on understanding which statistical test should be used for different types of data and different types of problems under examination. Both descriptive and inferential statistics are covered and a step-by-step description of how to run the most widely used univariate and multivariate tests is provided.

Teaching Sociology

Karen Grace-Martin

Daniel Dolezal

Redwanur Rahman Chowdhury

Tutorial and Guidelines for doing SPSS Analysis, for Dissertation, Thesis, Research and Business Analysis

KRIPARAJ KUNNATH

Data analysis using statistical package for social sciences

BlueLotus JM

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

RELATED PAPERS

Bala Subramanian R

babik hassan Adam

The American Statistician

Hayat Malik

Edward T Vieira, Jr.

Jamie DeCoster

Behavior Research Methods, Instruments, & Computers

Brooke Feeney

lablanjut sttpln

FRIDAY CHRISTOPHER

Dira Andinnie

International Research Journal of MMC

Dr. Lok R A J Sharma

Psychology Press eBooks

sanford braver

Dolores Frias-Navarro

Arif Riska Nurcahyo

Michael DeCesare

Mark Elliot

OCEM Journal of Management, Technology & Social Sciences

Sarad Kafle

International Journal of Trend in Scientific Research and Development

Jamez Hetfield

صادق العكيلي

Berlyn Hakim

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024
  • Harvard Division of Continuing Education
  • DCE Theses and Dissertations
  • Communities & Collections
  • By Issue Date
  • FAS Department
  • Quick submit
  • Waiver Generator
  • DASH Stories
  • Accessibility
  • COVID-related Research

Terms of Use

  • Privacy Policy
  • By Collections
  • By Departments

Comparison of Life Cycle Analysis Methodologies and Practical Applications in Textile Development

Thumbnail

Citable link to this page

Collections.

  • DCE Theses and Dissertations [1340]

Contact administrator regarding this item (to report mistakes or request changes)

Show Statistical Information

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

electronics-logo

Article Menu

statistical analysis thesis pdf

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Neuromarketing and big data analysis of banking firms’ website interfaces and performance.

statistical analysis thesis pdf

1. Introduction

2. literature background, 2.1. banking firms, digital marketing, and user engagement, 2.2. metrics and kpis of friendly website user interface (ui), 2.3. neuromarketing and big data analysis implications on website interface and performance, 2.4. hypotheses development, 3. materials and methods, 3.1. methodological concept.

  • The research started with the collection of data on website customers and digital marketing activities from banking firm websites. A website’s user behavioral data (pages per visit, bounce rate, time on site, etc.) were sourced from the website platform Semrush [ 61 ], which enables the extraction of big data from corporate webpages.
  • The next step involved statistical analysis using methods such as descriptive statistics, correlation, and linear regression. By analyzing the coefficients obtained, researchers can determine the impact of banking firms’ website customer data on their digital marketing and interface performance metrics, including purchase conversion, display ads, organic traffic, and bounce rate.
  • After statistical analysis, a hybrid model (HM) incorporating agent-based models (ABMs) and System Dynamics (SD) was used for the simulation. The software AnyLogic (version 8.9.1) [ 62 ] was employed to create a hybrid model that simulates the relationships between the study’s dependent and independent variables over 360 days. This model aims to represent the dynamic interaction between banking firms’ website interface metrics and key metrics of their digital marketing strategies.
  • The final stage included a neuromarketing approach to gain deeper insights from 26 participants who viewed the websites of the selected banking firms. They were instructed to search and observe, in 20 s, the selected banking firm websites and their provided financial products and services. Eye-tracking and heatmap analysis were conducted using the SeeSo Web Analysis platform (Eyedid SDK) [ 63 ]. This method seeks to extract additional information about the onsite activity and engagement of the participants from the qualitative methodological concept.

3.2. Fuzzy Cognitive Mapping (FCM) Framework

3.3. research sample, 4.1. statistical analysis, 4.2. simulation model, 4.3. neuromarketing applications, 5. discussion, 6. conclusions, 6.1. theoretical, practical, and managerial implications, 6.2. future work and limitations, author contributions, data availability statement, conflicts of interest.

Java Code of AnyLogic Simulation
@AnyLogicInternalCodegenAPI
 private void enterState(statechart_state self, boolean_destination) {
  switch( self ) {
   case Potential_Bank_Customers:
     logToDBEnterState(statechart, self);
    // (Simple state (not composite))
    statechart.setActiveState_xjal(Potential_Bank_Customers);
    transition1.start();
    transition2.start();
    return;
   case Return_Visitors:
     logToDBEnterState(statechart, self);
    // (Simple state (not composite))
    statechart.setActiveState_xjal(Return_Visitors);
    {
return_Visitors++;

pages_per_Visit = normal(0.97, 3.43);

visit_Duration = normal(128.25/60, 519.40/60);

referral_Domains = normal(794.22, 51,181.91);

email_Sources = normal(300,170.77, 184,876.14)
;}
    transition3.start();
    transition5.start();
    return;
   case Bounce_Rate:
     logToDBEnterState(statechart, self);
    // (Simple state (not composite))
    statechart.setActiveState_xjal(Bounce_Rate);
    {
bounce_Rate = organic_Traffic*(1.045) + paid_Costs*(0.025) + referral_Domains*(0.334) + email_Sources*(−0.043)
;}
    transition.start();
    return;
   case Visitors_To_Traffic:
     logToDBEnterState(statechart, self);
    // (Simple state (not composite))
    statechart.setActiveState_xjal(Visitors_To_Traffic);
    transition7.start();
    transition8.start();
    return;
   case Organic_Traffic:
     logToDBEnterState(statechart, self);
    // (Simple state (not composite))
    statechart.setActiveState_xjal(Organic_Traffic);
    {
organic_Costs = normal(5,822,486.64, 37,155,781.98);

organic_Traffic = paid_Costs*(−0.024) + referral_Domains*(−0.319) + email_Sources*(0.041)
;}
    transition13.start();
    return;
   case Display_Ads:
     logToDBEnterState(statechart, self);
    // (Simple state (not composite))
    statechart.setActiveState_xjal(Display_Ads);
    {
display_Ads = paid_Costs*(0.198) + referral_Domains*(−0.065) + email_Sources*(−0.135)
;}
    transition10.start();
    transition11.start();
    return;
   case Purchase_Convertion:
     logToDBEnterState(statechart, self);
    // (Simple state (not composite))
    statechart.setActiveState_xjal(Purchase_Convertion);
    {
purchase_Convertion = organic_Costs*(−1.670) + paid_Costs*(−1.369) + referral_Domains*(1.696) + email_Sources*(0.167)
;}
    transition9.start();
    return;
   case Paid_Traffic:
     logToDBEnterState(statechart, self);
    // (Simple state (not composite))
    statechart.setActiveState_xjal(Paid_Traffic);
    {
paid_Costs = normal(406,005.96, 1,514,463.27);

paid_Traffic = normal(666.9666, 3378.9857)
;}
    transition14.start();
    return;
   case New_Visitors:
     logToDBEnterState(statechart, self);
    // (Simple state (not composite))
    statechart.setActiveState_xjal(New_Visitors);
    {
new_Visitors++;

pages_per_Visit = normal(0.97, 3.43);

visit_Duration = normal(128.25/60, 519.40/60);

referral_Domains = normal(794.22, 51,181.91);

email_Sources = normal(300,170.77, 184,876.14)
;}
    transition4.start();
    transition6.start();
    return;
   default:
    return;
  }
 }
  • Hennig-Thurau, T.; Malthouse, E.C.; Friege, C.; Gensler, S.; Lobschat, L.; Rangaswamy, A.; Skiera, B. The impact of new media on customer relationships. J. Serv. Res. 2010 , 13 , 311–330. [ Google Scholar ] [ CrossRef ]
  • Broby, D. Financial technology and the future of banking. Financ. Innov. 2021 , 7 , 1–19. [ Google Scholar ] [ CrossRef ]
  • Ding, Q.; He, W. Digital transformation, monetary policy and risk-taking of banks. Financ. Res. Lett. 2023 , 55 , 103986. [ Google Scholar ] [ CrossRef ]
  • Shukla, S. Analyzing customer engagement through e-CRM: The role of relationship marketing in the era of digital banking in Varanasi banks. J. Commer. Econ. Comput. Sci. 2021 , 7 , 57–65. [ Google Scholar ]
  • Hendriyani, C.; Raharja, S.J. Analysis building customer engagement through eCRM in the era of digital banking in Indonesia. Int. J. Econ. Policy Emerg. Econ. 2018 , 11 , 479–486. [ Google Scholar ]
  • Vivek, S.D.; Beatty, S.E.; Morgan, R.M. Customer engagement: Exploring customer relationships beyond purchase. J. Mark. Theory Pract. 2012 , 20 , 122–146. [ Google Scholar ] [ CrossRef ]
  • Lee, D.; Hosanagar, K.; Nair, H.S. Advertising content and consumer engagement on social media: Evidence from Facebook. Manag. Sci. 2018 , 64 , 5105–5131. [ Google Scholar ] [ CrossRef ]
  • Lin, K.-Y.; Lu, H.-P. Why people use social networking sites: An empirical study integrating network externalities and motivation theory. Comput. Hum. Behav. 2011 , 27 , 1152–1161. [ Google Scholar ] [ CrossRef ]
  • Lee, M.; Wang, Y.R.; Huang, C.F. Design and development of a friendly user interface for building construction traceability system. Microsyst. Technol. 2021 , 27 , 1773–1785. [ Google Scholar ] [ CrossRef ]
  • Faghih, B.; Azadehfar, M.; Katebi, S. User interface design for E-learning software. Int. J. Soft Comput. Softw. Eng. 2014 , 3 , 786–794. [ Google Scholar ] [ CrossRef ]
  • Cheng, S.; Yang, Y.; Xiu, L.; Yu, G. Effects of prior experience on the user experience of news aggregation app’s features—Evidence from a behavioral experiment. Int. J. Hum.-Comput. Interact. 2022 , 39 , 1271–1279. [ Google Scholar ] [ CrossRef ]
  • Nielsen, J.; Norman, D. The Definition of User Experience (UX) ; Nielsen Norman Group N N/g.: Fremont, CA, USA, 2018; Available online: https://www.nngroup.com/articles/definition-user-experience/ (accessed on 20 June 2024).
  • He, W.; Hung, J.-L.; Liu, L. Impact of big data analytics on banking: A case study. J. Enterp. Inf. Manag. 2023 , 36 , 459–479. [ Google Scholar ] [ CrossRef ]
  • Kalaganis, F.P.; Georgiadis, K.; Oikonomou, V.P.; Laskaris, N.A.; Nikolopoulos, S.; Kompatsiaris, I. Unlocking the Subconscious Consumer Bias: A Survey on the Past, Present, and Future of Hybrid EEG Schemes in Neuromarketing. Front. Neuroergonomics 2021 , 2 , 672982. [ Google Scholar ] [ CrossRef ]
  • Walker, P.R. How Does Website Design in the e-Banking Sector Affect Customer Attitudes and Behaviour? Ph.D. Thesis, University of Northumbria, Newcastle upon Tyne, UK, 2021. Available online: https://nrl.northumbria.ac.uk/id/eprint/5849/7/walker.philip_phd_(VOLUME_1of2).pdf (accessed on 12 June 2024).
  • Manser Payne, E.H.; Peltier, J.; Barger, V.A. Enhancing the value co-creation process: Artificial intelligence and mobile banking service platforms. J. Res. Interact. Mark. 2021 , 15 , 68–85. [ Google Scholar ] [ CrossRef ]
  • Diener, F.; Špacek, M. Digital transformation in banking: A managerial perspective on barriers to change. Sustainability 2021 , 13 , 2032. [ Google Scholar ] [ CrossRef ]
  • Khattak, M.A.; Ali, M.; Azmi, W.; Rizvi, S.A.R. Digital transformation, diversification and stability: What do we know about banks? Econ. Anal. Policy 2023 , 78 , 122–132. [ Google Scholar ] [ CrossRef ]
  • Giannakis-Bompolis, C.; Boutsouki, C. Customer Relationship Management in the Era of Social Web and Social Customer: An Investigation of Customer Engagement in the Greek Retail Banking Sector. Procedia Soc. Behav. Sci. 2014 , 148 , 67–78. [ Google Scholar ] [ CrossRef ]
  • Mogaji, E. Redefining banks in the digital era: A typology of banks and their research, managerial and policy implications. Int. J. Bank Mark. 2023 , 41 , 1899–1918. [ Google Scholar ] [ CrossRef ]
  • Salvi, A.; Petruzzella, F.; Raimo, N.; Vitolla, F. Transparency in the digitalization choices and the cost of equity capital. Qual. Res. Financ. Mark. 2023 , 15 , 630–646. [ Google Scholar ] [ CrossRef ]
  • Carmona, J.; Cruz, C. Banks’ social media goals and strategies. J. Bus. Res. 2018 , 91 , 31–41. [ Google Scholar ] [ CrossRef ]
  • Kosiba, J.P.; Boateng, H.; Okoe, A.F.; Hinson, R. Trust and customer engagement in the banking sector in Ghana. Serv. Ind. J. 2018 , 40 , 960–973. [ Google Scholar ] [ CrossRef ]
  • Del Sarto, N.; Bocchialini, E.; Gai, L.; Ielasi, F. Digital banking: How social media is shaping the game. Qual. Res. Financ. Mark. 2024 . ahead of print . [ Google Scholar ] [ CrossRef ]
  • Sakas, D.P.; Giannakopoulos, N.T.; Trivellas, P. Exploring affiliate marketing’s impact on customers’ brand engagement and vulnerability in the online banking service sector. Int. J. Bank Mark. 2023 , 42 , 1282–1312. [ Google Scholar ] [ CrossRef ]
  • Sakas, D.P.; Giannakopoulos, N.T.; Terzi, M.C.; Kamperos, I.D.G.; Kanellos, N. What is the connection between Fintechs’ video marketing and their vulnerable customers’ brand engagement during crises? Int. J. Bank Mark. 2023 , 42 , 1313–1347. [ Google Scholar ] [ CrossRef ]
  • Mbama, C.I.; Ezepue, P.O. Digital banking, customer experience and bank financial performance: UK customers’ perceptions. Int. J. Bank Mark. 2018 , 36 , 230–255. [ Google Scholar ] [ CrossRef ]
  • Khandelwal, R.; Kapoor, D. The Use of Digital Tools for Customer Engagement in the Financial Services Sector. In Revolutionizing Customer-Centric Banking through ICT ; IGI Global: Hershey, PA, USA, 2024; pp. 29–55. [ Google Scholar ]
  • Islam, J.U.; Shahid, S.; Rasool, A.; Rahman, Z.; Khan, I.; Rather, R.A. Impact of website attributes on customer engagement in banking: A solicitation of stimulus-organism-response theory. Int. J. Bank Mark. 2020 , 38 , 1279–1303. [ Google Scholar ] [ CrossRef ]
  • Lestari, D.M.; Hardianto, D.; Hidayanto, A.N. Analysis of user experience quality on responsive web design from its informative perspective. Int. J. Softw. Eng. Appl. 2014 , 8 , 53–62. [ Google Scholar ] [ CrossRef ]
  • Almeida, F.; Monteiro, J. Approaches and principles for UX web experiences: A case study approach. Int. J. Inf. Technol. Web Eng. 2017 , 12 , 49–65. [ Google Scholar ] [ CrossRef ]
  • Walsh, T.A.; Kapfhammer, G.M.; McMinn, P. Automated layout failure detection for responsive web pages without an explicit oracle. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, Santa Barbara, CA, USA, 10–14 July 2017. [ Google Scholar ] [ CrossRef ]
  • Rogers, Y.; Sharp, H.; Preece, J. Interaction Design: Beyond Human-Computer Interaction , 6th ed.; John Wiley & Sons Ltd.: New York, NY, USA, 2023. [ Google Scholar ]
  • ISO9241-11 ; Ergonomics of Human-System Interaction–Part 11: Usability for Definition and Concept. ISO: Geneva, Switzerland, 2018.
  • Hussain, I.; Khan, I.A.; Jadoon, W.; Jadoon, R.N.; Khan, A.N.; Shafi, M. Touch or click friendly: Towards adaptive user interfaces for complex applications. PLoS ONE 2024 , 19 , e0297056. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Kim, S.; Cho, D. Technology Trends for UX/UI of Smart Contents. Korea Contents Assoc. Rev. 2016 , 14 , 29–33. [ Google Scholar ] [ CrossRef ]
  • Joo, H.S. A Study on UI/UX and Understanding of Computer Major Students. Int. J. Adv. Smart Converg. 2017 , 6 , 26–32. [ Google Scholar ]
  • Von Saucken, C.; Michailidou, I.; Lindemann, U. How to Design Experiences: Macro UX versus Micro UX Approach. Lect. Notes Comuter Sci. 2013 , 8015 , 130–139. [ Google Scholar ]
  • Instatus. Our Comprehensive List of Website Performance Metrics to Monitor. 2024. Available online: https://instatus.com/blog/website-performance-metrics (accessed on 20 June 2024).
  • Levrini, G.R.; Jeffman dos Santos, M. The influence of Price on purchase intentions: Comparative study between cognitive, sensory, and neurophysiological experiments. Behav. Sci. 2021 , 11 , 16. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Gabriel, D.; Merat, E.; Jeudy, A.; Cambos, S.; Chabin, T.; Giustiniani, J.; Haffen, E. Emotional effects induced by the application of a cosmetic product: A real-time electrophysiological evaluation. Appl. Sci. 2021 , 11 , 4766. [ Google Scholar ] [ CrossRef ]
  • Filipović, F.; Baljak, L.; Naumović, T.; Labus, A.; Bogdanović, Z. Developing a web application for recognizing emotions in neuromarketing. In Marketing and Smart Technologies ; Springer: Berlin/Heidelberg, Germany, 2020; pp. 297–308. [ Google Scholar ]
  • Lee, Ν.; Broderick, A.J.; Chamberlain, L. What is ‘neuromarketing’? A discussion and agenda for future research. Int. J. Psychophysiol. 2007 , 63 , 199–204. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Rawnaque, F.; Rahman, K.; Anwar, S.; Vaidyanathan, R.; Chau, T.; Sarker, F.; Mamun, K. Technological advancements and opportunities in Neuromarketing: A systematic review. Brain Inform. 2020 , 7 , 10. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Ariely, D.; Berns, G. Neuromarketing: The hope and hype of neuroimaging in business. Nat. Rev. Neurosci. 2010 , 11 , 284–292. [ Google Scholar ] [ CrossRef ]
  • Sousa, J. Neuromarketing and Big Data Analytics for Strategic Consumer Engagement: Emerging Research and Opportunities ; IGI Global: Hershey, PA, USA, 2017. [ Google Scholar ] [ CrossRef ]
  • Šola, H.M.; Qureshi, F.H.; Khawaja, S. Exploring the Untapped Potential of Neuromarketing in Online Learning: Implications and Challenges for the Higher Education Sector in Europe. Behav. Sci. 2024 , 14 , 80. [ Google Scholar ] [ CrossRef ]
  • Berčík, J.; Neomániová, K.; Gálová, J. Using neuromarketing to understand user experience with the website (UX) and interface (UI) of a selected company. In The Poprad Economic and Management Forum 2021, Conference Proceedings from International Scientific Conference, Poprad, Slovak Republic, 14 October 2021 ; Madzík, P., Janošková, M., Eds.; VERBUM: Ružomberok, Slovakia, 2021; pp. 246–254. [ Google Scholar ]
  • Golnar-Nik, P.; Farashi, S.; Safari, M. The application of EEG power for the prediction and interpretation of consumer decision-making: A neuromarketing study. Physiol. Behav. 2019 , 207 , 90–98. [ Google Scholar ] [ CrossRef ]
  • Uygun, Y.; Oguz, R.F.; Olmezogullari, E.; Aktas, M.S. On the Large-scale Graph Data Processing for User Interface Testing in Big Data Science Projects. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020; pp. 2049–2056. [ Google Scholar ] [ CrossRef ]
  • Li, L.; Zhang, J. Research and Analysis of an Enterprise E-Commerce Marketing System under the Big Data Environment. J. Organ. End User Comput. 2021 , 33 , 1–19. [ Google Scholar ] [ CrossRef ]
  • Sakas, D.P.; Giannakopoulos, N.T.; Terzi, M.C.; Kanellos, N.; Liontakis, A. Digital Transformation Management of Supply Chain Firms Based on Big Data from DeFi Social Media Profiles. Electronics 2023 , 12 , 4219. [ Google Scholar ] [ CrossRef ]
  • Bala, M.; Verma, D. A Critical Review of Digital Marketing. Int. J. Manag. IT Eng. 2018 , 8 , 321–339. Available online: https://ssrn.com/abstract=3545505 (accessed on 20 July 2024).
  • Pongpaew, W.; Speece, M.; Tiangsoongnern, L. Social presence and customer brand engagement on Facebook brand pages. J. Prod. Brand Manag. 2017 , 26 , 262–281. [ Google Scholar ] [ CrossRef ]
  • Chaffey, D.; Ellis-Chadwick, F. Digital Marketing ; Pearson: London, UK, 2019. [ Google Scholar ]
  • Dodson, I. The Art of Digital Marketing: The Definitive Guide to Creating Strategic, Targeted, and Measurable Online Campaigns ; John Wiley & Sons: New York, NY, USA, 2016. [ Google Scholar ]
  • Chawla, Y.; Chodak, G. Social media marketing for businesses: Organic promotions of web-links on Facebook. J. Bus. Res. 2021 , 135 , 49–65. [ Google Scholar ] [ CrossRef ]
  • McIlwain, C.D. Algorithmic Discrimination: A Framework and Approach to Auditing & Measuring the Impact of Race-Targeted Digital Advertising. PolicyLink Rep. 2023 , 1–50. [ Google Scholar ] [ CrossRef ]
  • Mladenović, D.; Rajapakse, A.; Kožuljević, N.; Shukla, Y. Search engine optimization (SEO) for digital marketers: Exploring determinants of online search visibility for blood bank service. Online Inf. Rev. 2023 , 47 , 661–679. [ Google Scholar ] [ CrossRef ]
  • Wedel, M.; Kannan, P.K. Marketing analytics for data-rich environments. J. Mark. 2016 , 80 , 97–121. [ Google Scholar ] [ CrossRef ]
  • Semrush. 2024. Available online: https://www.semrush.com/ (accessed on 12 April 2024).
  • Anylogic. 2024. Available online: https://www.anylogic.com/ (accessed on 12 April 2024).
  • SeeSo Web Analysis (Eyedid SDK). 2024. Available online: https://sdk.eyedid.ai/ (accessed on 20 April 2024).
  • MentalModeler. 2024. Available online: https://dev.mentalmodeler.com/ (accessed on 10 April 2024).
  • Migkos, S.P.; Sakas, D.P.; Giannakopoulos, N.T.; Konteos, G.; Metsiou, A. Analyzing Greece 2010 Memorandum’s Impact on Macroeconomic and Financial Figures through FCM. Economies 2022 , 10 , 178. [ Google Scholar ] [ CrossRef ]
  • Mpelogianni, V.; Groumpos, P.P. Re-approaching fuzzy cognitive maps to increase the knowledge of a system. AI Soc. 2018 , 33 , 175–188. [ Google Scholar ] [ CrossRef ]
  • Forbes India. The 10 Largest Banks in the World in 2024. 2024. Available online: https://www.forbesindia.com/article/explainers/the-10-largest-banks-in-the-world/86967/1 (accessed on 6 January 2024).
  • Nugroho, S.; Uehara, T. Systematic Review of Agent-Based and System Dynamics Models for Social-Ecological System Case Studies. Systems 2023 , 11 , 530. [ Google Scholar ] [ CrossRef ]
  • McGarraghy, S.; Olafsdottir, G.; Kazakov, R.; Huber, É.; Loveluck, W.; Gudbrandsdottir, I.Y.; Čechura, L.; Esposito, G.; Samoggia, A.; Aubert, P.-M.; et al. Conceptual System Dynamics and Agent-Based Modelling Simulation of Interorganisational Fairness in Food Value Chains: Research Agenda and Case Studies. Agriculture 2022 , 12 , 280. [ Google Scholar ] [ CrossRef ]
  • Wang, H.; Shi, W.; He, W.; Xue, H.; Zeng, W. Simulation of urban transport carbon dioxide emission reduction environment economic policy in China: An integrated approach using agent-based modelling and system dynamics. J. Clean. Prod. 2023 , 392 , 136221. [ Google Scholar ] [ CrossRef ]
  • Nguyen, L.K.N.; Howick, S.; Megiddo, I. A framework for conceptualising hybrid system dynamics and agent-based simulation model. Eur. J. Oper. Res. 2024 , 315 , 1153–1166. [ Google Scholar ] [ CrossRef ]
  • Ezquerra, A.; Agen, F.; Bogdan Toma, R.; Ezquerra-Romano, I. Using facial emotion recognition to research emotional phases in an inquiry-based science activity. Res. Sci. Technol. Educ. 2023 , 1–24. [ Google Scholar ] [ CrossRef ]
  • Chen, Y.; Qin, X.; Xu, X. Visual Analysis and Recognition of Virtual Reality Resolution Based on Pupil Response and Galvanic Skin Response. In Proceedings of the 4th International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI) 2023, Guangzhou, China, 4–6 August 2023; pp. 74–83. [ Google Scholar ] [ CrossRef ]
  • Muke, P.Z.; Kozierkiewicz, A.; Pietranik, M. Investigation and Prediction of Cognitive Load During Memory and Arithmetic Tasks. In Computational Collective Intelligence. ICCCI 2023. Lecture Notes in Computer Science ; Nguyen, N.T., Botzheim, J., Gulyás, L., Núñez, M., Treur, J., Vossen, G., Kozierkiewicz, A., Eds.; Springer: Cham, Switzerland, 2023; Volume 14162. [ Google Scholar ] [ CrossRef ]
  • Amiri, S.S.; Masoudi, M.; Asadi, S.; Karan, E.P. A Quantitative Way for Measuring the Building User Design Feedback and Evaluation. In Proceedings of the 16th International Conference on Computing in Civil and Building Engineering (ICCCBE2016), Osaka, Japan, 6–8 July 2016; pp. 1–7. [ Google Scholar ]
  • Wilson, L. 30-Minute Conversion Rate Optimisation Actions. In 30-Minute Website Marketing ; Emerald Publishing Limited: Leeds, UK, 2019; pp. 131–141. [ Google Scholar ] [ CrossRef ]
  • Sood, S. Leveraging Web Analytics for Optimizing Digital Marketing Strategies. In Big Data Analytics ; Chaudhary, K., Alam, M., Eds.; CRC Press (Auerbach Publications): Boca Raton, FL, USA, 2022; pp. 173–188. [ Google Scholar ]
  • Drivas, I.C.; Sakas, D.P.; Giannakopoulos, G.A. Display Advertising and Brand Awareness in Search Engines: Predicting the Engagement of Branded Search Traffic Visitors. In Business Intelligence and Modelling. IC-BIM 2019. Springer Proceedings in Business and Economics ; Sakas, D.P., Nasiopoulos, D.K., Taratuhina, Y., Eds.; Springer: Cham, Switzerland, 2021. [ Google Scholar ] [ CrossRef ]
  • Hari, H.; Iyer, R.; Sampat, B. Customer Brand Engagement through Chatbots on Bank Websites–Examining the Antecedents and Consequences. Int. J. Hum. Comput. Interact. 2023 , 38 , 1212–1227. [ Google Scholar ] [ CrossRef ]
  • Makrydakis, N. SEO mix 6 O’s model and categorization of search engine marketing factors for websites ranking on search engine result pages. Int. J. Res. Mark. Manag. Sales 2024 , 6 , 18–32. [ Google Scholar ] [ CrossRef ]
  • Shankar, B. Strategies for Deep Customer Engagement. In Nuanced Account Management ; Palgrave Macmillan: Singapore, 2018; pp. 53–99. [ Google Scholar ] [ CrossRef ]
  • Chakrabortty, K.; Jose, E. Relationship Analysis between Website Traffic, Domain Age and Google Indexed Pages of E-commerce Websites. IIM Kozhikode Soc. Manag. Rev. 2018 , 7 , 171–177. [ Google Scholar ] [ CrossRef ]
  • Müller, O.; Fay, M.; vom Brocke, J. The Effect of Big Data and Analytics on Firm Performance: An Econometric Analysis Considering Industry Characteristics. J. Manag. Inf. Syst. 2018 , 35 , 488–509. [ Google Scholar ] [ CrossRef ]
  • Pejić Bach, M.; Krstić, Ž.; Seljan, S.; Turulja, L. Text Mining for Big Data Analysis in Financial Sector: A Literature Review. Sustainability 2019 , 11 , 1277. [ Google Scholar ] [ CrossRef ]
  • Gupta, S.; Justy, T.; Kamboj, S.; Kumar, A.; Kristoffersen, E. Big data and firm marketing performance: Findings from knowledge-based view. Technol. Forecast. Soc. Change 2021 , 171 , 120986. [ Google Scholar ] [ CrossRef ]
  • Ravi, V.; Kamaruddin, S. Big Data Analytics Enabled Smart Financial Services: Opportunities and Challenges. In Big Data Analytics. BDA 2017. Lecture Notes in Computer Science ; Reddy, P., Sureka, A., Chakravarthy, S., Bhalla, S., Eds.; Springer: Cham, Switzerland, 2017; Volume 10721, pp. 15–39. [ Google Scholar ] [ CrossRef ]
  • Tichindelean, M.T.; Cetină, I.; Orzan, G. A Comparative Eye Tracking Study of Usability—Towards Sustainable Web Design. Sustainability 2021 , 13 , 10415. [ Google Scholar ] [ CrossRef ]
  • Bajaj, R.; Syed, A.A.; Singh, S. Analysing applications of neuromarketing in efficacy of programmatic advertising. J. Consum. Behav. 2023 , 23 , 939–958. [ Google Scholar ] [ CrossRef ]
  • Tirandazi, P.; Bamakan, S.M.H.; Toghroljerdi, A. A review of studies on internet of everything as an enabler of neuromarketing methods and techniques. J. Supercomput. 2022 , 79 , 7835–7876. [ Google Scholar ] [ CrossRef ]
  • Slijepčević, M.; Popović Šević, N.; Radojević, I.; Šević, A. Relative Importance of Neuromarketing in Support of Banking Service Users. Marketing 2022 , 53 , 131–142. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

MeanMinMaxStd. DeviationSkewnessKurtosis
Organic Traffic9,868,004.179,486,121.0010,700,067.60351,366.561.3421.651
Organic Keywords987,820.46889,059.201,193,079.6076,418.521.5921.851
Organic Traffic Costs37,155,781.9828,929,891.4044,660,727.205,822,486.64−0.188−1.627
Paid Traffic337,898.57232,588.80487,373.4066,696.660.3961.333
Paid Keywords6510.471815.209700.602624.74−0.757−0.580
Paid Traffic Costs1,514,463.27992,316.602,491,839.60406,005.960.9981.667
Email Sources184,876.140.00720,314.00300,170.771.3790.219
Display Ads4199.570.0020,892.007636.021.9821.927
Purchase Conversion7.717.008.000.49−1.230−0.840
Referral Domains51,181.9149,694.4052,457.40794.22−0.360−0.317
Visit Duration519.40368.00737.00128.250.658−0.174
Bounce Rate0.450.420.490.020.606−1.361
Pages per Visit3.432.005.000.970.2770.042
New Visitors15,149,188.4014,150,098.0016,212,804.00801,388.140.025−1.625
Returning Visitors47,056,175.8944,705,979.0051,410,725.002,301,015.961.1031.599
Organic TrafficOrganic Traffic CostsPaid KeywordsPaid Traffic CostsEmail SourcesDisplay AdsPurchase ConversionReferral DomainsVisit DurationBounce RatePages per VisitNew VisitorsReturn Visitors
Organic Traffic10.604 *0.2640.0370.174−0.0130.6190.5450.5290.905**0.0680.796*0.469
Organic Traffic Costs0.604 *10.0370.0000.6070.4130.2060.830 **0.1240.2420.6570.4890.628
Paid Traffic−0.122−0.0520.5330.889 **−0.220−0.304−0.5210.249−0.705−0.298−0.022−0.587−0.539
Paid Traffic Costs0.0370.0000.3791−0.371−0.315−0.5470.241−0.549−0.193−0.070−0.458−0.524
Email Sources0.1740.607−0.257−0.37110.5900.3440.4240.1450.0020.7090.3560.698
Display Ads−0.0130.413−0.456−0.3150.59010.1600.2990.635−0.3160.843 *0.5540.857 *
Purchase
Conversion
0.6190.206−0.555−0.5470.3440.16010.1750.2240.6000.3000.5390.485
Referral Domains0.5450.830 **0.2490.2410.4240.2990.1751−0.2230.1790.737 *0.2690.394
Visit Duration0.5290.124−0.748−0.5490.1450.6350.224−0.22310.1630.3090.804 *0.717
Bounce Rate0.905 **0.242−0.542−0.1930.002−0.3160.6000.1790.1631−0.0510.5810.192
Pages per Visit0.0680.657−0.410−0.0700.7090.843 *0.3000.737 *0.309−0.05110.5580.830 *
New Visitors0.796 *0.489−0.904 **−0.4580.3560.5540.5390.2690.804 *0.5810.55810.856 *
Returning Visitors0.4690.628−0.773 *−0.5240.6980.857 *0.4850.3940.7170.1920.830 *0.856 *1
VariablesStandardized CoefficientR Fp-Value
Organic Traffic Costs−1.6701.000-0.000 **
Paid Traffic Costs−1.3690.000 **
Referral Domains1.6960.000 **
Email Sources0.1670.000 **
VariablesStandardized CoefficientR Fp-Value
Paid Traffic Costs0.1981.000-0.000 **
Referral Domains−0.0650.000 **
Email Sources−0.1350.000 **
VariablesStandardized CoefficientR Fp-Value
Paid Traffic Costs−0.0241.000-0.000 **
Referral Domains−0.3190.000 **
Email Sources0.0410.000 **
VariablesStandardized CoefficientR Fp-Value
Paid Traffic Costs0.025 0.000 **
Referral Domains0.3340.000 **
Email Sources−0.0430.000 **
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Giannakopoulos, N.T.; Sakas, D.P.; Migkos, S.P. Neuromarketing and Big Data Analysis of Banking Firms’ Website Interfaces and Performance. Electronics 2024 , 13 , 3256. https://doi.org/10.3390/electronics13163256

Giannakopoulos NT, Sakas DP, Migkos SP. Neuromarketing and Big Data Analysis of Banking Firms’ Website Interfaces and Performance. Electronics . 2024; 13(16):3256. https://doi.org/10.3390/electronics13163256

Giannakopoulos, Nikolaos T., Damianos P. Sakas, and Stavros P. Migkos. 2024. "Neuromarketing and Big Data Analysis of Banking Firms’ Website Interfaces and Performance" Electronics 13, no. 16: 3256. https://doi.org/10.3390/electronics13163256

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

  • Systematic Review
  • Open access
  • Published: 16 August 2024

Rotavirus infections and their genotype distribution pre- and post-vaccine introduction in Ethiopia: a systemic review and meta-analysis

  • Wagi Tosisa 1 ,
  • Belay Tafa Regassa 1 ,
  • Daniel Eshetu 4 ,
  • Asnake Ararsa Irenso 5 ,
  • Andargachew Mulu 2 &
  • Gadissa Bedada Hundie 3  

BMC Infectious Diseases volume  24 , Article number:  836 ( 2024 ) Cite this article

31 Accesses

Metrics details

Rotavirus infections are a significant cause of severe diarrhea and related illness and death in children under five worldwide. Despite the global introduction of vaccinations for rotavirus over a decade ago, rotavirus infections still result in high deaths annually, mainly in low-income countries, including Ethiopia, and need special attention. This system review and meta-analysis aimed to comprehensively explore the positive proportion of rotavirus at pre- and post-vaccine introduction periods and genotype distribution in children under five with diarrhea in Ethiopia.

The review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 guidelines. Database sources included PubMed, Scopus, EMBASE, and Epistemonikos, focusing on studies published before November 30, 2023. The search targeted rotavirus infection and genotype distribution in Ethiopia before and after the introduction of the Rota vaccine. Data was managed using EndNote 2020 software and stored in an Excel 2010 sheet. A random-effects model determined the pooled estimate of the rotavirus infection rate at 95% confidence intervals. The Q-and I² statistics were used to assess the study heterogeneity, and a funnel plot (Egger test) was used to determine the possibility of publication bias.

The analysis included data from nine studies conducted in different regions of Ethiopia. The overall prevalence of rotavirus infection was significant, with a prevalence rate of approximately 22.63% (1362/6039). The most common genotypes identified before the Rota vacation introduction were G1, G2, G3, G12, P [4], P [6], P [8], P [9], and P [10]. Meanwhile, G3 and P [8] genotypes were particularly prevalent after the Rota vaccine introduction. These findings highlight the importance of implementing preventive measures, such as vaccination, to reduce the burden of rotavirus infection in this population. The identified genotypes provide valuable insights for vaccine development and targeted interventions.

This study contributes to the evidence base for public health interventions and strategies to reduce the impact of rotavirus infection in children under five in Ethiopia. Despite the rollout of the Rota vaccination in Ethiopia, rotavirus heterogeneity is still high, and thus, enhancing vaccination and immunization is essential.

Peer Review reports

Introduction

Rotavirus infection detrimentally affects the childhood population, mostly in low-income counties, through the induction of severe diarrhea, leading to hospitalizations and fatalities [ 1 , 2 ]. It constitutes a significant contributor to childhood morbidity and mortality on a global scale and still annually results in > 500,000 deaths globally and > 200,000 deaths in low-income countries [ 3 ]. The frequency of diarrhea diminishes with age [ 4 , 5 ]. Several risk factors are associated with rotavirus infection, including the availability of contaminated water supplies, malnutrition, and the coexistence of individuals afflicted with gastroenteritis within the household [ 5 ]. Nevertheless, breastfeeding has been identified as a safe option against rotavirus-induced gastroenteritis. Children afflicted with rotavirus infection exhibit diminished levels of micronutrients such as ferritin and vitamin B12, rendering them more susceptible to allergic ailments [ 6 ]. Vaccination against rotavirus plays a pivotal role in preempting the onset of infection and its subsequent complications. Consequently, public health initiatives aimed at endorsing potable water consumption, adhering to effective personal hygiene practices, advocating for exclusive breastfeeding, and administering vaccinations are highly recommended to mitigate the impact of rotavirus infection within developing countries [ 7 ].

In Ethiopia, where infectious diseases pose a significant public health challenge, rotavirus infection has been a notable contributor to childhood illness and death [ 8 ]. Understanding the historical context of rotavirus infection in Ethiopia involves considering its impact on child health, healthcare infrastructure, and the burden it places on families and communities. Numerous studies in Ethiopia showed no seasonal variation in diarrhea associated with enterotoxigenic enterobacteria, rotavirus, and the two parasites Giardia lamblia and Entamoeba histolytica . The study isolated that rotavirus was present in 27.8% of the patients and 6.85% of the two parasites. The study also found that Rotavirus in 27.8% of the patients with diarrhea, most prevalent in the 7–12-month age group, and enterotoxigenic enterobacteria during the second year of life, while parasites continuously increased with age [ 9 ].

Rotaviruses are non-enveloped double-stranded RNA viruses with a complex genome of 11 segments of dsRNA and are classified into 32 G genotypes and 47 P genotypes. In Ethiopia, the distribution of rotavirus genotypes in children under five is diverse, with several dominant genotypes identified. The most common genotypes include G1, G2, G3, G12, P [4], P [6], P [8], P [9], and P [10]. The most prevalent combinations are G12P [8], G3P [6], G1P [8], and G3P [8]. These genotypes comprise a significant proportion of rotavirus strains in Ethiopia. The prevalence of rotavirus infection among children under five in Ethiopia is approximately 23%. The G3 genotype is prevalent, accounting for 27.1% of cases, followed by the P [8] genotype at 49%. The G8 genotype, which is more commonly found in cattle, has also been reported in Ethiopia, although at a lower frequency [ 10 ].

Without age restriction, the rotavirus vaccine was introduced in Ethiopia on November 13, 2013, according to the WHO/SAGE recommendation [ 11 ]. In Ethiopia, fully vaccinated children aged between 15 and 23 months took one dose of Bacille Calmette Guerin (BCG), three doses of PCV (pneumococcal conjugate vaccine), three doses of pentavalent, three doses of OPV, two doses of Rota, and two doses of measles vaccine by card plus mother history [ 12 ]. According to a study in Ethiopia, since no indication of the virus was isolated in children who had received the rotavirus vaccine, it suggests the vaccine shows a protective benefit [ 13 ].

The importance of examining rotavirus in Ethiopia resides in its implications for public health and the potential for well-informed interventions. The infection caused by the rotavirus disproportionately affects children in settings with limited resources, resulting in heightened costs for healthcare, economic burdens, and reduced productivity. Experts and policymakers can identify more susceptible populations, formulate targeted interventions, and enhance healthcare strategies by delving into the prevalence, impact, and risk factors associated with Rotavirus in Ethiopia. Furthermore, comprehending the distribution of genotypes can contribute to the development and effectiveness of vaccines.

Hence, this study aimed to comprehensively explore the positive proportion of rotavirus at pre- and post-vaccine introduction periods and genotype distribution in children under five years old with diarrhea in Ethiopia.

Materials and methods

Study protocol.

The review was performed using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 guidelines [ 14 ]. Following the STROBE checklist and additional STROME-ID items, units were used to conduct systematic reviews to evaluate the reported quality of records [ 15 ]. This systematic review and meta-analysis proposal was registered at the PROSPERO International Prospective Register of Systematic Reviews on November 24, 2023, as PROSPERO 2023 ID: = CRD42023481674.

Searching strategies and information sources

We utilized the data from PubMed, Scopus, EMBASE, Epistemonikos database, and additional sources such as Google Scholar and Addis Ababa University Electronic Thesis and Dissertation for all published and unpublished articles before November 30, 2023. The search data from databases was done based on keywords, medical subject headings (MeSH) terms, and other terms such as epidemiology, risk factors, genotype distribution, and vaccination status of rotavirus infection in the pre-vaccine and post-vaccine introduction in Ethiopia, which are available in Annex S1 Supplementary Document ( Annex file 1 ). The articles were searched and downloaded from databases on November 24–30, 2023, with language restricted to English.

However, for the search string of PubMed, Epistemonikos, EMBASE and Scopus, we used the terms; " rotavirus infection ‘’, “pre-vaccine,” " post-vaccine,‘’ “epidemiology,” " risk factors,” " genotypes,” vaccination status,” [MeSH], “Ethiopia” with the combination of Boolean logic (“AND,” OR”) based upon the given databases ( Annex file 2 ).

Study selection

Inclusion criteria.

The studies deal with the human population socio-demographics in Ethiopia under five years old (0–59 months) and original research, including cross-sectional, cohort, case-control, and randomized controlled trials (RCTs). Studies reporting epidemiological data on rotavirus infection in under-five children, including incidence and prevalence, and data on risk factors associated with rotavirus infection, specifically in the under-five age group, genotype distribution of rotavirus strains in under-five children, vaccination status of under-five children, and studies published in English or with an English translation available were included.

Exclusion criteria

Studies focusing exclusively on populations above five, review articles, editorials, commentaries, letters, and conference abstracts without full-text availability, and studies with a high risk of bias or poor methodological quality were excluded. In addition, articles conducted after November 30, 2023, excluded studies without relevant epidemiological data on rotavirus infection, risk factors, genotypes, or vaccination status in the under-five age group.

Quality assessment

The quality of the records was assessed according to the Joanna Briggs Institute’s (JBI) 2017 checklist. The included records were critically appraised using the predetermined criteria or checklist. In addition, each article was appraised using the Joanna Bridges Institute (JBI) assessment tool for observational studies, particularly the analytical cross-sectional studies design. The study with a minimum of 5 scores was considered acceptable methodological quality and included in the review [ 16 ].

On the other hand, STROME-ID (Strengthening the Reporting of Molecular Epidemiology for Infectious Diseases) was also used to investigate the quality of each selected record. This statement is an extension of the 22-item STROBE statement with 20 additional elements aiming to promote transparency, clarity, and comparability of scientific reporting, specifically in Molecular Epidemiology for Infectious Diseases studies [ 15 ].

Data extraction

Three reviewers (WT, BT, and DE) screened the titles and abstracts and independently evaluated the titles and abstracts of potentially eligible records. The full texts of selected records were assessed against inclusion criteria, and three reviewers extracted data separately using the prepared data extraction tool (Table  1 ). The extracted data included the first author’s name and publication year, study area, population/sample, sample size, laboratory diagnostic methods, and total rotavirus-isolated genotypes. The extracted data was transferred to an Excel sheet. The results were then cross-checked, with a “checked” value of “1” for discrepancies. Any discrepancies that occurred between reviewers were discussed and resolved through consensus.

Statistical analysis

The extracted data from the records was stored on the Excel version 2010 sheet. Descriptive statistics and bar graphs were prepared from extracted data in Excel 2010 and Jamovi 2.3.5. Comprehensive Meta-Analysis Version V3.ext software was installed, and a shortcut labeled on the personal computer with the Windows start menu was created. Next, the specific columns containing the study names, sample size, and outcome variables were transferred to the Comprehensive Meta-Analysis column for analysis.

A random-effect model determined the pooled rotavirus infection test to assess the possibility of publication bias by using Compressive Meta-analysis V3.exe software to analyze the data.

Operational definition

Epidemiology.

Diagnosis of the prevalence and distribution of Rotavirus infection in pre- and post-vaccine introduction periods. It includes analyzing trends, affected age groups, and regional variations to provide a thorough understanding of the disease burden.

Risk factors

Isolating the socio-economic characteristics, environmental, and topographic factors that contribute to the vulnerability of specific populations to Rotavirus infection. This analysis can guide targeted public health interventions to reduce risk and improve overall child health.

Genotype distribution

Examining the genetic diversity of Rotavirus strains circulating in Ethiopia before and after vaccine introduction. Understanding genotype distribution is crucial for monitoring strain evolution, assessing vaccine efficacy, and adapting vaccination strategies if necessary.

Vaccination status

Evaluating the impact and effectiveness of Rotavirus vaccination programs in Ethiopia. It involves assessing coverage rates, identifying challenges in vaccine distribution and administration, and determining the overall success in reducing the burden of Rotavirus-related morbidity and mortality among children.

By addressing these essential aspects, the literature aims to contribute valuable evidence that can inform evidence-based policies, improve healthcare practices, and ultimately reduce the impact of Rotavirus infection on child health in Ethiopia.

Selection process and characterization of the included studies

A systematic review and meta-analysis were conducted to estimate the incidence of rotavirus infection in Ethiopia pre- and post-vaccine introduction. A total of 479 research articles were accessed from PubMed, EMBASE, Scopus, and Epistemonikos. After excluding 52 articles due to duplication, 427 records remained. Among these, 233 were excluded for lacking laboratory diagnosis for rotavirus, and 46 were excluded for being review or systematic review articles, with 233 records considered irrelevant. After screening abstracts, titles, and full texts, 148 articles met the inclusion criteria for the systematic review and meta-analysis. Subsequently, these 148 articles were thoroughly evaluated, including only nine records in the final systematic review and meta-analysis (Fig.  1 ).

The introduction of rotavirus vaccines in Ethiopia in 2013 significantly reduced the incidence of rotavirus infections, a leading cause of severe diarrhea in children. A stratified data analysis from various study years reveals a substantial reduction in the positivity proportion of rotavirus cases after the vaccine’s introduction. Before the vaccine’s introduction, the positivity proportion was relatively high, with values ranging from 0.181 to 0.277. However, after the vaccine was introduced, there was a notable decline in the positivity proportion, particularly in 2018, where it dropped to as low as 0.044. This significant reduction indicates that the rotavirus vaccination program has profoundly impacted reducing the burden of rotavirus infections among children in Ethiopia, underscoring the need for continued monitoring and evaluation of the vaccination program.

The stratified analysis of rotavirus infections in Ethiopia provides valuable insights into the effectiveness of the rotavirus vaccine introduced in 2013. Before the vaccine rollout, data from various years of study indicated a concerning prevalence of rotavirus infections among children. For instance, in 1981, the positivity proportion was 0.277; in 2013, it was 0.210. These figures highlight the significant burden of rotavirus before vaccination efforts.

Following the vaccine’s introduction, the data shows a marked decline in the positivity proportion of rotavirus infections. In 2018, the positivity proportion dropped to 0.197 in one study and decreased to 0.044 in another, indicating a substantial reduction in the number of positive cases relative to the sample size (Tables  1 and Fig.  2 ). This decline suggests that the rotavirus vaccination program has effectively reduced the incidence of severe rotavirus infections among children, contributing to improved health outcomes.

However, it is essential to note that the data has fluctuations. There were instances of the positivity proportion in the post-vaccine years not following the expected trend. For example, in 2014, the positivity proportion was recorded at 0.345, higher than some pre-vaccine years. This variability may be attributed to changes in surveillance practices, the emergence of different rotavirus strains, or variations in vaccination coverage. These fluctuations remind us of the challenges and uncertainties we must navigate in public health.

The data extracted for the systematic review and meta-analysis studies provide valuable insights into the prevalence and characteristics of rotavirus infection with diarrhea in children under five in Ethiopia. As summarized in Table  1 , the majority of the research articles were from Addis Ababa (6/10); cross-sectional studies by design (7/10), studies conducted before the introduction of the rotavirus vaccine in Ethiopia (6/10), and most studies (8/10) used the EIA/ELISA method for the determination of rotavirus infection and only two used RT-PCR.

A study by Abebe A et al. conducted in 2013 in Addis Ababa examined the positive proportion of rotavirus before the vaccine introduction. This pre-vaccine sentinel surveillance study included a larger sample size of 1841, and the positive proportion of rotavirus was 388 cases (21%) in 2013. The lab method used in this study was EIA. The genotype distribution of rotavirus was identified as G1p (20%), G12p (17%), and G3p (15%) (Table  1 ; Fig.  3 ).

Another study by Abebe A. et al., also published in 2018, focused on the post-vaccine period in Addis Ababa. This sentinel surveillance study included a sample size of 815, and the positive proportion of rotavirus was 161 cases. The genotype distribution in this study revealed G9P (19%) in 2014 and G3P and G2P (19% each) in 2015, which is a fluctuating trend in Rotavirus infections across different regions and years (Table  1 ).

The data is categorized by location: Addis Ababa, Awassa, Gondar and Bahir Dar, Jimma, and Wegera. Addis Ababa shows the highest number of cases, with peaks in 1981 (267 cases), 2013 (388 cases), and 2018 (349 cases). Awassa reported 44 cases in 2013. Gondar and Bahir Dar have significant cases in 2018 (113 cases). Jimma shows lower numbers, with 41 cases in 2004. Wegera presents minimal cases, with a notable report of 10 cases (Fig.  4 ).

The majority of studies used cross-sectional designs. Stintzing G. et al. conducted a survey in 1981 in Addis Ababa, which contained a sample size of 962. The number of rotavirus infections reported in this study was 267, and the lab method used was Immunoelectron microscopy (IEO). Getahun et al. conducted a survey in 2014 in Addis Ababa, with a sample size of 246, and found 85 Rotavirus infection cases. The lab method used in this study was EIA. Additional cross-sectional studies were conducted by Abebe et al. in 1995 and Bizuneh T et al. in 2004, both in Addis Ababa. Abebe et al.‘s study included a sample size of 358, with a frequency of 65 rotavirus infection cases. The lab method used was ELISA. Bizuneh T et al.‘s study included a smaller sample size of 154, with 41 rotavirus infection cases. The lab method used in this study was also ELISA.

Two more cross-sectional studies were conducted after the vaccine introduction. Feleke H et al. conducted a study in 2018 in Wegera woreda, Amhara region, with a sample size of 225, and reported a frequency of 10 rotavirus infection cases. The lab method used was ELISA. Similarly, Gelaw A et al. conducted a study in 2018 in the Amara region with a sample size of 450 and reported a frequency of 113 rotavirus infection cases. The lab method used in this study was reverse transcription polymerase chain reaction (RT-PCR), and the identified serotype distribution included G3P [8], G2P [4], G9P [8], 12P [8], and G3P [6].

Lastly, Yassin et al. conducted a cross-sectional study in Awassa in 2013, with a sample size of 200, and reported a frequency of 44 rotavirus infection cases. The lab method used in this study was RT-PCR, and the genotype distribution included G3P [6] (48%), G1P [8] (27%), and G2P [4] (7%).

This literature review examines the impact of rotavirus vaccination on the distribution of rotavirus genotypes in Ethiopia. The study findings indicate a notable shift in rotavirus genotype distribution in Ethiopia after introducing the vaccine in 2013. Pre-vaccine periods showed the prevalence of genotypes like G1p, G12p, G3p, and G2p, whereas post-vaccination samples displayed a shift to G9P, G3P, G2P, and G12P, with G3P [8] being the most common. This shift highlights the impact of vaccination on altering the predominant rotavirus genotypes.

The data we present here is a scientific observation and a call to action. It underscores the significant effect of vaccination on rotavirus infection patterns. The transition from pre-vaccine to post-vaccination periods demonstrates changes in the circulating genotypes, indicating the vaccine’s effectiveness in influencing rotavirus genotype distribution in Ethiopia. These findings directly affect public health officials and researchers’ efforts to combat the rotavirus.

The studies employed a rigorous methodology, utilizing various laboratory methods such as EIA, ELISA, and RT-PCR to identify rotavirus genotypes. This diverse approach in methods provides a comprehensive understanding of rotavirus epidemiology and genotype distribution in different regions of Ethiopia over time, ensuring the reliability and validity of our findings.

By analyzing data from 1981 to 2018, the study captures temporal trends in rotavirus genotype distribution, showcasing the evolution of prevalent genotypes before and after the vaccine’s introduction. This longitudinal perspective offers valuable insights into Ethiopia’s changing landscape of rotavirus infections.

Overall, the studies above provide essential data on the positive proportion of rotavirus, vaccination status, lab methods used, and genotype distribution in children under five years of age with diarrhea in various regions of Ethiopia. The findings contribute to our understanding of the epidemiology and characteristics of rotavirus infection in this population, which can inform public health interventions and strategies for prevention and control.

A summary review addressed a crucial understanding of the impact of rotavirus infection, assessing disease trends, and evaluating the effectiveness of interventions like vaccines in reducing the incidence of rotavirus-related illnesses in Ethiopia.

Meta-analysis

The systematic review and meta-analysis aimed to investigate the frequency, risk factors, genotype distribution by vaccination status, and age distribution of rotavirus infection with diarrhea in Ethiopia’s children under five. The analysis included data from 9 studies, and both fixed and random effects models were used to estimate the effect size (Fig.  5 ).

In the fixed effects model, the overall effect size for Rotavirus infection in children under five with diarrhea was estimated to be 0.231 (95% CI: 0.221–0.242). This indicates a moderate prevalence of Rotavirus infection in this population. The null hypothesis test showed a significant association between Rotavirus infection and diarrhea in children under five (Z-value: -38.70, p-value: 2.18E-13).

Heterogeneity analysis revealed high heterogeneity among the included studies (Q-value: 79.35, df: 9, p-value: 8.16E-02, I-squared: 88.66%). This suggests that the variation in effect sizes across the studies is not solely due to chance. The estimated tau-squared value (0.286) indicates substantial heterogeneity beyond what would be expected by chance alone.

Given the high heterogeneity observed among the included studies in the analysis of Rotavirus infection in children under five, a fixed-effects model may not be appropriate. Fixed-effects models assume that the actual effect size is the same across all studies, which may need to be more accurate due to the substantial variation in effect sizes observed. In the presence of high heterogeneity, a random-effects model may be more suitable as it accounts for both within-study and between-study variability, providing a more robust estimate of the overall effect size. A random-effects model would better accommodate the observed heterogeneity and provide a more accurate representation of the actual underlying effect size across studies.

In the random effects model, the estimated effect size for rotavirus infection in children under five with diarrhea was slightly lower at 0.221 (95% CI: 0.189–0.257). Although the random effects model did not provide a p-value for the null hypothesis test, it can be inferred that there is still a significant association between rotavirus infection and diarrhea in this population (Fig.  5 ).

The findings suggest that rotavirus infection is a common cause of diarrhea in children under five in Ethiopia. The prevalence of rotavirus infection is moderate, indicating a need for preventive measures such as vaccination. The high heterogeneity observed among the studies highlights the need for further investigation into the factors contributing to the variation in effect sizes.

The results also indicate that the effect size estimates may vary depending on the model used. While the fixed effects model assumes a standard effect size across all studies, the random effects model accounts for both within-study and between-study variability. Therefore, the random effects model may provide a more conservative estimate of the actual effect size.

Overall, this systematic review and meta-analysis provide valuable insights into the epidemiology of rotavirus infection in children under five with diarrhea in Ethiopia. The findings emphasize the importance of implementing effective vaccination strategies and identifying specific risk factors associated with rotavirus infection in this population. Further research is needed to understand the sources of heterogeneity better and to inform targeted interventions for preventing and controlling rotavirus infection in Ethiopia.

Publication bias

The meta-analysis incorporated data from 9 studies categorized in 10 as pre-vaccine and post-vaccine studies, yielding a z-value of -36.19785 and a corresponding 2-tailed p-value of 0.00000. The Classic Fail-safe N for this meta-analysis is 3401, which means that 3401 additional ‘null’ studies would need to be located and included for the combined 2-tailed p-value to exceed 0.050. This indicates that for every observed study, there would need to be 340.1 missing studies for the effect to be nullified, suggesting the results are highly robust.

Egger’s Test of the Intercept was also performed to assess publication bias. The intercept (B0) was found to be -1.61283, with a 95% confidence interval ranging from − 7.10118 to 3.87552. The t-value was 0.67765 with 8 degrees of freedom. The 1-tailed p-value was 0.25855, and the 2-tailed p-value was 0.51711. These p-values are not statistically significant, indicating no significant evidence of publication bias according to Egger’s test.

Furthermore, Duval and Tweedie’s Trim and Fill method was applied. Under the fixed effect model, the combined studies’ point estimate and 95% confidence interval were 0.23134 (0.22070, 0.24233). Using Trim and Fill, these values remained unchanged. Similarly, under the random effects model, the point estimate and 95% confidence interval were 0.22147 (0.18947, 0.25717), and these values also remained unchanged after applying Trim and Fill. This consistency suggests there is no significant publication bias affecting the results.

The funnel plot, which displays the standard error by logit event rate, is likely to show a symmetrical distribution given the results from Egger’s Test and the Trim and Fill method (Fig.  6 ).

Overall, the meta-analysis demonstrates a robust effect with minimal evidence of publication bias. The high fail-safe N indicates the results are reliable and not easily nullified by potential missing studies.

Rotavirus infection poses a significant public health concern in low-income countries with poor socio-economic situations and a lack of appropriate sanitation, hygiene, and vaccination coverage. This systemic review and meta-analysis estimated the prevalence of rotavirus infection among children under five with diarrhea in Ethiopia at 22.6%, highlighting the need for preventive measures such as vaccination. The obtained review data included cases from hospitalized children, community-based studies, and outpatient departments. However, data on mild cases (community-based study and outpatient department) are fewer, which suggests that the findings may only partially represent the broader spectrum of rotavirus infection severity in the population. This limitation may be considered when interpreting the results and their implications for public health strategies (Table  1 ; Fig.  3 ).

Hospitalized children from 2013 onwards were not exclusively vaccinated cohorts; the data includes pre- and post-vaccine cases. The data primarily focuses on rotavirus infection cases, sample sizes, vaccination statuses (pre- and post-vaccine), and genotype distributions without specific details on vaccination cohorts beyond 2013. The identified genotypes, including G1, G2, G3, G12, P [4], P [6], P [8], P [9], and P [10], provide insights into the circulating strains of Rotavirus in Ethiopia. Some surveillance studies also observed similar study results [ 17 , 18 ].

Analysis of genotype variations in Rotavirus infections pre- and post-vaccination in Ethiopia

Pre-vaccination period.

Before introducing the rotavirus vaccine on June 13, 2013, the predominant genotypes identified in various studies were G1, G12, G3, and G2. For instance, in Addis Ababa, sentinel surveillance data from 2013 showed the prevalence of G1P (20%), G12P (17%), and G3P (15%). In Awassa, the genotypes G3P[6] (48%), G1P[8] (27%), and G2P[4] (7%) were common.

Post-vaccination Period

The most striking observation was the significant shift in genotype distribution following the introduction of the vaccine. The previously dominant genotypes G1, G12, G3, and G2 were replaced by G9P, G3P, G2P, and G12P. For instance, in Addis Ababa, the post-vaccination data from 2014 to 2015 revealed a dramatic change in the prevalence of genotypes. 2014 G9P emerged as the most common (19%), but in 2015, G3P and G2P (19% each) took the lead. Similarly, in Gondar and Bahir Dar, the genotypes G3P [8], G2P [4], G9P [8], G12P [8], and G3P [6] were prevalent in 2018.

Year-by-year genotype distribution :

2011–2013 (Pre-Vaccine) : In Addis Ababa, G12P was prevalent in 2011 (36%) and 2012 (27%), while G2P was dominant in 2013 (35%).

2014–2015 (Post-Vaccine) : The post-vaccination period in Addis Ababa saw a dynamic change in genotype prevalence. 2014, G9P was the most common (19%), but in 2015, G3P and G2P emerged as equally prevalent (19% each), marking a significant shift in just a year.

2018 (Post-Vaccine) : In Gondar and Bahir Dar, the genotypes G3P[8], G2P[4], G9P[8], G12P[8], and G3P[6] were identified.

In summary, the introduction of the rotavirus vaccine in Ethiopia has had a profound impact on the genotype distribution of rotavirus infections. The pre-vaccination period was characterized by the dominance of genotypes G1, G12, G3, and G2, while the post-vaccination period witnessed a shift towards genotypes G9P, G3P, G2P, and G12P, with G3P[8] being notably common. This shift underscores the significant role of vaccination in shaping the epidemiology of rotavirus genotypes in the region (Fig.  3 ).

This systematic review and meta-analysis provide essential insights into the prevalence, genotype distribution, and risk factors of rotavirus infection in children under five in Ethiopia. The identified genotypes, including G1, G2, G3, G12, P [4], P [6], P [8], P [9], and P [10], highlight the diversity of rotavirus strains circulating in Ethiopia. Other studies in different regions have also reported these genotypes, emphasizing their global significance [ 17 , 19 ].

The G3 genotype was particularly prevalent, accounting for 27.1% of cases, followed by the P [8] genotype at 49%. These findings are consistent with previous studies that have reported the predominance of G3 and P [8] genotypes in rotavirus infections [ 20 , 21 ].

The prevalence of specific genotypes can have implications for vaccine development and effectiveness. For instance, the G12P [8] combination, identified as one of the most common genotypes in this study, has been associated with reduced vaccine effectiveness in some settings [ 22 ]. Therefore, monitoring the prevalence and distribution of genotypes is crucial for informing vaccine strategies and ensuring their optimal impact.

This systematic review and meta-analysis provide valuable insights into the prevalence, genotype distribution, and risk factors of rotavirus infection in children under five in Ethiopia. The findings underscore the importance of vaccination and ongoing surveillance to reduce this population’s rotavirus-related morbidity and mortality burden [ 23 , 24 ].

However, it is important to note that this study has some limitations. The high heterogeneity observed among the included studies suggests potential variations in study design, population characteristics, and diagnostic methods. Additionally, the limited number of studies available for inclusion may only partially represent part of the population of Ethiopia.

This systematic review and meta-analysis provide valuable insights into the prevalence, genotype distribution, and risk factors of rotavirus infection in children under five in Ethiopia. The findings underscore the importance of vaccination and ongoing surveillance to reduce this population’s rotavirus-related morbidity and mortality burden.

Limitations

Among the limitations observed in this review is that there needs to be more sufficient data showing genotypes responsible for mechanisms of rotavirus infection in the included studies. In addition, these studies were limited to a few places, which didn’t reveal the actual figure for the distribution of rotavirus infection in Ethiopia. The studies’ limitations include focusing on specific regions like Addis Ababa and Awassa in Ethiopia, potentially limiting the findings’ generalizability. Another limitation is the reliance on different laboratory methods, such as EIA, ELISA, and RT-PCR, across studies, which may introduce variability in the results. Additionally, the sample sizes varied between studies, which could impact the statistical power and precision of the results. A larger sample size generally leads to more robust and statistically significant results, providing greater confidence in the study’s conclusions regarding rotavirus infection and vaccine efficacy.

Conclusion and recommendations

In conclusion, the systematic review and meta-analysis of rotavirus infection in children under five in Ethiopia revealed a moderate prevalence of rotavirus infection in this population. Overall, the stratified analysis demonstrates that the introduction of rotavirus vaccines in Ethiopia has significantly reduced the positivity proportion of rotavirus infections, particularly in the years following the vaccine’s implementation. However, this success should not lead to complacency. Continued monitoring and evaluation of the vaccination program are important to ensure its ongoing effectiveness and address any emerging challenges in controlling rotavirus infections in the future. This is a collective responsibility that we all share in the public health community, and it is crucial that we all actively participate in maintaining the success of the vaccination program. The genotype distribution was diverse, with G1, G2, G3, G12, P [4], P [6], P [8], P [9], and P [10] being the most common genotypes identified. The pooled prevalence of rotavirus infection was approximately 23%, with the G3 and P [8] genotypes being particularly prevalent. The findings underscore the importance of implementing preventive measures, such as vaccination, to reduce the burden of rotavirus infection in children under five years of age in Ethiopia. The identified genotypes provide valuable insights for vaccine development and targeted interventions. Water supply contamination, inadequate sanitation, and poor water quality can contribute to the spreading of rotavirus infection. Contaminated water sources can harbor the virus, leading to potential infections when children are consumed or used for hygiene. Proper water treatment and sanitation practices are essential to prevent rotavirus transmission and other waterborne diseases. Further studies are needed to address additional risk factors associated with rotavirus infection, better understand the observed heterogeneity among the included studies, and understand the burden of rotavirus infection, especially among children under five. Overall, this study contributes to the evidence base for public health interventions and strategies to reduce the impact of rotavirus infection in Ethiopia.

figure 1

PRISMA 2020 flow diagram for updated systematic reviews, which included searches of database registers

figure 2

The positivity proportion of rotavirus cases at pre-vaccine and post-vaccine studies periods in Ethiopia

figure 3

Genotype distribution by Rotavirus cases in the community, hospitalized and outpatient children during pre- and post-vaccine periods

figure 4

Illustrates the number of Rotavirus infection cases in different study areas during specific publication years from 1981 to 2018

figure 5

Rotavirus Infection in Pre- and Post-Vaccine Introduction in Ethiopia

figure 6

Forest plot

Data availability

The datasets used and analyzed in the study are available from the corresponding author on reasonable request.

Abbreviations

Enzyme Immunoassay

Enzyme-Linked Immunosorbent Assay

Immunoelectro-osmophoresis

Joanna Briggs Institute

Real-time Polymerase reaction

Ali KB, Gadzama GB, Zailani SB, Mohammed Y, Daggash BB, Yakubu YM, et al. The risk factors Associated with Rotavirus Gastroenteritis among children under five years at University of Maiduguri Teaching Hospital, Borno State, Nigeria. Niger J Med. 2022;31(1):35–40.

Article   Google Scholar  

Kraay ANM, Chaney DM, Deshpande A, Pitzer VE, Lopman BA. Predicting indirect effects of rotavirus vaccination programs on rotavirus mortality among children in 112 countries. npj Vaccines. 2023;8(1):32.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Slotboom DEF, Peeters D, Groeneweg S, van Rijn-Klink A, Jacobs E, Schoenaker MHD, et al. Neurologic complications of Rotavirus infections in Children. Pediatr Infect Dis J. 2023;42(7):533–6.

Article   PubMed   Google Scholar  

Basaran MK, Dogan C, Sursal A, Ozdener F. Effect of Rotavirus infection on serum micronutrients and atopy in children. J Pediatr Infect Dis. 2022;17(03):137–42.

Nirmal K, Gangar S. Rotaviral diseases and their implications. Viral Outbreaks-Global Trends and Perspectives: IntechOpen; 2023.

Book   Google Scholar  

Prameela K, Vijaya L. The importance of breastfeeding in rotaviral diarrhoeas. Malaysian J Nutr. 2012;18(1).

Hallowell BD, Tate J, Parashar U. An overview of rotavirus vaccination programs in developing countries. Expert Rev Vaccines. 2020;19(6):529–37.

Moges F, Endris M, Mulu A, Tessema B, Belyhun Y, Shiferaw Y, et al. The growing challenges of antibacterial drug resistance in Ethiopia. J Global Antimicrob Resist. 2014;2(3):148–54.

Stintzing G, Back E, Tufvesson B, Johnsson T, Wadström T, Habte D. Seasonal fluctuations in the occurrence of enterotoxigenic bacteria and rotavirus in paediatric diarrhoea in Addis Ababa. Bull World Health Organ. 1981;59(1):67–73.

CAS   PubMed   PubMed Central   Google Scholar  

Atalell KA, Liyew AM, Alene KA. Spatial distribution of rotavirus immunization coverage in Ethiopia: a geospatial analysis using the bayesian approach. BMC Infect Dis. 2022;22(1):830.

Article   PubMed   PubMed Central   Google Scholar  

Mandomando I, Mumba M, Nsiari-muzeyi Biey J, Kipese Paluku G, Weldegebriel G, Mwenda JM. Implementation of the World Health Organization recommendation on the use of rotavirus vaccine without age restriction by African countries. Vaccine. 2021;39(23):3111–9.

Miretu DG, Asfaw ZA, Addis SG. Impact of COVID-19 pandemic on vaccination coverage among children aged 15 to 23 months at Dessie town, Northeast Ethiopia, 2020. Hum Vaccines Immunotherapeutics. 2021;17(8):2427–36.

Article   CAS   Google Scholar  

Feleke H, Medhin G, Abebe A, Beyene B, Kloos H, Asrat D. Enteric pathogens and associated risk factors among under-five children with and without diarrhea in Wegera district, northwestern Ethiopia. Pan Afr Med J. 2018;29.

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Int J Surg. 2021;88:105906.

Field N, Cohen T, Struelens MJ, Palm D, Cookson B, Glynn JR, et al. Strengthening the reporting of Molecular Epidemiology for Infectious diseases (STROME-ID): an extension of the STROBE statement. Lancet Infect Dis. 2014;14(4):341–52.

Briggs J. Critical Appraisal Tools: Checklist for Quasi-experimental studies. Joanna Briggs Inst. 2017:1–7.

Matthijnssens J, Bilcke J, Ciarlet M, Martella V, Bányai K, Rahman M, et al. Rotavirus disease and vaccination: impact on genotype diversity. Future Microbiol. 2009;4(10):1303–16.

Article   CAS   PubMed   Google Scholar  

Costa PS, Cardoso DD, Grisi SJ, Silva PA, Fiaccadori F, Souza MB, et al. Rotavirus a infections and reinfections: genotyping and vaccine implications. Jornal De Pediatria. 2004;80:119–22.

Tate JE, Patel MM, Steele AD, Gentsch JR, Payne DC, Cortese MM, et al. Global impact of rotavirus vaccines. Expert Rev Vaccines. 2010;9(4):395–407.

Mwenda JM, Ntoto KM, Abebe A, Enweronu-Laryea C, Amina I, Mchomvu J, et al. Burden and epidemiology of rotavirus diarrhea in selected African countries: preliminary results from the African Rotavirus Surveillance Network. J Infect Dis. 2010;202(Supplement1):S5–11.

Enweronu-Laryea CC, Sagoe KW, Damanka S, Lartey B, Armah GE. Rotavirus genotypes associated with childhood severe acute diarrhoea in southern Ghana: a cross-sectional study. Virol J. 2013;10:1–6.

Payne DC, Boom JA, Staat MA, Edwards KM, Szilagyi PG, Klein EJ, et al. Effectiveness of pentavalent and monovalent rotavirus vaccines in concurrent use among US children < 5 years of age, 2009–2011. Clin Infect Dis. 2013;57(1):13–20.

Vesikari T, Karvonen A, Ferrante SA, Ciarlet M. Efficacy of the pentavalent rotavirus vaccine, RotaTeq ® , in Finnish infants up to 3 years of age: the Finnish extension study. Eur J Pediatrics. 2010;169:1379–86.

Ruiz-Palacios GM, Pérez-Schael I, Velázquez FR, Abate H, Breuer T, Clemens SC, et al. Safety and efficacy of an attenuated vaccine against severe rotavirus gastroenteritis. N Engl J Med. 2006;354(1):11–22.

Download references

Acknowledgements

Ambo University and Armauer Hansen Research Institutes have a viable infrastructure to provide the research team with access to information technology.

Author information

Authors and affiliations.

Department of Medical Laboratory Sciences, College of Medical and Health Sciences, Ambo University, P. O. Box 19, Ambo, Ethiopia

Wagi Tosisa & Belay Tafa Regassa

Armauer Hansen Research Institute, Addis Ababa, Ethiopia

Andargachew Mulu

St. Paul’s Hospital Millennium Medical College, Addis Ababa, Ethiopia

Gadissa Bedada Hundie

Yirgalem Medical College Yirgalem, Yirgalem, Ethiopia

Daniel Eshetu

Department of Public Health, College of Medical and Health Sciences, Ambo University, Ambo, Ethiopia

Asnake Ararsa Irenso

You can also search for this author in PubMed   Google Scholar

Contributions

WT outlined the review work, developed the protocol, oversaw the research process, and facilitated communication and manuscript submission; WT, BT, and DE worked on screening, extracting, data quality checking, analysis, and writing.

Corresponding author

Correspondence to Wagi Tosisa .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors affirm that they do not have any competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Tosisa, W., Regassa, B.T., Eshetu, D. et al. Rotavirus infections and their genotype distribution pre- and post-vaccine introduction in Ethiopia: a systemic review and meta-analysis. BMC Infect Dis 24 , 836 (2024). https://doi.org/10.1186/s12879-024-09754-7

Download citation

Received : 20 February 2024

Accepted : 13 August 2024

Published : 16 August 2024

DOI : https://doi.org/10.1186/s12879-024-09754-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Rotavirus infection
  • Pre-vaccine
  • Post-vaccine

BMC Infectious Diseases

ISSN: 1471-2334

statistical analysis thesis pdf

IMAGES

  1. (PDF) Statistical Analysis of Data in Research Methodology

    statistical analysis thesis pdf

  2. (PDF) Data Analysis of Students Marks with Descriptive Statistics

    statistical analysis thesis pdf

  3. 😊 Statistical analysis paper. Free statistics project Essays and Papers

    statistical analysis thesis pdf

  4. Statistical tools for data analysis pdf

    statistical analysis thesis pdf

  5. (PDF) Statistical analysis of cross-tabs

    statistical analysis thesis pdf

  6. Thesis chapter 3 statistical treatment

    statistical analysis thesis pdf

COMMENTS

  1. Mathematics and Statistics Theses and Dissertations

    PDF. Statistical Learning of Biomedical Non-Stationary Signals and Quality of Life Modeling, Mahdi Goudarzi. PDF. Probabilistic and Statistical Prediction Models for Alzheimer's Disease and Statistical Analysis of Global Warming, Maryam Ibrahim Habadi. PDF. Essays on Time Series and Machine Learning Techniques for Risk Management, Michael ...

  2. PDF COMPARISON OF METHODS OF ANALYSIS FOR PRETEST AND POSTTEST DATA by

    2.2 METHODS OF DATA ANALYSIS. The most appropriate method to analyze Pretest-Posttest data is highly debated. According to Bonate (2000), a method sensitive to the validity of its assumptions may result in. inaccurate P-values and false conclusions, while a test with low power is likely to result in a.

  3. The Beginner's Guide to Statistical Analysis

    Table of contents. Step 1: Write your hypotheses and plan your research design. Step 2: Collect data from a sample. Step 3: Summarize your data with descriptive statistics. Step 4: Test hypotheses or make estimates with inferential statistics.

  4. (PDF) An Overview of Statistical Data Analysis

    [email protected]. August 21, 2019. Abstract. The use of statistical software in academia and enterprises has been evolving over the last. years. More often than not, students, professors ...

  5. PDF STATISTICAL METHODS FOR META-ANALYSIS

    This thesis deals with several important statisti- cal issues in systematic reviews and meta-analyses, such as assessing heterogeneity in the presence of outliers, quantifying publication bias, and simultaneously synthesizing

  6. (Pdf) Statistical Analysis With Spss for Research

    STATISTICAL ANALYSIS WITH SPSS FOR RESEARCH. January 2017. January 2017. Edition: First Edition. Publisher: ECRTD Publication. Editor: European Center for Research Training and Development. ISBN ...

  7. PDF Guideline to Writing a Master's Thesis in Statistics

    A master's thesis is an independent scientific work and is meant to prepare students for future professional or academic work. Largely, the thesis is expected to be similar to papers published in statistical journals. It is not set in stone exactly how the thesis should be organized. The following outline should however be followed. Title Page

  8. (PDF) A Really Simple Guide to Quantitative Data Analysis

    It is important to know w hat kind of data you are planning to collect or analyse as this w ill. affect your analysis method. A 12 step approach to quantitative data analysis. Step 1: Start with ...

  9. Mathematics and Statistics Masters Theses

    PDF. An investigation of the influence of the 2007-2009 recession on the day of the week effect for the S&P 500 and its sectors, Marcel Alwin Trick. Theses from 2017 PDF. The pantograph equation in quantum calculus, Thomas Griebel. PDF. Comparing region level testing methods for differential DNA methylation analysis, Arnold Albert Harder. PDF

  10. Statistics, Department of

    PDF. NEW STATISTICAL METHODS FOR ANALYSIS OF HISTORICAL DATA FROM WILDLIFE POPULATIONS, Trevor Hefley. 2013 PDF. Informative Retesting for Hierarchical Group Testing, Michael S. Black. PDF. A Test for Detecting Changes in Closed Networks Based on the Number of Communications Between Nodes, Christopher S. Wichman. 2012 PDF

  11. PDF INTRODUCTION TO SPSS FOR STATISTICAL ANALYSIS

    • Descriptive statistics provide a summary of your data • Purpose of looking at descriptive statistics: (1) Check whether valid data are loaded properly E.g., unexpected values (e.g., 999, -2) in "Age" variable (range 10-80) (2) Explore data E.g., potential group differences, associations between variables (3) Sample description

  12. Introduction to Research Statistical Analysis: An Overview of the

    Introduction. Statistical analysis is necessary for any research project seeking to make quantitative conclusions. The following is a primer for research-based statistical analysis. It is intended to be a high-level overview of appropriate statistical testing, while not diving too deep into any specific methodology.

  13. Statistical Methods in Theses: Guidelines and Explanations

    Guidelines and Explanations. In light of the changes in psychology, faculty members who teach statistics/methods have reviewed the literature and generated this guide for graduate students. The guide is intended to enhance the quality of student theses by facilitating their engagement in open and transparent research practices and by helping ...

  14. PDF Chapter 10. Experimental Design: Statistical Analysis of Data Purpose

    Now, if we divide the frequency with which a given mean was obtained by the total number of sample means (36), we obtain the probability of selecting that mean (last column in Table 10.5). Thus, eight different samples of n = 2 would yield a mean equal to 3.0. The probability of selecting that mean is 8/36 = 0.222.

  15. Different Types of Data Analysis; Data Analysis Methods and Techniques

    Mechanistic Analysis, Statistical Analysis. I. DATA ANALYSIS AND DATA PREPARATION Data analysis is simply the process of converting the gathered data to meaningful information. Different techniques such as modeling to reach trends, relationships, and therefore conclusions to address the decision-making process are employed in this process ...

  16. PDF CUSTOMER SATISFACTION SURVEY, RESULT ANALYSIS AND UTILIZATION ...

    Statistical data is processed with Tixel and Qualtrics. In addition to comparative, correla-tion and cross tabulation analysis, this research is using quadrant analysis and net pro-moter score to analyse the quantitative data. The research also includes qualitative anal-ysis of open-ended questions.

  17. PhD Dissertations

    Alex Luedtke, Lalit Kumar Jain. Statistical Learning and Modeling with Graphs and Networks. Jerry Wei. Yen-Chi Chen, Tyler Mccormick. 2023. Title. Author. Supervisor. Statistical Methods for the Analysis and Prediction of Hierarchical Time Series Data with Applications to Demography.

  18. PDF Study Design and Statistical Analysis

    %PDF-1.3 %âãÏÓ 56 0 obj > endobj xref 56 18 0000000016 00000 n 0000001002 00000 n 0000001064 00000 n 0000001427 00000 n 0000002055 00000 n 0000002108 00000 n 0000009678 00000 n 0000010097 00000 n 0000010497 00000 n 0000010771 00000 n 0000010829 00000 n 0000011807 00000 n 0000012803 00000 n 0000022095 00000 n 0000022740 00000 n 0000023167 00000 n 0000023711 00000 n 0000000656 00000 n ...

  19. (PDF) Chapter 3 Research Design and Methodology

    Research Design and Methodology. Chapter 3 consists of three parts: (1) Purpose of the. study and research design, (2) Methods, and (3) Statistical. Data analysis procedure. Part one, Purpose of ...

  20. A Handbook of Statistical Analyses using SPSS

    Once located, the examples will aid a user to effectively navigate the SPSS GUI in order to conduct the statistical analysis required.2 A Handbook of Statistical Analyses using Stata is in its 3rd edition. The text has been updated to Stata 8, enriched with three new chapters, includes new graphical features of Stata 8, and emphasizes diagnostics.

  21. Statistics PhD theses

    DStat thesis: An exploration of the statistical consequences of sub-sampling for species identification. Carmen Ybarra Moncada. Multivariate methods with application to spectroscopy. Alun Bedding. The Bayesian analysis of dose titration to effect in Phase II clinical trials in order to design Phase III.

  22. A Handbook of Statistical Analyses using SPSS

    These notes are a basic guide to statistics using SPSS and are primarily written for undergraduate students, as well as for graduates and scholars with limited statistical knowledge. Explanations of basic statistical concepts are provided, along with an introduction to confidence intervals and a brief description of each statistical test.

  23. Comparison of Life Cycle Analysis Methodologies and Practical

    Master's thesis, Harvard University Division of Continuing Education. Abstract Life Cycle Assessments are cradle-to-grave systems studies that are created from a variety of inputs that include the process of defining the study goals, scope and boundaries, data input sources and quality requirements, methodologies and assumptions, and allocation ...

  24. ISLE

    Papers and Grants involving ISLE. Philipp Burckhardt, Rebecca Nugent & Christopher R. Genovese (2020) Teaching Statistical Concepts and Modern Data Analysis with a Computing-Integrated Learning Environment, Journal of Statistics Education, DOI: 10.1080/10691898.2020.1854637 Integrating a Statistical Learning Environment into the Writing Classroom.

  25. Neuromarketing and Big Data Analysis of Banking Firms' Website ...

    In today's competitive digital landscape, banking firms must leverage qualitative and quantitative analysis to enhance their website interfaces, ensuring they meet user needs and expectations. By combining detailed user feedback with data-driven insights, banks can create more intuitive and engaging online experiences, ultimately driving customer satisfaction and loyalty. Thus, the need for ...

  26. (PDF) Basic statistical tools in research and data analysis

    The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. ... Use text/rtf/doc/pdf files. Do not zip the files. Limit the file . size to 1 MB ...

  27. PDF Syllabus for M.Sc. (Master of Science) program in Statistics Department

    STAT 5208 Statistical Data Analysis-VIII(Lab) 50 1.5 STAT 5209 Thesis 150 4.5 STAT 5210 Report 50 1.5 STAT 5211 Viva-Voce 50 1.5 . 3 STAT 5101 ... Statistical Analysis of Reliability and Life Testing Models, Theory and Methods, 2nd Edition. Marcel Dekker, New York. 2. Balakrishnan, N.(ED.) (1995). Recent Advances in Life- Testing and ...

  28. PDF Dear Harvard Students, CfA, located at 60 Garden Street opposite the

    Open for Junior Thesis (AY98), Senior Thesis (AY 99), Semester Research for Credit (AY91R), Paid semester work for work-study eligible students, Paid semester work for non-work-study eligible students, Summer research paid by advisor, Summer Research with external stipend . The Kovac Group's Cosmic Microwave Background (CMB) Lab at the CfA works

  29. Rotavirus infections and their genotype distribution pre- and post

    Background Rotavirus infections are a significant cause of severe diarrhea and related illness and death in children under five worldwide. Despite the global introduction of vaccinations for rotavirus over a decade ago, rotavirus infections still result in high deaths annually, mainly in low-income countries, including Ethiopia, and need special attention. This system review and meta-analysis ...