U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Korean J Anesthesiol
  • v.71(2); 2018 Apr

Introduction to systematic review and meta-analysis

1 Department of Anesthesiology and Pain Medicine, Inje University Seoul Paik Hospital, Seoul, Korea

2 Department of Anesthesiology and Pain Medicine, Chung-Ang University College of Medicine, Seoul, Korea

Systematic reviews and meta-analyses present results by combining and analyzing data from different studies conducted on similar research topics. In recent years, systematic reviews and meta-analyses have been actively performed in various fields including anesthesiology. These research methods are powerful tools that can overcome the difficulties in performing large-scale randomized controlled trials. However, the inclusion of studies with any biases or improperly assessed quality of evidence in systematic reviews and meta-analyses could yield misleading results. Therefore, various guidelines have been suggested for conducting systematic reviews and meta-analyses to help standardize them and improve their quality. Nonetheless, accepting the conclusions of many studies without understanding the meta-analysis can be dangerous. Therefore, this article provides an easy introduction to clinicians on performing and understanding meta-analyses.

Introduction

A systematic review collects all possible studies related to a given topic and design, and reviews and analyzes their results [ 1 ]. During the systematic review process, the quality of studies is evaluated, and a statistical meta-analysis of the study results is conducted on the basis of their quality. A meta-analysis is a valid, objective, and scientific method of analyzing and combining different results. Usually, in order to obtain more reliable results, a meta-analysis is mainly conducted on randomized controlled trials (RCTs), which have a high level of evidence [ 2 ] ( Fig. 1 ). Since 1999, various papers have presented guidelines for reporting meta-analyses of RCTs. Following the Quality of Reporting of Meta-analyses (QUORUM) statement [ 3 ], and the appearance of registers such as Cochrane Library’s Methodology Register, a large number of systematic literature reviews have been registered. In 2009, the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [ 4 ] was published, and it greatly helped standardize and improve the quality of systematic reviews and meta-analyses [ 5 ].

An external file that holds a picture, illustration, etc.
Object name is kjae-2018-71-2-103f1.jpg

Levels of evidence.

In anesthesiology, the importance of systematic reviews and meta-analyses has been highlighted, and they provide diagnostic and therapeutic value to various areas, including not only perioperative management but also intensive care and outpatient anesthesia [6–13]. Systematic reviews and meta-analyses include various topics, such as comparing various treatments of postoperative nausea and vomiting [ 14 , 15 ], comparing general anesthesia and regional anesthesia [ 16 – 18 ], comparing airway maintenance devices [ 8 , 19 ], comparing various methods of postoperative pain control (e.g., patient-controlled analgesia pumps, nerve block, or analgesics) [ 20 – 23 ], comparing the precision of various monitoring instruments [ 7 ], and meta-analysis of dose-response in various drugs [ 12 ].

Thus, literature reviews and meta-analyses are being conducted in diverse medical fields, and the aim of highlighting their importance is to help better extract accurate, good quality data from the flood of data being produced. However, a lack of understanding about systematic reviews and meta-analyses can lead to incorrect outcomes being derived from the review and analysis processes. If readers indiscriminately accept the results of the many meta-analyses that are published, incorrect data may be obtained. Therefore, in this review, we aim to describe the contents and methods used in systematic reviews and meta-analyses in a way that is easy to understand for future authors and readers of systematic review and meta-analysis.

Study Planning

It is easy to confuse systematic reviews and meta-analyses. A systematic review is an objective, reproducible method to find answers to a certain research question, by collecting all available studies related to that question and reviewing and analyzing their results. A meta-analysis differs from a systematic review in that it uses statistical methods on estimates from two or more different studies to form a pooled estimate [ 1 ]. Following a systematic review, if it is not possible to form a pooled estimate, it can be published as is without progressing to a meta-analysis; however, if it is possible to form a pooled estimate from the extracted data, a meta-analysis can be attempted. Systematic reviews and meta-analyses usually proceed according to the flowchart presented in Fig. 2 . We explain each of the stages below.

An external file that holds a picture, illustration, etc.
Object name is kjae-2018-71-2-103f2.jpg

Flowchart illustrating a systematic review.

Formulating research questions

A systematic review attempts to gather all available empirical research by using clearly defined, systematic methods to obtain answers to a specific question. A meta-analysis is the statistical process of analyzing and combining results from several similar studies. Here, the definition of the word “similar” is not made clear, but when selecting a topic for the meta-analysis, it is essential to ensure that the different studies present data that can be combined. If the studies contain data on the same topic that can be combined, a meta-analysis can even be performed using data from only two studies. However, study selection via a systematic review is a precondition for performing a meta-analysis, and it is important to clearly define the Population, Intervention, Comparison, Outcomes (PICO) parameters that are central to evidence-based research. In addition, selection of the research topic is based on logical evidence, and it is important to select a topic that is familiar to readers without clearly confirmed the evidence [ 24 ].

Protocols and registration

In systematic reviews, prior registration of a detailed research plan is very important. In order to make the research process transparent, primary/secondary outcomes and methods are set in advance, and in the event of changes to the method, other researchers and readers are informed when, how, and why. Many studies are registered with an organization like PROSPERO ( http://www.crd.york.ac.uk/PROSPERO/ ), and the registration number is recorded when reporting the study, in order to share the protocol at the time of planning.

Defining inclusion and exclusion criteria

Information is included on the study design, patient characteristics, publication status (published or unpublished), language used, and research period. If there is a discrepancy between the number of patients included in the study and the number of patients included in the analysis, this needs to be clearly explained while describing the patient characteristics, to avoid confusing the reader.

Literature search and study selection

In order to secure proper basis for evidence-based research, it is essential to perform a broad search that includes as many studies as possible that meet the inclusion and exclusion criteria. Typically, the three bibliographic databases Medline, Embase, and Cochrane Central Register of Controlled Trials (CENTRAL) are used. In domestic studies, the Korean databases KoreaMed, KMBASE, and RISS4U may be included. Effort is required to identify not only published studies but also abstracts, ongoing studies, and studies awaiting publication. Among the studies retrieved in the search, the researchers remove duplicate studies, select studies that meet the inclusion/exclusion criteria based on the abstracts, and then make the final selection of studies based on their full text. In order to maintain transparency and objectivity throughout this process, study selection is conducted independently by at least two investigators. When there is a inconsistency in opinions, intervention is required via debate or by a third reviewer. The methods for this process also need to be planned in advance. It is essential to ensure the reproducibility of the literature selection process [ 25 ].

Quality of evidence

However, well planned the systematic review or meta-analysis is, if the quality of evidence in the studies is low, the quality of the meta-analysis decreases and incorrect results can be obtained [ 26 ]. Even when using randomized studies with a high quality of evidence, evaluating the quality of evidence precisely helps determine the strength of recommendations in the meta-analysis. One method of evaluating the quality of evidence in non-randomized studies is the Newcastle-Ottawa Scale, provided by the Ottawa Hospital Research Institute 1) . However, we are mostly focusing on meta-analyses that use randomized studies.

If the Grading of Recommendations, Assessment, Development and Evaluations (GRADE) system ( http://www.gradeworkinggroup.org/ ) is used, the quality of evidence is evaluated on the basis of the study limitations, inaccuracies, incompleteness of outcome data, indirectness of evidence, and risk of publication bias, and this is used to determine the strength of recommendations [ 27 ]. As shown in Table 1 , the study limitations are evaluated using the “risk of bias” method proposed by Cochrane 2) . This method classifies bias in randomized studies as “low,” “high,” or “unclear” on the basis of the presence or absence of six processes (random sequence generation, allocation concealment, blinding participants or investigators, incomplete outcome data, selective reporting, and other biases) [ 28 ].

The Cochrane Collaboration’s Tool for Assessing the Risk of Bias [ 28 ]

Data extraction

Two different investigators extract data based on the objectives and form of the study; thereafter, the extracted data are reviewed. Since the size and format of each variable are different, the size and format of the outcomes are also different, and slight changes may be required when combining the data [ 29 ]. If there are differences in the size and format of the outcome variables that cause difficulties combining the data, such as the use of different evaluation instruments or different evaluation timepoints, the analysis may be limited to a systematic review. The investigators resolve differences of opinion by debate, and if they fail to reach a consensus, a third-reviewer is consulted.

Data Analysis

The aim of a meta-analysis is to derive a conclusion with increased power and accuracy than what could not be able to achieve in individual studies. Therefore, before analysis, it is crucial to evaluate the direction of effect, size of effect, homogeneity of effects among studies, and strength of evidence [ 30 ]. Thereafter, the data are reviewed qualitatively and quantitatively. If it is determined that the different research outcomes cannot be combined, all the results and characteristics of the individual studies are displayed in a table or in a descriptive form; this is referred to as a qualitative review. A meta-analysis is a quantitative review, in which the clinical effectiveness is evaluated by calculating the weighted pooled estimate for the interventions in at least two separate studies.

The pooled estimate is the outcome of the meta-analysis, and is typically explained using a forest plot ( Figs. 3 and ​ and4). 4 ). The black squares in the forest plot are the odds ratios (ORs) and 95% confidence intervals in each study. The area of the squares represents the weight reflected in the meta-analysis. The black diamond represents the OR and 95% confidence interval calculated across all the included studies. The bold vertical line represents a lack of therapeutic effect (OR = 1); if the confidence interval includes OR = 1, it means no significant difference was found between the treatment and control groups.

An external file that holds a picture, illustration, etc.
Object name is kjae-2018-71-2-103f3.jpg

Forest plot analyzed by two different models using the same data. (A) Fixed-effect model. (B) Random-effect model. The figure depicts individual trials as filled squares with the relative sample size and the solid line as the 95% confidence interval of the difference. The diamond shape indicates the pooled estimate and uncertainty for the combined effect. The vertical line indicates the treatment group shows no effect (OR = 1). Moreover, if the confidence interval includes 1, then the result shows no evidence of difference between the treatment and control groups.

An external file that holds a picture, illustration, etc.
Object name is kjae-2018-71-2-103f4.jpg

Forest plot representing homogeneous data.

Dichotomous variables and continuous variables

In data analysis, outcome variables can be considered broadly in terms of dichotomous variables and continuous variables. When combining data from continuous variables, the mean difference (MD) and standardized mean difference (SMD) are used ( Table 2 ).

Summary of Meta-analysis Methods Available in RevMan [ 28 ]

The MD is the absolute difference in mean values between the groups, and the SMD is the mean difference between groups divided by the standard deviation. When results are presented in the same units, the MD can be used, but when results are presented in different units, the SMD should be used. When the MD is used, the combined units must be shown. A value of “0” for the MD or SMD indicates that the effects of the new treatment method and the existing treatment method are the same. A value lower than “0” means the new treatment method is less effective than the existing method, and a value greater than “0” means the new treatment is more effective than the existing method.

When combining data for dichotomous variables, the OR, risk ratio (RR), or risk difference (RD) can be used. The RR and RD can be used for RCTs, quasi-experimental studies, or cohort studies, and the OR can be used for other case-control studies or cross-sectional studies. However, because the OR is difficult to interpret, using the RR and RD, if possible, is recommended. If the outcome variable is a dichotomous variable, it can be presented as the number needed to treat (NNT), which is the minimum number of patients who need to be treated in the intervention group, compared to the control group, for a given event to occur in at least one patient. Based on Table 3 , in an RCT, if x is the probability of the event occurring in the control group and y is the probability of the event occurring in the intervention group, then x = c/(c + d), y = a/(a + b), and the absolute risk reduction (ARR) = x − y. NNT can be obtained as the reciprocal, 1/ARR.

Calculation of the Number Needed to Treat in the Dichotomous table

Fixed-effect models and random-effect models

In order to analyze effect size, two types of models can be used: a fixed-effect model or a random-effect model. A fixed-effect model assumes that the effect of treatment is the same, and that variation between results in different studies is due to random error. Thus, a fixed-effect model can be used when the studies are considered to have the same design and methodology, or when the variability in results within a study is small, and the variance is thought to be due to random error. Three common methods are used for weighted estimation in a fixed-effect model: 1) inverse variance-weighted estimation 3) , 2) Mantel-Haenszel estimation 4) , and 3) Peto estimation 5) .

A random-effect model assumes heterogeneity between the studies being combined, and these models are used when the studies are assumed different, even if a heterogeneity test does not show a significant result. Unlike a fixed-effect model, a random-effect model assumes that the size of the effect of treatment differs among studies. Thus, differences in variation among studies are thought to be due to not only random error but also between-study variability in results. Therefore, weight does not decrease greatly for studies with a small number of patients. Among methods for weighted estimation in a random-effect model, the DerSimonian and Laird method 6) is mostly used for dichotomous variables, as the simplest method, while inverse variance-weighted estimation is used for continuous variables, as with fixed-effect models. These four methods are all used in Review Manager software (The Cochrane Collaboration, UK), and are described in a study by Deeks et al. [ 31 ] ( Table 2 ). However, when the number of studies included in the analysis is less than 10, the Hartung-Knapp-Sidik-Jonkman method 7) can better reduce the risk of type 1 error than does the DerSimonian and Laird method [ 32 ].

Fig. 3 shows the results of analyzing outcome data using a fixed-effect model (A) and a random-effect model (B). As shown in Fig. 3 , while the results from large studies are weighted more heavily in the fixed-effect model, studies are given relatively similar weights irrespective of study size in the random-effect model. Although identical data were being analyzed, as shown in Fig. 3 , the significant result in the fixed-effect model was no longer significant in the random-effect model. One representative example of the small study effect in a random-effect model is the meta-analysis by Li et al. [ 33 ]. In a large-scale study, intravenous injection of magnesium was unrelated to acute myocardial infarction, but in the random-effect model, which included numerous small studies, the small study effect resulted in an association being found between intravenous injection of magnesium and myocardial infarction. This small study effect can be controlled for by using a sensitivity analysis, which is performed to examine the contribution of each of the included studies to the final meta-analysis result. In particular, when heterogeneity is suspected in the study methods or results, by changing certain data or analytical methods, this method makes it possible to verify whether the changes affect the robustness of the results, and to examine the causes of such effects [ 34 ].

Heterogeneity

Homogeneity test is a method whether the degree of heterogeneity is greater than would be expected to occur naturally when the effect size calculated from several studies is higher than the sampling error. This makes it possible to test whether the effect size calculated from several studies is the same. Three types of homogeneity tests can be used: 1) forest plot, 2) Cochrane’s Q test (chi-squared), and 3) Higgins I 2 statistics. In the forest plot, as shown in Fig. 4 , greater overlap between the confidence intervals indicates greater homogeneity. For the Q statistic, when the P value of the chi-squared test, calculated from the forest plot in Fig. 4 , is less than 0.1, it is considered to show statistical heterogeneity and a random-effect can be used. Finally, I 2 can be used [ 35 ].

I 2 , calculated as shown above, returns a value between 0 and 100%. A value less than 25% is considered to show strong homogeneity, a value of 50% is average, and a value greater than 75% indicates strong heterogeneity.

Even when the data cannot be shown to be homogeneous, a fixed-effect model can be used, ignoring the heterogeneity, and all the study results can be presented individually, without combining them. However, in many cases, a random-effect model is applied, as described above, and a subgroup analysis or meta-regression analysis is performed to explain the heterogeneity. In a subgroup analysis, the data are divided into subgroups that are expected to be homogeneous, and these subgroups are analyzed. This needs to be planned in the predetermined protocol before starting the meta-analysis. A meta-regression analysis is similar to a normal regression analysis, except that the heterogeneity between studies is modeled. This process involves performing a regression analysis of the pooled estimate for covariance at the study level, and so it is usually not considered when the number of studies is less than 10. Here, univariate and multivariate regression analyses can both be considered.

Publication bias

Publication bias is the most common type of reporting bias in meta-analyses. This refers to the distortion of meta-analysis outcomes due to the higher likelihood of publication of statistically significant studies rather than non-significant studies. In order to test the presence or absence of publication bias, first, a funnel plot can be used ( Fig. 5 ). Studies are plotted on a scatter plot with effect size on the x-axis and precision or total sample size on the y-axis. If the points form an upside-down funnel shape, with a broad base that narrows towards the top of the plot, this indicates the absence of a publication bias ( Fig. 5A ) [ 29 , 36 ]. On the other hand, if the plot shows an asymmetric shape, with no points on one side of the graph, then publication bias can be suspected ( Fig. 5B ). Second, to test publication bias statistically, Begg and Mazumdar’s rank correlation test 8) [ 37 ] or Egger’s test 9) [ 29 ] can be used. If publication bias is detected, the trim-and-fill method 10) can be used to correct the bias [ 38 ]. Fig. 6 displays results that show publication bias in Egger’s test, which has then been corrected using the trim-and-fill method using Comprehensive Meta-Analysis software (Biostat, USA).

An external file that holds a picture, illustration, etc.
Object name is kjae-2018-71-2-103f5.jpg

Funnel plot showing the effect size on the x-axis and sample size on the y-axis as a scatter plot. (A) Funnel plot without publication bias. The individual plots are broader at the bottom and narrower at the top. (B) Funnel plot with publication bias. The individual plots are located asymmetrically.

An external file that holds a picture, illustration, etc.
Object name is kjae-2018-71-2-103f6.jpg

Funnel plot adjusted using the trim-and-fill method. White circles: comparisons included. Black circles: inputted comparisons using the trim-and-fill method. White diamond: pooled observed log risk ratio. Black diamond: pooled inputted log risk ratio.

Result Presentation

When reporting the results of a systematic review or meta-analysis, the analytical content and methods should be described in detail. First, a flowchart is displayed with the literature search and selection process according to the inclusion/exclusion criteria. Second, a table is shown with the characteristics of the included studies. A table should also be included with information related to the quality of evidence, such as GRADE ( Table 4 ). Third, the results of data analysis are shown in a forest plot and funnel plot. Fourth, if the results use dichotomous data, the NNT values can be reported, as described above.

The GRADE Evidence Quality for Each Outcome

N: number of studies, ROB: risk of bias, PON: postoperative nausea, POV: postoperative vomiting, PONV: postoperative nausea and vomiting, CI: confidence interval, RR: risk ratio, AR: absolute risk.

When Review Manager software (The Cochrane Collaboration, UK) is used for the analysis, two types of P values are given. The first is the P value from the z-test, which tests the null hypothesis that the intervention has no effect. The second P value is from the chi-squared test, which tests the null hypothesis for a lack of heterogeneity. The statistical result for the intervention effect, which is generally considered the most important result in meta-analyses, is the z-test P value.

A common mistake when reporting results is, given a z-test P value greater than 0.05, to say there was “no statistical significance” or “no difference.” When evaluating statistical significance in a meta-analysis, a P value lower than 0.05 can be explained as “a significant difference in the effects of the two treatment methods.” However, the P value may appear non-significant whether or not there is a difference between the two treatment methods. In such a situation, it is better to announce “there was no strong evidence for an effect,” and to present the P value and confidence intervals. Another common mistake is to think that a smaller P value is indicative of a more significant effect. In meta-analyses of large-scale studies, the P value is more greatly affected by the number of studies and patients included, rather than by the significance of the results; therefore, care should be taken when interpreting the results of a meta-analysis.

When performing a systematic literature review or meta-analysis, if the quality of studies is not properly evaluated or if proper methodology is not strictly applied, the results can be biased and the outcomes can be incorrect. However, when systematic reviews and meta-analyses are properly implemented, they can yield powerful results that could usually only be achieved using large-scale RCTs, which are difficult to perform in individual studies. As our understanding of evidence-based medicine increases and its importance is better appreciated, the number of systematic reviews and meta-analyses will keep increasing. However, indiscriminate acceptance of the results of all these meta-analyses can be dangerous, and hence, we recommend that their results be received critically on the basis of a more accurate understanding.

1) http://www.ohri.ca .

2) http://methods.cochrane.org/bias/assessing-risk-bias-included-studies .

3) The inverse variance-weighted estimation method is useful if the number of studies is small with large sample sizes.

4) The Mantel-Haenszel estimation method is useful if the number of studies is large with small sample sizes.

5) The Peto estimation method is useful if the event rate is low or one of the two groups shows zero incidence.

6) The most popular and simplest statistical method used in Review Manager and Comprehensive Meta-analysis software.

7) Alternative random-effect model meta-analysis that has more adequate error rates than does the common DerSimonian and Laird method, especially when the number of studies is small. However, even with the Hartung-Knapp-Sidik-Jonkman method, when there are less than five studies with very unequal sizes, extra caution is needed.

8) The Begg and Mazumdar rank correlation test uses the correlation between the ranks of effect sizes and the ranks of their variances [ 37 ].

9) The degree of funnel plot asymmetry as measured by the intercept from the regression of standard normal deviates against precision [ 29 ].

10) If there are more small studies on one side, we expect the suppression of studies on the other side. Trimming yields the adjusted effect size and reduces the variance of the effects by adding the original studies back into the analysis as a mirror image of each study.

  • How it works

Meta-Analysis – Guide with Definition, Steps & Examples

Published by Owen Ingram at April 26th, 2023 , Revised On April 26, 2023

“A meta-analysis is a formal, epidemiological, quantitative study design that uses statistical methods to generalise the findings of the selected independent studies. “

Meta-analysis and systematic review are the two most authentic strategies in research. When researchers start looking for the best available evidence concerning their research work, they are advised to begin from the top of the evidence pyramid. The evidence available in the form of meta-analysis or systematic reviews addressing important questions is significant in academics because it informs decision-making.

What is Meta-Analysis  

Meta-analysis estimates the absolute effect of individual independent research studies by systematically synthesising or merging the results. Meta-analysis isn’t only about achieving a wider population by combining several smaller studies. It involves systematic methods to evaluate the inconsistencies in participants, variability (also known as heterogeneity), and findings to check how sensitive their findings are to the selected systematic review protocol.   

When Should you Conduct a Meta-Analysis?

Meta-analysis has become a widely-used research method in medical sciences and other fields of work for several reasons. The technique involves summarising the results of independent systematic review studies. 

The Cochrane Handbook explains that “an important step in a systematic review is the thoughtful consideration of whether it is appropriate to combine the numerical results of all, or perhaps some, of the studies. Such a meta-analysis yields an overall statistic (together with its confidence interval) that summarizes the effectiveness of an experimental intervention compared with a comparator intervention” (section 10.2).

A researcher or a practitioner should choose meta-analysis when the following outcomes are desirable. 

For generating new hypotheses or ending controversies resulting from different research studies. Quantifying and evaluating the variable results and identifying the extent of conflict in literature through meta-analysis is possible. 

To find research gaps left unfilled and address questions not posed by individual studies. Primary research studies involve specific types of participants and interventions. A review of these studies with variable characteristics and methodologies can allow the researcher to gauge the consistency of findings across a wider range of participants and interventions. With the help of meta-analysis, the reasons for differences in the effect can also be explored. 

To provide convincing evidence. Estimating the effects with a larger sample size and interventions can provide convincing evidence. Many academic studies are based on a very small dataset, so the estimated intervention effects in isolation are not fully reliable.

Elements of a Meta-Analysis

Deeks et al. (2019), Haidilch (2010), and Grant & Booth (2009) explored the characteristics, strengths, and weaknesses of conducting the meta-analysis. They are briefly explained below. 

Characteristics: 

  • A systematic review must be completed before conducting the meta-analysis because it provides a summary of the findings of the individual studies synthesised. 
  • You can only conduct a meta-analysis by synthesising studies in a systematic review. 
  • The studies selected for statistical analysis for the purpose of meta-analysis should be similar in terms of comparison, intervention, and population. 

Strengths: 

  • A meta-analysis takes place after the systematic review. The end product is a comprehensive quantitative analysis that is complicated but reliable. 
  • It gives more value and weightage to existing studies that do not hold practical value on their own. 
  • Policy-makers and academicians cannot base their decisions on individual research studies. Meta-analysis provides them with a complex and solid analysis of evidence to make informed decisions. 

Criticisms: 

  • The meta-analysis uses studies exploring similar topics. Finding similar studies for the meta-analysis can be challenging.
  • When and if biases in the individual studies or those related to reporting and specific research methodologies are involved, the meta-analysis results could be misleading.

Steps of Conducting the Meta-Analysis 

The process of conducting the meta-analysis has remained a topic of debate among researchers and scientists. However, the following 5-step process is widely accepted. 

Step 1: Research Question

The first step in conducting clinical research involves identifying a research question and proposing a hypothesis . The potential clinical significance of the research question is then explained, and the study design and analytical plan are justified.

Step 2: Systematic Review 

The purpose of a systematic review (SR) is to address a research question by identifying all relevant studies that meet the required quality standards for inclusion. While established journals typically serve as the primary source for identified studies, it is important to also consider unpublished data to avoid publication bias or the exclusion of studies with negative results.

While some meta-analyses may limit their focus to randomized controlled trials (RCTs) for the sake of obtaining the highest quality evidence, other experimental and quasi-experimental studies may be included if they meet the specific inclusion/exclusion criteria established for the review.

Step 3: Data Extraction

After selecting studies for the meta-analysis, researchers extract summary data or outcomes, as well as sample sizes and measures of data variability for both intervention and control groups. The choice of outcome measures depends on the research question and the type of study, and may include numerical or categorical measures.

For instance, numerical means may be used to report differences in scores on a questionnaire or changes in a measurement, such as blood pressure. In contrast, risk measures like odds ratios (OR) or relative risks (RR) are typically used to report differences in the probability of belonging to one category or another, such as vaginal birth versus cesarean birth.

Step 4: Standardisation and Weighting Studies

After gathering all the required data, the fourth step involves computing suitable summary measures from each study for further examination. These measures are typically referred to as Effect Sizes and indicate the difference in average scores between the control and intervention groups. For instance, it could be the variation in blood pressure changes between study participants who used drug X and those who used a placebo.

Since the units of measurement often differ across the included studies, standardization is necessary to create comparable effect size estimates. Standardization is accomplished by determining, for each study, the average score for the intervention group, subtracting the average score for the control group, and dividing the result by the relevant measure of variability in that dataset.

In some cases, the results of certain studies must carry more significance than others. Larger studies, as measured by their sample sizes, are deemed to produce more precise estimates of effect size than smaller studies. Additionally, studies with less variability in data, such as smaller standard deviation or narrower confidence intervals, are typically regarded as higher quality in study design. A weighting statistic that aims to incorporate both of these factors, known as inverse variance, is commonly employed.

Step 5: Absolute Effect Estimation

The ultimate step in conducting a meta-analysis is to choose and utilize an appropriate model for comparing Effect Sizes among diverse studies. Two popular models for this purpose are the Fixed Effects and Random Effects models. The Fixed Effects model relies on the premise that each study is evaluating a common treatment effect, implying that all studies would have estimated the same Effect Size if sample variability were equal across all studies.

Conversely, the Random Effects model posits that the true treatment effects in individual studies may vary from each other, and endeavors to consider this additional source of interstudy variation in Effect Sizes. The existence and magnitude of this latter variability is usually evaluated within the meta-analysis through a test for ‘heterogeneity.’

Forest Plot

The results of a meta-analysis are often visually presented using a “Forest Plot”. This type of plot displays, for each study, included in the analysis, a horizontal line that indicates the standardized Effect Size estimate and 95% confidence interval for the risk ratio used. Figure A provides an example of a hypothetical Forest Plot in which drug X reduces the risk of death in all three studies.

However, the first study was larger than the other two, and as a result, the estimates for the smaller studies were not statistically significant. This is indicated by the lines emanating from their boxes, including the value of 1. The size of the boxes represents the relative weights assigned to each study by the meta-analysis. The combined estimate of the drug’s effect, represented by the diamond, provides a more precise estimate of the drug’s effect, with the diamond indicating both the combined risk ratio estimate and the 95% confidence interval limits.

odds ratio

Figure-A: Hypothetical Forest Plot

Relevance to Practice and Research 

  Evidence Based Nursing commentaries often include recently published systematic reviews and meta-analyses, as they can provide new insights and strengthen recommendations for effective healthcare practices. Additionally, they can identify gaps or limitations in current evidence and guide future research directions.

The quality of the data available for synthesis is a critical factor in the strength of conclusions drawn from meta-analyses, and this is influenced by the quality of individual studies and the systematic review itself. However, meta-analysis cannot overcome issues related to underpowered or poorly designed studies.

Therefore, clinicians may still encounter situations where the evidence is weak or uncertain, and where higher-quality research is required to improve clinical decision-making. While such findings can be frustrating, they remain important for informing practice and highlighting the need for further research to fill gaps in the evidence base.

Methods and Assumptions in Meta-Analysis 

Ensuring the credibility of findings is imperative in all types of research, including meta-analyses. To validate the outcomes of a meta-analysis, the researcher must confirm that the research techniques used were accurate in measuring the intended variables. Typically, researchers establish the validity of a meta-analysis by testing the outcomes for homogeneity or the degree of similarity between the results of the combined studies.

Homogeneity is preferred in meta-analyses as it allows the data to be combined without needing adjustments to suit the study’s requirements. To determine homogeneity, researchers assess heterogeneity, the opposite of homogeneity. Two widely used statistical methods for evaluating heterogeneity in research results are Cochran’s-Q and I-Square, also known as I-2 Index.

Difference Between Meta-Analysis and Systematic Reviews

Meta-analysis and systematic reviews are both research methods used to synthesise evidence from multiple studies on a particular topic. However, there are some key differences between the two.

Systematic reviews involve a comprehensive and structured approach to identifying, selecting, and critically appraising all available evidence relevant to a specific research question. This process involves searching multiple databases, screening the identified studies for relevance and quality, and summarizing the findings in a narrative report.

Meta-analysis, on the other hand, involves using statistical methods to combine and analyze the data from multiple studies, with the aim of producing a quantitative summary of the overall effect size. Meta-analysis requires the studies to be similar enough in terms of their design, methodology, and outcome measures to allow for meaningful comparison and analysis.

Therefore, systematic reviews are broader in scope and summarize the findings of all studies on a topic, while meta-analyses are more focused on producing a quantitative estimate of the effect size of an intervention across multiple studies that meet certain criteria. In some cases, a systematic review may be conducted without a meta-analysis if the studies are too diverse or the quality of the data is not sufficient to allow for statistical pooling.

Software Packages For Meta-Analysis

Meta-analysis can be done through software packages, including free and paid options. One of the most commonly used software packages for meta-analysis is RevMan by the Cochrane Collaboration.

Assessing the Quality of Meta-Analysis 

Assessing the quality of a meta-analysis involves evaluating the methods used to conduct the analysis and the quality of the studies included. Here are some key factors to consider:

  • Study selection: The studies included in the meta-analysis should be relevant to the research question and meet predetermined criteria for quality.
  • Search strategy: The search strategy should be comprehensive and transparent, including databases and search terms used to identify relevant studies.
  • Study quality assessment: The quality of included studies should be assessed using appropriate tools, and this assessment should be reported in the meta-analysis.
  • Data extraction: The data extraction process should be systematic and clearly reported, including any discrepancies that arose.
  • Analysis methods: The meta-analysis should use appropriate statistical methods to combine the results of the included studies, and these methods should be transparently reported.
  • Publication bias: The potential for publication bias should be assessed and reported in the meta-analysis, including any efforts to identify and include unpublished studies.
  • Interpretation of results: The results should be interpreted in the context of the study limitations and the overall quality of the evidence.
  • Sensitivity analysis: Sensitivity analysis should be conducted to evaluate the impact of study quality, inclusion criteria, and other factors on the overall results.

Overall, a high-quality meta-analysis should be transparent in its methods and clearly report the included studies’ limitations and the evidence’s overall quality.

Hire an Expert Writer

Orders completed by our expert writers are

  • Formally drafted in an academic style
  • Free Amendments and 100% Plagiarism Free – or your money back!
  • 100% Confidential and Timely Delivery!
  • Free anti-plagiarism report
  • Appreciated by thousands of clients. Check client reviews

Hire an Expert Writer

Examples of Meta-Analysis

  • STANLEY T.D. et JARRELL S.B. (1989), « Meta-regression analysis : a quantitative method of literature surveys », Journal of Economics Surveys, vol. 3, n°2, pp. 161-170.
  • DATTA D.K., PINCHES G.E. et NARAYANAN V.K. (1992), « Factors influencing wealth creation from mergers and acquisitions : a meta-analysis », Strategic Management Journal, Vol. 13, pp. 67-84.
  • GLASS G. (1983), « Synthesising empirical research : Meta-analysis » in S.A. Ward and L.J. Reed (Eds), Knowledge structure and use : Implications for synthesis and interpretation, Philadelphia : Temple University Press.
  • WOLF F.M. (1986), Meta-analysis : Quantitative methods for research synthesis, Sage University Paper n°59.
  • HUNTER J.E., SCHMIDT F.L. et JACKSON G.B. (1982), « Meta-analysis : cumulating research findings across studies », Beverly Hills, CA : Sage.

Frequently Asked Questions

What is a meta-analysis in research.

Meta-analysis is a statistical method used to combine results from multiple studies on a specific topic. By pooling data from various sources, meta-analysis can provide a more precise estimate of the effect size of a treatment or intervention and identify areas for future research.

Why is meta-analysis important?

Meta-analysis is important because it combines and summarizes results from multiple studies to provide a more precise and reliable estimate of the effect of a treatment or intervention. This helps clinicians and policymakers make evidence-based decisions and identify areas for further research.

What is an example of a meta-analysis?

A meta-analysis of studies evaluating physical exercise’s effect on depression in adults is an example. Researchers gathered data from 49 studies involving a total of 2669 participants. The studies used different types of exercise and measures of depression, which made it difficult to compare the results.

Through meta-analysis, the researchers calculated an overall effect size and determined that exercise was associated with a statistically significant reduction in depression symptoms. The study also identified that moderate-intensity aerobic exercise, performed three to five times per week, was the most effective. The meta-analysis provided a more comprehensive understanding of the impact of exercise on depression than any single study could provide.

What is the definition of meta-analysis in clinical research?

Meta-analysis in clinical research is a statistical technique that combines data from multiple independent studies on a particular topic to generate a summary or “meta” estimate of the effect of a particular intervention or exposure.

This type of analysis allows researchers to synthesise the results of multiple studies, potentially increasing the statistical power and providing more precise estimates of treatment effects. Meta-analyses are commonly used in clinical research to evaluate the effectiveness and safety of medical interventions and to inform clinical practice guidelines.

Is meta-analysis qualitative or quantitative?

Meta-analysis is a quantitative method used to combine and analyze data from multiple studies. It involves the statistical synthesis of results from individual studies to obtain a pooled estimate of the effect size of a particular intervention or treatment. Therefore, meta-analysis is considered a quantitative approach to research synthesis.

You May Also Like

What are the different types of research you can use in your dissertation? Here are some guidelines to help you choose a research strategy that would make your research more credible.

A hypothesis is a research question that has to be proved correct or incorrect through hypothesis testing – a scientific approach to test a hypothesis.

You can transcribe an interview by converting a conversation into a written format including question-answer recording sessions between two or more people.

USEFUL LINKS

LEARNING RESOURCES

researchprospect-reviews-trust-site

COMPANY DETAILS

Research-Prospect-Writing-Service

  • How It Works

What is Meta-Analysis? Definition, Research & Examples

Appinio Research · 01.02.2024 · 39min read

What Is Meta-Analysis Definition Research Examples

Are you looking to harness the power of data and uncover meaningful insights from a multitude of research studies? In a world overflowing with information, meta-analysis emerges as a guiding light, offering a systematic and quantitative approach to distilling knowledge from a sea of research.

This guide will demystify the art and science of meta-analysis, walking you through the process, from defining your research question to interpreting the results. Whether you're an academic researcher, a policymaker, or a curious mind eager to explore the depths of data, this guide will equip you with the tools and understanding needed to undertake robust and impactful meta-analyses.

What is a Meta Analysis?

Meta-analysis is a quantitative research method that involves the systematic synthesis and statistical analysis of data from multiple individual studies on a particular topic or research question. It aims to provide a comprehensive and robust summary of existing evidence by pooling the results of these studies, often leading to more precise and generalizable conclusions.

The primary purpose of meta-analysis is to:

  • Quantify Effect Sizes:  Determine the magnitude and direction of an effect or relationship across studies.
  • Evaluate Consistency:  Assess the consistency of findings among studies and identify sources of heterogeneity.
  • Enhance Statistical Power:  Increase the statistical power to detect significant effects by combining data from multiple studies.
  • Generalize Results:  Provide more generalizable results by analyzing a more extensive and diverse sample of participants or contexts.
  • Examine Subgroup Effects:  Explore whether the effect varies across different subgroups or study characteristics.

Importance of Meta-Analysis

Meta-analysis plays a crucial role in scientific research and evidence-based decision-making. Here are key reasons why meta-analysis is highly valuable:

  • Enhanced Precision:  By pooling data from multiple studies, meta-analysis provides a more precise estimate of the effect size, reducing the impact of random variation.
  • Increased Statistical Power:  The combination of numerous studies enhances statistical power, making it easier to detect small but meaningful effects.
  • Resolution of Inconsistencies:  Meta-analysis can help resolve conflicting findings in the literature by systematically analyzing and synthesizing evidence.
  • Identification of Moderators:  It allows for the identification of factors that may moderate the effect, helping to understand when and for whom interventions or treatments are most effective.
  • Evidence-Based Decision-Making:  Policymakers, clinicians, and researchers use meta-analysis to inform evidence-based decision-making, leading to more informed choices in healthcare , education, and other fields.
  • Efficient Use of Resources:  Meta-analysis can guide future research by identifying gaps in knowledge, reducing duplication of efforts, and directing resources to areas with the most significant potential impact.

Types of Research Questions Addressed

Meta-analysis can address a wide range of research questions across various disciplines. Some common types of research questions that meta-analysis can tackle include:

  • Treatment Efficacy:  Does a specific medical treatment, therapy, or intervention have a significant impact on patient outcomes or symptoms?
  • Intervention Effectiveness:  How effective are educational programs, training methods, or interventions in improving learning outcomes or skills?
  • Risk Factors and Associations:  What are the associations between specific risk factors, such as smoking or diet, and the likelihood of developing certain diseases or conditions?
  • Impact of Policies:  What is the effect of government policies, regulations, or interventions on social, economic, or environmental outcomes?
  • Psychological Constructs:  How do psychological constructs, such as self-esteem, anxiety, or motivation, influence behavior or mental health outcomes?
  • Comparative Effectiveness:  Which of two or more competing interventions or treatments is more effective for a particular condition or population?
  • Dose-Response Relationships:  Is there a dose-response relationship between exposure to a substance or treatment and the likelihood or severity of an outcome?

Meta-analysis is a versatile tool that can provide valuable insights into a wide array of research questions, making it an indispensable method in evidence synthesis and knowledge advancement.

Meta-Analysis vs. Systematic Review

In evidence synthesis and research aggregation, meta-analysis and systematic reviews are two commonly used methods, each serving distinct purposes while sharing some similarities. Let's explore the differences and similarities between these two approaches.

Meta-Analysis

  • Purpose:  Meta-analysis is a statistical technique used to combine and analyze quantitative data from multiple individual studies that address the same research question. The primary aim of meta-analysis is to provide a single summary effect size that quantifies the magnitude and direction of an effect or relationship across studies.
  • Data Synthesis:  In meta-analysis, researchers extract and analyze numerical data, such as means, standard deviations, correlation coefficients, or odds ratios, from each study. These effect size estimates are then combined using statistical methods to generate an overall effect size and associated confidence interval.
  • Quantitative:  Meta-analysis is inherently quantitative, focusing on numerical data and statistical analyses to derive a single effect size estimate.
  • Main Outcome:  The main outcome of a meta-analysis is the summary effect size, which provides a quantitative estimate of the research question's answer.

Systematic Review

  • Purpose:  A systematic review is a comprehensive and structured overview of the available evidence on a specific research question. While systematic reviews may include meta-analysis, their primary goal is to provide a thorough and unbiased summary of the existing literature.
  • Data Synthesis:  Systematic reviews involve a meticulous process of literature search, study selection, data extraction, and quality assessment. Researchers may narratively synthesize the findings, providing a qualitative summary of the evidence.
  • Qualitative:  Systematic reviews are often qualitative in nature, summarizing and synthesizing findings in a narrative format. They do not always involve statistical analysis .
  • Main Outcome:  The primary outcome of a systematic review is a comprehensive narrative summary of the existing evidence. While some systematic reviews include meta-analyses, not all do so.

Key Differences

  • Nature of Data:  Meta-analysis primarily deals with quantitative data and statistical analysis , while systematic reviews encompass both quantitative and qualitative data, often presenting findings in a narrative format.
  • Focus on Effect Size:  Meta-analysis focuses on deriving a single, quantitative effect size estimate, whereas systematic reviews emphasize providing a comprehensive overview of the literature, including study characteristics, methodologies, and key findings.
  • Synthesis Approach:  Meta-analysis is a quantitative synthesis method, while systematic reviews may use both quantitative and qualitative synthesis approaches.

Commonalities

  • Structured Process:  Both meta-analyses and systematic reviews follow a structured and systematic process for literature search, study selection, data extraction, and quality assessment.
  • Evidence-Based:  Both approaches aim to provide evidence-based answers to specific research questions, offering valuable insights for decision-making in various fields.
  • Transparency:  Both meta-analyses and systematic reviews prioritize transparency and rigor in their methodologies to minimize bias and enhance the reliability of their findings.

While meta-analysis and systematic reviews share the overarching goal of synthesizing research evidence, they differ in their approach and main outcomes. Meta-analysis is quantitative, focusing on effect sizes, while systematic reviews provide comprehensive overviews, utilizing both quantitative and qualitative data to summarize the literature. Depending on the research question and available data, one or both of these methods may be employed to provide valuable insights for evidence-based decision-making.

How to Conduct a Meta-Analysis?

Planning a meta-analysis is a critical phase that lays the groundwork for a successful and meaningful study. We will explore each component of the planning process in more detail, ensuring you have a solid foundation before diving into data analysis.

How to Formulate Research Questions?

Your research questions are the guiding compass of your meta-analysis. They should be precise and tailored to the topic you're investigating. To craft effective research questions:

  • Clearly Define the Problem:  Start by identifying the specific problem or topic you want to address through meta-analysis.
  • Specify Key Variables:  Determine the essential variables or factors you'll examine in the included studies.
  • Frame Hypotheses:  If applicable, create clear hypotheses that your meta-analysis will test.

For example, if you're studying the impact of a specific intervention on patient outcomes, your research question might be: "What is the effect of Intervention X on Patient Outcome Y in published clinical trials?"

Eligibility Criteria

Eligibility criteria define the boundaries of your meta-analysis. By establishing clear criteria, you ensure that the studies you include are relevant and contribute to your research objectives. Key considerations for eligibility criteria include:

  • Study Types:  Decide which types of studies will be considered (e.g., randomized controlled trials, cohort studies, case-control studies).
  • Publication Time Frame:  Specify the publication date range for included studies.
  • Language:  Determine whether studies in languages other than your primary language will be included.
  • Geographic Region:  If relevant, define any geographic restrictions.

Your eligibility criteria should strike a balance between inclusivity and relevance. Excluding certain studies based on valid criteria ensures the quality and relevance of the data you analyze.

Search Strategy

A robust search strategy is fundamental to identifying all relevant studies. To create an effective search strategy:

  • Select Databases:  Choose appropriate databases that cover your research area (e.g., PubMed, Scopus, Web of Science).
  • Keywords and Search Terms:  Develop a comprehensive list of relevant keywords and search terms related to your research questions.
  • Search Filters:  Utilize search filters and Boolean operators (AND, OR) to refine your search queries.
  • Manual Searches:  Consider conducting hand-searches of key journals and reviewing the reference lists of relevant studies for additional sources.

Remember that the goal is to cast a wide net while maintaining precision to capture all relevant studies.

Data Extraction

Data extraction is the process of systematically collecting information from each selected study. It involves retrieving key data points, including:

  • Study Characteristics:  Author(s), publication year, study design, sample size, duration, and location.
  • Outcome Data:  Effect sizes, standard errors, confidence intervals, p-values, and any other relevant statistics.
  • Methodological Details:  Information on study quality, risk of bias, and potential sources of heterogeneity.

Creating a standardized data extraction form is essential to ensure consistency and accuracy throughout this phase. Spreadsheet software, such as Microsoft Excel, is commonly used for data extraction.

Quality Assessment

Assessing the quality of included studies is crucial to determine their reliability and potential impact on your meta-analysis. Various quality assessment tools and checklists are available, depending on the study design. Some commonly used tools include:

  • Newcastle-Ottawa Scale:  Used for assessing the quality of non-randomized studies (e.g., cohort, case-control studies).
  • Cochrane Risk of Bias Tool:  Designed for evaluating randomized controlled trials.

Quality assessment typically involves evaluating aspects such as study design, sample size, data collection methods , and potential biases. This step helps you weigh the contribution of each study to the overall analysis.

How to Conduct a Literature Review?

Conducting a thorough literature review is a critical step in the meta-analysis process. We will explore the essential components of a literature review, from designing a comprehensive search strategy to establishing clear inclusion and exclusion criteria and, finally, the study selection process.

Comprehensive Search

To ensure the success of your meta-analysis, it's imperative to cast a wide net when searching for relevant studies. A comprehensive search strategy involves:

  • Selecting Relevant Databases:  Identify databases that cover your research area comprehensively, such as PubMed, Scopus, Web of Science, or specialized databases specific to your field.
  • Creating a Keyword List:  Develop a list of relevant keywords and search terms related to your research questions. Think broadly and consider synonyms, acronyms, and variations.
  • Using Boolean Operators:  Utilize Boolean operators (AND, OR) to combine keywords effectively and refine your search.
  • Applying Filters:  Employ search filters (e.g., publication date range, study type) to narrow down results based on your eligibility criteria.

Remember that the goal is to leave no relevant stone unturned, as missing key studies can introduce bias into your meta-analysis.

Inclusion and Exclusion Criteria

Clearly defined inclusion and exclusion criteria are the gatekeepers of your meta-analysis. These criteria ensure that the studies you include meet your research objectives and maintain the quality of your analysis. Consider the following factors when establishing criteria:

  • Study Types:  Determine which types of studies are eligible for inclusion (e.g., randomized controlled trials, observational studies, case reports).
  • Publication Time Frame:  Specify the time frame within which studies must have been published.
  • Language:  Decide whether studies in languages other than your primary language will be included or excluded.
  • Geographic Region:  If applicable, define any geographic restrictions.
  • Relevance to Research Questions:  Ensure that selected studies align with your research questions and objectives.

Your inclusion and exclusion criteria should strike a balance between inclusivity and relevance. Rigorous criteria help maintain the quality and applicability of the studies included in your meta-analysis.

Study Selection Process

The study selection process involves systematically screening and evaluating each potential study to determine whether it meets your predefined inclusion criteria. Here's a step-by-step guide:

  • Screen Titles and Abstracts:  Begin by reviewing the titles and abstracts of the retrieved studies. Exclude studies that clearly do not meet your inclusion criteria.
  • Full-Text Assessment:  Assess the full text of potentially relevant studies to confirm their eligibility. Pay attention to study design, sample size, and other specific criteria.
  • Data Extraction:  For studies that meet your criteria, extract the necessary data, including study characteristics, effect sizes, and other relevant information.
  • Record Exclusions:  Keep a record of the reasons for excluding studies. This transparency is crucial for the reproducibility of your meta-analysis.
  • Resolve Discrepancies:  If multiple reviewers are involved, resolve any disagreements through discussion or a third-party arbitrator.

Maintaining a clear and organized record of your study selection process is essential for transparency and reproducibility. Software tools like EndNote or Covidence can facilitate the screening and data extraction process.

By following these systematic steps in conducting a literature review, you ensure that your meta-analysis is built on a solid foundation of relevant and high-quality studies.

Data Extraction and Management

As you progress in your meta-analysis journey, the data extraction and management phase becomes paramount. We will delve deeper into the critical aspects of this phase, including the data collection process, data coding and transformation, and how to handle missing data effectively.

Data Collection Process

The data collection process is the heart of your meta-analysis, where you systematically extract essential information from each selected study. To ensure accuracy and consistency:

  • Create a Data Extraction Form:  Develop a standardized data extraction form that includes all the necessary fields for collecting relevant data. This form should align with your research questions and inclusion criteria.
  • Data Extractors:  Assign one or more reviewers to extract data from the selected studies. Ensure they are familiar with the form and the specific data points to collect.
  • Double-Check Accuracy:  Implement a verification process where a second reviewer cross-checks a random sample of data extractions to identify discrepancies or errors.
  • Extract All Relevant Information:  Collect data on study characteristics, participant demographics, outcome measures, effect sizes, confidence intervals, and any additional information required for your analysis.
  • Maintain Consistency:  Use clear guidelines and definitions for data extraction to ensure uniformity across studies.
To optimize your data collection process and streamline the extraction and management of crucial information, consider leveraging innovative solutions like Appinio . With Appinio, you can effortlessly collect real-time consumer insights, ensuring your meta-analysis benefits from the latest data trends and user perspectives.   Ready to learn more? Book a demo today and unlock a world of data-driven possibilities!

Book a demo EN US faces

Get a free demo and see the Appinio platform in action!

Data Coding and Transformation

After data collection, you may need to code and transform the extracted data to ensure uniformity and compatibility across studies. This process involves:

  • Coding Categorical Variables:  If studies report data differently, code categorical variables consistently . For example, ensure that categories like "male" and "female" are coded consistently across studies.
  • Standardizing Units of Measurement:  Convert all measurements to a common unit if studies use different measurement units. For instance, if one study reports height in inches and another in centimeters, standardize to one unit for comparability.
  • Calculating Effect Sizes:  Calculate effect sizes and their standard errors or variances if they are not directly reported in the studies. Common effect size measures include Cohen's d, odds ratio (OR), and hazard ratio (HR).
  • Data Transformation:  Transform data if necessary to meet assumptions of statistical tests. Common transformations include log transformation for skewed data or arcsine transformation for proportions.
  • Heterogeneity Adjustment:  Consider using transformation methods to address heterogeneity among studies, such as applying the Freeman-Tukey double arcsine transformation for proportions.

The goal of data coding and transformation is to make sure that data from different studies are compatible and can be effectively synthesized during the analysis phase. Spreadsheet software like Excel or statistical software like R can be used for these tasks.

Handling Missing Data

Missing data is a common challenge in meta-analysis, and how you handle it can impact the validity and precision of your results. Strategies for handling missing data include:

  • Contact Authors:  If feasible, contact the authors of the original studies to request missing data or clarifications.
  • Imputation:  Consider using appropriate imputation methods to estimate missing values, but exercise caution and report the imputation methods used.
  • Sensitivity Analysis:  Conduct sensitivity analyses to assess the impact of missing data on your results by comparing the main analysis to alternative scenarios.

Remember that transparency in reporting how you handled missing data is crucial for the credibility of your meta-analysis.

By following these steps in data extraction and management, you will ensure the integrity and reliability of your meta-analysis dataset.

Meta-Analysis Example

Meta-analysis is a versatile research method that can be applied to various fields and disciplines, providing valuable insights by synthesizing existing evidence.

Example 1: Analyzing the Impact of Advertising Campaigns on Sales

Background:  A market research agency is tasked with assessing the effectiveness of advertising campaigns on sales outcomes for a range of consumer products. They have access to multiple studies and reports conducted by different companies, each analyzing the impact of advertising on sales revenue.

Meta-Analysis Approach:

  • Study Selection:  Identify relevant studies that meet specific inclusion criteria, such as the type of advertising campaign (e.g., TV commercials, social media ads), the products examined, and the sales metrics assessed.
  • Data Extraction:  Collect data from each study, including details about the advertising campaign (e.g., budget, duration), sales data (e.g., revenue, units sold), and any reported effect sizes or correlations.
  • Effect Size Calculation:  Calculate effect sizes (e.g., correlation coefficients) based on the data provided in each study, quantifying the strength and direction of the relationship between advertising and sales.
  • Data Synthesis:  Employ meta-analysis techniques to combine the effect sizes from the selected studies. Compute a summary effect size and its confidence interval to estimate the overall impact of advertising on sales.
  • Publication Bias Assessment:  Use funnel plots and statistical tests to assess the potential presence of publication bias, ensuring that the meta-analysis results are not unduly influenced by selective reporting.

Findings:  Through meta-analysis, the market research agency discovers that advertising campaigns have a statistically significant and positive impact on sales across various product categories. The findings provide evidence for the effectiveness of advertising efforts and assist companies in making data-driven decisions regarding their marketing strategies.

These examples illustrate how meta-analysis can be applied in diverse domains, from tech startups seeking to optimize user engagement to market research agencies evaluating the impact of advertising campaigns. By systematically synthesizing existing evidence, meta-analysis empowers decision-makers with valuable insights for informed choices and evidence-based strategies.

How to Assess Study Quality and Bias?

Ensuring the quality and reliability of the studies included in your meta-analysis is essential for drawing accurate conclusions. We'll show you how you can assess study quality using specific tools, evaluate potential bias, and address publication bias.

Quality Assessment Tools

Quality assessment tools provide structured frameworks for evaluating the methodological rigor of each included study. The choice of tool depends on the study design. Here are some commonly used quality assessment tools:

For Randomized Controlled Trials (RCTs):

  • Cochrane Risk of Bias Tool:  This tool assesses the risk of bias in RCTs based on six domains: random sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessment, incomplete outcome data, and selective reporting.
  • Jadad Scale:  A simpler tool specifically for RCTs, the Jadad Scale focuses on randomization, blinding, and the handling of withdrawals and dropouts.

For Observational Studies:

  • Newcastle-Ottawa Scale (NOS):  The NOS assesses the quality of cohort and case-control studies based on three categories: selection, comparability, and outcome.
  • ROBINS-I:  Designed for non-randomized studies of interventions, the Risk of Bias in Non-randomized Studies of Interventions tool evaluates bias in domains such as confounding, selection bias, and measurement bias.
  • MINORS:  The Methodological Index for Non-Randomized Studies (MINORS) assesses non-comparative studies and includes items related to study design, reporting, and statistical analysis.

Bias Assessment

Evaluating potential sources of bias is crucial to understanding the limitations of the included studies. Common sources of bias include:

  • Selection Bias:  Occurs when the selection of participants is not random or representative of the target population.
  • Performance Bias:  Arises when participants or researchers are aware of the treatment or intervention status, potentially influencing outcomes.
  • Detection Bias:  Occurs when outcome assessors are not blinded to the treatment groups.
  • Attrition Bias:  Results from incomplete data or differential loss to follow-up between treatment groups.
  • Reporting Bias:  Involves selective reporting of outcomes, where only positive or statistically significant results are published.

To assess bias, reviewers often use the quality assessment tools mentioned earlier, which include domains related to bias, or they may specifically address bias concerns in the narrative synthesis.

We'll move on to the core of meta-analysis: data synthesis. We'll explore different effect size measures, fixed-effect versus random-effects models, and techniques for assessing and addressing heterogeneity among studies.

Data Synthesis

Now that you've gathered data from multiple studies and assessed their quality, it's time to synthesize this information effectively.

Effect Size Measures

Effect size measures quantify the magnitude of the relationship or difference you're investigating in your meta-analysis. The choice of effect size measure depends on your research question and the type of data provided by the included studies. Here are some commonly used effect size measures:

Continuous Outcome Data:

  • Cohen's d:  Measures the standardized mean difference between two groups. It's suitable for continuous outcome variables.
  • Hedges' g:  Similar to Cohen's d but incorporates a correction factor for small sample sizes.

Binary Outcome Data:

  • Odds Ratio (OR):  Used for dichotomous outcomes, such as success/failure or presence/absence.
  • Risk Ratio (RR):  Similar to OR but used when the outcome is relatively common.

Time-to-Event Data:

  • Hazard Ratio (HR):  Used in survival analysis to assess the risk of an event occurring over time.
  • Risk Difference (RD):  Measures the absolute difference in event rates between two groups.

Selecting the appropriate effect size measure depends on the nature of your data and the research question. When effect sizes are not directly reported in the studies, you may need to calculate them using available data, such as means, standard deviations, and sample sizes.

Formula for Cohen's d:

d = (Mean of Group A - Mean of Group B) / Pooled Standard Deviation

Fixed-Effect vs. Random-Effects Models

In meta-analysis, you can choose between fixed-effect and random-effects models to combine the results of individual studies:

Fixed-Effect Model:

  • Assumes that all included studies share a common true effect size.
  • Accounts for only within-study variability ( sampling error ).
  • Appropriate when studies are very similar or when there's minimal heterogeneity.

Random-Effects Model:

  • Acknowledges that there may be variability in effect sizes across studies.
  • Accounts for both within-study variability (sampling error) and between-study variability (real differences between studies).
  • More conservative and applicable when there's substantial heterogeneity.

The choice between these models should be guided by the degree of heterogeneity observed among the included studies. If heterogeneity is significant, the random-effects model is often preferred, as it provides a more robust estimate of the overall effect.

Forest Plots

Forest plots are graphical representations commonly used in meta-analysis to display the results of individual studies along with the combined summary estimate. Key components of a forest plot include:

  • Vertical Line:  Represents the null effect (e.g., no difference or no effect).
  • Horizontal Lines:  Represent the confidence intervals for each study's effect size estimate.
  • Diamond or Square:  Represents the summary effect size estimate, with its width indicating the confidence interval around the summary estimate.
  • Study Names:  Listed on the left side of the plot, identifying each study.

Forest plots help visualize the distribution of effect sizes across studies and provide insights into the consistency and direction of the findings.

Heterogeneity Assessment

Heterogeneity refers to the variability in effect sizes among the included studies. It's important to assess and understand heterogeneity as it can impact the interpretation of your meta-analysis results. Standard methods for assessing heterogeneity include:

  • Cochran's Q Test:  A statistical test that assesses whether there is significant heterogeneity among the effect sizes of the included studies.
  • I² Statistic:  A measure that quantifies the proportion of total variation in effect sizes that is due to heterogeneity. I² values range from 0% to 100%, with higher values indicating greater heterogeneity.

Assessing heterogeneity is crucial because it informs your choice of meta-analysis model (fixed-effect vs. random-effects) and whether subgroup analyses or sensitivity analyses are warranted to explore potential sources of heterogeneity.

How to Interpret Meta-Analysis Results?

With the data synthesis complete, it's time to make sense of the results of your meta-analysis.

Meta-Analytic Summary

The meta-analytic summary is the culmination of your efforts in data synthesis. It provides a consolidated estimate of the effect size and its confidence interval, combining the results of all included studies. To interpret the meta-analytic summary effectively:

  • Effect Size Estimate:  Understand the primary effect size estimate, such as Cohen's d, odds ratio, or hazard ratio, and its associated confidence interval.
  • Significance:  Determine whether the summary effect size is statistically significant. This is indicated when the confidence interval does not include the null value (e.g., 0 for Cohen's d or 1 for odds ratio).
  • Magnitude:  Assess the magnitude of the effect size. Is it large, moderate, or small, and what are the practical implications of this magnitude?
  • Direction:  Consider the direction of the effect. Is it in the hypothesized direction, or does it contradict the expected outcome?
  • Clinical or Practical Significance:  Reflect on the clinical or practical significance of the findings. Does the effect size have real-world implications?
  • Consistency:  Evaluate the consistency of the findings across studies. Are most studies in agreement with the summary effect size estimate, or are there outliers?

Subgroup Analyses

Subgroup analyses allow you to explore whether the effect size varies across different subgroups of studies or participants. This can help identify potential sources of heterogeneity or assess whether the intervention's effect differs based on specific characteristics. Steps for conducting subgroup analyses:

  • Define Subgroups:  Clearly define the subgroups you want to investigate based on relevant study characteristics (e.g., age groups, study design , intervention type).
  • Analyze Subgroups:  Calculate separate summary effect sizes for each subgroup and compare them to the overall summary effect.
  • Assess Heterogeneity:  Evaluate whether subgroup differences are statistically significant. If so, this suggests that the effect size varies significantly among subgroups.
  • Interpretation:  Interpret the subgroup findings in the context of your research question. Are there meaningful differences in the effect across subgroups? What might explain these differences?

Subgroup analyses can provide valuable insights into the factors influencing the overall effect size and help tailor recommendations for specific populations or conditions.

Sensitivity Analyses

Sensitivity analyses are conducted to assess the robustness of your meta-analysis results by exploring how different choices or assumptions might affect the findings. Common sensitivity analyses include:

  • Exclusion of Low-Quality Studies:  Repeating the meta-analysis after excluding studies with low quality or a high risk of bias.
  • Changing Effect Size Measure:  Re-running the analysis using a different effect size measure to assess whether the choice of measure significantly impacts the results.
  • Publication Bias Adjustment:  Applying methods like the trim-and-fill procedure to adjust for potential publication bias.
  • Subsample Analysis:  Analyzing a subset of studies based on specific criteria or characteristics to investigate their impact on the summary effect.

Sensitivity analyses help assess the robustness and reliability of your meta-analysis results, providing a more comprehensive understanding of the potential influence of various factors.

Reporting and Publication

The final stages of your meta-analysis involve preparing your findings for publication.

Manuscript Preparation

When preparing your meta-analysis manuscript, consider the following:

  • Structured Format:  Organize your manuscript following a structured format, including sections such as introduction, methods, results, discussion, and conclusions.
  • Clarity and Conciseness:  Write your findings clearly and concisely, avoiding jargon or overly technical language. Use tables and figures to enhance clarity.
  • Transparent Methods:  Provide detailed descriptions of your methods, including eligibility criteria, search strategy, data extraction, and statistical analysis.
  • Incorporate Tables and Figures:  Present your meta-analysis results using tables and forest plots to visually convey key findings.
  • Interpretation:  Interpret the implications of your findings, discussing the clinical or practical significance and limitations.

Transparent Reporting Guidelines

Adhering to transparent reporting guidelines ensures that your meta-analysis is transparent, reproducible, and credible. Some widely recognized guidelines include:

  • PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses):  PRISMA provides a checklist and flow diagram for reporting systematic reviews and meta-analyses, enhancing transparency and rigor.
  • MOOSE (Meta-analysis of Observational Studies in Epidemiology):  MOOSE guidelines are designed for meta-analyses of observational studies and provide a framework for transparent reporting.
  • ROBINS-I:  If your meta-analysis involves non-randomized studies, follow the Risk Of Bias In Non-randomized Studies of Interventions guidelines for reporting.

Adhering to these guidelines ensures that your meta-analysis is transparent, reproducible, and credible. It enhances the quality of your research and aids readers and reviewers in assessing the rigor of your study.

PRISMA Statement

The PRISMA statement is a valuable resource for conducting and reporting systematic reviews and meta-analyses. Key elements of PRISMA include:

  • Title:  Clearly indicate that your paper is a systematic review or meta-analysis.
  • Structured Abstract:  Provide a structured summary of your study, including objectives, methods, results, and conclusions.
  • Transparent Reporting:  Follow the PRISMA checklist, which covers items such as the rationale, eligibility criteria, search strategy, data extraction, and risk of bias assessment.
  • Flow Diagram:  Include a flow diagram illustrating the study selection process.

By adhering to the PRISMA statement, you enhance the transparency and credibility of your meta-analysis, facilitating its acceptance for publication and aiding readers in evaluating the quality of your research.

Conclusion for Meta-Analysis

Meta-analysis is a powerful tool that allows you to combine and analyze data from multiple studies to find meaningful patterns and make informed decisions. It helps you see the bigger picture and draw more accurate conclusions than individual studies alone. Whether you're in healthcare, education, business, or any other field, the principles of meta-analysis can be applied to enhance your research and decision-making processes. Remember that conducting a successful meta-analysis requires careful planning, attention to detail, and transparency in reporting. By following the steps outlined in this guide, you can embark on your own meta-analysis journey with confidence, contributing to the advancement of knowledge and evidence-based practices in your area of interest.

How to Elevate Your Meta-Analysis With Real-Time Insights?

Introducing Appinio , the real-time market research platform that brings a new level of excitement to your meta-analysis journey. With Appinio, you can seamlessly collect your own market research data in minutes, empowering your meta-analysis with fresh, real-time consumer insights.

Here's why Appinio is your ideal partner for efficient data collection:

  • From Questions to Insights in Minutes:  Appinio's lightning-fast platform ensures you get the answers you need when you need them, accelerating your meta-analysis process.
  • No Research PhD Required:  Our intuitive platform is designed for everyone, eliminating the need for specialized research skills and putting the power of data collection in your hands.
  • Global Reach, Minimal Time:  With an average field time of less than 23 minutes for 1,000 respondents and access to over 90 countries, you can define precise target groups and gather data swiftly.

Register now EN

Get free access to the platform!

Join the loop 💌

Be the first to hear about new updates, product news, and data insights. We'll send it all straight to your inbox.

Get the latest market research news straight to your inbox! 💌

Wait, there's more

What is Market Share? Definition, Formula, Examples

15.04.2024 | 32min read

What is Market Share? Definition, Formula, Examples

What is Data Analysis Definition Tools Examples

11.04.2024 | 34min read

What is Data Analysis? Definition, Tools, Examples

What is a Confidence Interval and How to Calculate It

09.04.2024 | 29min read

What is a Confidence Interval and How to Calculate It?

  • Methodology
  • Open access
  • Published: 24 April 2023

Quantitative evidence synthesis: a practical guide on meta-analysis, meta-regression, and publication bias tests for environmental sciences

  • Shinichi Nakagawa   ORCID: orcid.org/0000-0002-7765-5182 1 , 2 ,
  • Yefeng Yang   ORCID: orcid.org/0000-0002-8610-4016 1 ,
  • Erin L. Macartney   ORCID: orcid.org/0000-0003-3866-143X 1 ,
  • Rebecca Spake   ORCID: orcid.org/0000-0003-4671-2225 3 &
  • Malgorzata Lagisz   ORCID: orcid.org/0000-0002-3993-6127 1  

Environmental Evidence volume  12 , Article number:  8 ( 2023 ) Cite this article

6285 Accesses

16 Citations

28 Altmetric

Metrics details

Meta-analysis is a quantitative way of synthesizing results from multiple studies to obtain reliable evidence of an intervention or phenomenon. Indeed, an increasing number of meta-analyses are conducted in environmental sciences, and resulting meta-analytic evidence is often used in environmental policies and decision-making. We conducted a survey of recent meta-analyses in environmental sciences and found poor standards of current meta-analytic practice and reporting. For example, only ~ 40% of the 73 reviewed meta-analyses reported heterogeneity (variation among effect sizes beyond sampling error), and publication bias was assessed in fewer than half. Furthermore, although almost all the meta-analyses had multiple effect sizes originating from the same studies, non-independence among effect sizes was considered in only half of the meta-analyses. To improve the implementation of meta-analysis in environmental sciences, we here outline practical guidance for conducting a meta-analysis in environmental sciences. We describe the key concepts of effect size and meta-analysis and detail procedures for fitting multilevel meta-analysis and meta-regression models and performing associated publication bias tests. We demonstrate a clear need for environmental scientists to embrace multilevel meta-analytic models, which explicitly model dependence among effect sizes, rather than the commonly used random-effects models. Further, we discuss how reporting and visual presentations of meta-analytic results can be much improved by following reporting guidelines such as PRISMA-EcoEvo (Preferred Reporting Items for Systematic Reviews and Meta-Analyses for Ecology and Evolutionary Biology). This paper, along with the accompanying online tutorial, serves as a practical guide on conducting a complete set of meta-analytic procedures (i.e., meta-analysis, heterogeneity quantification, meta-regression, publication bias tests and sensitivity analysis) and also as a gateway to more advanced, yet appropriate, methods.

Evidence synthesis is an essential part of science. The method of systematic review provides the most trusted and unbiased way to achieve the synthesis of evidence [ 1 , 2 , 3 ]. Systematic reviews often include a quantitative summary of studies on the topic of interest, referred to as a meta-analysis (for discussion on the definitions of ‘meta-analysis’, see [ 4 ]). The term meta-analysis can also mean a set of statistical techniques for quantitative data synthesis. The methodologies of the meta-analysis were initially developed and applied in medical and social sciences. However, meta-analytic methods are now used in many other fields, including environmental sciences [ 5 , 6 , 7 ]. In environmental sciences, the outcomes of meta-analyses (within systematic reviews) have been used to inform environmental and related policies (see [ 8 ]). Therefore, the reliability of meta-analytic results in environmental sciences is important beyond mere academic interests; indeed, incorrect results could lead to ineffective or sometimes harmful environmental policies [ 8 ].

As in medical and social sciences, environmental scientists frequently use traditional meta-analytic models, namely fixed-effect and random-effects models [ 9 , 10 ]. However, we contend that such models in their original formulation are no longer useful and are often incorrectly used, leading to unreliable estimates and errors. This is mainly because the traditional models assume independence among effect sizes, but almost all primary research papers include more than one effect size, and this non-independence is often not considered (e.g., [ 11 , 12 , 13 ]). Furthermore, previous reviews of published meta-analyses in environmental sciences (hereafter, ‘environmental meta-analyses’) have demonstrated that less than half report or investigate heterogeneity (inconsistency) among effect sizes [ 14 , 15 , 16 ]. Many environmental meta-analyses also do not present any sensitivity analysis, for example, for publication bias (i.e., statistically significant effects being more likely to be published, making collated data unreliable; [ 17 , 18 ]). These issues might have arisen for several reasons, for example, because of no clear conduct guideline for the statistical part of meta-analyses in environmental sciences and rapid developments in meta-analytic methods. Taken together, the field urgently requires a practical guide to implement correct meta-analyses and associated procedures (e.g., heterogeneity analysis, meta-regression, and publication bias tests; cf. [ 19 ]).

To assist environmental scientists in conducting meta-analyses, the aims of this paper are five-fold. First, we provide an overview of the processes involved in a meta-analysis while introducing some key concepts. Second, after introducing the main types of effect size measures, we mathematically describe the two commonly used traditional meta-analytic models, demonstrate their utility, and introduce a practical, multilevel meta-analytic model for environmental sciences that appropriately handles non-independence among effect sizes. Third, we show how to quantify heterogeneity (i.e., consistencies among effect sizes and/or studies) using this model, and then explain such heterogeneity using meta-regression. Fourth, we show how to test for publication bias in a meta-analysis and describe other common types of sensitivity analysis. Fifth, we cover other technical issues relevant to environmental sciences (e.g., scale and phylogenetic dependence) as well as some advanced meta-analytic techniques. In addition, these five aims (sections) are interspersed with two more sections, named ‘Notes’ on: (1) visualisation and interpretation; and (2) reporting and archiving. Some of these sections are accompanied by results from a survey of 73 environmental meta-analyses published between 2019 and 2021; survey results depict current practices and highlight associated problems (for the method of the survey, see Additional file 1 ). Importantly, we provide easy-to-follow implementations of much of what is described below, using the R package, metafor [ 20 ] and other R packages at the webpage ( https://itchyshin.github.io/Meta-analysis_tutorial/ ), which also connects the reader to the wealth of online information on meta-analysis (note that we also provide this tutorial as Additional file 2 ; see also [ 21 ]).

Overview with key concepts

Statistically speaking, we have three general objectives when conducting a meta-analysis [ 12 ]: (1) estimating an overall mean , (2) quantifying consistency ( heterogeneity ) between studies, and (3) explaining the heterogeneity (see Table 1 for the definitions of the terms in italic ). A notable feature of a meta-analysis is that an overall mean is estimated by taking the sampling variance of each effect size into account: a study (effect size) with a low sampling variance (usually based on a larger sample size) is assigned more weight in estimating an overall mean than one with a high sampling variance (usually based on a smaller sample size). However, an overall mean estimate itself is often not informative because one can get the same overall mean estimates in different ways. For example, we may get an overall estimate of zero if all studies have zero effects with no heterogeneity. In contrast, we might also obtain a zero mean across studies that have highly variable effects (e.g., ranging from strongly positive to strongly negative), signifying high heterogeneity. Therefore, quantifying indicators of heterogeneity is an essential part of a meta-analysis, necessary for interpreting the overall mean appropriately. Once we observe non-zero heterogeneity among effect sizes, then, our job is to explain this variation by running meta-regression models, and, at the same time, quantify how much variation is accounted for (often quantified as R 2 ). In addition, it is important to conduct an extra set of analyses, often referred to as publication bias tests , which are a type of sensitivity analysis [ 11 ], to check the robustness of meta-analytic results.

Choosing an effect size measure

In this section, we introduce different kinds of ‘effect size measures’ or ‘effect measures’. In the literature, the term ‘effect size’ is typically used to refer to the magnitude or strength of an effect of interest or its biological interpretation (e.g., environmental significance). Effect sizes can be quantified using a range of measures (for details, see [ 22 ]). In our survey of environmental meta-analyses (Additional file 1 ), the two most commonly used effect size measures are: the logarithm of response ratio, lnRR ([ 23 ]; also known as the ratio of means; [ 24 ]) and standardized mean difference, SMD (often referred to as Hedges’ g or Cohen’s d [ 25 , 26 ]). These are followed by proportion (%) and Fisher’s z -transformation of correlation, or Zr . These four effect measures nearly fit into the three categories, which are named: (1) single-group measures (a statistical summary from one group; e.g., proportion), (2) comparative measures (comparing between two groups e.g., SMD and lnRR), and (3) association measures (relationships between two variables; e.g., Zr ). Table 2 summarizes effect measures that are common or potentially useful for environmental scientists. It is important to note that any measures with sampling variance can become an ‘effect size’. The main reason why SMD, lnRR, Zr, or proportion are popular effect measures is that they are unitless, while a meta-analysis of mean, or mean difference, can only be conducted when all effect sizes have the same unit (e.g., cm, kg).

Table 2 also includes effect measures that are likely to be unfamiliar to environmental scientists; these are effect sizes that characterise differences in the observed variability between samples, (i.e., lnSD, lnCV, lnVR and lnCVR; [ 27 , 28 ]) rather than central tendencies (averages). These dispersion-based effect measures can provide us with extra insights along with average-based effect measures. Although the literature survey showed none of these were used in our sample, these effect sizes have been used in many fields, including agriculture (e.g., [ 29 ]), ecology (e.g., [ 30 ]), evolutionary biology (e.g., [ 31 ]), psychology (e.g., [ 32 ]), education (e.g., [ 33 ]), psychiatry (e.g., [ 34 ]), and neurosciences (e.g. [ 35 ],),. Perhaps, it is not difficult to think of an environmental intervention that can affect not only the mean but also the variance of measurements taken on a group of individuals or a set of plots. For example, environmental stressors such as pesticides and eutrophication are likely to increase variability in biological systems because stress accentuates individual differences in environmental responses (e.g. [ 36 , 37 ],). Such ideas are yet to be tested meta-analytically (cf. [ 38 , 39 ]).

Choosing a meta-analytic model

Fixed-effect and random-effects models.

Two traditional meta-analytic models are called the ‘fixed-effect’ model and the ‘random-effects’ model. The former assumes that all effect sizes (from different studies) come from one population (i.e., they have one true overall mean), while the latter does not have such an assumption (i.e., each study has different overall means or heterogeneity exists among studies; see below for more). The fixed-effect model, which should probably be more correctly referred to as the ‘common-effect’ model, can be written as [ 9 , 10 , 40 ]:

where the intercept, \({\beta }_{0}\) is the overall mean, z j (the response/dependent variable) is the effect size from the j th study ( j  = 1, 2,…, N study ; in this model, N study  = the number of studies = the number of effect sizes), m j is the sampling error, related to the j th sampling variance ( v j ), which is normally distributed with the mean of 0 and the ‘study-specific’ sampling variance, v j (see also Fig.  1 A).

figure 1

Visualisation of the three statistical models of meta-analysis: A a fixed-effect model (1-level), B a random-effects model (2-level), and C a multilevel model (3-level; see the text for what symbols mean)

The overall mean needs to be estimated and often done so as the weighted average with the weights, \({w}_{j}=1/{v}_{j}\) (i.e., the inverse-variance approach). An important, but sometimes untenable, assumption of meta-analysis is that sampling variance is known. Indeed, we estimate sampling variance, using formulas, as in Table 2 , meaning that vj is submitted by sampling variance estimates (see also section ‘ Scale dependence ’). Of relevance, the use of the inverse-variance approach has been recently criticized, especially for SMD and lnRR [ 41 , 42 ] and we note that the inverse-variance approach using the formulas in Table 2 is one of several different weighting approaches used in meta-analysis (e.g., for adjusted sampling-variance weighing, see [ 43 , 44 ]; for sample-size-based weighting, see [ 41 , 42 , 45 , 46 ]). Importantly, the fixed-effect model assumes that the only source of variation in effect sizes ( z j ) is the effect due to sampling variance (which is inversely proportional to the sample size, n ; Table 2 ).

Similarly, the random-effects model can be expressed as:

where u j is the j th study effect, which is normally distributed with the mean of 0 and the between-study variance, \({\tau }^{2}\) (for different estimation methods, see [ 47 , 48 , 49 , 50 ]), and other notations are the same as in Eq.  1 (Fig.  1 B). Here, the overall mean can be estimated as the weighted average with weights \({w}_{j}=1/\left({\tau }^{2}+{v}_{j}^{2}\right)\) (note that different weighting approaches, mentioned above, are applicable to the random-effects model and some of them are to the multilevel model, introduced below). The model assumes each study has its specific mean, \({b}_{0}+{u}_{j}\) , and (in)consistencies among studies (effect sizes) are indicated by \({\tau }^{2}\) . When \({\tau }^{2}\) is 0 (or not statistically different from 0), the random-effects model simplifies to the fixed-effect model (cf. Equations  1 and 2 ). Given no studies in environmental sciences are conducted in the same manner or even at exactly the same place and time, we should expect different studies to have different means. Therefore, in almost all cases in the environmental sciences, the random-effects model is a more ‘realistic’ model [ 9 , 10 , 40 ]. Accordingly, most environmental meta-analyses (68.5%; 50 out of 73 studies) in our survey used the random-effects model, while only 2.7% (2 of 73 studies) used the fixed-effect model (Additional file 1 ).

Multilevel meta-analytic models

Although we have introduced the random-effects model as being more realistic than the fixed-effect model (Eq.  2 ), we argue that the random-effects model is rather limited and impractical for the environmental sciences. This is because random-effects models, like fixed-effect models, assume all effect sizes ( z j ) to be independent. However, when multiple effect sizes are obtained from a study, these effect sizes are dependent (for more details, see the next section on non-independence). Indeed, our survey showed that in almost all datasets used in environmental meta-analyses, this type of non-independence among effect sizes occurred (97.3%; 71 out of 73 studies, with two studies being unclear, so effectively 100%; Additional file 1 ). Therefore, we propose the simplest and most practical meta-analytic model for environmental sciences as [ 13 , 40 ] (see also [ 51 , 52 ]):

where we explicitly recognize that N effect ( i  = 1, 2,…, N effect ) >  N study ( j  = 1, 2,…, N study ) and, therefore, we now have the study effect (between-study effect), u j[i] (for the j th study and i th effect size) and effect-size level (within-study) effect, e i (for the i th effect size), with the between-study variance, \({\tau }^{2}\) , and with-study variance, \({\sigma }^{2}\) , respectively, and other notations are the same as above. We note that this model (Eq.  3 ) is an extension of the random-effects model (Eq.  2 ), and we refer to it as the multilevel/hierarchical model (used in 7 out of 73 studies: 9.6% [Additional file 1 ]; note that Eq.  3 is also known as a three-level meta-analytic model; Fig.  1 C). Also, environmental scientists who are familiar with (generalised) linear mixed-models may recognize u j (the study effect) as the effect of a random factor which is associated with a variance component, i.e., \({\tau }^{2}\) [ 53 ]; also, e i and m i can be seen as parts of random factors, associated with \({\sigma }^{2}\) and v i (the former is comparable to the residuals, while the latter is sampling variance, specific to a given effect size).

It seems that many researchers are aware of the issue of non-independence so that they often use average effect sizes per study or choose one effect size (at least 28.8%, 21 out of 73 environmental meta-analyses; Additional file 1 ). However, as we discussed elsewhere [ 13 , 40 ], such averaging or selection of one effect size per study dramatically reduces our ability to investigate environmental drivers of variation among effect sizes [ 13 ]. Therefore, we strongly support the use of the multilevel model. Nevertheless, this proposed multilevel model, formulated as Eq.  3 does not usually deal with the issue of non-independence completely, which we elaborate on in the next section.

Non-independence among effect sizes and among sampling errors

When you have multiple effect sizes from a study, there are two broad types and three cases of non-independence (cf. [ 11 , 12 ]): (1) effect sizes are calculated from different cohorts of individuals (or groups of plots) within a study (Fig.  2 A, referred to as ‘shared study identity’), and (2) effects sizes are calculated from the same cohort of individuals (or group of plots; Fig.  2 B, referred to as ‘shared measurements’) or partially from the same individuals and plots, more concretely, sharing individuals and plots from the control group (Fig.  2 C, referred to as ‘shared control group’). The first type of non-independence induces dependence among effect sizes, but not among sampling variances, and the second type leads to non-independence among sampling variances. Many datasets, if not almost all, will have a combination of these three cases (or even are more complex, see the section " Complex non-independence "). Failing to deal with these non-independences will inflate Type 1 error (note that the overall estimate, b 0 is unlikely to be biased, but standard error of b 0 , se( b 0 ), will be underestimated; note that this is also true for all other regression coefficients, e.g., b 1 ; see Table 1 ). The multilevel model (as in Eq.  3 ) only takes care of cases of non-independence that are due to the shared study identity but neither shared measurements nor shared control group.

figure 2

Visualisation of the three types of non-independence among effect sizes: A due to shared study identities (effect sizes from the same study), B due to shared measurements (effect sizes come from the same group of individuals/plots but are based on different types of measurements), and C due to shared control (effect sizes are calculated using the same control group and multiple treatment groups; see the text for more details)

There are two practical ways to deal with non-independence among sampling variances. The first method is that we explicitly model such dependence using a variance–covariance (VCV) matrix (used in 6 out of 73 studies: 8.2%; Additional file 1 ). Imagine a simple scenario with a dataset of three effect sizes from two studies where two effects sizes from the first study are calculated (partially) using the same cohort of individuals (Fig.  2 B); in such a case, the sampling variance effect, \({m}_{i}\) , as in Eq.  3 , should be written as:

where M is the VCV matrix showing the sampling variances, \({v}_{1\left[1\right]}\) (study 1 and effect size 1), \({v}_{1\left[2\right]}\) (study 1 and effect size 2), and \({v}_{2\left[3\right]}\) (study 2 and effect size 3) in its diagonal, and sampling covariance, \(\rho \sqrt{{v}_{1\left[1\right]}{v}_{1\left[2\right]}}= \rho \sqrt{{v}_{1\left[2\right]}{v}_{1\left[1\right]}}\) in its off-diagonal elements, where \(\rho \) is a correlation between two sampling variances due to shared samples (individuals/plots). Once this VCV matrix is incorporated into the multilevel model (Eq.  3 ), all the types of non-independence, as in Fig.  2 , are taken care of. Table 3 shows formulas for the sampling variance and covariance of the four common effect sizes (SDM, lnRR, proportion and Zr ). For comparative effect measures (Table 2 ), exact covariances can be calculated under the case of ‘shared control group’ (see [ 54 , 55 ]). But this is not feasible for most circumstances because we usually do not know what \(\rho \) should be. Some have suggested fixing this value at 0.5 (e.g., [ 11 ]) or 0.8 (e.g., [ 56 ]); the latter is a more conservative assumption. Or one can run both and use one for the main analysis and the other for sensitivity analysis (for more, see the ‘ Conducting sensitivity analysis and critical appraisal " section).

The second method overcomes this very issue of unknown \(\rho \) by approximating average dependence among sampling variance (and effect sizes) from the data and incorporating such dependence to estimate standard errors (only used in 1 out of 73 studies; Additional file 1 ). This method is known as ‘robust variance estimation’, RVE, and the original estimator was proposed by Hedges and colleagues in 2010 [ 57 ]. Meta-analysis using RVE is relatively new, and this method has been applied to multilevel meta-analytic models only recently [ 58 ]. Note that the random-effects model (Eq.  2 ) and RVE could correctly model both types of non-independence. However, we do not recommend the use of RVE with Eq.  2 because, as we will later show, estimating \({\sigma }^{2}\) as well as \({\tau }^{2}\) will constitute an important part of understanding and gaining more insights from one’s data. We do not yet have a definite recommendation on which method to use to account for non-independence among sampling errors (using the VCV matrix or RVE). This is because no simulation work in the context of multilevel meta-analysis has been done so far, using multilevel meta-analyses [ 13 , 58 ]. For now, one could use both VCV matrices and RVE in the same model [ 58 ] (see also [ 21 ]).

Quantifying and explaining heterogeneity

Measuring consistencies with heterogeneity.

As mentioned earlier, quantifying heterogeneity among effect sizes is an essential component of any meta-analysis. Yet, our survey showed only 28 out of 73 environmental meta-analyses (38.4%; Additional file 1 ) report at least one index of heterogeneity (e.g., \({\tau }^{2}\) , Q , and I 2 ). Conventionally, the presence of heterogeneity is tested by Cochrane’s Q test. However, Q (often noted as Q T or Q total ), and its associated p value, are not particularly informative: the test does not tell us about the extent of heterogeneity (e.g. [ 10 ],), only whether heterogeneity is zero or not (when p  < 0.05). Therefore, for environmental scientists, we recommend two common ways of quantifying heterogeneity from a meta-analytic model: absolute heterogeneity measure (i.e., variance components, \({\tau }^{2}\) and \({\sigma }^{2}\) ) and relative heterogeneity measure (i.e., I 2 ; see also the " Notes on visualisation and interpretation " section for another way of quantifying and visualising heterogeneity at the same time, using prediction intervals; see also [ 59 ]). We have already covered the absolute measure (Eqs.  2 & 3 ), so here we explain I 2 , which ranges from 0 to 1 (for some caveats for I 2 , see [ 60 , 61 ]). The heterogeneity measure, I 2 , for the random-effect model (Eq.  2 ) can be written as:

Where \(\overline{v}\) is referred to as the typical sampling variance (originally this is called ‘within-study’ variance, as in Eq.  2 , and note that in this formulation, within-study effect and the effect of sampling error is confounded; see [ 62 , 63 ]; see also [ 64 ]) and the other notations are as above. As you can see from Eq.  5 , we can interpret I 2 as relative variation due to differences between studies (between-study variance) or relative variation not due to sampling variance.

By seeing I 2 as a type of interclass correlation (also known as repeatability [ 65 ],), we can generalize I 2 to multilevel models. In the case of Eq.  3 ([ 40 , 66 ]; see also [ 52 ]), we have:

Because we can have two more I 2 , Eq.  7 is written as \({I}_{total}^{2}\) ; these other two are \({I}_{study}^{2}\) and \({I}_{effect}^{2}\) , respectively:

\({I}_{total}^{2}\) represents relative variance due to differences both between and within studies (between- and within-study variance) or relative variation not due to sampling variance, while \({I}_{study}^{2}\) is relative variation due to differences between studies, and \({I}_{effect}^{2}\) is relative variation due to differences within studies (Fig.  3 A). Once heterogeneity is quantified (note almost all data will have non-zero heterogeneity and an earlier meta-meta-analysis suggests in ecology, we have on average, I 2 close to 90% [ 66 ]), it is time to fit a meta-regression model to explain the heterogeneity. Notably, the magnitude of \({I}_{study}^{2}\) (and \({\tau }^{2}\) ) and \({I}_{effect}^{2}\) (and \({\sigma }^{2}\) ) can already inform you which predictor variable (usually referred to as ‘moderator’) is likely to be important, which we explain in the next section.

figure 3

Visualisation of variation (heterogeneity) partitioned into different variance components: A quantifying different types of I 2 from a multilevel model (3-level; see Fig.  1 C) and B variance explained, R 2 , by moderators. Note that different levels of variances would be explained, depending on which level a moderator belongs to (study level and effect-size level)

Explaining variance with meta-regression

We can extend the multilevel model (Eq.  3 ) to a meta-regression model with one moderator (also known as predictor, independent, explanatory variable, or fixed factor), as below:

where \({\beta }_{1}\) is a slope of the moderator ( x 1 ), \({x}_{1j\left[i\right]}\) denotes the value of x 1 , corresponding to the j th study (and the i th effect sizes). Equation ( 10 ) (meta-regression) is comparable to the simplest regression with the intercept ( \({\beta }_{0}\) ) and slope ( \({\beta }_{1}\) ). Notably, \({x}_{1j\left[i\right]}\) differs between studies and, therefore, it will mainly explain the variance component, \({\tau }^{2}\) (which relates to \({I}_{study}^{2}\) ). On the other hand, if noted like \({x}_{1i}\) , this moderator would vary within studies or at the level of effect sizes, therefore, explaining \({\sigma }^{2}\) (relating to \({I}_{effect}^{2}\) ). Therefore, when \({\tau }^{2}\) ( \({I}_{study}^{2}\) ), or \({\sigma }^{2}\) ( \({I}_{effect}^{2}\) ), is close to zero, there will be little point fitting a moderator(s) at the level of studies, or effect sizes, respectively.

As in multiple regression, we can have multiple (multi-moderator) meta-regression, which can be written as:

where \(\sum_{h=1}^{q}{\beta }_{h}{x}_{h\left[i\right]}\) denotes the sum of all the moderator effects, with q being the number of slopes (staring with h  = 1). We note that q is not necessarily the number of moderators. This is because when we have a categorical moderator, which is common, with more than two levels (e.g., method A, B & C), the fixed effect part of the formula is \({\beta }_{0}+{\beta }_{1}{x}_{1}+{\beta }_{2}{x}_{2}\) , where x 1 and x 2 are ‘dummy’ variables, which code whether the i th effect size belongs to, for example, method B or C, with \({\beta }_{1}\) and \({\beta }_{2}\) being contrasts between A and B and between A and C, respectively (for more explanations of dummy variables, see our tutorial page [ https://itchyshin.github.io/Meta-analysis_tutorial/ ]; also see [ 67 , 68 ]). Traditionally, researchers conduct separate meta-analyses per different groups (known as ‘sub-group analysis’), but we prefer a meta-regression approach with a categorical variable, which is statistically more powerful [ 40 ]. Also, importantly, what can be used as a moderator(s) is very flexible, including, for example, individual/plot characteristics (e.g., age, location), environmental factors (e.g., temperature), methodological differences between studies (e.g., randomization), and bibliometric information (e.g., publication year; see more in the section ‘Checking for publication bias and robustness’). Note that moderators should be decided and listed a priori in the meta-analysis plan (i.e., a review protocol or pre-registration).

As with meta-analysis, the Q -test ( Q m or Q moderator ) is often used to test the significance of the moderator(s). To complement this test, we can also quantify variance explained by the moderator(s) using R 2 . We can define R 2 using Eq. ( 11 ) as:

where R 2 is known as marginal R 2 (sensu [ 69 , 70 ]; cf. [ 71 ]), \({f}^{2}\) is the variance due to the moderator(s), and \({(f}^{2}+{\tau }^{2}+{\sigma }^{2})\) here equals to \(({\tau }^{2}+{\sigma }^{2})\) in Eq.  7 , as \({f}^{2}\) ‘absorbs’ variance from \({\tau }^{2}\) and/or \({\sigma }^{2}\) . We can compare the similarities and differences in Fig.  3 B where we denote a part of \({f}^{2}\) originating from \({\tau }^{2}\) as \({f}_{study}^{2}\) while \({\sigma }^{2}\) as \({f}_{effect}^{2}\) . In a multiple meta-regression model, we often want to find a model with the ‘best’ or an adequate set of predictors (i.e., moderators). R 2 can potentially help such a model selection process. Yet, methods based on information criteria (such as Akaike information criterion, AIC) may be preferable. Although model selection based on the information criteria is beyond the scope of the paper, we refer the reader to relevant articles (e.g., [ 72 , 73 ]), and we show an example of this procedure in our online tutorial ( https://itchyshin.github.io/Meta-analysis_tutorial/ ).

Notes on visualisation and interpretation

Visualization and interpretation of results is an essential part of a meta-analysis [ 74 , 75 ]. Traditionally, a forest plot is used to display the values and 95% of confidence intervals (CIs) for each effect size and the overall effect and its 95% CI (the diamond symbol is often used, as shown in Fig.  4 A). More recently, adding a 95% prediction interval (PI) to the overall estimate has been strongly recommended because 95% PIs show a predicted range of values in which an effect size from a new study would fall, assuming there is no sampling error [ 76 ]. Here, we think that examining the formulas for 95% CIs and PIs for the overall mean (from Eq.  3 ) is illuminating:

where \({t}_{df\left[\alpha =0.05\right]}\) denotes the t value with the degree of freedom, df , at 97.5 percentile (or \(\alpha =0.05\) ) and other notations are as above. In a meta-analysis, it has been conventional to use z value 1.96 instead of \({t}_{df\left[\alpha =0.05\right]}\) , but simulation studies have shown the use of t value over z value reduces Type 1 errors under many scenarios and, therefore, is recommended (e.g., [ 13 , 77 ]). Also, it is interesting to note that by plotting 95% PIs, we can visualize heterogeneity as Eq.  15 includes \({\tau }^{2}\) and \({\sigma }^{2}\) .

figure 4

Different types of plots useful for a meta-analysis using data from Midolo et al. [ 133 ]: A a typical forest plot with the overall mean shown as a diamond at the bottom (20 effect sizes from 20 studies are used), B a caterpillar plot (100 effect sizes from 24 studies are used), C an orchard plot of categorical moderator with seven levels (all effect sizes are used), and D a bubble plot of a continuous moderator. Note that the first two only show confidence intervals, while the latter two also show prediction intervals (see the text for more details)

A ‘forest’ plot can become quickly illegible as the number of studies (effect sizes) becomes large, so other methods of visualizing the distribution of effect sizes have been suggested. Some suggested to present a ‘caterpillar’ plot, which is a version of the forest plot, instead (Fig.  4 B; e.g., [ 78 ]). We here recommend an ‘orchard’ plot, as it can present results across different groups (or a result of meta-regression with a categorical variable), as shown in Fig.  4 C [ 78 ]. For visualization of a continuous variable, we suggest what is called a ‘bubble’ plot, shown in Fig.  4 D. Visualization not only helps us interpret meta-analytic results, but can also help to identify something we may not see from statistical results, such as influential data points and outliers that could threaten the robustness of our results.

Checking for publication bias and robustness

Detecting and correcting for publication bias.

Checking for and adjusting for any publication bias is necessary to ensure the validity of meta-analytic inferences [ 79 ]. However, our survey showed almost half of the environmental meta-analyses (46.6%; 34 out of 73 studies; Additional file 1 ) neither tested for nor corrected for publication bias (cf. [ 14 , 15 , 16 ]). The most popular methods used were: (1) graphical tests using funnel plots (26 studies; 35.6%), (2) regression-based tests such as Egger regression (18 studies; 24.7%), (3) Fail-safe number tests (12 studies; 16.4%), and (4) trim-and-fill tests (10 studies; 13.7%). We recently showed that these methods are unsuitable for datasets with non-independent effect sizes, with the exception of funnel plots [ 80 ] (for an example of funnel plots, see Fig.  5 A). This is because these methods cannot deal with non-independence in the same way as the fixed-effect and random-effects models. Here, we only introduce a two-step method for multilevel models that can both detect and correct for publication bias [ 80 ] (originally proposed by [ 81 , 82 ]), more specifically, the “small study effect” where an effect size value from a small-sample-sized study can be much larger in magnitude than a ‘true’ effect [ 83 , 84 ]. This method is a simple extension of Egger’s regression [ 85 ], which can be easily implemented by using Eq.  10 :

where \({\widetilde{n}}_{i}\) is known as effective sample size; for Zr and proportion it is just n i , and for SMD and lnRR, it is \({n}_{iC}{n}_{iT}/\left({n}_{iC}+{n}_{iT}\right)\) , as in Table 2 . When \({\beta }_{1}\) is significant, we conclude there exists a small-study effect (in terms of a funnel plot, this is equivalent to significant funnel asymmetry). Then, we fit Eq.  17 and we look at the intercept \({\beta }_{0}\) , which will be a bias-corrected overall estimate [note that \({\beta }_{0}\) in Eq. ( 16 ) provides less accurate estimates when non-zero overall effects exist [ 81 , 82 ]; Fig.  5 B]. An intuitive explanation of why \({\beta }_{0}\) (Eq.  17 ) is the ‘bias-corrected’ estimate is that the intercept represents \(1/\widetilde{{n}_{i}}=0\) (or \(\widetilde{{n}_{i}}=\infty \) ); in other words, \({\beta }_{0}\) is the estimate of the overall effect when we have a very large (infinite) sample size. Of note, appropriate bias correction requires a selection-mode-based approach although such an approach is yet to be available for multilevel meta-analytic models [ 80 ].

figure 5

Different types of plots for publication bias tests: A a funnel plot using model residuals, showing a funnel (white) that shows the region of statistical non-significance (30 effect sizes from 30 studies are used; note that we used the inverse of standard errors for the y -axis, but for some effect sizes, sample size or ‘effective’ sample size may be more appropriate), B a bubble plot visualising a multilevel meta-regression that tests for the small study effect (note that the slope was non-significant: b  = 0.120, 95% CI = [− 0.095, 0.334]; all effect sizes are used), and C a bubble plot visualising a multilevel meta-regression that tests for the decline effect (the slope was non-significant: b  = 0.003, 95%CI = [− 0.002, 0.008])

Conveniently, this proposed framework can be extended to test for another type of publication bias, known as time-lag bias, or the decline effect, where effect sizes tend to get closer to zero over time, as larger or statistically significant effects are published more quickly than smaller or non-statistically significant effects [ 86 , 87 ]. Again, a decline effect can be statistically tested by adding year to Eq. ( 3 ):

where \(c\left(yea{r}_{j\left[i\right]}\right)\) is the mean-centred publication year of a particular study (study j and effect size i ); this centring makes the intercept \({\beta }_{0}\) meaningful, representing the overall effect estimate at the mean value of publication years (see [ 68 ]). When the slope is significantly different from 0, we deem that we have a decline effect (or time-lag bias; Fig.  5 C).

However, there may be some confounding moderators, which need to be modelled together. Indeed, Egger’s regression (Eqs.  16 and 17 ) is known to detect the funnel asymmetry when there is little heterogeneity; this means that we need to model \(\sqrt{1/{\widetilde{n}}_{i}}\) with other moderators that account for heterogeneity. Given this, we probably should use a multiple meta-regression model, as below:

where \(\sum_{h=3}^{q}{\beta }_{h}{x}_{h\left[i\right]}\) is the sum of the other moderator effects apart from the small-study effect and decline effect, and other notations are as above (for more details see [ 80 ]). We need to carefully consider which moderators should go into Eq.  19 (e.g., fitting all moderators or using an AIC-based model selection method; see [ 72 , 73 ]). Of relevance, when running complex models, some model parameters cannot be estimated well, or they are not ‘identifiable’ [ 88 ]. This is especially so for variance components (random-effect part) rather than regression coeffects (fixed-effect part). Therefore, it is advisable to check whether model parameters are all identifiable, which can be checked using the profile function in metafor (for an example, see our tutorial webpage [ https://itchyshin.github.io/Meta-analysis_tutorial/ ]).

Conducting sensitivity analysis and critical appraisal

Sensitivity analysis explores the robustness of meta-analytic results by running a different set of analyses from the original analysis, and comparing the results (note that some consider publication bias tests a part of sensitivity analysis; [ 11 ]). For example, we might be interested in assessing how robust results are to the presence of influential studies, to the choice of method for addressing non-independence, or weighting effect sizes. Unfortunately, in our survey, only 37% of environmental meta-analyses (27 out of 73) conducted sensitivity analysis (Additional file 1 ). There are two general and interrelated ways to conduct sensitivity analyses [ 73 , 89 , 90 ]. The first one is to take out influential studies (e.g., outliers) and re-run meta-analytic and meta-regression models. We can also systematically take each effect size out and run a series of meta-analytic models to see whether any resulting overall effect estimates are different from others; this method is known as ‘leave-one-out’, which is considered less subjective and thus recommended.

The second way of approaching sensitivity analysis is known as subset analysis, where a certain group of effect sizes (studies) will be excluded to re-run the models without this group of effect sizes. For example, one may want to run an analysis without studies that did not randomize samples. Yet, as mentioned earlier, we recommend using meta-regression (Eq.  13 ) with a categorical variable of randomization status (‘randomized’ or ‘not randomized’), to statistically test for an influence of moderators. It is important to note that such tests for risk of bias (or study quality) can be considered as a way of quantitatively evaluating the importance of study features that were noted at the stage of critical appraisal, which is an essential part of any systematic review (see [ 11 , 91 ]). In other words, we can use meta-regression or subset analysis to quantitatively conduct critical appraisal using (study-level) moderators that code, for example, blinding, randomization, and selective reporting. Despite the importance of critical appraisal ([ 91 ]), only 4 of 73 environmental meta-analyses (5.6%) in our survey assessed the risk of bias in each study included in a meta-analysis (i.e., evaluating a primary study in terms of the internal validity of study design and reporting; Additional file 1 ). We emphasize that critically appraising each paper or checking them for risk of bias is an extremely important topic. Also, critical appraisal is not restricted to quantitative synthesis. Therefore, we do not cover any further in this paper for more, see [ 92 , 93 ]).

Notes on transparent reporting and open archiving

For environmental systematic reviews and maps, there are reporting guidelines called RepOrting standards for Systematic Evidence Syntheses in environmental research, ROSES [ 94 ] and synthesis assessment checklist, the Collaboration for Environmental Evidence Synthesis Appraisal Tool (CEESAT; [ 95 ]). However, these guidelines are somewhat limited in terms of reporting quantitative synthesis because they cover only a few core items. These two guidelines are complemented by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses for Ecology and Evolutionary Biology (PRISMA-EcoEvo; [ 96 ]; cf. [ 97 , 98 ]), which provides an extended set of reporting items covering what we have described above. Items 20–24 from PRISMA-EcoEvo are most relevant: these items outline what should be reported in the Methods section: (i) sample sizes and study characteristics, (ii) meta-analysis, (iii) heterogeneity, (iv) meta-regression and (v) outcomes of publication bias and sensitivity analysis (see Table 4 ). Our survey, as well as earlier surveys, suggest there is a large room for improvement in the current practice ([ 14 , 15 , 16 ]). Incidentally, the orchard plot is well aligned with Item 20, as this plot type shows both the number of effect sizes and studies for different groups (Fig.  4 C). Further, our survey of environmental meta-analyses highlighted the poor standards of data openness (with 24 studies sharing data: 32.9%) and code sharing (7 studies: 29.2%; Additional file 1 ). Environmental scientists must archive their data as well as their analysis code in accordance with the FAIR principles (Findable, Accessible, Interoperable, and Reusable [ 99 ]) using dedicated depositories such as Dryad, FigShare, Open Science Framework (OSF), Zenodo or others (cf. [ 100 , 101 ]), preferably not on publisher’s webpages (as paywall may block access). However, archiving itself is not enough; data requires metadata (detailed descriptions) and the code needs to also be FAIR [ 102 , 103 ].

Other relevant and advanced issues

Scale dependence.

The issue of scale dependence is a unique yet widespread problem in environmental sciences (see [ 7 , 104 ]); our literature survey indicated three quarters of the environmental meta-analyses (56 out of 73 studies) have inferences that are potentially vulnerable to scale-dependence [ 105 ]. For example, studies that set out to compare group means in biodiversity measures, such as species richness, can vary as a function of the scale (size) of the sampling unit. When the unit of replication is a plot (not an individual animal or plant), the aerial size of a plot (e.g., 100 cm 2 or 1 km 2 ) will affect both the precision and accuracy of effect size estimates (e.g., lnRR and SMD). In general, a study with larger plots might have more accurately estimated species richness differences, but less precisely than a study with smaller plots and greater replication. Lower replication means that our sampling variance estimates are likely to be misestimated, and the study with larger plots will generally have less weight than the study with smaller plots, due to higher sampling variance. Inaccurate variance estimates in little-replicated ecological studies are known to cause an accumulating bias in precision-weighted meta-analysis, requiring correction [ 43 ]. To assess the potential for scale-dependence, it is recommended that analysts test for possible covariation among plot size, replication, variances, and effect sizes [ 104 ]. If detected, analysts should use an effect size measure that is less sensitive to scale dependence (lnRR), and could use the size of a plot as a moderator in meta-regression, or alternatively, they consider running an unweighted model ([ 7 ]; note that only 12%, 9 out of 73 studies, accounted for sampling area in some way; Additional file 1 ).

  • Missing data

In many fields, meta-analytic data almost always encompass missing values see [ 106 , 107 , 108 ]. Broadly, we have two types of missing data in meta-analyses [ 109 , 110 ]: (1) missing data in standard deviations or sample sizes, associated with means, preventing effect size calculations (Table 2 ), and (2) missing data in moderators. There are several solutions for both types. The best, and first to try, should be contacting the authors. If this fails, we can potentially ‘impute’ missing data. Single imputation methods using the strong correlation between standard deviation and mean values (known as mean–variance relationship) are available, although single imputation can lead to Type I error [ 106 , 107 ] (see also [ 43 ]) because we do not model the uncertainty of imputation itself. Contrastingly, multiple imputation, which creates multiple versions of imputed datasets, incorporates such uncertainty. Indeed, multiple imputation is a preferred and proven solution for missing data in effect sizes and moderators [ 109 , 110 ]. Yet, correct implementation can be challenging (see [ 110 ]). What we require now is an automated pipeline of merging meta-analysis and multiple imputation, which accounts for imputation uncertainty, although it may be challenging for complex meta-analytic models. Fortunately, however, for lnRR, there is a series of new methods that can perform better than the conventional method and which can deal with missing SDs [ 44 ]; note that these methods do not deal with missing moderators. Therefore, where applicable, we recommend these new methods, until an easy-to-implement multiple imputation workflow arrives.

Complex non-independence

Above, we have only dealt with the model that includes study identities as a clustering/grouping (random) factor. However, many datasets are more complex, with potentially more clustering variables in addition to the study identity. It is certainly possible that an environmental meta-analysis contains data from multiple species. Such a situation creates an interesting dependence among effect sizes from different species, known as phylogenetic relatedness, where closely related species are more likely to be similar in effect sizes compared to distantly related ones (e.g., mice vs. rats and mice vs. sparrows). Our multilevel model framework is flexible and can accommodate phylogenetic relatedness. A phylogenetic multilevel meta-analytic model can be written as [ 40 , 111 , 112 ]:

where \({a}_{k\left[i\right]}\) is the phylogenetic (species) effect for the k th species (effect size i ; N effect ( i  = 1, 2,…, N effect ) >  N study ( j  = 1, 2,…, N study ) >  N species ( k  = 1, 2,…, N species )), normally distributed with \({\omega }^{2}{\text{A}}\) where is the phylogenetic variance and A is a correlation matrix coding how close each species are to each other and \({\omega }^{2}\) is the phylogenetic variance, \({s}_{k\left[i\right]}\) is the non-phylogenetic (species) effect for the k th species (effect size i ), normally distributed with the variance of \({\gamma }^{2}\) (the non-phylogenetic variance), and other notations are as above. It is important to realize that A explicitly models relatedness among species, and we do need to provide this correlation matrix, using a distance relationship usually derived from a molecular-based phylogenetic tree (for more details, see [ 40 , 111 , 112 ]). Some may think that the non-phylogenetic term ( \({s}_{k\left[i\right]}\) ) is unnecessary or redundant because \({s}_{k\left[i\right]}\) and the phylogenetic term ( \({a}_{k\left[i\right]}\) ) are both modelling variance at the species level. However, a simulation recently demonstrated that failing to have the non-phylogenetic term ( \({s}_{k\left[i\right]}\) ) will often inflate the phylogenetic variance \({\omega }^{2}\) , leading to an incorrect conclusion that there is a strong phylogenetic signal (as shown in [ 112 ]). The non-phylogenetic variance ( \({\gamma }^{2}\) ) arises from, for example, ecological similarities among species (herbivores vs. carnivores or arboreal vs. ground-living) not phylogeny [ 40 ].

Like phylogenetic relatedness, effect sizes arising from closer geographical locations are likely to be more correlated [ 113 ]. Statistically, spatial correlation can be also modelled in a manner analogous to phylogenetic relatedness (i.e., rather than a phylogenetic correlation matrix, A , we fit a spatial correlation matrix). For example, Maire and colleagues [ 114 ] used a meta-analytic model with spatial autocorrelation to investigate the temporal trends of fish communities in the network of rivers in France. We note that a similar argument can be made for temporal correlation, but in many cases, temporal correlations could be dealt with, albeit less accurately, as a special case of ‘shared measurements’, as in Fig.  2 . An important idea to take away is that one can model different, if not all, types of non-independence as the random factor(s) in a multilevel model.

Advanced techniques

Here we touch upon five advanced meta-analytic techniques with potential utility for environmental sciences, providing relevant references so that interested readers can obtain more information on these advanced topics. The first one is the meta-analysis of magnitudes, or absolute values (effect sizes), where researchers may be interested in deviations from 0, rather than the directionality of the effect [ 115 ]. For example, Cohen and colleagues [ 116 ] investigated absolute values of phenological responses, as they were concerned with the magnitudes of changes in phenology rather than directionality.

The second method is the meta-analysis of interaction where our focus is on synthesizing the interaction effect of, usually, 2 × 2 factorial design (e.g., the effect of two simultaneous environmental stressors [ 54 , 117 , 118 ]; see also [ 119 ]). Recently, Siviter and colleagues [ 120 ] showed that agrochemicals interact synergistically (i.e., non-additively) to increase the mortality of bees; that is, two agrochemicals together caused more mortality than the sum of mortalities of each chemical.

Third, network meta-analysis has been heavily used in medical sciences; network meta-analysis usually compares different treatments in relation to placebo and ranks these treatments in terms of effectiveness [ 121 ]. The very first ‘environmental’ network meta-analysis, as far as we know, investigated the effectives of ecosystem services among different land types [ 122 ].

Fourth, a multivariate meta-analysis is where one can model two or more different types of effect sizes with the estimation of pair-wise correlations between different effect sizes. The benefit of such an approach is known as the ‘borrowing of strength’, where the error of fixed effects (moderators; e.g., b 0 and b 1 ) can be reduced when different types of effect sizes are correlated (i.e., se ( b 0 ) and se ( b 1 ) can be smaller [ 123 ]) For example, it is possible for lnRR (differences in mean) and lnVR (differences in SDs) to be modelled together (cf. [ 124 ]).

Fifth, as with network meta-analysis, there has been a surge in the use of ‘individual participants data’, called ‘IPD meta-analysis’, in medical sciences [ 125 , 126 ]. The idea of IPD meta-analysis is simple—rather than using summary statistics reported in papers (sample means and variances), we directly use raw data from all studies. We can either model raw data using one complex multilevel (hierarchical) model (one-step method) or calculate statistics for each study and use a meta-analysis (two-step method; note that both methods will usually give the same results). Study-level random effects can be incorporated to allow the response variable of interest to vary among studies, and overall effects correspond to fixed, population-level estimates. The use of IPD or ‘full-data analyses’ has also surged in ecology, aided by open-science policies that encourage the archival of raw data alongside articles, and initiatives that synthesise raw data (e.g., PREDICTS [ 127 ], BioTime [ 128 ]). In health disciplines, such meta-analyses are considered the ‘gold standard’ [ 129 ], owing to their potential for resolving issues regarding study-specific designs and confounding variation, and it is unclear whether and how they might resolve issues such as scale dependence in environmental meta-analyses [ 104 , 130 ].

Conclusions

In this article, we have attempted to describe the most practical ways to conduct quantitative synthesis, including meta-analysis, meta-regression, and publication bias tests. In addition, we have shown that there is much to be improved in terms of meta-analytic practice and reporting via a survey of 73 recent environmental meta-analyses. Such improvements are urgently required, especially given the potential influence that environmental meta-analyses can have on policies and decision-making [ 8 ]. So often, meta-analysts have called for better reporting of primary research (e.g. [ 131 , 132 ]), and now this is the time to raise the standards of reporting in meta-analyses. We hope our contribution will help to catalyse a turning point for better practice in quantitative synthesis in environmental sciences. We remind the reader most of what is described is implemented in the R environment on our tutorial webpage and researchers can readily use the proposed models and techniques ( https://itchyshin.github.io/Meta-analysis_tutorial/ ). Finally, meta-analytic techniques are always developing and improving. It is certainly possible that in the future, our proposed models and related methods will become dated, just as the traditional fixed-effect and random-effects models already are. Therefore, we must endeavour to be open-minded to new ways of doing quantitative research synthesis in environmental sciences.

Availability of data and materials

All data and material are provided as additional files.

Higgins JP, Thomas JE, Chandler JE, Cumpston ME, Li TE, Page MJ, Welch VA. Cochrane handbook for systematic reviews of interventions. 2nd ed. Chichester: Wikey; 2019.

Book   Google Scholar  

Cooper HM, Hedges LV, Valentine JC. The handbook of research synthesis and meta-analysis . 3rd ed. New York: Russell Sage Foundation; 2019.

Google Scholar  

Schmid CH, Stijnen TE, White IE. Handbook of meta-analysis. 1st ed. Boca Ranton: CRC; 2021.

Vetter D, Rucker G, Storch I. Meta-analysis: a need for well-defined usage in ecology and conservation biology. Ecosphere. 2013;4(6):1.

Article   Google Scholar  

Koricheva J, Gurevitch J, Mengersen K, editors. Handbook of meta-analysis in ecology and evolution. Princeton: Princeton Univesity Press; 2017.

Gurevitch J, Koricheva J, Nakagawa S, Stewart G. Meta-analysis and the science of research synthesis. Nature. 2018;555(7695):175–82.

Article   CAS   Google Scholar  

Spake R, Doncaster CP. Use of meta-analysis in forest biodiversity research: key challenges and considerations. Forest Ecol Manag. 2017;400:429–37.

Bilotta GS, Milner AM, Boyd I. On the use of systematic reviews to inform environmental policies. Environ Sci Policy. 2014;42:67–77.

Hedges LV, Vevea JL. Fixed- and random-effects models in meta-analysis. Psychol Methods. 1998;3(4):486–504.

Borenstein M, Hedges LV, Higgins JPT, Rothstein H. Introduction to meta-analysis. 2nd ed. Chichester: Wiley; 2021.

Noble DWA, Lagisz M, Odea RE, Nakagawa S. Nonindependence and sensitivity analyses in ecological and evolutionary meta-analyses. Mol Ecol. 2017;26(9):2410–25.

Nakagawa S, Noble DWA, Senior AM, Lagisz M. Meta-evaluation of meta-analysis: ten appraisal questions for biologists. Bmc Biol. 2017;15:1.

Nakagawa S, Senior AM, Viechtbauer W, Noble DWA. An assessment of statistical methods for nonindependent data in ecological meta-analyses: comment. Ecology. 2022;103(1): e03490.

Romanelli JP, Meli P, Naves RP, Alves MC, Rodrigues RR. Reliability of evidence-review methods in restoration ecology. Conserv Biol. 2021;35(1):142–54.

Koricheva J, Gurevitch J. Uses and misuses of meta-analysis in plant ecology. J Ecol. 2014;102(4):828–44.

O’Leary BC, Kvist K, Bayliss HR, Derroire G, Healey JR, Hughes K, Kleinschroth F, Sciberras M, Woodcock P, Pullin AS. The reliability of evidence review methodology in environmental science and conservation. Environ Sci Policy. 2016;64:75–82.

Rosenthal R. The “file drawer problem” and tolerance for null results. Psychol Bull. 1979;86(3):638–41.

Nakagawa S, Lagisz M, Jennions MD, Koricheva J, Noble DWA, Parker TH, Sánchez-Tójar A, Yang Y, O’Dea RE. Methods for testing publication bias in ecological and evolutionary meta-analyses. Methods Ecol Evol. 2022;13(1):4–21.

Cheung MWL. A guide to conducting a meta-analysis with non-independent effect sizes. Neuropsychol Rev. 2019;29(4):387–96.

Viechtbauer W. Conducting meta-analyses in R with the metafor package. J Stat Softw. 2010;36(3):1–48.

Yang Y, Macleod M, Pan J, Lagisz M, Nakagawa S. Advanced methods and implementations for the meta-analyses of animal models: current practices and future recommendations. Neurosci Biobehav Rev. 2022. https://doi.org/10.1016/j.neubiorev.2022.105016:105016 .

Nakagawa S, Cuthill IC. Effect size, confidence interval and statistical significance: a practical guide for biologists. Biol Rev. 2007;82(4):591–605.

Hedges LV, Gurevitch J, Curtis PS. The meta-analysis of response ratios in experimental ecology. Ecology. 1999;80(4):1150–6.

Friedrich JO, Adhikari NKJ, Beyene J. The ratio of means method as an alternative to mean differences for analyzing continuous outcome variables in meta-analysis: A simulation study. BMC Med Res Methodol. 2008;8:5.

Hedges L, Olkin I. Statistical methods for meta-analysis. New York: Academic Press; 1985.

Cohen J. Statistical power analysis for the beahvioral sciences. 2nd ed. Hillsdale: Lawrence Erlbaum; 1988.

Senior AM, Viechtbauer W, Nakagawa S. Revisiting and expanding the meta-analysis of variation: the log coefficient of variation ratio. Res Synth Methods. 2020;11(4):553–67.

Nakagawa S, Poulin R, Mengersen K, Reinhold K, Engqvist L, Lagisz M, Senior AM. Meta-analysis of variation: ecological and evolutionary applications and beyond. Methods Ecol Evol. 2015;6(2):143–52.

Knapp S, van der Heijden MGA. A global meta-analysis of yield stability in organic and conservation agriculture. Nat Commun. 2018;9:3632.

Porturas LD, Anneberg TJ, Cure AE, Wang SP, Althoff DM, Segraves KA. A meta-analysis of whole genome duplication and theeffects on flowering traits in plants. Am J Bot. 2019;106(3):469–76.

Janicke T, Morrow EH. Operational sex ratio predicts the opportunity and direction of sexual selection across animals. Ecol Lett. 2018;21(3):384–91.

Chamberlain R, Brunswick N, Siev J, McManus IC. Meta-analytic findings reveal lower means but higher variances in visuospatial ability in dyslexia. Brit J Psychol. 2018;109(4):897–916.

O’Dea RE, Lagisz M, Jennions MD, Nakagawa S. Gender differences in individual variation in academic grades fail to fit expected patterns for STEM. Nat Commun. 2018;9:3777.

Brugger SP, Angelescu I, Abi-Dargham A, Mizrahi R, Shahrezaei V, Howes OD. Heterogeneity of striatal dopamine function in schizophrenia: meta-analysis of variance. Biol Psychiat. 2020;87(3):215–24.

Usui T, Macleod MR, McCann SK, Senior AM, Nakagawa S. Meta-analysis of variation suggests that embracing variability improves both replicability and generalizability in preclinical research. Plos Biol. 2021;19(5): e3001009.

Hoffmann AA, Merila J. Heritable variation and evolution under favourable and unfavourable conditions. Trends Ecol Evol. 1999;14(3):96–101.

Wood CW, Brodie ED 3rd. Environmental effects on the structure of the G-matrix. Evolution. 2015;69(11):2927–40.

Hillebrand H, Donohue I, Harpole WS, Hodapp D, Kucera M, Lewandowska AM, Merder J, Montoya JM, Freund JA. Thresholds for ecological responses to global change do not emerge from empirical data. Nat Ecol Evol. 2020;4(11):1502.

Yang YF, Hillebrand H, Lagisz M, Cleasby I, Nakagawa S. Low statistical power and overestimated anthropogenic impacts, exacerbated by publication bias, dominate field studies in global change biology. Global Change Biol. 2022;28(3):969–89.

Nakagawa S, Santos ESA. Methodological issues and advances in biological meta-analysis. Evol Ecol. 2012;26(5):1253–74.

Bakbergenuly I, Hoaglin DC, Kulinskaya E. Estimation in meta-analyses of response ratios. BMC Med Res Methodol. 2020;20(1):1.

Bakbergenuly I, Hoaglin DC, Kulinskaya E. Estimation in meta-analyses of mean difference and standardized mean difference. Stat Med. 2020;39(2):171–91.

Doncaster CP, Spake R. Correction for bias in meta-analysis of little-replicated studies. Methods Ecol Evol. 2018;9(3):634–44.

Nakagawa S, Noble DW, Lagisz M, Spake R, Viechtbauer W, Senior AM. A robust and readily implementable method for the meta-analysis of response ratios with and without missing standard deviations. Ecol Lett. 2023;26(2):232–44

Hamman EA, Pappalardo P, Bence JR, Peacor SD, Osenberg CW. Bias in meta-analyses using Hedges’ d. Ecosphere. 2018;9(9): e02419.

Bakbergenuly I, Hoaglin DC, Kulinskaya E. On the Q statistic with constant weights for standardized mean difference. Brit J Math Stat Psy. 2022;75(3):444–65.

DerSimonian R, Kacker R. Random-effects model for meta-analysis of clinical trials: an update. Contemp Clin Trials. 2007;28(2):105–14.

Veroniki AA, Jackson D, Viechtbauer W, Bender R, Bowden J, Knapp G, Kuss O, Higgins JPT, Langan D, Salanti G. Methods to estimate the between-study variance and its uncertainty in meta-analysis. Res Synth Methods. 2016;7(1):55–79.

Langan D, Higgins JPT, Simmonds M. Comparative performance of heterogeneity variance estimators in meta-analysis: a review of simulation studies. Res Synth Methods. 2017;8(2):181–98.

Panityakul T, Bumrungsup C, Knapp G. On estimating residual heterogeneity in random-effects meta-regression: a comparative study. J Stat Theory Appl. 2013;12(3):253–65.

Bishop J, Nakagawa S. Quantifying crop pollinator dependence and its heterogeneity using multi-level meta-analysis. J Appl Ecol. 2021;58(5):1030–42.

Cheung MWL. Modeling dependent effect sizes with three-level meta-analyses: a structural equation modeling approach. Psychol Methods. 2014;19(2):211–29.

Bolker BM, Brooks ME, Clark CJ, Geange SW, Poulsen JR, Stevens MHH, White JSS. Generalized linear mixed models: a practical guide for ecology and evolution. Trends Ecol Evol. 2009;24(3):127–35.

Lajeunesse MJ. On the meta-analysis of response ratios for studies with correlated and multi-group designs. Ecology. 2011;92(11):2049–55.

Gleser LJ, Olkin I. Stochastically dependent effect sizes. In: Cooper H, Hedges LV, Valentine JC, editors. The handbook of research synthesis and meta-analysis. New York: Russell Sage Foundation; 2009.

Tipton E, Pustejovsky JE. Small-sample adjustments for tests of moderators and model fit using robust variance estimation in meta-regression. J Educ Behav Stat. 2015;40(6):604–34.

Hedges LV, Tipton E, Johnson MC. Robust variance estimation in meta-regression with dependent effect size estimates (vol 1, pg 39, 2010). Res Synth Methods. 2010;1(2):164–5.

Pustejovsky JE, Tipton E. Meta-analysis with robust variance estimation: expanding the range of working models. Prev Sci. 2021. https://doi.org/10.1007/s11121-021-01246-3 .

Cairns M, Prendergast LA. On ratio measures of heterogeneity for meta-analyses. Res Synth Methods. 2022;13(1):28–47.

Borenstein M, Higgins JPT, Hedges LV, Rothstein HR. Basics of meta-analysis: I2 is not an absolute measure of heterogeneity. Res Synth Methods. 2017;8(1):5–18.

Hoaglin DC. Practical challenges of I-2 as a measure of heterogeneity. Res Synth Methods. 2017;8(3):254–254.

Higgins JPT, Thompson SG. Quantifying heterogeneity in a meta-analysis. Stat Med. 2002;21(11):1539–58.

Higgins JPT, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. Brit Med J. 2003;327(7414):557–60.

Xiong CJ, Miller JP, Morris JC. Measuring study-specific heterogeneity in meta-analysis: application to an antecedent biomarker study of Alzheimer’s disease. Stat Biopharm Res. 2010;2(3):300–9.

Nakagawa S, Schielzeth H. Repeatability for Gaussian and non-Gaussian data: a practical guide for biologists. Biol Rev. 2010;85(4):935–56.

Senior AM, Grueber CE, Kamiya T, Lagisz M, O’Dwyer K, Santos ESA, Nakagawa S. Heterogeneity in ecological and evolutionary meta-analyses: its magnitude and implications. Ecology. 2016;97(12):3293–9.

Gelman A, Hill J. Data analysis using regression and multilevel/hierarchical models. Cambridge: Cambridge University Press; 2007.

Schielzeth H. Simple means to improve the interpretability of regression coefficients. Methods Ecol Evol. 2010;1(2):103–13.

Nakagawa S, Schielzeth H. A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods Ecol Evol. 2013;4(2):133–42.

Nakagawa S, Johnson PCD, Schielzeth H. The coefficient of determination R-2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded. J R Soc Interface. 2017;14(134):20170213.

Aloe AM, Becker BJ, Pigott TD. An alternative to R-2 for assessing linear models of effect size. Res Synth Methods. 2010;1(3–4):272–83.

Cinar O, Umbanhowar J, Hoeksema JD, Viechtbauer W. Using information-theoretic approaches for model selection in meta-analysis. Res Synth Methods. 2021. https://doi.org/10.1002/jrsm.1489 .

Viechtbauer W. Model checking in meta-analysis. In: Schmid CH, Stijnen T, White IR, editors. Handbook of meta-analysis. Boca Raton: CRC; 2021.

Anzures-Cabrera J, Higgins JPT. Graphical displays for meta-analysis: An overview with suggestions for practice. Res Synth Methods. 2010;1(1):66–80.

Kossmeier M, Tran US, Voracek M. Charting the landscape of graphical displays for meta-analysis and systematic reviews: a comprehensive review, taxonomy, and feature analysis. Bmc Med Res Methodol. 2020;20(1):1.

Intout J, Ioannidis JPA, Rovers MM, Goeman JJ. Plea for routinely presenting prediction intervals in meta-analysis. BMJ Open. 2016;6(7): e010247.

Moeyaert M, Ugille M, Beretvas SN, Ferron J, Bunuan R, Van den Noortgate W. Methods for dealing with multiple outcomes in meta-analysis a comparison between averaging effect sizes, robust variance estimation and multilevel meta-analysis. Int J Soc Res Methodol. 2017;20:559.

Nakagawa S, Lagisz M, O’Dea RE, Rutkowska J, Yang YF, Noble DWA, Senior AM. The orchard plot: cultivating a forest plot for use in ecology, evolution, and beyond. Res Synth Methods. 2021;12(1):4–12.

Rothstein H, Sutton AJ, Borenstein M. Publication bias in meta-analysis : prevention, assessment and adjustments. Hoboken: Wiley; 2005.

Nakagawa S, Lagisz M, Jennions MD, Koricheva J, Noble DWA, Parker TH, Sanchez-Tojar A, Yang YF, O’Dea RE. Methods for testing publication bias in ecological and evolutionary meta-analyses. Methods Ecol Evol. 2022;13(1):4–21.

Stanley TD, Doucouliagos H. Meta-regression analysis in economics and business. New York: Routledge; 2012.

Stanley TD, Doucouliagos H. Meta-regression approximations to reduce publication selection bias. Res Synth Methods. 2014;5(1):60–78.

Sterne JAC, Becker BJ, Egger M. The funnel plot. In: Rothstein H, Sutton AJ, Borenstein M, editors. Publication bias in meta-analysis: prevention, assessment and adjustments. Chichester: Wiley; 2005. p. 75–98.

Sterne JAC, Sutton AJ, Ioannidis JPA, Terrin N, Jones DR, Lau J, Carpenter J, Rucker G, Harbord RM, Schmid CH, et al. Recommendations for examining and interpreting funnel plot asymmetry in meta-analyses of randomised controlled trials. Br Med J. 2011;343:4002.

Egger M, Smith GD, Schneider M, Minder C. Bias in meta-analysis detected by a simple, graphical test. Brit Med J. 1997;315(7109):629–34.

Jennions MD, Moller AP. Relationships fade with time: a meta-analysis of temporal trends in publication in ecology and evolution. P Roy Soc B-Biol Sci. 2002;269(1486):43–8.

Koricheva J, Kulinskaya E. Temporal instability of evidence base: a threat to policy making? Trends Ecol Evol. 2019;34(10):895–902.

Raue A, Kreutz C, Maiwald T, Bachmann J, Schilling M, Klingmuller U, Timmer J. Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics. 2009;25(15):1923–9.

Matsushima Y, Noma H, Yamada T, Furukawa TA. Influence diagnostics and outlier detection for meta-analysis of diagnostic test accuracy. Res Synth Methods. 2020;11(2):237–47.

Viechtbauer W, Cheung MWL. Outlier and influence diagnostics for meta-analysis. Res Synth Methods. 2010;1(2):112–25.

Haddaway NR, Macura B. The role of reporting standards in producing robust literature reviews comment. Nat Clim Change. 2018;8(6):444–7.

Frampton G, Whaley P, Bennett M, Bilotta G, Dorne JLCM, Eales J, James K, Kohl C, Land M, Livoreil B, et al. Principles and framework for assessing the risk of bias for studies included in comparative quantitative environmental systematic reviews. Environ Evid. 2022;11(1):12.

Stanhope J, Weinstein P. Critical appraisal in ecology: what tools are available, and what is being used in systematic reviews? Res Synth Methods. 2022. https://doi.org/10.1002/jrsm.1609 .

Haddaway NR, Macura B, Whaley P, Pullin AS. ROSES RepOrting standards for systematic evidence syntheses: pro forma, flow-diagram and descriptive summary of the plan and conduct of environmental systematic reviews and systematic maps. Environ Evid. 2018;7(1):1.

Woodcock P, Pullin AS, Kaiser MJ. Evaluating and improving the reliability of evidence syntheses in conservation and environmental science: a methodology. Biol Conserv. 2014;176:54–62.

O’Dea RE, Lagisz M, Jennions MD, Koricheva J, Noble DWA, Parker TH, Gurevitch J, Page MJ, Stewart G, Moher D, et al. Preferred reporting items for systematic reviews and meta-analyses in ecology and evolutionary biology: a PRISMA extension. Biol Rev. 2021;96(5):1695–722.

Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Plos Med. 2009;6(7):e1000097.

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, Shamseer L, Tetzlaff JM, Akl EA, Brennan SE, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Plos Med. 2021;18(3): e1003583.

Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten JW, Santos LBD, Bourne PE, et al. Comment: the FAIR guiding principles for scientific data management and stewardship. Sci Data. 2016;3: 160018.

Culina A, Baglioni M, Crowther TW, Visser ME, Woutersen-Windhouwer S, Manghi P. Navigating the unfolding open data landscape in ecology and evolution. Nat Ecol Evol. 2018;2(3):420–6.

Roche DG, Lanfear R, Binning SA, Haff TM, Schwanz LE, Cain KE, Kokko H, Jennions MD, Kruuk LE. Troubleshooting public data archiving: suggestions to increase participation. Plos Biol. 2014;12(1): e1001779.

Roche DG, Kruuk LEB, Lanfear R, Binning SA. Public data archiving in ecology and evolution: how well are we doing? Plos Biol. 2015;13(11): e1002295.

Culina A, van den Berg I, Evans S, Sanchez-Tojar A. Low availability of code in ecology: a call for urgent action. Plos Biol. 2020;18(7): e3000763.

Spake R, Mori AS, Beckmann M, Martin PA, Christie AP, Duguid MC, Doncaster CP. Implications of scale dependence for cross-study syntheses of biodiversity differences. Ecol Lett. 2021;24(2):374–90.

Osenberg CW, Sarnelle O, Cooper SD. Effect size in ecological experiments: the application of biological models in meta-analysis. Am Nat. 1997;150(6):798–812.

Noble DWA, Nakagawa S. Planned missing data designs and methods: options for strengthening inference, increasing research efficiency and improving animal welfare in ecological and evolutionary research. Evol Appl. 2021;14(8):1958–68.

Nakagawa S, Freckleton RP. Missing inaction: the dangers of ignoring missing data. Trends Ecol Evol. 2008;23(11):592–6.

Mavridis D, Chaimani A, Efthimiou O, Leucht S, Salanti G. Addressing missing outcome data in meta-analysis. Evid-Based Ment Health. 2014;17(3):85.

Ellington EH, Bastille-Rousseau G, Austin C, Landolt KN, Pond BA, Rees EE, Robar N, Murray DL. Using multiple imputation to estimate missing data in meta-regression. Methods Ecol Evol. 2015;6(2):153–63.

Kambach S, Bruelheide H, Gerstner K, Gurevitch J, Beckmann M, Seppelt R. Consequences of multiple imputation of missing standard deviations and sample sizes in meta-analysis. Ecol Evol. 2020;10(20):11699–712.

Hadfield JD, Nakagawa S. General quantitative genetic methods for comparative biology: phylogenies, taxonomies and multi-trait models for continuous and categorical characters. J Evol Biol. 2010;23(3):494–508.

Cinar O, Nakagawa S, Viechtbauer W. Phylogenetic multilevel meta-analysis: a simulation study on the importance of modelling the phylogeny. Methods Ecol Evol. 2021. https://doi.org/10.1111/2041-210X.13760 .

Ives AR, Zhu J. Statistics for correlated data: phylogenies, space, and time. Ecol Appl. 2006;16(1):20–32.

Maire A, Thierry E, Viechtbauer W, Daufresne M. Poleward shift in large-river fish communities detected with a novel meta-analysis framework. Freshwater Biol. 2019;64(6):1143–56.

Morrissey MB. Meta-analysis of magnitudes, differences and variation in evolutionary parameters. J Evol Biol. 2016;29(10):1882–904.

Cohen JM, Lajeunesse MJ, Rohr JR. A global synthesis of animal phenological responses to climate change. Nat Clim Change. 2018;8(3):224.

Gurevitch J, Morrison JA, Hedges LV. The interaction between competition and predation: a meta-analysis of field experiments. Am Nat. 2000;155(4):435–53.

Macartney EL, Lagisz M, Nakagawa S. The relative benefits of environmental enrichment on learning and memory are greater when stressed: a meta-analysis of interactions in rodents. Neurosci Biobehav R. 2022. https://doi.org/10.1016/j.neubiorev.2022.104554 .

Spake R, Bowler DE, Callaghan CT, Blowes SA, Doncaster CP, Antão LH, Nakagawa S, McElreath R, Chase JM. Understanding ‘it depends’ in ecology: a guide to hypothesising, visualising and interpreting statistical interactions. Biol Rev. 2023. https://doi.org/10.1111/brv.12939 .

Siviter H, Bailes EJ, Martin CD, Oliver TR, Koricheva J, Leadbeater E, Brown MJF. Agrochemicals interact synergistically to increase bee mortality. Nature. 2021;596(7872):389.

Salanti G, Schmid CH. Research synthesis methods special issue on network meta-analysis: introduction from the editors. Res Synth Methods. 2012;3(2):69–70.

Gomez-Creutzberg C, Lagisz M, Nakagawa S, Brockerhoff EG, Tylianakis JM. Consistent trade-offs in ecosystem services between land covers with different production intensities. Biol Rev. 2021;96(5):1989–2008.

Jackson D, White IR, Price M, Copas J, Riley RD. Borrowing of strength and study weights in multivariate and network meta-analysis. Stat Methods Med Res. 2017;26(6):2853–68.

Sanchez-Tojar A, Moran NP, O’Dea RE, Reinhold K, Nakagawa S. Illustrating the importance of meta-analysing variances alongside means in ecology and evolution. J Evol Biol. 2020;33(9):1216–23.

Riley RD, Lambert PC, Abo-Zaid G. Meta-analysis of individual participant data: rationale, conduct, and reporting. BMJ. 2010;340:c221.

Riley RD, Tierney JF, Stewart LA. Individual participant data meta-analysis : a handbook for healthcare research. 1st ed. Hoboken: Wiley; 2021.

Hudson LN, Newbold T, Contu S, Hill SLL, Lysenko I, De Palma A, Phillips HRP, Alhusseini TI, Bedford FE, Bennett DJ, et al. The database of the PREDICTS (projecting responses of ecological diversity in changing terrestrial systems) project. Ecol Evol. 2017;7(1):145–88.

Dornelas M, Antao LH, Moyes F, Bates AE, Magurran AE, Adam D, Akhmetzhanova AA, Appeltans W, Arcos JM, Arnold H, et al. BioTIME: a database of biodiversity time series for the anthropocene. Glob Ecol Biogeogr. 2018;27(7):760–86.

Mengersen K, Gurevitch J, Schmid CH. Meta-analysis of primary data. In: Koricheva J, Gurevitch J, Mengersen K, editors. Handbook of meta-analysis in ecology and evolution. Priceton: Princeton university; 2013. p. 300–12.

Spake R, O’Dea RE, Nakagawa S, Doncaster CP, Ryo M, Callaghan CT, Bullock JM. Improving quantitative synthesis to achieve generality in ecology. Nat Ecol Evol. 2022;6(12):1818–28.

Gerstner K, Moreno-Mateos D, Gurevitch J, Beckmann M, Kambach S, Jones HP, Seppelt R. Will your paper be used in a meta-analysis? Make the reach of your research broader and longer lasting. Methods Ecol Evol. 2017;8(6):777–84.

Haddaway NR. A call for better reporting of conservation research data for use in meta-analyses. Conserv Biol. 2015;29(4):1242–5.

Midolo G, De Frenne P, Holzel N, Wellstein C. Global patterns of intraspecific leaf trait responses to elevation. Global Change Biol. 2019;25(7):2485–98.

White IR, Schmid CH, Stijnen T. Choice of effect measure and issues in extracting outcome data. In: Schmid CH, Stijnen T, White IR, editors. Handbook of meta-analysis. Boca Raton: CRC; 2021.

Lajeunesse MJ. Bias and correction for the log response ratio in ecological meta-analysis. Ecology. 2015;96(8):2056–63.

Download references

Acknowledgements

SN, ELM, and ML were supported by the ARC (Australian Research Council) Discovery grant (DP200100367), and SN, YY, and ML by the ARC Discovery grant (DP210100812). YY was also supported by the National Natural Science Foundation of China (32102597). A part of this research was conducted while visiting the Okinawa Institute of Science and Technology (OIST) through the Theoretical Sciences Visiting Program (TSVP) to SN.

Australian Research Council Discovery grant (DP200100367); Australian Research Council Discovery grant (DP210100812); The National Natural Science Foundation of China (32102597).

Author information

Authors and affiliations.

Evolution & Ecology Research Centre and School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, NSW, 2052, Australia

Shinichi Nakagawa, Yefeng Yang, Erin L. Macartney & Malgorzata Lagisz

Theoretical Sciences Visiting Program, Okinawa Institute of Science and Technology Graduate University, Onna, 904-0495, Japan

Shinichi Nakagawa

School of Biological Sciences, Whiteknights Campus, University of Reading, Reading, RG6 6AS, UK

Rebecca Spake

You can also search for this author in PubMed   Google Scholar

Contributions

SN was commissioned to write this article so he assembled a team of co-authors. SN discussed the idea with YY, ELM, RS and ML, and all of them contributed to the design of this review. ML led the survey working with YY and ELM, while YY led the creation of the accompanying webpage working with RS. SN supervised all aspects of this work and wrote the first draft, which was commented on, edited, and therefore, significantly improved by the other co-authors. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Shinichi Nakagawa or Yefeng Yang .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Contest for publication

The authors provide consent for publication.

Competing interests

The authors report no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1:.

The survey of meta-analyses in environmnetal sciences.

Additional file 2:

The hands-on R tutorial.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Nakagawa, S., Yang, Y., Macartney, E.L. et al. Quantitative evidence synthesis: a practical guide on meta-analysis, meta-regression, and publication bias tests for environmental sciences. Environ Evid 12 , 8 (2023). https://doi.org/10.1186/s13750-023-00301-6

Download citation

Received : 13 January 2023

Accepted : 23 March 2023

Published : 24 April 2023

DOI : https://doi.org/10.1186/s13750-023-00301-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Hierarchical models
  • Robust variance estimation
  • Spatial dependency
  • Variance–covariance matrix
  • Meta-analysis of variance
  • Network meta-analysis
  • Multivariate meta-analysis

Environmental Evidence

ISSN: 2047-2382

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

meta analysis in quantitative research example

  • Search Menu
  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Urban Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Literature
  • Classical Reception
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Papyrology
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Greek and Roman Archaeology
  • Late Antiquity
  • Religion in the Ancient World
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Emotions
  • History of Agriculture
  • History of Education
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Evolution
  • Language Reference
  • Language Acquisition
  • Language Variation
  • Language Families
  • Lexicography
  • Linguistic Anthropology
  • Linguistic Theories
  • Linguistic Typology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature
  • Bibliography
  • Children's Literature Studies
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies (Modernism)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Media
  • Music and Religion
  • Music and Culture
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Science
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Lifestyle, Home, and Garden
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Politics
  • Law and Society
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Neuroanaesthesia
  • Clinical Neuroscience
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Toxicology
  • Medical Oncology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Medical Ethics
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Security
  • Computer Games
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Psychology
  • Cognitive Neuroscience
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business Ethics
  • Business Strategy
  • Business History
  • Business and Technology
  • Business and Government
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic History
  • Economic Systems
  • Economic Methodology
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Natural Disasters (Environment)
  • Social Impact of Environmental Issues (Social Science)
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • International Political Economy
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Political Theory
  • Politics and Law
  • Public Policy
  • Public Administration
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Developmental and Physical Disabilities Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

The Oxford Handbook of Quantitative Methods in Psychology: Vol. 2: Statistical Analysis

  • < Previous chapter
  • Next chapter >

30 Meta-Analysis and Quantitative Research Synthesis

Noel A. Card, Family Studies and Human Development, University of Arizona, Tucson, AZ

Deborah M. Casper, Family Studies and Human Development, University of Arizona, Tucson, AZ

  • Published: 01 October 2013
  • Cite Icon Cite
  • Permissions Icon Permissions

Meta-analysis is an increasingly common method of quantitatively synthesizing research results, with substantial advantages over traditional (i.e., qualitative or narrative) methods of literature review. This chapter is an overview of meta-analysis that provides the foundational knowledge necessary to understand the goals of meta-analysis and the process of conducting a meta-analysis, from the initial formulation of research questions through the interpretation of results. The chapter provides insights into the types of research questions that can and cannot be answered through meta-analysis as well as more practical information on the practices of meta-analysis. Finally, the chapter concludes with some advanced topics intended to alert readers to further possibilities available through meta-analysis.

Introduction to Meta-analysis

Meta-analysis, also referred to as quantitative research synthesis, is a systematic approach to quantitatively synthesizing empirical literature. By combining and comparing research results, metaanalysis is used to advance theory, resolve conflicts within a discipline, and identify directions for future research ( Cooper & Hedges, 2009 ). We begin by describing what meta-analysis is and what it is not.

Basic Terminology

It is important to provide a foundation of basic terminology on which to build a more technical and advanced understanding of meta-analysis. First, we draw the distinction between meta-analysis and primary and secondary analysis. The second distinction we draw is between quantitative research synthesis and qualitative literature review.

Glass (1976 ) defined primary-, secondary-, and meta-analysis as the analysis of data in an original study, the re-analysis of data previously explored in an effort to answer new questions or existing questions in a new way, and the quantitative analysis of results from multiple studies, respectively. A notable distinction between meta-analysis as compared to primary and secondary analysis involves the unit of analysis. In primary and secondary analyses, the units of analysis are most often the individual participants. In contrast, the units of analysis in a meta-analysis are the studies themselves or, more accurately, the effect sizes (defined below) of these studies.

A second foundational feature to consider is the distinction between quantitative research synthesis and qualitative literature review. Although both approaches are valuable to the advancement of knowledge, they differ with regard to focus and methodology. The focus of meta-analysis is on the integration of research outcomes, specifically in terms of effect sizes. In contrast, the focus of a qualitative literature review can be on research outcomes (although typically not focusing on effect sizes) but can also be on theoretical perspectives or typical practices in research. In terms of methods, scientists utilizing meta-analytic methodologies quantitatively synthesize findings to draw conclusions based on statistical principle. In contrast, scholars who conduct a qualitative literature review subjectively interpret and integrate research. Not considered in this chapter are other methodologies that fall between these two approaches on the taxonomy of literature review (for a more comprehensive review, see   Card, 2012 ; Cooper 1988 ).

As previously acknowledged, both quantitative research synthesis and qualitative literature review merit recognition for their respective contributions to the advancement of scientific knowledge. Quantitative literature reviews were developed to overcome many of the limitations of qualitative literature reviews, and we will highlight the advantages of quantitative literature reviews below. However, it is worth noting that quantitative research synthesis has also faced criticisms ( Chalmers, Hedges, & Cooper, 2002 ). Following are some highlights in the history of meta-analysis (for more thorough historical account, see   Chalmers, Hedges, & Cooper, 2002 ; Hedges, 1992 ; Hunt, 1997 ; Olkin, 1990 ).

A Brief History

Research synthesis methodology can be traced as far back as 1904 when Karl Pearson integrated five studies looking at the association between inoculation for typhoid fever and morality ( see   Olkin, 1990 ). By the 1970s, at least three independent groups had started to combine results from multiple studies ( Glass, 1976 ; Rosenthal & Rubin, 1978 ; Schmidt & Hunter, 1977 ), but the most influential work was Mary Smith and Gene Glass’ (1977) “meta-analysis” of psychotherapy, which was both ground-breaking and controversial. Smith and Glass’s (1977 ) meta-analysis sparked considerable controversy and debate as to the legitimacy of not only the findings but of the methodology itself ( Eysenck, 1978 ). It is worth noting, however, that some have suggested the controversy surrounding Smith and Glass’ (1977 ) meta-analysis had much more to do with the results than the methodology ( Card, 2012 ).

Following the somewhat turbulent introduction of meta-analysis into the social sciences, the 1980s offered significant contributions. These contributions came from both the advancement and dissemination of knowledge of meta-analytic techniques by way of published books describing the approach, as well as through the publication of research utilizing the methods ( Glass, McGaw, & Smith, 1981 ; Hedges & Olkin, 1985 ; Hunter, Schmidt, & Jackson, 1982 ; Rosenthal, 1984 ). Since its introduction into the social sciences in the 1970s, meta-analysis has become increasingly visible and has made considerable contributions to numerous bodies of scholarly research ( see   Cochran, 1937 ; Hunter, Schmidt, & Hunter, 1979 ; Pearsons, 1904 ; Rosenthal & Rubin, 1978 ; Glass & Smith, 1979 ; Smith & Glass, 1977 ).

Research Synthesis in the Social Sciences

Glass (1976 ) brought the need for meta-analysis to the forefront in a presidential address. It is not uncommon to observe conflicting findings across studies ( Cooper & Hedges, 2009 ). These inconsistencies lead to confusion and impede progress in social science (as well as in the so-called hard sciences; Hedges, 1987 ). Quantitative research synthesis is a powerful approach that addresses this problem through the systematic integration of results from multiple studies that often individually report conflicting results.

Chapter Overview

The following chapter is an overview of metaanalysis that provides the foundational knowledge necessary to understand the goals of meta-analysis and the process of conducting a meta-analysis, from the initial formulation of research questions through the interpretation of results. The chapter provides insights into the types of research questions that can and cannot be answered through meta-analysis as well as more practical information on the practices of meta-analysis. Finally, we conclude the chapter with some advanced topics intended to alert readers to further possibilities available through meta-analysis. To begin, we consider the types of questions that can and cannot be answered through meta-analysis.

Problem Formulation

Questions that can and cannot be answered through meta-analysis.

One of the first things to consider when conducting scientific research is the question for which you seek an answer; meta-analysis is no exception. A primary purpose for conducting a meta-analytic review is to integrate findings across multiple studies; however, not all questions are suitable for this type of synthesis. Hundreds, or sometimes thousands, of individual research reports potentially exist on any given topic; therefore, after an initial search of the literature, it is important to narrow the focus, identify goals, and articulate concise research questions that can be answered by conducting a tractable meta-analysis. A common misconception by those unfamiliar with meta-analysis is that an entire discipline or phenomenon can be “meta-analyzed” ( Card, 2012 ). Because of the infinite number of questions that could be asked—many of which could be answered using meta-analysis—this sort of goal is too as pecific. Rather, a more appropriate approach to quantitative research synthesis is to identify a narrowly focused goal or set of goals and corresponding research questions.

Identifying Goals and Research Questions

Cooper’s (1988 ) taxonomy of literature reviews identified multiple goals for meta-analysis. These include integration, theory development, and the identification of central issues within a discipline. We consider each of these goals in turn.

Integration . There are two general approaches to integrating research findings in meta-analysis: combining and comparing studies. The approach of combining studies is used to integrate effect sizes from multiple primary studies in an effort to estimate an overall, typical effect size. It would then be expectable to make inferences about this mean effect size by way of significance testing and/or confidence intervals. A second approach commonly used to integrate findings involves comparing studies. Also known as moderator analyses (addressed in more detail below), comparisons can be made across studies when a particular effect size is hypothesized to systematically vary on one or more of the coded study characteristics. Analyses to address each of these two approaches to integration will be described below.

Theory Development . A second goal of meta-analysis involves the development of theory. Meta-analysis can be used quite effectively and efficiently toward this end. If associations between variables that have been meta-analytically combined are weak, then this might indicate that a theory positing stronger relations of the constructs in question should be abandoned or modified ( Schmidt, 1992 ). If, on the other hand, associations are strong, then this may be an indication that the phenomenon under investigation is moving toward a more integrated theory. Ideally, meta-analyses can be used to evaluate competing theories that make different predictions about the associations studied. Either way, meta-analysis is a powerful tool that can be used toward the advancement of theory within the social sciences.

Integration of Central Issues . A final goal has to do with identifying central issues within a discipline or phenomenon. The exhaustive review of empirical findings can aid in the process of identifying key issues within a discipline, such as whether there is inadequate study of certain types of samples or methodologies. The statistical techniques of meta-analysis can address inconsistencies in the findings, attempting to predict these inconsistencies with coded study characteristics (i.e., moderator analyses). Both of these contributions are important to the process of identifying directions for future research and the advancement of knowledge.

Critiques of Meta-Analysis

Earlier, we described how the controversial nature of one of the earliest meta-analyses ( Smith & Glass, 1977 ) drew criticism not only of their findings but also of the technique of meta-analysis itself. Although these critiques have largely been rebuffed, they are still occasionally applied. Among the most common criticisms of meta-analysis are: (1) the “file drawer” problem; (2) the apples and oranges problem; (3) garbage in and garbage out; (4) the level of expertise required of the meta-analyst; and (5) the potential lack of qualitative finesse.

The “file drawer” problem . The “file drawer” problem, also known as the threat of publication bias, is based on the notion that significant results get published and nonsignificant findings get relegated to the “file drawer,” resulting in the potential for a publication bias in meta-analysis ( Rosenthal, 1979 ). To answer this criticism, however, meta-analysts typically employ both systematic and exhaustive search strategies to obtain published and unpublished reports in an effort to minimize this threat. In addition, there is an extensive collection of statistical procedures in meta-analysis that can be used to probe the existence, extent, and likely impact of publication bias ( Rothstein, Sutton, & Borenstein, 2005 ).

The apples and oranges problem . The apples and oranges problem describes the potential process of combing such a diverse range of studies that the aggregated results are meaningless. For example, if a meta-analyst attempted to investigate the predictors of childhood internalizing problems by including studies focusing on depression, anxiety, and social withdrawal, then it could be argued that the aggregation of results across this diverse range of problems is meaningless. This critique, in our opinion, is conceptual rather than methodological: Did the scientist using meta-analytic techniques define a sampling frame of studies within which it is useful to combine results? Fortunately, meta-analytic reviews can use both (1) combination to estimate mean results and (2) comparison to evaluate whether studies with certain features differ. Put differently, meta-analysis allows for both general and specific results. Returning to the example of a meta-analyst investigating the predictors of child psychopathology, it might be useful to present results of both (1) predictors of general internalizing problems, and (2) comparisons of the distinct predictors of depression, anxiety, and social withdrawal.

Garbage in and garbage out . Garbage in, garbage out describes the practice of including poor-quality research reports in a meta-analysis, which result in only poor-quality conclusions. Although this critique is valid in some situations, we believe a more nuanced consideration of “garbage” is needed before being used as a critique of a particular meta-analysis. In the next section , we will provide this consideration by discussing how the limits of primary research place limits on the conclusions that can be drawn from meta-analysis of that research.

The level of expertise required of the meta-analyst . A common misconception is that meta-analysis requires advanced statistical expertise. We would argue that with basic methodological and quantitative training, such as usually obtained in the first year of graduate school, many scientists could readily learn the basic techniques (through an introductory course or book on meta-analysis) to conduct a sound meta-analytic review.

The potential lack of qualitative finesse . A final criticism that has been raised is that meta-analysis lacks the “qualitative finesse” of a qualitative review. Perhaps tellingly, a definition of qualitative finesse is generally lacking when this critique is made, but it seems that this critique implies that a meta-analyst has not thought carefully and critically about the nuances of the studies and collection of studies. There certainly exist meta-analyses where this critique seems relevant—just as there exist primary quantitative studies in which careful thought seems lacking. The solution to this critique is not to abandon meta-analytic techniques, however, just as the solution to thoughtless primary studies is not to abandon statistical analyses of these data. Rather, this critique makes clear that meta-analysis—like any other methodological approach—is a tool to aid careful thinking, rather than a replacement for it.

Limits of Primary Research and Meta-Analysis

It is also important to recognize that the conclusions of a meta-analytic review must be tempered by the quality of the empirical research comprising this review. Many of the threats to drawing conclusions in primary research are likely to translate to meta-analysis as well. Perhaps the most salient threats involve flaws in the study design, sampling procedures, methodological artifacts, and statistical power.

Study design . The design of primary studies guides the types of conclusions that can be drawn from them; similarly, the design of studies included in a meta-analysis guides the types of conclusions that can be drawn. Experimental designs, although powerful in their ability to permit inferences of causality, often do not share the same ecological validity as correlational designs. Conversely, correlational designs cannot make inferences of causality. It would follow that any limitation existing within primary studies also exists within the meta-analyses that encompass these studies.

Sampling . Another limitation of primary studies is that it is difficult to support inferences generalizable beyond the sampling frame. When a sample is drawn from a homogeneous population, inferences can be made only for a limited set of individuals. Similarly, findings from a meta-analysis can only be generalized to populations within the sampling frame of the included studies; however, the collection of primary studies within a meta-analysis is likely to be more heterogeneous than one single primary study if it includes studies that are collectively diverse in their samples, even if each study sample is homogeneous.

Methodological artifacts . Both primary research and meta-analysis involve methodological shortcomings. Although it is difficult to describe all of the characteristics that make up a high-quality study, it is possible to identify those artifacts that likely lower the quality of the design. In primary studies, methodological issues need to be addressed prior to data collection. In contrast, meta-analysis can address these methodological artifacts in either one of two ways. The first way is to compare (through moderator analyses) whether studies with different methodological features actually yield different findings. Second, for some artifacts (e.g., measurement unreliability) described near the end of this chapter, corrections can be made that allow for the analysis of effect sizes free of these artifacts. Artifact correction is rarely performed in primary research (with the exception of latent variable modeling to correct for unreliability) but more commonly considered in meta-analyses.

Statistical power . Another limitation of much primary research is low statistical power ( Maxwell, 2004 ). Statistical power is the probability of detecting an effect that truly does exist but is often unacceptably low in many primary research studies. This low power results in incorrect conclusions in primary studies that an effect does not exist (despite cautions against “accepting” the null hypothesis). Fortunately, meta-analysis is usually less affected by inadequate power of primary studies because it combines a potentially large number of studies, thus resulting in greater statistical power.

Strengths of Meta-Analysis

As outlined above, there are limits to metaanalysis; however, meta-analysis should be recognized for its considerable strengths. We next briefly describe three of the most important of these: (1) a systematic and disciplined review process; (2) sophisticated reporting of findings; and (3) a way of combining and comparing large amounts of data ( Lipsey & Wilson, 2001 ).

Systematic and disciplined review process . First, systematic procedures must be followed to conduct a comprehensive literature search, consistently code comparable characteristics and effect sizes from studies, and to ensure the accuracy of combining results from multiple reports into one effect size. The processes of searching the literature, identifying studies, coding, and analyzing results have received tremendous attention in the literature on meta-analysis methodology, in contrast to most other forms of literature review. Although this work requires discipline, diligent attention to detail, and meticulous documentation on the part of the metaanalyst, when these procedures are followed, a large amount of data can be combined and compared and the outcome is likely to be a significant contribution to the field.

Combining and comparing large amounts of data . Perhaps one of the greatest strengths of meta-analytic techniques is the ability to combine and compare large amounts of data that would otherwise be impossible to integrate in a meaningful way. It would assuredly exceed the capacity of almost any scholar to combine the large amounts of data and draw meaningful conclusions without quantitative literature review techniques. Following the strength of combining and comparing large amounts of data is the strength in the way in which the findings are reported.

Sophisticated reporting of findings . Meta-analysis offers a level of sophistication in the way in which the findings are reported. Unlike qualitative literature reviews that derive and report conclusions and interpretations in a narrative format, meta-analysis uses statistical techniques to yield quantified conclusions. Meta-analysts commonly take advantage of visual tools such as stem-and-leaf plots, funnel plots, and tables of effect sizes to add a level of sophistication to the reporting of findings.

Searching the Literature

Defining a sampling frame.

Similarly to primary research, a sampling frame must be considered in meta-analysis. However, the unit of analysis in a meta-analysis is the study itself, as compared to the individuals in most primary studies. If we are to make inferences about the population of studies of interest, it is necessary to define the population a priori by articulating a set of criteria of the type of studies included versus excluded from this sampling frame.

Identifying Inclusion and Exclusion Criteria

As mentioned, the inclusion and exclusion criteria define the sampling frame of a meta-analysis. Establishing clear and explicit criteria will help guide the search process, a consideration particularly important if multiple individuals are working on the project. A second reason for identifying clear criteria is that it will help define the population of interest to which generalizations can be made. A final reason that clear criteria are necessary has to do with the ideas of transparency and replication. As with the sampling in well-conducted and well-reported primary studies, each decision and subsequent procedure utilized in the literature search of a meta-analysis must be transparent and replicable. Some of the more common search techniques and sources of information are described next.

Search Techniques and Identifying Resources

Many techniques have been used quite successfully toward the goal of searching the literature and identifying relevant resources. Two important concepts related to the literature search are recall and precision ( see   White, 2009 ). Recall is the percentage of studies retrieved that meet your inclusion criteria from all of those that actually exist. Precision is the percentage of studies retrieved that meet the inclusion criteria for the meta-analysis. The ideal literature search strategy provides both high recall and precision, although the reality is that decisions that affect efforts to improve recall often lower precision and vice versa.

By using multiple methods of searching for literature, meta-analysts strive to maximize recall without imposing impractical detriments on precision. The use of multiple search techniques helps this effort. The techniques most commonly used include searching: electronic databases using keywords, bibliographical reference volumes, unpublished works and other outlets (described below), conference presentations, funding agency lists, research registries, backward searches, forward searches, and personal communications with colleagues.

Electronic databases . Electronic databases are probably one of the most helpful tools for conducting literature searches developed in the past decades. Now, electronic database searches can identify as much of the relevant literature in a matter of hours or days, as would have taken weeks or months a few decades earlier (not to mention that these searches can be done from the comfort of one’s office rather than within the confines of a library). Most disciplines have electronic databases that serve primarily that particular discipline (e.g., PsychINFO for psychology, Medline for medicine, ERIC for education, etc.). With these and similar databases, the metaanalyst identifies the most relevant combination of keywords, wildcard marks (e.g., * ), and logical statements (e.g., and, or, not), and voluminous amounts of literature is quickly searched for matches. The electronic database is perhaps the most fruitful place to begin and is currently the primary tool used to search the literature.

Despite their advantages, it is worth mentioning a few cautions regarding electronic databases. First, an electronic search must not be used exclusively because of that which is not included in these databases. For example, many unpublished works might not be retrieved through electronic databases. Second, as mentioned previously, each discipline relies on one primary electronic database; therefore, multiple databases must be considered in your search. Third, electronic databases produce studies that match the keyword searches, but it is not possible to know what has been excluded. Using other search strategies and investigating why studies found by these strategies were not identified in the electronic database search is necessary to avoid unnecessary (and potentially embarrassing) omission of studies from a meta-analysis.

Bibliographical reference volumes . A method of locating relevant literature that was common as little as a decade ago is to search biographical reference volumes. These volumes are printed collections containing essentially the same information as electronic databases. Although these reference volumes are being phased out of circulation, you may find them useful if relevant literature was published some time ago (especially if the electronic databases have not yet incorporated this older literature).

Unpublished works . One of the challenges of meta-analysis has to do with publication bias ( see   Rothstein et al., 2005 ). If there is a tendency for significant findings to be more likely published than nonsignificant (presumably with smaller effect sizes) studies, then the exclusion of unpublished studies in a meta-analysis can be problematic. To balance this potential problem, the meta-analyst should make deliberate efforts to find and obtain unpublished studies. Some possible places to find such studies include conference program books, funding agency lists, and research registries.

Backward searches . Another technique commonly used in meta-analysis is the backward search. Once relevant reports are retrieved, it is recommended that the researcher thoroughly read each report and identify additional articles cited within these reports. This strategy is called a “backward” search because it proceeds backward in time from obtained studies toward previous studies.

Forward searches . A complimentary procedure, known as the forward search, involves searching for additional studies that have cited the relevant studies included in your meta-analysis (“forward” because the search proceeds from older studies to newer studies citing these previous works). To conduct this type of search, special databases (e.g., Social Science Citation Index) are used.

Personal communication with researchers in the field . A final search technique involves personal communication with the researchers in the field. It will be especially helpful to communicate with researchers in your field (those who will likely read your work) in an effort to locate resources that somehow escaped your comprehensive search efforts. An effective yet efficient way to do this is to simply email researchers in your field, let them know what type of meta-analysis you are conducting, and ask if they would be willing to peruse your reference list to see if there are any glaring oversights.

Coding Study Characteristics

In a meta-analysis, study characteristics are systematically coded for two reasons. First, this coded information is presented to describe the collective field being reviewed. For example, do studies primarily rely on White college students, or are the samples more diverse (either within or across studies)? Do studies rely on the same measures or types of measures, or has the phenomenon been studied using multiple measures?

A second reason for systematically coding study characteristics is for use as potential predictors of variation in effect sizes across studies (i.e., moderators, as described below in section titled Moderator Analyses). In other words, does variation across studies in the coded study characteristics co-occur with differences in results (i.e., effect sizes) from these studies? Ultimately, the decision of what study characteristics should be coded derives from the meta-analysts’ substantive understanding of the field. There are at least three general types of study features that are commonly considered: characteristics of the sample, the methodology, and the source.

Coding Sample Characteristics

Sample characteristics include any descriptions of the study samples that might systematically covary with study results (i.e., effect sizes). Some meta-analyses will include codes for the sampling procedures, such as whether the study used a representative sample or a convenience sample (e.g., college students), or whether the sample was selected from some specific setting, such as clinical treatment settings, schools, or prisons. Nearly all meta-analyses code various demographic features of the sample, such as the ethnic composition, proportion of the sample that is male or female, and the average age of participants in the sample.

Coding Methodological Characteristics

Potential methodological characteristics for coding include both design and measurement features. At a broad level, a meta-analyst might code broad types of designs, such as experimental, quasiexperimental, and single-subject ABAB studies. It might also be useful to code at more narrow levels, such as the type of control group used within experimental treatment studies (e.g., no contact, attention only, treatment as usual). Similarly, the types of measures used could be coded as either broad (e.g., parent vs. child reports) or narrow (CBCL vs. BASC parent reports). In practice, most meta-analysts will code methodological features at both broad and narrow levels, first considering broad-level features as predictors of variability in effect sizes, and then using more narrow-level feature if there exists unexplained variation in results within these broad features.

Coding Source Characteristics

Source characteristics include features of the report or author that might plausibly be related to study findings. The most commonly coded source characteristic is whether the study was published, which is often used to evaluate potential publication bias. The year of publication (or presentation, for unpublished works) is often used as a proxy for the historic time in which the study was conducted. If the year predicts differences in effect sizes, then this may be evidence for historic change in the phenomenon over time. Other source characteristics, such as characteristics of the researcher (e.g., gender, ethnicity, discipline), are less commonly coded but are possibilities. For example, some meta-analyses of gender differences have coded the gender of the first author to evaluate the possibility that the researchers’ presumed biases may somehow impact the results found (e.g., Card, Stucky, Sawalani, &Little, 2008 ).

Coding Effect Sizes

As mentioned, study results in meta-analysis are represented as effect sizes. To be useful in metaanalysis, a potential effect size needs to meet four criteria. First, it needs to quantify the direction and magnitude of a phenomenon of interest. Second, it needs to be comparable across studies that use different sample sizes and scales of measurement. Third, it needs to be either consistently reported in studies included in the meta-analysis or else it can be computed from commonly reported results. Fourth, it is necessary that the meta-analyst can compute its standard error, which is used for weighting of studies in subsequent meta-analytic combination and comparison.

The three effect sizes most commonly used in meta-analyses all index associations between two variables. The correlation coefficient (typically denoted as r ) quantifies associations between two continuous variables. The standardized mean differences are a family of effect sizes (we will focus on Hedges’ g ) that quantify associations between a dichotomous (group) variable and a continuous variable. The odds ratio (denoted as either o or OR) is a useful and commonly used index for associations between two dichotomous variables ( Fleiss, 1994 ). We next describe these three indexes of effect size, the correlation coefficient, the standardized mean difference, and the OR. After describing each of these effect sizes indexes, we will describe how these are computed from results commonly reported in empirical reports.

Correlation Coefficient

Correlation coefficients represent associations between two variables on a standardized scale from − 1 to +1. Correlations near 0 denote the absence of association between two variables, whereas positive values indicate that scores on one variable tend to be similar to scores on another (relatively high scores on one variable tend to occur with relatively high scores on the other, as do low scores tend to occur with low scores), whereas negative scores indicate the opposite (high scores with low scores). The correlation coefficient has the advantage of being widely recognized by scientists in diverse fields. A commonly applied suggestion is that r ≍ ±0.10 is considered small, r ≍ ±0.30 is considered medium, and r ≍ ±0.50 is considered large; however, disciplines and fields differ in their evaluations of what constitutes small or large correlations, and researchers should not be dogmatic in its application.

Although r has many advantages as an effect size, it has the undesirable property for meta-analysis of having sample estimates that are skewed around the population mean. For this reason, meta-analysts should transform r to Fisher’s Z r prior to analysis using the following equation:

Although Z r has desirable properties for meta-analytic combination and comparison, it is not very interpretable by most readers. Therefore, metaanalysts back-transform results in Z r metric (e.g., mean effect size) to r for reporting using the following equation:

As mentioned earlier, and will be described in greater detail below, it is necessary to compute the standard error of the estimation of the effect size ( Z r ) for use in weighting studies in meta-analysis. The equation for the standard error of Z r   ( S E Z r ) is a simple function of the study sample size:

Standardized Mean Differences

There exist several standardized mean differences, which index associations between a dichotomous “group” variable and a continuous variable. Each of these standardized mean differences indexes the direction and magnitude of differences between two groups in standard deviation units. We begin with one of the more common of these indices, Hedges’ g , which is defined as:

The numerator of this equation contains the difference between the means of two groups (groups 1 and 2) and will yield a positive value if group 1 has a higher mean than group 2 or a negative value if group 2 has a higher mean than group 1. Although it is arbitrary which group is designated 1 or 2, this designation must be consistent across all studies coded for a meta-analysis.

If all studies in a meta-analysis use the same measure, or else different measures with the same scale, then the numerator of this equation alone would suffice as an effect size for meta-analysis (this is the unstandardized mean difference). However, the more common situation is that different scales are used across different studies, and in this situation it would make no sense to attempt to combine these unstandardized mean differences across studies. To illustrate, if one study comparing treatment to control groups measured an outcome on a 1 to 100 scale and found a 10-point difference, whereas another study measured the outcome on a 0 to 5 scale and found a 2-point difference, then there would be no way of knowing which—if either—study had a larger effect. To make these differences comparable across studies, it is necessary to standardize them in some way, typically by dividing the mean difference by a standard deviation.

As seen in equation (4) above, this standard deviation in the divisor for g is the pooled (i.e., combined across the two groups) estimate of the population standard deviation. Other variants within the standardized mean difference family of effect sizes use different divisors. For example, the index d uses the pooled sample standard deviation and a less commonly used index, g Glass (also denoted as Glass’ Δ), uses the estimated population standard deviation for one group (the group that you believe is a more accurate estimate of population standard deviation, such as the control group if you believe that treatment impacts the standard deviation). The latter index ( g Glass ) is less preferred because it cannot be computed from some commonly reported statistics (e.g., t tests), and it is a poorer estimate if the standard deviations are, in fact, comparable across groups ( Hedges & Olkin, 1985 ).

In this chapter, we focus our attention primarily on g , and we will describe the computation of g from commonly reported results below. Like other standardized mean differences, g has a value of 0 when the groups do not differ (i.e., no association between the dichotomous group variable and the continuous variable), and positive or negative values depending on which group has a higher mean. Unlike r, g is not bounded at 1, but can have values greater than ±1.0 if the groups differ by more than one standard deviation.

Although g is a preferred index of standardized mean differences, it exhibits a slight bias when estimated from small samples (e.g., sample sizes less than 20). To correct for this bias, it is common to apply the following correction:

As with any effect size used in meta-analysis, it is necessary to compute the standard error of estimates of g for weighting during meta-analytic combination. The standard error of g is more precisely estimated using the sample sizes from both groups under consideration (i.e., n 1 and n 1 for groups 1 and 2, respectively) using the left portion of Equation 6 but can be reasonably estimated using overall sample size ( N Total ; right portion of Equation 6 ) when exact group sizes are unknown but approximately equal (no more than a 3-to-1 discrepancy in group sizes; Card, 2012 ; Rosenthal, 1991 ):

Odds Ratios

The odds ratio, denoted as either o or OR, is a useful index of associations between two dichotomous variables. Although readers might be familiar with other indices of two variable associations, such as the rate (also known as risk) ratio or the phi coefficient, the OR is advantageous because it is not affected by differences in the base rates of dichotomous variables across studies and is computed from a wider range of study designs ( see   Fleiss, 1994 ). The OR is estimated from 2 × 2 contingency tables by dividing the product of cell frequencies in the major diagonal (i.e., frequencies in the cells where values of the two variables are both 0 { n 00 }or both 1 { n 11 }) by the product of cell frequencies off the diagonal (i.e., frequencies in the cells where the two variables have different values, n 10 and n 01 ):

The OR has a rather different scale than either r or g . Values of 1.0 represent no association between the dichotomous variables, values from 1 down to 0 represent negative association, and values from 1 to infinity represent positive associations. Given this scale, o is obviously skewed; therefore, a log transformation is applied to o when included in a meta-analysis: ln ( o ). The standard error of this log-transformed odds ratio is a function of number of participants in each cell of the 2 × 2 contingency table:

Computing Effect Sizes From Commonly Reported Data

Ideally, all studies that you want to include in a meta-analysis will have effect sizes reported, and it is a fairly straightforward matter to simply record these. Unfortunately, many studies do not report effect sizes (despite many calls for this reporting; e.g., Wilkinson et al., 1999 ), and it is necessary to compute effect sizes from a wide variety of information reported in studies. Although it is not possible to consider all possibilities here, we next describe a few of the more common situations. Table 30.1 summarizes equations for computing r and g in these situations (note that it is typically necessary to reconstruct contingency tables from reported data to compute the odds ratio; see   Fleiss, 1994 ).

It is common for studies to report group comparisons in the form of either the (independent samples) t -test or as the results of a two-group (i.e., 1 df ) analysis of variance (ANOVA). This occurs either because the study focused on a truly dichotomous grouping variable (in which case, the desired effect size is a standardized mean difference such as g ) or because the study authors artificially dichotomized one of the continuous variables (in which case the desired effect size is r ). In these cases, either r or g can be computed from the t statistic of F ratio in Table 30.1 . For the F ratio, it is critical that the result is from a two-group (i.e., 1 df ) comparison (for discussion of computing effect sizes from > 1 df F ratios, see   Rosenthal, Rosnow, & Rubin, 2000 ). When computing g (but not r ), a more precise estimate can be made if the two group sizes are known; otherwise, it is necessary to use the approximations shown to the right of Table 30.1 (e.g., in the first row for g , the exact formula is on the left and the approximation is on the right).

An alternate situation is that the study has performed repeated-measures comparisons (e.g., pretreatment vs. posttreatment) and reported results of dependent, or repeated-measures, t -tests, or F ratios. The equations for computing r from these results are identical to those for computing from independent samples tests; however, for g , the equations differ for independent versus dependent sample statistics, as seen in Table 30.1 .

A third possibility is that the study authors represent both variables that constitute your effect size of interest as dichotomous variables. The study might report the 1 df χ 2 of this contingency or the data that can be used to construct the contingency table and the subsequent value. In this situation, r and g are computed from this χ 2 value and sample size ( N ). As with the F ratio, it is important to keep in mind that this equation only applied to 1 df χ 2 values (i.e., 2 × 2 contingency tables).

The last situation we will discuss is when the authors report none of the above statistics but do report a significance level (i.e., p ). Here, you can compute the one-tail standard normal deviate, Z , associated with this significance level (e.g., Z = 1.645 for p = 0.05) and then use the equations of Table 30.1 to compute r or g . These formulas are used when an exact significance level is reported (e.g., p = 0.027); if they are applied to ranges (e.g., p < 0.05), then they provide only a lower-bound estimate of the actual effect size.

Although we have certainly not covered all possible situations, these represent some of the most common situations you are likely to encounter when coding effect sizes for a meta-analysis. For details of these and other situations in which you might code effect sizes, see   Card (2012 ) or Lipsey and Wilson (2001 ).

Analysis of Mean Effect Sizes and Heterogeneity

After coding study characteristics and effect sizes from all studies included in a meta-analysis, it is possible to statistically combine and compare results across studies. In this section, we describe a method (fixed effects) of computing a mean effect size and making inferences about this mean. We then describe a test of heterogeneity that informs whether the between-study variability in effect sizes is greater than expectable by sampling fluctuation alone. Finally, we describe an alternative approach to computing mean effect sizes (random effects) that accounts for between-study variability.

Fixed-Effects Means

One of the primary goals of meta-analytic combination of effect sizes from multiple studies is to estimate an average effect size that exists in the literature and then to make inferences about this average effect size in the form of statistical significance and/or confidence intervals. Before describing how to estimate and make inferences about a mean effect size, we briefly describe the concept of weighting.

Weighting in Meta-Analysis . Nearly all (and all that we describe here) analyses of effect sizes in meta-analysis apply weights to studies. These weights are meant to in dex the degree of precision in each study’s estimate of the population effect size, such that studies with more precise estimates receive greater weight in the analyses than studies with less precise estimates. The most straightforward weight is the inverse of the variance of a study’s estimate of the population effect size. In other words, the weight of study i is the inverse of the squared standard error from that study:

As described above, the standard error of a study largely depends on the sample size (and for g , the effect size itself), such that studies with large samples have smaller standard errors than studies with small samples. Therefore, studies with large samples have larger weights than studies with smaller samples.

Fixed-Effects Mean Effect Sizes . After computing weights for each study using the equation above, estimating the mean effect size ( E ¯ S ¯ ) across studies is a relatively simple matter of computing the weighted mean of effect sizes across all studies:

This value represents the estimate of a single effect size in the population based on information combined from all studies included in the meta-analysis. Because it is often useful to draw inferential conclusions, the standard error of this estimate is computed using the equation:

This standard error can then be used to compute either statistical significance or confidence intervals. For determining statistical significance, the mean effect size is divided by the standard error, and the resulting ratio is evaluated as a standard normal deviate (i.e., Z -test, with, e.g., values larger than ±1.96 having p < 0.05). For computing confidence intervals, the standard error is multiplied by the standard normal deviate associated with the desired confidence interval (e.g., Z = 1.96 for a 95% confidence interval), and this product is then subtracted from and added to the mean effect size to identify the lower- and upper-bounds of the confidence interval.

If the effect size chosen for the meta-analysis (i.e., r, g , or o ) was transformed prior to analyses (e.g., r to Z r ), then the mean effect size and boundaries of its confidence interval will be in this transformed metric. It is usually more meaningful to back-transform these values to their original metrics for reporting.

Heterogeneity

In addition to estimating a mean effect size, meta-analysts evaluate the variability of effect sizes across studies. Some degree of variability in effect sizes across studies is always expectable; the fact that different studies relied on different samples results in somewhat different estimates of effect sizes because of sampling variability. In situations where effect sizes differ by an amount expectable due to sampling variability, the studies are considered homogeneous with respect to their population effect sizes. However, if effect sizes vary across studies more than expected by sampling fluctuation alone, then they are considered heterogeneous (or varying) with respect to their population effect sizes.

It is common to perform a statistical test to evaluate heterogeneity. In this test, the null hypothesis is of homogeneity, or no variability, in population effect sizes across studies (i.e., any variability in sample effect sizes is caused by sampling variability), whereas the alternative hypothesis is of heterogeneity, or variability, in population effect sizes across studies (i.e., variability in sample effect sizes that is not accounted for by sampling variability alone). The result of this test is denoted by Q :

The statistical significance of this Q is evaluated as a χ 2 distribution with df = number of studies – 1. You will note that this equation has two forms. The left portion of Equation 12 is the definitional equation, which makes clear that the squared deviation of the effect size from each study i from the overall mean effect size is being weighted and summed across studies. Therefore, small deviations from the mean will contribute to small values of Q (homogeneity), whereas large deviations from the mean will contribute to large values of Q (heterogeneity). The right portion of Equation 12 is an algebraic rearrangement that simplifies computation (i.e., a computational formula).

Results of this test have implications for subsequent analyses. Specifically, a conclusion of homogeneity (more properly, failure to conclude heterogeneity) suggests that the fixed-effects mean described above is an acceptable way to summarize effect sizes, and this conclusion may contraindicate moderator analyses (described below). In contrast, a conclusion of heterogeneity implies that the fixedeffects mean is not an appropriate way to summarize effect sizes, but, rather, a random-effects model (described in the next section ) should be used. Further, a conclusion of heterogeneity indicates that moderator analyses (described below) may help explain this between-study variance (i.e., heterogeneity). It is worth noting that the result of this heterogeneity test is not the sole basis of deciding to use random-effects models or to conduct moderator analyses, and meta-analysts often base these decisions on conceptual rather than empirical grounds ( see   Card, 2012 ; Hedges & Vevea, 1998 ).

Random-Effects Means

Estimation of means via a random-effects model relies on a different conceptual model and analytic approach than estimation via a fixed-effects model. We describe this conceptual model and estimation procedures next.

Conceptualization of Random-Effects Means . Previously, when we described estimation of a fixedeffects mean, we describe a single population effect size. In contrast, a random-effects model assumes that there is a normal distribution of population effect sizes. This distribution of population effect sizes has a mean, which we estimate as described next. However, it also has a degree of spread, which can be indexed by the standard deviation (or variance) of effect sizes at the population level. To explicate the assumptions in equation form, the fixed- and random-effects models assume that the effect sizes observed in study i ( ES i ) are a function of the following, respectively:

In both Equation 13 (fixed effects) and Equation 14 (random effects), effect sizes in a study partly result from the sampling fluctuation of that study (ε i ). In the fixed-effects model, this sampling fluctuation is around a single population effect size (θ). In contrast, the random-effects model specifies that the population effect size is a function of both a mean population effect size (μ) as well as the deviation of the population effect size of study i from this mean (ξ i ). Although it is impossible to know the sampling fluctuation and the population deviation from a single study, it is possible to estimate the respective variances of each across studies.

Estimating Between-Study Population Variance . We described above the heterogeneity test, indexed by Q , which is a statistical test of whether variability in observed effect sizes across studies could be accounted for by sampling variability alone (i.e., the null hypothesis of homogeneity) or was greater than expected by sampling variability (i.e., the alternate hypothesis of heterogeneity). To estimate betweenstudy population variance τ 2 in effect sizes, we evaluate how much greater Q is than that expected under the null hypothesis of homogeneity (i.e., sampling variance alone):

Note that this equation is used only if Q ≥ k − 1 to avoid negative variance estimates (if Q < k − 1, τ 2 = 0). Although this equation is not intuitively obvious, consideration of the numerator helps clarify. Recall that large values of Q result when studies have effect sizes with large deviations from the mean effect size and that under the null hypothesis of homogeneity, Q is expected to equal the number of studies ( k ) minus 1. To the extent that Q is much larger than this expected value, the numerator of this equation will be large, implying large population between-study variability. In contrast, if Q is not much higher than the expected value under homogeneity, then the population between-study variability will be near zero.

Estimating Random-Effects Means . If studies have a sizable amount of randomly distributed between-study variance in their population effect sizes, then this implies that each is a less precise estimate of mean population effect size. In other words, each contains more uncertainty as information for estimating this value. To capture this uncertainty, or lower precision, analyses under the random-effects model use a different weight than those of the fixed-effects model. Specifically, the random-effects weight, denoted as w ∗ (or sometimes w RE ), for study i is the inverse of the sum of this between-study variance (τ 2 ) and the sampling variance for that study (i.e., squared standard error, S E i 2 ):

This random-effects weight will be smaller than the comparable fixed-effects weight, with the discrepancy increasing with greater between-study variance. These random-effects weights are simply used in the equations above to estimate a random-effects mean effect size (Equation 10 ), as well as a standard error for this mean (Equation 9 ) for inferential tests.

Moderator Analyses

Moderator analyses are another approach to managing heterogeneity in effect sizes ( Hedges & Pigott, 2004 ), but here the focus is on explaining (versus simply modeling as random) this between-study variance. These analyses use coded study characteristics to predict effect sizes; the reason these analyses are called “moderator” analyses is because they evaluate whether the effect size—a two-variable association—differs depending on the level of the third, moderator, variable—the study characteristic. It is often of primary interest to understand whether the association between two variables differs based on the level of a third variable (the moderator). Therefore, moderator analyses identifying those characteristics of the study that lead to higher or lower effects sizes are very commonly performed in meta-analyses. In this section, we briefly consider two types of moderators (i.e., categorical and continuous) along with the procedures used to inves-tigate these two types of moderators in meta-analysis (i.e., an adapted ANOVA procedure and a multiple regression procedure, respectively).

Single Categorical Moderator

A categorical variable is any variable on which a participants, observations or, in the case of metaanalysis, studies can be distinctly classified. Testing categorical moderators in meta-analysis involves comparing the mean effects of groups of studies classified by their status on some categorical variable.

Evaluating the Significance of a Categorical Moderator . Categorical moderator analysis in meta-analysis is similar to ANOVA in primary research. In the context of primary research, ANOVA partitions variability between groups of individuals into variability between and within these groups. Similarly, in meta-analysis, the ANOVA procedure is used to partition between-study heterogeneity into heterogeneity that exists between and within groups of studies. Earlier (Equation 12 ), we provided equations for quantifying the heterogeneity as Q ; we now provide this equation again, but now specifying that this is the total heterogeneity among studies:

This Q Total refers to the heterogeneity that exists across all studies. It can be partitioned into between-group ( Q Between ) and within-group ( Q Within ) components by the fact that Q Total = Q Between + Q Within . It is simpler to compute Q Within than Q Between , so it is common to subtract this from the total heterogeneity to obtain the between group variance. Within each group of studies, g , the heterogeneity can be estimated among just the studies in this group:

Then, these estimates of heterogeneity within each group can be summed across groups to yield the within study heterogeneity:

As stated above, testing categorical moderators within an ANOVA framework is done by separating the total heterogeneity ( Q Total ) into between-group ( Q Between ) and within-group ( Q Within ) heterogeneity. Therefore, after computing the total heterogeneity ( Q Total ) and the within-group heterogeneity ( Q Within ), you simply subtract the within-group heterogeneity from the total heterogeneity to find Q Between . This value is evaluated as a χ 2 distribution with df = number of groups – 1. If this value is statistically significant, then this is evidence that the level of the categorical moderator predicts variability in effect sizes—moderation.

Single Continuous Moderator

A continuous study characteristic is one that is measured on a scale that can potentially take on an infinite, or at least large number of values. In metaanalysis, a continuous moderator is a coded study characteristic (e.g., sample age, SES) that varies along a continuum of values and is hypothesizes to predict effect sizes.

Similarly to the use of an adapted ANOVA procedure in the evaluation of categorical moderators in meta-analysis, we use an adapted multiple regression procedure for the evaluation of continuous moderators in meta-analysis ( Hedges & Pigott, 2004 ). This adaptation to the evaluation of a continuous moderator involves a weighted regression of the effect sizes (dependent variable) onto the continuous moderator.

To evaluate potential moderation of a continuous moderator within a multiple regression framework, we regress the effect sizes onto the hypothesized continuous moderator using a standard regression equation: Z ES = B O + B 1 (Study Characteistic) + e , using w as the (weighted least squares) weight. From the results, we are interested in the sum of squares of the regression model (which is the heterogeneity accounted for by the linear regression model, Q Regression , and evaluated on a chi-square distribution with df = number of predictors), and sometimes the residual sum of squares (which is Q Residual , or heterogeneity not explained by the study characteristic). The unstandardized regression coefficient indicates how the effect size changes per unit change in the continuous moderator. The standard error of this coefficient is inaccurate in the regression output and must be adjusted by dividing it by the square root of the MS Residual .

The statistical significance of the predictor can also be evaluated by dividing the regression coefficient ( B 1 ) by the adjusted standard error, evaluated on the standard two-tail Z distribution. Interpretation of moderation with continuous variables is not as straightforward as with categorical moderators; it is necessary to compute implied effect sizes at different levels of the continuous moderator.

Multiple Regression to Analyze Categorical Moderators

Thus far we have considered moderation by a single categorical variable within an ANOVA framework and by a continuous variable within a regression framework. Next, we address categorical moderators within a multiple regression framework. Before doing so, it is useful to consider how the analyses we have described to this point fit within this general multiple regression framework.

The Empty Model . By empty model, we are referring to a model that includes only an intercept (a constant value of 1 for all cases) as a predictor. A weighted regression of effect sizes predicted only by a constant is often useful for an initial analysis of the mean effect size and to evaluate heterogeneity of these effect sizes across studies. The following equation accomplishes this:

In this empty model, the intercept regression coefficient is the mean effect size, and the sum of squares of the residual is the heterogeneity.

Use of Dummy Variables to Analyze Categorical Moderators . To evaluate the categorical moderators in this meta-regression framework, dummy variables can be used to represent group membership. Here, we select a reference group to which we would assign the value 0 for all the dummy codes for studies using that particular reference group, and each dummy variable represents the difference of another group relative to this reference group. The effect size is regressed onto the dummy variables, weighted by the inverse variance weight w , with the following equation:

The results of this regression are interpreted as above. The Q Regression is equivalent to the Q Between of the ANOVA framework and is used to determine whether there is categorical moderation. To identify the particular groups that differ from the reference group, the regression coefficients for the dummy variables are considered. Again, the standard errors of these coefficients are inaccurate and need to be adjusted as described above.

Multiple Moderators . A multiple regression framework can also be used to evaluate multiple categorical and/or continuous predictors in meta-analysis. Here, it is likely of interest to consider both the overall regression model ( Q Regression ) as well as the results of particular predictors. The former is evaluated by interpreting the model sum of squares as a χ 2 distribution with df = number of predictors – 1. The latter is evaluated by dividing the regression coefficients by their adjusted standard errors.

Limitations to Interpretation of Moderators

Clearly, moderation analyses can enhance the conclusions drawn from meta-analysis, but there are some limitations that need also be considered. The first consideration is that of multicolinearity in meta-analytic moderator analyses. It is likely that some moderator variables will be correlated, but this can be assessed by regressing each moderator onto the set of other moderators using the weights you used in the moderator analyses. The second limitation is the possibility that uncoded variables are confounding the association and moderation of the variables that are coded. The best approach to avoid confounding variables is to code as many variables as possible. Finally, it will be important to feel confident that the literature included in your synthesis adequately covers the range of potential moderator values. This can best be analyzed by plotting the included studies at the various levels of the moderator.

Advanced Topics

Given the existence of meta-analytic techniques over several decades, and their widespread use during this time, it is not surprising that there exists a rich literature on meta-analytic techniques. Although space has precluded us from discussing all of these topics, we next briefly describe a few of the more advanced topics important in this field.

Alternative Effect Sizes

The three effect sizes described in this chapter (i.e., r, g , and o ) quantify two-variable associations and are the most commonly used effect sizes for meta-analysis. However, there exist many other possibilities that might be considered.

Single Variable Effect Sizes . In some cases, it may be valuable to meta-analytically combine and/or compare information about single variables rather than two-variable associations. Central tendency can be indexed by the mean for continuous data or by the proportion for dichotomous variables; both of these effect sizes can be used in meta-analyses ( see   Lipsey & Wilson, 2001 ). It is also possible to use standard deviations or variances as effect sizes for meta-analysis to draw conclusions about interindi-vidual differences. Meta-analytic combination of means and variances require that the same measure, or else different measures on the same scale, be used for all studies.

Meaningful Metric . The effect sizes we have described are all in some standardized metric. However, there may be instances when the scales of variables comprising the effect size are meaningful, and therefore it is useful to use unstandard-ized effect sizes. Meta-analysis of such effect sizes were described in a special section of the journal Psychological Methods ( see   Becker, 2003 ).

Multivariate Effect Sizes . Many research questions go beyond two-variable associations to consider multivariate effect sizes, such as whether X uniquely predicts Y above and beyond Z . It is statistically possible to meta-analytically combine and compare multivariate effects sizes, such as regression coefficients or partial/semipartial correlation to address the example associations among X, Y , and Z . However, it is typically not possible in practice to use multivariate effect sizes for meta-analyses. The primary reason is that their use would require that the same multivariate analyses are performed and reported across studies in the meta-analysis. For example, it would be necessary for all studies included to report the regression of Y on X controlling for Z ; studies that failed to control for Z , that instead controlled for W , or that controlled for both Z and W could not be included. A more tractable alternative to the meta-analysis of multivariate effect sizes is to perform multivari-ate meta-analysis of bivariate effect sizes, which we briefly describe below.

Artifact Corrections

Artifacts are study imperfections that lead to biases—typically underestimations—of effect sizes. For example, it is well known that unreliability of a measure attenuates (i.e., reduces) the magnitude of observed associations that this variable has with others relative to what would have been found with a perfectly measured variable. In addition to measurement unreliability, other artifacts include imperfect validity of measures, artificial dichotomization of continuous variables, and range restriction of the sample on a variable included in the effect size (direct range restriction) or another variable closely related to a variable in the effect size (indirect range restriction).

The general approach to correcting for artifacts is to compute a correction factor for each artifact using one of a variety of equations ( see   Hunter & Schmidt, 2004 ). For example, one of the more straightforward corrections is for unreliability of a measure of X (where r xx is the reliability of X ):

Each of the artifact corrections may yield a correction factor, which are then multiplied together to yield an overall artifact multiplier ( a ). This artifact multiplier is then used to estimate an adjusted effect size from the observed effect size to index what the effect size would likely have been if the artifacts (study imperfections) had not existed:

This estimation of artifact-free effect sizes from observed effect sizes is unbiased (i.e., it will not consistently over- or underestimate the true effect size), but it is also not entirely precise. In other words, the artifact correction introduces additional uncertainty in the effect size estimate that must be considered in the meta-analysis. Specifically, the standard error of the effect size, which can be thought of as representing imprecision in the estimate of the effect size, is also adjusted by this artifact multiplier to account for this additional uncertainty introduced by artifact correction:

Multivariate Meta-Analysis

Multivariate meta-analysis is a relatively new and underdeveloped approach, but one that has great potential for use. Because the approach is fairly complex, and there is not general agreement on what techniques are best in different situations, we describe this approach in fairly general terms, referring interested readers to Becker (2009 ) or Cheung and Chan (2005 ).

The key idea of multivariate meta-analysis is to meta-analytically combine multiple bivariate effect sizes, which are then used as sufficient statistics for multivariate analyses. For example, to fit a model in which variable X is regressed on variables Y and Z , you would perform three meta-analyses of the three correlations ( r XY , r XZ , and r YZ ), and this matrix of meta-analytically combined correlations would then be used to estimate the multiple regression parameters.

Although the logic of this approach is reasonably simple, the application is much more complex. Challenges include how one handles the likely possibility that different studies provide different effect sizes, what the effective sample size is for the multivariate model when different studies inform different correlations, how (or even whether) to test for and potentially model between-study heterogeneity, and how to perform moderator analyses. Answers to these challenges have not been entirely agreed upon even by quantitative experts, making it difficult for those wishing to apply these models to answer substantive research questions. However, these models offer an extremely valuable potential for extending meta-analytic techniques to answer richer research questions than two-variable associations that are the typical focus of meta-analyses.

Although we have been able to provide only a brief overview of meta-analysis in this chapter, we hope that the opportunities of this methodology are clear. Given the overwhelming and increasing quantity of empirical research in most fields, techniques for best synthesizing the existing research are a critical tool in advancing our understanding.

Becker, B. J. ( 2003 ). Introduction to the special section on metric in meta-analysis.   Psychological Methods, 8, 403–405.

Google Scholar

Becker, B. J. ( 2009 ). Model-based meta-analysis. In H. Cooper , L. V. Hedges , & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed.) (pp. 377–395). New York: Russell Sage Foundation

Google Preview

Card, N. A. ( 2012 ). Meta-analysis: Quantitative synthesis of social science research . New York: Guilford.

Card, N. A. , Stucky, B. D. , Sawalani, G. M. , & Little, T. D. ( 2008 ). Direct and indirect aggression during childhood and adolescence: A meta-analytic review of gender differences, intercorrelations, and relations to maladjustment.   Child Development, 79, 1185–1229.

Chalmers, I. , Hedges, L.V. , & Cooper, H. ( 2002 ). A brief history of research synthesis.   Evaluation and Health Professions, 25, 12–37.

Cheung, M. W. L. , & Chan, W. ( 2005 ). Meta-analytic structural equation modeling: A two-stage approach.   Psychological Methods, 10, 40–64.

Cochran, W. G. ( 1937 ). Problems arising in the analysis of a series of similar experiments.   Journal of the Royal Statistical Societ y, 4(Suppl.), 102–118.

Cooper, H. M. ( 1988 ). Organizing knowledge syntheses: A taxonomy of literature reviews.   Knowledge in Societ y, 1, 104–126.

Cooper, H. M. , & Hedges, L. V. ( 2009 ). Research synthesis as a scientificprocess. In H. Cooper , L. V. Hedges , & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed.) (pp. 3–16). New York: Russell Sage Foundation.

Eysenck, H. J. ( 1978 ). An exercise in mega-silliness.   American Psychologist, 33, 517.

Fleiss, J. H. ( 1994 ). Measures of effect size for categorical data. In H. Cooper & L. V. Hedges (Eds.), The handbook of research synthesis (pp. 245–260). New York: Russell Sage Foundation.

Glass, G. V. ( 1976 ). The critical state of the critical review article.   Quarterly Review of Biology , 50th Anniversary Special Issue (1926–1976), 415–418.

Glass, G. V. , McGraw, B. , & Smith, M. L. ( 1981 ). Meta-analysis in social research . Thousand Oaks, CA: Sage.

Glass, G. V. & Smith, M. ( 1979 ). Meta-analysis of the relationship between class size and achievement.   Educational Evaluation and Policy Analysi s, 1, 2–16.

Hedges, L. V. ( 1987 ). How hard is hard science, how soft is soft science? The empirical cumulativeness of research.   American Psychologist, 42, 443–455.

Hedges, L. V. ( 1992 ). Meta-analysis.   Journal of Educational Statistics, 17, 279–296.

Hedges, L. V. , & Olkin, I. ( 1985 ). Statistical methods for meta-analysis . San Diego, CA: Academic Press.

Hedges, L. V. & Pigott, T. D. ( 2004 ). The power of statistical tests for moderators in meta analysis.   Psychological Method s, 9, 426–445.

Hedges, L. V. , & Vevea, J. L. ( 1998 ). Fixed- and random-effects models in meta-analysis.   Psychological Method s, 3, 486–504.

Hunt, M. ( 1997 ). How science takes stock: The story of meta-analysis . New York: Russell Sage Foundation.

Hunter, J. E. , & Schmidt, F. L. ( 2004 ). Methods of meta-analysis: Correcting error and bias in research findings (2nd ed.). Thousand Oaks, CA: Sage.

Hunter, J. E. , Schmidt, F. L. , & Hunter, R. ( 1979 ). Differential validity of employment tests by race: A comprehensive review and analysis.   Psychological Bulletin, 86, 721–735.

Hunter, J. E. , Schmidt, F. L. , & Jackson, G. B. ( 1982 ). Meta-analysis: Cumulating research findings across studies . Beverly Hills, CA: Sage.

Lipsey, M. W. , & Wilson, D. B. ( 2001 ). Practical meta-analysis . Thousand Oaks, CA: Sage.

Maxwell, S. E. ( 2004 ). The persistence of underpowered studies in psychological research: Causes, consequences, and remedies.   Psychological Method s, 9, 147–163.

Olkin , ( 1990 ). History and goals. In K. W. Wachter & M. L. Straf (Eds.), The future of meta analysis (pp. 3–10). New York: Russell Sage Foundation.

Pearsons, K. ( 1904 ). Report on certain enteric fever inoculation statistics.   British Medical Journa l, 3, 1243–1246.

Rosenthal, R. ( 1979 ). The “file drawer problem” and tolerance for null results.   Psychological Bulletin, 86, 638–641.

Rosenthal, R. ( 1984 ). Meta-analytic procedures for social research . Beverly Hills, CA: Sage.

Rosenthal, R. ( 1991 ). Meta-analytic procedures for social research (revised edition). Newbury Park, CA: Sage.

Rosenthal, R. , Rosnow, R. L. , & Rubin, D. B. ( 2000 ). Contrasts and effect sizes in behavioral research: A correlational approach . New York: Cambridge University Press.

Rosenthal, R. & Rubin, D. ( 1978 ). Interpersonal expectancy effects: The first 345 studies.   Behavioral and Brain Science s, 3, 377–415.

Rothstein, H. R. , Sutton, A. J. , & Borenstein, M. (Eds.) ( 2005 ). Publication bias in meta analysis: Prevention, assessment and adjustments . Hoboken, NJ: Wiley.

Schmidt, F. L. ( 1992 ). What do data really mean? Research findings, meta-analysis, and cumulative knowledge in psychology.   American Psychologist, 47, 1173–1181.

Schmidt, F. L. , & Hunter, J. E. ( 1977 ). Development of a general solution to the problem of validity generalization.   Journal of Applied Psychology, 62, 529–540.

Smith, M. L. & Glass, G.V. ( 1977 ). Meta-analysis of psychotherapy outcome studies.   American Psychologist, 32, 752–760.

White, H. D. ( 2009 ). Scientific communication and literature retrieval. In H. Cooper , L. V. Hedges , & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed.) (pp. 51–71). New York: Russell Sage Foundation.

Wilkinson, L. , & The Task Force on Statistical Significance ( 1999 ). Statistical methods in psychology journals: Guidelines and explanations.   American Psychologist, 54, 594–604.

  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Published: 08 March 2018

Meta-analysis and the science of research synthesis

  • Jessica Gurevitch 1 ,
  • Julia Koricheva 2 ,
  • Shinichi Nakagawa 3 , 4 &
  • Gavin Stewart 5  

Nature volume  555 ,  pages 175–182 ( 2018 ) Cite this article

54k Accesses

869 Citations

881 Altmetric

Metrics details

  • Biodiversity
  • Outcomes research

Meta-analysis is the quantitative, scientific synthesis of research results. Since the term and modern approaches to research synthesis were first introduced in the 1970s, meta-analysis has had a revolutionary effect in many scientific fields, helping to establish evidence-based practice and to resolve seemingly contradictory research outcomes. At the same time, its implementation has engendered criticism and controversy, in some cases general and others specific to particular disciplines. Here we take the opportunity provided by the recent fortieth anniversary of meta-analysis to reflect on the accomplishments, limitations, recent advances and directions for future developments in the field of research synthesis.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 51 print issues and online access

185,98 € per year

only 3,65 € per issue

Buy this article

  • Purchase on Springer Link
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

meta analysis in quantitative research example

Similar content being viewed by others

meta analysis in quantitative research example

Eight problems with literature reviews and how to fix them

Neal R. Haddaway, Alison Bethel, … Gavin B. Stewart

meta analysis in quantitative research example

The past, present and future of Registered Reports

Christopher D. Chambers & Loukia Tzavella

meta analysis in quantitative research example

Raiders of the lost HARK: a reproducible inference framework for big data science

Mattia Prosperi, Jiang Bian, … Mo Wang

Jennions, M. D ., Lortie, C. J. & Koricheva, J. in The Handbook of Meta-analysis in Ecology and Evolution (eds Koricheva, J . et al.) Ch. 23 , 364–380 (Princeton Univ. Press, 2013)

Article   Google Scholar  

Roberts, P. D ., Stewart, G. B. & Pullin, A. S. Are review articles a reliable source of evidence to support conservation and environmental management? A comparison with medicine. Biol. Conserv. 132 , 409–423 (2006)

Bastian, H ., Glasziou, P . & Chalmers, I. Seventy-five trials and eleven systematic reviews a day: how will we ever keep up? PLoS Med. 7 , e1000326 (2010)

Article   PubMed   PubMed Central   Google Scholar  

Borman, G. D. & Grigg, J. A. in The Handbook of Research Synthesis and Meta-analysis 2nd edn (eds Cooper, H. M . et al.) 497–519 (Russell Sage Foundation, 2009)

Ioannidis, J. P. A. The mass production of redundant, misleading, and conflicted systematic reviews and meta-analyses. Milbank Q. 94 , 485–514 (2016)

Koricheva, J . & Gurevitch, J. Uses and misuses of meta-analysis in plant ecology. J. Ecol. 102 , 828–844 (2014)

Littell, J. H . & Shlonsky, A. Making sense of meta-analysis: a critique of “effectiveness of long-term psychodynamic psychotherapy”. Clin. Soc. Work J. 39 , 340–346 (2011)

Morrissey, M. B. Meta-analysis of magnitudes, differences and variation in evolutionary parameters. J. Evol. Biol. 29 , 1882–1904 (2016)

Article   CAS   PubMed   Google Scholar  

Whittaker, R. J. Meta-analyses and mega-mistakes: calling time on meta-analysis of the species richness-productivity relationship. Ecology 91 , 2522–2533 (2010)

Article   PubMed   Google Scholar  

Begley, C. G . & Ellis, L. M. Drug development: Raise standards for preclinical cancer research. Nature 483 , 531–533 (2012); clarification 485 , 41 (2012)

Article   CAS   ADS   PubMed   Google Scholar  

Hillebrand, H . & Cardinale, B. J. A critique for meta-analyses and the productivity-diversity relationship. Ecology 91 , 2545–2549 (2010)

Moher, D . et al. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 6 , e1000097 (2009). This paper provides a consensus regarding the reporting requirements for medical meta-analysis and has been highly influential in ensuring good reporting practice and standardizing language in evidence-based medicine, with further guidance for protocols, individual patient data meta-analyses and animal studies.

Moher, D . et al. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst. Rev. 4 , 1 (2015)

Nakagawa, S . & Santos, E. S. A. Methodological issues and advances in biological meta-analysis. Evol. Ecol. 26 , 1253–1274 (2012)

Nakagawa, S ., Noble, D. W. A ., Senior, A. M. & Lagisz, M. Meta-evaluation of meta-analysis: ten appraisal questions for biologists. BMC Biol. 15 , 18 (2017)

Hedges, L. & Olkin, I. Statistical Methods for Meta-analysis (Academic Press, 1985)

Viechtbauer, W. Conducting meta-analyses in R with the metafor package. J. Stat. Softw. 36 , 1–48 (2010)

Anzures-Cabrera, J . & Higgins, J. P. T. Graphical displays for meta-analysis: an overview with suggestions for practice. Res. Synth. Methods 1 , 66–80 (2010)

Egger, M ., Davey Smith, G ., Schneider, M. & Minder, C. Bias in meta-analysis detected by a simple, graphical test. Br. Med. J. 315 , 629–634 (1997)

Article   CAS   Google Scholar  

Duval, S . & Tweedie, R. Trim and fill: a simple funnel-plot-based method of testing and adjusting for publication bias in meta-analysis. Biometrics 56 , 455–463 (2000)

Article   CAS   MATH   PubMed   Google Scholar  

Leimu, R . & Koricheva, J. Cumulative meta-analysis: a new tool for detection of temporal trends and publication bias in ecology. Proc. R. Soc. Lond. B 271 , 1961–1966 (2004)

Higgins, J. P. T . & Green, S. (eds) Cochrane Handbook for Systematic Reviews of Interventions : Version 5.1.0 (Wiley, 2011). This large collaborative work provides definitive guidance for the production of systematic reviews in medicine and is of broad interest for methods development outside the medical field.

Lau, J ., Rothstein, H. R . & Stewart, G. B. in The Handbook of Meta-analysis in Ecology and Evolution (eds Koricheva, J . et al.) Ch. 25 , 407–419 (Princeton Univ. Press, 2013)

Lortie, C. J ., Stewart, G ., Rothstein, H. & Lau, J. How to critically read ecological meta-analyses. Res. Synth. Methods 6 , 124–133 (2015)

Murad, M. H . & Montori, V. M. Synthesizing evidence: shifting the focus from individual studies to the body of evidence. J. Am. Med. Assoc. 309 , 2217–2218 (2013)

Rasmussen, S. A ., Chu, S. Y ., Kim, S. Y ., Schmid, C. H . & Lau, J. Maternal obesity and risk of neural tube defects: a meta-analysis. Am. J. Obstet. Gynecol. 198 , 611–619 (2008)

Littell, J. H ., Campbell, M ., Green, S . & Toews, B. Multisystemic therapy for social, emotional, and behavioral problems in youth aged 10–17. Cochrane Database Syst. Rev. https://doi.org/10.1002/14651858.CD004797.pub4 (2005)

Schmidt, F. L. What do data really mean? Research findings, meta-analysis, and cumulative knowledge in psychology. Am. Psychol. 47 , 1173–1181 (1992)

Button, K. S . et al. Power failure: why small sample size undermines the reliability of neuroscience. Nat. Rev. Neurosci. 14 , 365–376 (2013); erratum 14 , 451 (2013)

Parker, T. H . et al. Transparency in ecology and evolution: real problems, real solutions. Trends Ecol. Evol. 31 , 711–719 (2016)

Stewart, G. Meta-analysis in applied ecology. Biol. Lett. 6 , 78–81 (2010)

Sutherland, W. J ., Pullin, A. S ., Dolman, P. M . & Knight, T. M. The need for evidence-based conservation. Trends Ecol. Evol. 19 , 305–308 (2004)

Lowry, E . et al. Biological invasions: a field synopsis, systematic review, and database of the literature. Ecol. Evol. 3 , 182–196 (2013)

Article   PubMed Central   Google Scholar  

Parmesan, C . & Yohe, G. A globally coherent fingerprint of climate change impacts across natural systems. Nature 421 , 37–42 (2003)

Jennions, M. D ., Lortie, C. J . & Koricheva, J. in The Handbook of Meta-analysis in Ecology and Evolution (eds Koricheva, J . et al.) Ch. 24 , 381–403 (Princeton Univ. Press, 2013)

Balvanera, P . et al. Quantifying the evidence for biodiversity effects on ecosystem functioning and services. Ecol. Lett. 9 , 1146–1156 (2006)

Cardinale, B. J . et al. Effects of biodiversity on the functioning of trophic groups and ecosystems. Nature 443 , 989–992 (2006)

Rey Benayas, J. M ., Newton, A. C ., Diaz, A. & Bullock, J. M. Enhancement of biodiversity and ecosystem services by ecological restoration: a meta-analysis. Science 325 , 1121–1124 (2009)

Article   ADS   PubMed   CAS   Google Scholar  

Leimu, R ., Mutikainen, P. I. A ., Koricheva, J. & Fischer, M. How general are positive relationships between plant population size, fitness and genetic variation? J. Ecol. 94 , 942–952 (2006)

Hillebrand, H. On the generality of the latitudinal diversity gradient. Am. Nat. 163 , 192–211 (2004)

Gurevitch, J. in The Handbook of Meta-analysis in Ecology and Evolution (eds Koricheva, J . et al.) Ch. 19 , 313–320 (Princeton Univ. Press, 2013)

Rustad, L . et al. A meta-analysis of the response of soil respiration, net nitrogen mineralization, and aboveground plant growth to experimental ecosystem warming. Oecologia 126 , 543–562 (2001)

Adams, D. C. Phylogenetic meta-analysis. Evolution 62 , 567–572 (2008)

Hadfield, J. D . & Nakagawa, S. General quantitative genetic methods for comparative biology: phylogenies, taxonomies and multi-trait models for continuous and categorical characters. J. Evol. Biol. 23 , 494–508 (2010)

Lajeunesse, M. J. Meta-analysis and the comparative phylogenetic method. Am. Nat. 174 , 369–381 (2009)

Rosenberg, M. S ., Adams, D. C . & Gurevitch, J. MetaWin: Statistical Software for Meta-Analysis with Resampling Tests Version 1 (Sinauer Associates, 1997)

Wallace, B. C . et al. OpenMEE: intuitive, open-source software for meta-analysis in ecology and evolutionary biology. Methods Ecol. Evol. 8 , 941–947 (2016)

Gurevitch, J ., Morrison, J. A . & Hedges, L. V. The interaction between competition and predation: a meta-analysis of field experiments. Am. Nat. 155 , 435–453 (2000)

Adams, D. C ., Gurevitch, J . & Rosenberg, M. S. Resampling tests for meta-analysis of ecological data. Ecology 78 , 1277–1283 (1997)

Gurevitch, J . & Hedges, L. V. Statistical issues in ecological meta-analyses. Ecology 80 , 1142–1149 (1999)

Schmid, C. H . & Mengersen, K. in The Handbook of Meta-analysis in Ecology and Evolution (eds Koricheva, J . et al.) Ch. 11 , 145–173 (Princeton Univ. Press, 2013)

Eysenck, H. J. Exercise in mega-silliness. Am. Psychol. 33 , 517 (1978)

Simberloff, D. Rejoinder to: Don’t calculate effect sizes; study ecological effects. Ecol. Lett. 9 , 921–922 (2006)

Cadotte, M. W ., Mehrkens, L. R . & Menge, D. N. L. Gauging the impact of meta-analysis on ecology. Evol. Ecol. 26 , 1153–1167 (2012)

Koricheva, J ., Jennions, M. D. & Lau, J. in The Handbook of Meta-analysis in Ecology and Evolution (eds Koricheva, J . et al.) Ch. 15 , 237–254 (Princeton Univ. Press, 2013)

Lau, J ., Ioannidis, J. P. A ., Terrin, N ., Schmid, C. H . & Olkin, I. The case of the misleading funnel plot. Br. Med. J. 333 , 597–600 (2006)

Vetter, D ., Rucker, G. & Storch, I. Meta-analysis: a need for well-defined usage in ecology and conservation biology. Ecosphere 4 , 1–24 (2013)

Mengersen, K ., Jennions, M. D. & Schmid, C. H. in The Handbook of Meta-analysis in Ecology and Evolution (eds Koricheva, J. et al.) Ch. 16 , 255–283 (Princeton Univ. Press, 2013)

Patsopoulos, N. A ., Analatos, A. A. & Ioannidis, J. P. A. Relative citation impact of various study designs in the health sciences. J. Am. Med. Assoc. 293 , 2362–2366 (2005)

Kueffer, C . et al. Fame, glory and neglect in meta-analyses. Trends Ecol. Evol. 26 , 493–494 (2011)

Cohnstaedt, L. W. & Poland, J. Review Articles: The black-market of scientific currency. Ann. Entomol. Soc. Am. 110 , 90 (2017)

Longo, D. L. & Drazen, J. M. Data sharing. N. Engl. J. Med. 374 , 276–277 (2016)

Gauch, H. G. Scientific Method in Practice (Cambridge Univ. Press, 2003)

Science Staff. Dealing with data: introduction. Challenges and opportunities. Science 331 , 692–693 (2011)

Nosek, B. A . et al. Promoting an open research culture. Science 348 , 1422–1425 (2015)

Article   CAS   ADS   PubMed   PubMed Central   Google Scholar  

Stewart, L. A . et al. Preferred reporting items for a systematic review and meta-analysis of individual participant data: the PRISMA-IPD statement. J. Am. Med. Assoc. 313 , 1657–1665 (2015)

Saldanha, I. J . et al. Evaluating Data Abstraction Assistant, a novel software application for data abstraction during systematic reviews: protocol for a randomized controlled trial. Syst. Rev. 5 , 196 (2016)

Tipton, E. & Pustejovsky, J. E. Small-sample adjustments for tests of moderators and model fit using robust variance estimation in meta-regression. J. Educ. Behav. Stat. 40 , 604–634 (2015)

Mengersen, K ., MacNeil, M. A . & Caley, M. J. The potential for meta-analysis to support decision analysis in ecology. Res. Synth. Methods 6 , 111–121 (2015)

Ashby, D. Bayesian statistics in medicine: a 25 year review. Stat. Med. 25 , 3589–3631 (2006)

Article   MathSciNet   PubMed   Google Scholar  

Senior, A. M . et al. Heterogeneity in ecological and evolutionary meta-analyses: its magnitude and implications. Ecology 97 , 3293–3299 (2016)

McAuley, L ., Pham, B ., Tugwell, P . & Moher, D. Does the inclusion of grey literature influence estimates of intervention effectiveness reported in meta-analyses? Lancet 356 , 1228–1231 (2000)

Koricheva, J ., Gurevitch, J . & Mengersen, K. (eds) The Handbook of Meta-Analysis in Ecology and Evolution (Princeton Univ. Press, 2013) This book provides the first comprehensive guide to undertaking meta-analyses in ecology and evolution and is also relevant to other fields where heterogeneity is expected, incorporating explicit consideration of the different approaches used in different domains.

Lumley, T. Network meta-analysis for indirect treatment comparisons. Stat. Med. 21 , 2313–2324 (2002)

Zarin, W . et al. Characteristics and knowledge synthesis approach for 456 network meta-analyses: a scoping review. BMC Med. 15 , 3 (2017)

Elliott, J. H . et al. Living systematic reviews: an emerging opportunity to narrow the evidence-practice gap. PLoS Med. 11 , e1001603 (2014)

Vandvik, P. O ., Brignardello-Petersen, R . & Guyatt, G. H. Living cumulative network meta-analysis to reduce waste in research: a paradigmatic shift for systematic reviews? BMC Med. 14 , 59 (2016)

Jarvinen, A. A meta-analytic study of the effects of female age on laying date and clutch size in the Great Tit Parus major and the Pied Flycatcher Ficedula hypoleuca . Ibis 133 , 62–67 (1991)

Arnqvist, G. & Wooster, D. Meta-analysis: synthesizing research findings in ecology and evolution. Trends Ecol. Evol. 10 , 236–240 (1995)

Hedges, L. V ., Gurevitch, J . & Curtis, P. S. The meta-analysis of response ratios in experimental ecology. Ecology 80 , 1150–1156 (1999)

Gurevitch, J ., Curtis, P. S. & Jones, M. H. Meta-analysis in ecology. Adv. Ecol. Res 32 , 199–247 (2001)

Lajeunesse, M. J. phyloMeta: a program for phylogenetic comparative analyses with meta-analysis. Bioinformatics 27 , 2603–2604 (2011)

CAS   PubMed   Google Scholar  

Pearson, K. Report on certain enteric fever inoculation statistics. Br. Med. J. 2 , 1243–1246 (1904)

Fisher, R. A. Statistical Methods for Research Workers (Oliver and Boyd, 1925)

Yates, F. & Cochran, W. G. The analysis of groups of experiments. J. Agric. Sci. 28 , 556–580 (1938)

Cochran, W. G. The combination of estimates from different experiments. Biometrics 10 , 101–129 (1954)

Smith, M. L . & Glass, G. V. Meta-analysis of psychotherapy outcome studies. Am. Psychol. 32 , 752–760 (1977)

Glass, G. V. Meta-analysis at middle age: a personal history. Res. Synth. Methods 6 , 221–231 (2015)

Cooper, H. M ., Hedges, L. V . & Valentine, J. C. (eds) The Handbook of Research Synthesis and Meta-analysis 2nd edn (Russell Sage Foundation, 2009). This book is an important compilation that builds on the ground-breaking first edition to set the standard for best practice in meta-analysis, primarily in the social sciences but with applications to medicine and other fields.

Rosenthal, R. Meta-analytic Procedures for Social Research (Sage, 1991)

Hunter, J. E ., Schmidt, F. L. & Jackson, G. B. Meta-analysis: Cumulating Research Findings Across Studies (Sage, 1982)

Gurevitch, J ., Morrow, L. L ., Wallace, A . & Walsh, J. S. A meta-analysis of competition in field experiments. Am. Nat. 140 , 539–572 (1992). This influential early ecological meta-analysis reports multiple experimental outcomes on a longstanding and controversial topic that introduced a wide range of ecologists to research synthesis methods.

O’Rourke, K. An historical perspective on meta-analysis: dealing quantitatively with varying study results. J. R. Soc. Med. 100 , 579–582 (2007)

Shadish, W. R . & Lecy, J. D. The meta-analytic big bang. Res. Synth. Methods 6 , 246–264 (2015)

Glass, G. V. Primary, secondary, and meta-analysis of research. Educ. Res. 5 , 3–8 (1976)

DerSimonian, R . & Laird, N. Meta-analysis in clinical trials. Control. Clin. Trials 7 , 177–188 (1986)

Lipsey, M. W . & Wilson, D. B. The efficacy of psychological, educational, and behavioral treatment. Confirmation from meta-analysis. Am. Psychol. 48 , 1181–1209 (1993)

Chalmers, I. & Altman, D. G. Systematic Reviews (BMJ Publishing Group, 1995)

Moher, D . et al. Improving the quality of reports of meta-analyses of randomised controlled trials: the QUOROM statement. Quality of reporting of meta-analyses. Lancet 354 , 1896–1900 (1999)

Higgins, J. P. & Thompson, S. G. Quantifying heterogeneity in a meta-analysis. Stat. Med. 21 , 1539–1558 (2002)

Download references

Acknowledgements

We dedicate this Review to the memory of Ingram Olkin and William Shadish, founding members of the Society for Research Synthesis Methodology who made tremendous contributions to the development of meta-analysis and research synthesis and to the supervision of generations of students. We thank L. Lagisz for help in preparing the figures. We are grateful to the Center for Open Science and the Laura and John Arnold Foundation for hosting and funding a workshop, which was the origination of this article. S.N. is supported by Australian Research Council Future Fellowship (FT130100268). J.G. acknowledges funding from the US National Science Foundation (ABI 1262402).

Author information

Authors and affiliations.

Department of Ecology and Evolution, Stony Brook University, Stony Brook, 11794-5245, New York, USA

Jessica Gurevitch

School of Biological Sciences, Royal Holloway University of London, Egham, TW20 0EX, Surrey, UK

Julia Koricheva

Evolution and Ecology Research Centre and School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, 2052, New South Wales, Australia

Shinichi Nakagawa

Diabetes and Metabolism Division, Garvan Institute of Medical Research, 384 Victoria Street, Darlinghurst, Sydney, 2010, New South Wales, Australia

School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK

Gavin Stewart

You can also search for this author in PubMed   Google Scholar

Contributions

All authors contributed equally in designing the study and writing the manuscript, and so are listed alphabetically.

Corresponding authors

Correspondence to Jessica Gurevitch , Julia Koricheva , Shinichi Nakagawa or Gavin Stewart .

Ethics declarations

Competing interests.

The authors declare no competing financial interests.

Additional information

Reviewer Information Nature thanks D. Altman, M. Lajeunesse, D. Moher and G. Romero for their contribution to the peer review of this work.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

PowerPoint slides

Powerpoint slide for fig. 1, rights and permissions.

Reprints and permissions

About this article

Cite this article.

Gurevitch, J., Koricheva, J., Nakagawa, S. et al. Meta-analysis and the science of research synthesis. Nature 555 , 175–182 (2018). https://doi.org/10.1038/nature25753

Download citation

Received : 04 March 2017

Accepted : 12 January 2018

Published : 08 March 2018

Issue Date : 08 March 2018

DOI : https://doi.org/10.1038/nature25753

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Investigate the relationship between the retraction reasons and the quality of methodology in non-cochrane retracted systematic reviews: a systematic review.

  • Azita Shahraki-Mohammadi
  • Leila Keikha
  • Razieh Zahedi

Systematic Reviews (2024)

Systematic review of the uncertainty of coral reef futures under climate change

  • Shannon G. Klein
  • Cassandra Roch
  • Carlos M. Duarte

Nature Communications (2024)

Meta-analysis reveals weak associations between reef fishes and corals

  • Pooventhran Muruga
  • Alexandre C. Siqueira
  • David R. Bellwood

Nature Ecology & Evolution (2024)

Farming practices to enhance biodiversity across biomes: a systematic review

  • Felipe Cozim-Melges
  • Raimon Ripoll-Bosch
  • Hannah H. E. van Zanten

npj Biodiversity (2024)

Large language models reveal big disparities in current wildfire research

  • Zhengyang Lin
  • Anping Chen
  • Shilong Piao

Communications Earth & Environment (2024)

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

meta analysis in quantitative research example

Study Design 101: Meta-Analysis

  • Case Report
  • Case Control Study
  • Cohort Study
  • Randomized Controlled Trial
  • Practice Guideline
  • Systematic Review

Meta-Analysis

  • Helpful Formulas
  • Finding Specific Study Types

A subset of systematic reviews; a method for systematically combining pertinent qualitative and quantitative study data from several selected studies to develop a single conclusion that has greater statistical power. This conclusion is statistically stronger than the analysis of any single study, due to increased numbers of subjects, greater diversity among subjects, or accumulated effects and results.

Meta-analysis would be used for the following purposes:

  • To establish statistical significance with studies that have conflicting results
  • To develop a more correct estimate of effect magnitude
  • To provide a more complex analysis of harms, safety data, and benefits
  • To examine subgroups with individual numbers that are not statistically significant

If the individual studies utilized randomized controlled trials (RCT), combining several selected RCT results would be the highest-level of evidence on the evidence hierarchy, followed by systematic reviews, which analyze all available studies on a topic.

  • Greater statistical power
  • Confirmatory data analysis
  • Greater ability to extrapolate to general population affected
  • Considered an evidence-based resource

Disadvantages

  • Difficult and time consuming to identify appropriate studies
  • Not all studies provide adequate data for inclusion and analysis
  • Requires advanced statistical techniques
  • Heterogeneity of study populations

Design pitfalls to look out for

The studies pooled for review should be similar in type (i.e. all randomized controlled trials).

Are the studies being reviewed all the same type of study or are they a mixture of different types?

The analysis should include published and unpublished results to avoid publication bias.

Does the meta-analysis include any appropriate relevant studies that may have had negative outcomes?

Fictitious Example

Do individuals who wear sunscreen have fewer cases of melanoma than those who do not wear sunscreen? A MEDLINE search was conducted using the terms melanoma, sunscreening agents, and zinc oxide, resulting in 8 randomized controlled studies, each with between 100 and 120 subjects. All of the studies showed a positive effect between wearing sunscreen and reducing the likelihood of melanoma. The subjects from all eight studies (total: 860 subjects) were pooled and statistically analyzed to determine the effect of the relationship between wearing sunscreen and melanoma. This meta-analysis showed a 50% reduction in melanoma diagnosis among sunscreen-wearers.

Real-life Examples

Goyal, A., Elminawy, M., Kerezoudis, P., Lu, V., Yolcu, Y., Alvi, M., & Bydon, M. (2019). Impact of obesity on outcomes following lumbar spine surgery: A systematic review and meta-analysis. Clinical Neurology and Neurosurgery, 177 , 27-36. https://doi.org/10.1016/j.clineuro.2018.12.012

This meta-analysis was interested in determining whether obesity affects the outcome of spinal surgery. Some previous studies have shown higher perioperative morbidity in patients with obesity while other studies have not shown this effect. This study looked at surgical outcomes including "blood loss, operative time, length of stay, complication and reoperation rates and functional outcomes" between patients with and without obesity. A meta-analysis of 32 studies (23,415 patients) was conducted. There were no significant differences for patients undergoing minimally invasive surgery, but patients with obesity who had open surgery had experienced higher blood loss and longer operative times (not clinically meaningful) as well as higher complication and reoperation rates. Further research is needed to explore this issue in patients with morbid obesity.

Nakamura, A., van Der Waerden, J., Melchior, M., Bolze, C., El-Khoury, F., & Pryor, L. (2019). Physical activity during pregnancy and postpartum depression: Systematic review and meta-analysis. Journal of Affective Disorders, 246 , 29-41. https://doi.org/10.1016/j.jad.2018.12.009

This meta-analysis explored whether physical activity during pregnancy prevents postpartum depression. Seventeen studies were included (93,676 women) and analysis showed a "significant reduction in postpartum depression scores in women who were physically active during their pregnancies when compared with inactive women." Possible limitations or moderators of this effect include intensity and frequency of physical activity, type of physical activity, and timepoint in pregnancy (e.g. trimester).

Related Terms

A document often written by a panel that provides a comprehensive review of all relevant studies on a particular clinical or health-related topic/question.

Publication Bias

A phenomenon in which studies with positive results have a better chance of being published, are published earlier, and are published in journals with higher impact factors. Therefore, conclusions based exclusively on published studies can be misleading.

Now test yourself!

1. A Meta-Analysis pools together the sample populations from different studies, such as Randomized Controlled Trials, into one statistical analysis and treats them as one large sample population with one conclusion.

a) True b) False

2. One potential design pitfall of Meta-Analyses that is important to pay attention to is:

a) Whether it is evidence-based. b) If the authors combined studies with conflicting results. c) If the authors appropriately combined studies so they did not compare apples and oranges. d) If the authors used only quantitative data.

Evidence Pyramid - Navigation

  • Meta- Analysis
  • Case Reports
  • << Previous: Systematic Review
  • Next: Helpful Formulas >>

Creative Commons License

  • Last Updated: Sep 25, 2023 10:59 AM
  • URL: https://guides.himmelfarb.gwu.edu/studydesign101

GW logo

  • Himmelfarb Intranet
  • Privacy Notice
  • Terms of Use
  • GW is committed to digital accessibility. If you experience a barrier that affects your ability to access content on this page, let us know via the Accessibility Feedback Form .
  • Himmelfarb Health Sciences Library
  • 2300 Eye St., NW, Washington, DC 20037
  • Phone: (202) 994-2850
  • [email protected]
  • https://himmelfarb.gwu.edu
  • Open access
  • Published: 01 August 2019

A step by step guide for conducting a systematic review and meta-analysis with simulation data

  • Gehad Mohamed Tawfik 1 , 2 ,
  • Kadek Agus Surya Dila 2 , 3 ,
  • Muawia Yousif Fadlelmola Mohamed 2 , 4 ,
  • Dao Ngoc Hien Tam 2 , 5 ,
  • Nguyen Dang Kien 2 , 6 ,
  • Ali Mahmoud Ahmed 2 , 7 &
  • Nguyen Tien Huy 8 , 9 , 10  

Tropical Medicine and Health volume  47 , Article number:  46 ( 2019 ) Cite this article

787k Accesses

289 Citations

93 Altmetric

Metrics details

The massive abundance of studies relating to tropical medicine and health has increased strikingly over the last few decades. In the field of tropical medicine and health, a well-conducted systematic review and meta-analysis (SR/MA) is considered a feasible solution for keeping clinicians abreast of current evidence-based medicine. Understanding of SR/MA steps is of paramount importance for its conduction. It is not easy to be done as there are obstacles that could face the researcher. To solve those hindrances, this methodology study aimed to provide a step-by-step approach mainly for beginners and junior researchers, in the field of tropical medicine and other health care fields, on how to properly conduct a SR/MA, in which all the steps here depicts our experience and expertise combined with the already well-known and accepted international guidance.

We suggest that all steps of SR/MA should be done independently by 2–3 reviewers’ discussion, to ensure data quality and accuracy.

SR/MA steps include the development of research question, forming criteria, search strategy, searching databases, protocol registration, title, abstract, full-text screening, manual searching, extracting data, quality assessment, data checking, statistical analysis, double data checking, and manuscript writing.

Introduction

The amount of studies published in the biomedical literature, especially tropical medicine and health, has increased strikingly over the last few decades. This massive abundance of literature makes clinical medicine increasingly complex, and knowledge from various researches is often needed to inform a particular clinical decision. However, available studies are often heterogeneous with regard to their design, operational quality, and subjects under study and may handle the research question in a different way, which adds to the complexity of evidence and conclusion synthesis [ 1 ].

Systematic review and meta-analyses (SR/MAs) have a high level of evidence as represented by the evidence-based pyramid. Therefore, a well-conducted SR/MA is considered a feasible solution in keeping health clinicians ahead regarding contemporary evidence-based medicine.

Differing from a systematic review, unsystematic narrative review tends to be descriptive, in which the authors select frequently articles based on their point of view which leads to its poor quality. A systematic review, on the other hand, is defined as a review using a systematic method to summarize evidence on questions with a detailed and comprehensive plan of study. Furthermore, despite the increasing guidelines for effectively conducting a systematic review, we found that basic steps often start from framing question, then identifying relevant work which consists of criteria development and search for articles, appraise the quality of included studies, summarize the evidence, and interpret the results [ 2 , 3 ]. However, those simple steps are not easy to be reached in reality. There are many troubles that a researcher could be struggled with which has no detailed indication.

Conducting a SR/MA in tropical medicine and health may be difficult especially for young researchers; therefore, understanding of its essential steps is crucial. It is not easy to be done as there are obstacles that could face the researcher. To solve those hindrances, we recommend a flow diagram (Fig. 1 ) which illustrates a detailed and step-by-step the stages for SR/MA studies. This methodology study aimed to provide a step-by-step approach mainly for beginners and junior researchers, in the field of tropical medicine and other health care fields, on how to properly and succinctly conduct a SR/MA; all the steps here depicts our experience and expertise combined with the already well known and accepted international guidance.

figure 1

Detailed flow diagram guideline for systematic review and meta-analysis steps. Note : Star icon refers to “2–3 reviewers screen independently”

Methods and results

Detailed steps for conducting any systematic review and meta-analysis.

We searched the methods reported in published SR/MA in tropical medicine and other healthcare fields besides the published guidelines like Cochrane guidelines {Higgins, 2011 #7} [ 4 ] to collect the best low-bias method for each step of SR/MA conduction steps. Furthermore, we used guidelines that we apply in studies for all SR/MA steps. We combined these methods in order to conclude and conduct a detailed flow diagram that shows the SR/MA steps how being conducted.

Any SR/MA must follow the widely accepted Preferred Reporting Items for Systematic Review and Meta-analysis statement (PRISMA checklist 2009) (Additional file 5 : Table S1) [ 5 ].

We proposed our methods according to a valid explanatory simulation example choosing the topic of “evaluating safety of Ebola vaccine,” as it is known that Ebola is a very rare tropical disease but fatal. All the explained methods feature the standards followed internationally, with our compiled experience in the conduct of SR beside it, which we think proved some validity. This is a SR under conduct by a couple of researchers teaming in a research group, moreover, as the outbreak of Ebola which took place (2013–2016) in Africa resulted in a significant mortality and morbidity. Furthermore, since there are many published and ongoing trials assessing the safety of Ebola vaccines, we thought this would provide a great opportunity to tackle this hotly debated issue. Moreover, Ebola started to fire again and new fatal outbreak appeared in the Democratic Republic of Congo since August 2018, which caused infection to more than 1000 people according to the World Health Organization, and 629 people have been killed till now. Hence, it is considered the second worst Ebola outbreak, after the first one in West Africa in 2014 , which infected more than 26,000 and killed about 11,300 people along outbreak course.

Research question and objectives

Like other study designs, the research question of SR/MA should be feasible, interesting, novel, ethical, and relevant. Therefore, a clear, logical, and well-defined research question should be formulated. Usually, two common tools are used: PICO or SPIDER. PICO (Population, Intervention, Comparison, Outcome) is used mostly in quantitative evidence synthesis. Authors demonstrated that PICO holds more sensitivity than the more specific SPIDER approach [ 6 ]. SPIDER (Sample, Phenomenon of Interest, Design, Evaluation, Research type) was proposed as a method for qualitative and mixed methods search.

We here recommend a combined approach of using either one or both the SPIDER and PICO tools to retrieve a comprehensive search depending on time and resources limitations. When we apply this to our assumed research topic, being of qualitative nature, the use of SPIDER approach is more valid.

PICO is usually used for systematic review and meta-analysis of clinical trial study. For the observational study (without intervention or comparator), in many tropical and epidemiological questions, it is usually enough to use P (Patient) and O (outcome) only to formulate a research question. We must indicate clearly the population (P), then intervention (I) or exposure. Next, it is necessary to compare (C) the indicated intervention with other interventions, i.e., placebo. Finally, we need to clarify which are our relevant outcomes.

To facilitate comprehension, we choose the Ebola virus disease (EVD) as an example. Currently, the vaccine for EVD is being developed and under phase I, II, and III clinical trials; we want to know whether this vaccine is safe and can induce sufficient immunogenicity to the subjects.

An example of a research question for SR/MA based on PICO for this issue is as follows: How is the safety and immunogenicity of Ebola vaccine in human? (P: healthy subjects (human), I: vaccination, C: placebo, O: safety or adverse effects)

Preliminary research and idea validation

We recommend a preliminary search to identify relevant articles, ensure the validity of the proposed idea, avoid duplication of previously addressed questions, and assure that we have enough articles for conducting its analysis. Moreover, themes should focus on relevant and important health-care issues, consider global needs and values, reflect the current science, and be consistent with the adopted review methods. Gaining familiarity with a deep understanding of the study field through relevant videos and discussions is of paramount importance for better retrieval of results. If we ignore this step, our study could be canceled whenever we find out a similar study published before. This means we are wasting our time to deal with a problem that has been tackled for a long time.

To do this, we can start by doing a simple search in PubMed or Google Scholar with search terms Ebola AND vaccine. While doing this step, we identify a systematic review and meta-analysis of determinant factors influencing antibody response from vaccination of Ebola vaccine in non-human primate and human [ 7 ], which is a relevant paper to read to get a deeper insight and identify gaps for better formulation of our research question or purpose. We can still conduct systematic review and meta-analysis of Ebola vaccine because we evaluate safety as a different outcome and different population (only human).

Inclusion and exclusion criteria

Eligibility criteria are based on the PICO approach, study design, and date. Exclusion criteria mostly are unrelated, duplicated, unavailable full texts, or abstract-only papers. These exclusions should be stated in advance to refrain the researcher from bias. The inclusion criteria would be articles with the target patients, investigated interventions, or the comparison between two studied interventions. Briefly, it would be articles which contain information answering our research question. But the most important is that it should be clear and sufficient information, including positive or negative, to answer the question.

For the topic we have chosen, we can make inclusion criteria: (1) any clinical trial evaluating the safety of Ebola vaccine and (2) no restriction regarding country, patient age, race, gender, publication language, and date. Exclusion criteria are as follows: (1) study of Ebola vaccine in non-human subjects or in vitro studies; (2) study with data not reliably extracted, duplicate, or overlapping data; (3) abstract-only papers as preceding papers, conference, editorial, and author response theses and books; (4) articles without available full text available; and (5) case reports, case series, and systematic review studies. The PRISMA flow diagram template that is used in SR/MA studies can be found in Fig. 2 .

figure 2

PRISMA flow diagram of studies’ screening and selection

Search strategy

A standard search strategy is used in PubMed, then later it is modified according to each specific database to get the best relevant results. The basic search strategy is built based on the research question formulation (i.e., PICO or PICOS). Search strategies are constructed to include free-text terms (e.g., in the title and abstract) and any appropriate subject indexing (e.g., MeSH) expected to retrieve eligible studies, with the help of an expert in the review topic field or an information specialist. Additionally, we advise not to use terms for the Outcomes as their inclusion might hinder the database being searched to retrieve eligible studies because the used outcome is not mentioned obviously in the articles.

The improvement of the search term is made while doing a trial search and looking for another relevant term within each concept from retrieved papers. To search for a clinical trial, we can use these descriptors in PubMed: “clinical trial”[Publication Type] OR “clinical trials as topic”[MeSH terms] OR “clinical trial”[All Fields]. After some rounds of trial and refinement of search term, we formulate the final search term for PubMed as follows: (ebola OR ebola virus OR ebola virus disease OR EVD) AND (vaccine OR vaccination OR vaccinated OR immunization) AND (“clinical trial”[Publication Type] OR “clinical trials as topic”[MeSH Terms] OR “clinical trial”[All Fields]). Because the study for this topic is limited, we do not include outcome term (safety and immunogenicity) in the search term to capture more studies.

Search databases, import all results to a library, and exporting to an excel sheet

According to the AMSTAR guidelines, at least two databases have to be searched in the SR/MA [ 8 ], but as you increase the number of searched databases, you get much yield and more accurate and comprehensive results. The ordering of the databases depends mostly on the review questions; being in a study of clinical trials, you will rely mostly on Cochrane, mRCTs, or International Clinical Trials Registry Platform (ICTRP). Here, we propose 12 databases (PubMed, Scopus, Web of Science, EMBASE, GHL, VHL, Cochrane, Google Scholar, Clinical trials.gov , mRCTs, POPLINE, and SIGLE), which help to cover almost all published articles in tropical medicine and other health-related fields. Among those databases, POPLINE focuses on reproductive health. Researchers should consider to choose relevant database according to the research topic. Some databases do not support the use of Boolean or quotation; otherwise, there are some databases that have special searching way. Therefore, we need to modify the initial search terms for each database to get appreciated results; therefore, manipulation guides for each online database searches are presented in Additional file 5 : Table S2. The detailed search strategy for each database is found in Additional file 5 : Table S3. The search term that we created in PubMed needs customization based on a specific characteristic of the database. An example for Google Scholar advanced search for our topic is as follows:

With all of the words: ebola virus

With at least one of the words: vaccine vaccination vaccinated immunization

Where my words occur: in the title of the article

With all of the words: EVD

Finally, all records are collected into one Endnote library in order to delete duplicates and then to it export into an excel sheet. Using remove duplicating function with two options is mandatory. All references which have (1) the same title and author, and published in the same year, and (2) the same title and author, and published in the same journal, would be deleted. References remaining after this step should be exported to an excel file with essential information for screening. These could be the authors’ names, publication year, journal, DOI, URL link, and abstract.

Protocol writing and registration

Protocol registration at an early stage guarantees transparency in the research process and protects from duplication problems. Besides, it is considered a documented proof of team plan of action, research question, eligibility criteria, intervention/exposure, quality assessment, and pre-analysis plan. It is recommended that researchers send it to the principal investigator (PI) to revise it, then upload it to registry sites. There are many registry sites available for SR/MA like those proposed by Cochrane and Campbell collaborations; however, we recommend registering the protocol into PROSPERO as it is easier. The layout of a protocol template, according to PROSPERO, can be found in Additional file 5 : File S1.

Title and abstract screening

Decisions to select retrieved articles for further assessment are based on eligibility criteria, to minimize the chance of including non-relevant articles. According to the Cochrane guidance, two reviewers are a must to do this step, but as for beginners and junior researchers, this might be tiresome; thus, we propose based on our experience that at least three reviewers should work independently to reduce the chance of error, particularly in teams with a large number of authors to add more scrutiny and ensure proper conduct. Mostly, the quality with three reviewers would be better than two, as two only would have different opinions from each other, so they cannot decide, while the third opinion is crucial. And here are some examples of systematic reviews which we conducted following the same strategy (by a different group of researchers in our research group) and published successfully, and they feature relevant ideas to tropical medicine and disease [ 9 , 10 , 11 ].

In this step, duplications will be removed manually whenever the reviewers find them out. When there is a doubt about an article decision, the team should be inclusive rather than exclusive, until the main leader or PI makes a decision after discussion and consensus. All excluded records should be given exclusion reasons.

Full text downloading and screening

Many search engines provide links for free to access full-text articles. In case not found, we can search in some research websites as ResearchGate, which offer an option of direct full-text request from authors. Additionally, exploring archives of wanted journals, or contacting PI to purchase it if available. Similarly, 2–3 reviewers work independently to decide about included full texts according to eligibility criteria, with reporting exclusion reasons of articles. In case any disagreement has occurred, the final decision has to be made by discussion.

Manual search

One has to exhaust all possibilities to reduce bias by performing an explicit hand-searching for retrieval of reports that may have been dropped from first search [ 12 ]. We apply five methods to make manual searching: searching references from included studies/reviews, contacting authors and experts, and looking at related articles/cited articles in PubMed and Google Scholar.

We describe here three consecutive methods to increase and refine the yield of manual searching: firstly, searching reference lists of included articles; secondly, performing what is known as citation tracking in which the reviewers track all the articles that cite each one of the included articles, and this might involve electronic searching of databases; and thirdly, similar to the citation tracking, we follow all “related to” or “similar” articles. Each of the abovementioned methods can be performed by 2–3 independent reviewers, and all the possible relevant article must undergo further scrutiny against the inclusion criteria, after following the same records yielded from electronic databases, i.e., title/abstract and full-text screening.

We propose an independent reviewing by assigning each member of the teams a “tag” and a distinct method, to compile all the results at the end for comparison of differences and discussion and to maximize the retrieval and minimize the bias. Similarly, the number of included articles has to be stated before addition to the overall included records.

Data extraction and quality assessment

This step entitles data collection from included full-texts in a structured extraction excel sheet, which is previously pilot-tested for extraction using some random studies. We recommend extracting both adjusted and non-adjusted data because it gives the most allowed confounding factor to be used in the analysis by pooling them later [ 13 ]. The process of extraction should be executed by 2–3 independent reviewers. Mostly, the sheet is classified into the study and patient characteristics, outcomes, and quality assessment (QA) tool.

Data presented in graphs should be extracted by software tools such as Web plot digitizer [ 14 ]. Most of the equations that can be used in extraction prior to analysis and estimation of standard deviation (SD) from other variables is found inside Additional file 5 : File S2 with their references as Hozo et al. [ 15 ], Xiang et al. [ 16 ], and Rijkom et al. [ 17 ]. A variety of tools are available for the QA, depending on the design: ROB-2 Cochrane tool for randomized controlled trials [ 18 ] which is presented as Additional file 1 : Figure S1 and Additional file 2 : Figure S2—from a previous published article data—[ 19 ], NIH tool for observational and cross-sectional studies [ 20 ], ROBINS-I tool for non-randomize trials [ 21 ], QUADAS-2 tool for diagnostic studies, QUIPS tool for prognostic studies, CARE tool for case reports, and ToxRtool for in vivo and in vitro studies. We recommend that 2–3 reviewers independently assess the quality of the studies and add to the data extraction form before the inclusion into the analysis to reduce the risk of bias. In the NIH tool for observational studies—cohort and cross-sectional—as in this EBOLA case, to evaluate the risk of bias, reviewers should rate each of the 14 items into dichotomous variables: yes, no, or not applicable. An overall score is calculated by adding all the items scores as yes equals one, while no and NA equals zero. A score will be given for every paper to classify them as poor, fair, or good conducted studies, where a score from 0–5 was considered poor, 6–9 as fair, and 10–14 as good.

In the EBOLA case example above, authors can extract the following information: name of authors, country of patients, year of publication, study design (case report, cohort study, or clinical trial or RCT), sample size, the infected point of time after EBOLA infection, follow-up interval after vaccination time, efficacy, safety, adverse effects after vaccinations, and QA sheet (Additional file 6 : Data S1).

Data checking

Due to the expected human error and bias, we recommend a data checking step, in which every included article is compared with its counterpart in an extraction sheet by evidence photos, to detect mistakes in data. We advise assigning articles to 2–3 independent reviewers, ideally not the ones who performed the extraction of those articles. When resources are limited, each reviewer is assigned a different article than the one he extracted in the previous stage.

Statistical analysis

Investigators use different methods for combining and summarizing findings of included studies. Before analysis, there is an important step called cleaning of data in the extraction sheet, where the analyst organizes extraction sheet data in a form that can be read by analytical software. The analysis consists of 2 types namely qualitative and quantitative analysis. Qualitative analysis mostly describes data in SR studies, while quantitative analysis consists of two main types: MA and network meta-analysis (NMA). Subgroup, sensitivity, cumulative analyses, and meta-regression are appropriate for testing whether the results are consistent or not and investigating the effect of certain confounders on the outcome and finding the best predictors. Publication bias should be assessed to investigate the presence of missing studies which can affect the summary.

To illustrate basic meta-analysis, we provide an imaginary data for the research question about Ebola vaccine safety (in terms of adverse events, 14 days after injection) and immunogenicity (Ebola virus antibodies rise in geometric mean titer, 6 months after injection). Assuming that from searching and data extraction, we decided to do an analysis to evaluate Ebola vaccine “A” safety and immunogenicity. Other Ebola vaccines were not meta-analyzed because of the limited number of studies (instead, it will be included for narrative review). The imaginary data for vaccine safety meta-analysis can be accessed in Additional file 7 : Data S2. To do the meta-analysis, we can use free software, such as RevMan [ 22 ] or R package meta [ 23 ]. In this example, we will use the R package meta. The tutorial of meta package can be accessed through “General Package for Meta-Analysis” tutorial pdf [ 23 ]. The R codes and its guidance for meta-analysis done can be found in Additional file 5 : File S3.

For the analysis, we assume that the study is heterogenous in nature; therefore, we choose a random effect model. We did an analysis on the safety of Ebola vaccine A. From the data table, we can see some adverse events occurring after intramuscular injection of vaccine A to the subject of the study. Suppose that we include six studies that fulfill our inclusion criteria. We can do a meta-analysis for each of the adverse events extracted from the studies, for example, arthralgia, from the results of random effect meta-analysis using the R meta package.

From the results shown in Additional file 3 : Figure S3, we can see that the odds ratio (OR) of arthralgia is 1.06 (0.79; 1.42), p value = 0.71, which means that there is no association between the intramuscular injection of Ebola vaccine A and arthralgia, as the OR is almost one, and besides, the P value is insignificant as it is > 0.05.

In the meta-analysis, we can also visualize the results in a forest plot. It is shown in Fig. 3 an example of a forest plot from the simulated analysis.

figure 3

Random effect model forest plot for comparison of vaccine A versus placebo

From the forest plot, we can see six studies (A to F) and their respective OR (95% CI). The green box represents the effect size (in this case, OR) of each study. The bigger the box means the study weighted more (i.e., bigger sample size). The blue diamond shape represents the pooled OR of the six studies. We can see the blue diamond cross the vertical line OR = 1, which indicates no significance for the association as the diamond almost equalized in both sides. We can confirm this also from the 95% confidence interval that includes one and the p value > 0.05.

For heterogeneity, we see that I 2 = 0%, which means no heterogeneity is detected; the study is relatively homogenous (it is rare in the real study). To evaluate publication bias related to the meta-analysis of adverse events of arthralgia, we can use the metabias function from the R meta package (Additional file 4 : Figure S4) and visualization using a funnel plot. The results of publication bias are demonstrated in Fig. 4 . We see that the p value associated with this test is 0.74, indicating symmetry of the funnel plot. We can confirm it by looking at the funnel plot.

figure 4

Publication bias funnel plot for comparison of vaccine A versus placebo

Looking at the funnel plot, the number of studies at the left and right side of the funnel plot is the same; therefore, the plot is symmetry, indicating no publication bias detected.

Sensitivity analysis is a procedure used to discover how different values of an independent variable will influence the significance of a particular dependent variable by removing one study from MA. If all included study p values are < 0.05, hence, removing any study will not change the significant association. It is only performed when there is a significant association, so if the p value of MA done is 0.7—more than one—the sensitivity analysis is not needed for this case study example. If there are 2 studies with p value > 0.05, removing any of the two studies will result in a loss of the significance.

Double data checking

For more assurance on the quality of results, the analyzed data should be rechecked from full-text data by evidence photos, to allow an obvious check for the PI of the study.

Manuscript writing, revision, and submission to a journal

Writing based on four scientific sections: introduction, methods, results, and discussion, mostly with a conclusion. Performing a characteristic table for study and patient characteristics is a mandatory step which can be found as a template in Additional file 5 : Table S3.

After finishing the manuscript writing, characteristics table, and PRISMA flow diagram, the team should send it to the PI to revise it well and reply to his comments and, finally, choose a suitable journal for the manuscript which fits with considerable impact factor and fitting field. We need to pay attention by reading the author guidelines of journals before submitting the manuscript.

The role of evidence-based medicine in biomedical research is rapidly growing. SR/MAs are also increasing in the medical literature. This paper has sought to provide a comprehensive approach to enable reviewers to produce high-quality SR/MAs. We hope that readers could gain general knowledge about how to conduct a SR/MA and have the confidence to perform one, although this kind of study requires complex steps compared to narrative reviews.

Having the basic steps for conduction of MA, there are many advanced steps that are applied for certain specific purposes. One of these steps is meta-regression which is performed to investigate the association of any confounder and the results of the MA. Furthermore, there are other types rather than the standard MA like NMA and MA. In NMA, we investigate the difference between several comparisons when there were not enough data to enable standard meta-analysis. It uses both direct and indirect comparisons to conclude what is the best between the competitors. On the other hand, mega MA or MA of patients tend to summarize the results of independent studies by using its individual subject data. As a more detailed analysis can be done, it is useful in conducting repeated measure analysis and time-to-event analysis. Moreover, it can perform analysis of variance and multiple regression analysis; however, it requires homogenous dataset and it is time-consuming in conduct [ 24 ].

Conclusions

Systematic review/meta-analysis steps include development of research question and its validation, forming criteria, search strategy, searching databases, importing all results to a library and exporting to an excel sheet, protocol writing and registration, title and abstract screening, full-text screening, manual searching, extracting data and assessing its quality, data checking, conducting statistical analysis, double data checking, manuscript writing, revising, and submitting to a journal.

Availability of data and materials

Not applicable.

Abbreviations

Network meta-analysis

Principal investigator

Population, Intervention, Comparison, Outcome

Preferred Reporting Items for Systematic Review and Meta-analysis statement

Quality assessment

Sample, Phenomenon of Interest, Design, Evaluation, Research type

Systematic review and meta-analyses

Bello A, Wiebe N, Garg A, Tonelli M. Evidence-based decision-making 2: systematic reviews and meta-analysis. Methods Mol Biol (Clifton, NJ). 2015;1281:397–416.

Article   Google Scholar  

Khan KS, Kunz R, Kleijnen J, Antes G. Five steps to conducting a systematic review. J R Soc Med. 2003;96(3):118–21.

Rys P, Wladysiuk M, Skrzekowska-Baran I, Malecki MT. Review articles, systematic reviews and meta-analyses: which can be trusted? Polskie Archiwum Medycyny Wewnetrznej. 2009;119(3):148–56.

PubMed   Google Scholar  

Higgins JPT, Green S. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 [updated March 2011]. 2011.

Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. BMJ. 2009;339:b2535.

Methley AM, Campbell S, Chew-Graham C, McNally R, Cheraghi-Sohi S. PICO, PICOS and SPIDER: a comparison study of specificity and sensitivity in three search tools for qualitative systematic reviews. BMC Health Serv Res. 2014;14:579.

Gross L, Lhomme E, Pasin C, Richert L, Thiebaut R. Ebola vaccine development: systematic review of pre-clinical and clinical studies, and meta-analysis of determinants of antibody response variability after vaccination. Int J Infect Dis. 2018;74:83–96.

Article   CAS   Google Scholar  

Shea BJ, Reeves BC, Wells G, Thuku M, Hamel C, Moran J, ... Henry DA. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ. 2017;358:j4008.

Giang HTN, Banno K, Minh LHN, Trinh LT, Loc LT, Eltobgy A, et al. Dengue hemophagocytic syndrome: a systematic review and meta-analysis on epidemiology, clinical signs, outcomes, and risk factors. Rev Med Virol. 2018;28(6):e2005.

Morra ME, Altibi AMA, Iqtadar S, Minh LHN, Elawady SS, Hallab A, et al. Definitions for warning signs and signs of severe dengue according to the WHO 2009 classification: systematic review of literature. Rev Med Virol. 2018;28(4):e1979.

Morra ME, Van Thanh L, Kamel MG, Ghazy AA, Altibi AMA, Dat LM, et al. Clinical outcomes of current medical approaches for Middle East respiratory syndrome: a systematic review and meta-analysis. Rev Med Virol. 2018;28(3):e1977.

Vassar M, Atakpo P, Kash MJ. Manual search approaches used by systematic reviewers in dermatology. Journal of the Medical Library Association: JMLA. 2016;104(4):302.

Naunheim MR, Remenschneider AK, Scangas GA, Bunting GW, Deschler DG. The effect of initial tracheoesophageal voice prosthesis size on postoperative complications and voice outcomes. Ann Otol Rhinol Laryngol. 2016;125(6):478–84.

Rohatgi AJaiWa. Web Plot Digitizer. ht tp. 2014;2.

Hozo SP, Djulbegovic B, Hozo I. Estimating the mean and variance from the median, range, and the size of a sample. BMC Med Res Methodol. 2005;5(1):13.

Wan X, Wang W, Liu J, Tong T. Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. BMC Med Res Methodol. 2014;14(1):135.

Van Rijkom HM, Truin GJ, Van’t Hof MA. A meta-analysis of clinical studies on the caries-inhibiting effect of fluoride gel treatment. Carries Res. 1998;32(2):83–92.

Higgins JP, Altman DG, Gotzsche PC, Juni P, Moher D, Oxman AD, et al. The Cochrane Collaboration's tool for assessing risk of bias in randomised trials. BMJ. 2011;343:d5928.

Tawfik GM, Tieu TM, Ghozy S, Makram OM, Samuel P, Abdelaal A, et al. Speech efficacy, safety and factors affecting lifetime of voice prostheses in patients with laryngeal cancer: a systematic review and network meta-analysis of randomized controlled trials. J Clin Oncol. 2018;36(15_suppl):e18031-e.

Wannemuehler TJ, Lobo BC, Johnson JD, Deig CR, Ting JY, Gregory RL. Vibratory stimulus reduces in vitro biofilm formation on tracheoesophageal voice prostheses. Laryngoscope. 2016;126(12):2752–7.

Sterne JAC, Hernán MA, Reeves BC, Savović J, Berkman ND, Viswanathan M, et al. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ. 2016;355.

RevMan The Cochrane Collaboration %J Copenhagen TNCCTCC. Review Manager (RevMan). 5.0. 2008.

Schwarzer GJRn. meta: An R package for meta-analysis. 2007;7(3):40-45.

Google Scholar  

Simms LLH. Meta-analysis versus mega-analysis: is there a difference? Oral budesonide for the maintenance of remission in Crohn’s disease: Faculty of Graduate Studies, University of Western Ontario; 1998.

Download references

Acknowledgements

This study was conducted (in part) at the Joint Usage/Research Center on Tropical Disease, Institute of Tropical Medicine, Nagasaki University, Japan.

Author information

Authors and affiliations.

Faculty of Medicine, Ain Shams University, Cairo, Egypt

Gehad Mohamed Tawfik

Online research Club http://www.onlineresearchclub.org/

Gehad Mohamed Tawfik, Kadek Agus Surya Dila, Muawia Yousif Fadlelmola Mohamed, Dao Ngoc Hien Tam, Nguyen Dang Kien & Ali Mahmoud Ahmed

Pratama Giri Emas Hospital, Singaraja-Amlapura street, Giri Emas village, Sawan subdistrict, Singaraja City, Buleleng, Bali, 81171, Indonesia

Kadek Agus Surya Dila

Faculty of Medicine, University of Khartoum, Khartoum, Sudan

Muawia Yousif Fadlelmola Mohamed

Nanogen Pharmaceutical Biotechnology Joint Stock Company, Ho Chi Minh City, Vietnam

Dao Ngoc Hien Tam

Department of Obstetrics and Gynecology, Thai Binh University of Medicine and Pharmacy, Thai Binh, Vietnam

Nguyen Dang Kien

Faculty of Medicine, Al-Azhar University, Cairo, Egypt

Ali Mahmoud Ahmed

Evidence Based Medicine Research Group & Faculty of Applied Sciences, Ton Duc Thang University, Ho Chi Minh City, 70000, Vietnam

Nguyen Tien Huy

Faculty of Applied Sciences, Ton Duc Thang University, Ho Chi Minh City, 70000, Vietnam

Department of Clinical Product Development, Institute of Tropical Medicine (NEKKEN), Leading Graduate School Program, and Graduate School of Biomedical Sciences, Nagasaki University, 1-12-4 Sakamoto, Nagasaki, 852-8523, Japan

You can also search for this author in PubMed   Google Scholar

Contributions

NTH and GMT were responsible for the idea and its design. The figure was done by GMT. All authors contributed to the manuscript writing and approval of the final version.

Corresponding author

Correspondence to Nguyen Tien Huy .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional files

Additional file 1:.

Figure S1. Risk of bias assessment graph of included randomized controlled trials. (TIF 20 kb)

Additional file 2:

Figure S2. Risk of bias assessment summary. (TIF 69 kb)

Additional file 3:

Figure S3. Arthralgia results of random effect meta-analysis using R meta package. (TIF 20 kb)

Additional file 4:

Figure S4. Arthralgia linear regression test of funnel plot asymmetry using R meta package. (TIF 13 kb)

Additional file 5:

Table S1. PRISMA 2009 Checklist. Table S2. Manipulation guides for online database searches. Table S3. Detailed search strategy for twelve database searches. Table S4. Baseline characteristics of the patients in the included studies. File S1. PROSPERO protocol template file. File S2. Extraction equations that can be used prior to analysis to get missed variables. File S3. R codes and its guidance for meta-analysis done for comparison between EBOLA vaccine A and placebo. (DOCX 49 kb)

Additional file 6:

Data S1. Extraction and quality assessment data sheets for EBOLA case example. (XLSX 1368 kb)

Additional file 7:

Data S2. Imaginary data for EBOLA case example. (XLSX 10 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article.

Tawfik, G.M., Dila, K.A.S., Mohamed, M.Y.F. et al. A step by step guide for conducting a systematic review and meta-analysis with simulation data. Trop Med Health 47 , 46 (2019). https://doi.org/10.1186/s41182-019-0165-6

Download citation

Received : 30 January 2019

Accepted : 24 May 2019

Published : 01 August 2019

DOI : https://doi.org/10.1186/s41182-019-0165-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Tropical Medicine and Health

ISSN: 1349-4147

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

meta analysis in quantitative research example

meta analysis in quantitative research example

Shopping Cart

No products in the cart.

Introduction to Quantitative Meta-Analysis

Length: Four Days Instructor: Tasha Beretvas Software Demonstrations: R Lifetime Access: No expirations

Student: $792 Professional: $1032

Introduction to Quantitative Meta-Analysis is a four-day workshop focused on the statistical techniques used to conduct quantitative meta-analyses. Quantitative meta-analysis allows synthesis of results from primary studies investigating relations among common variables. The procedure entails first capturing effect sizes that numerically describe the relationship among relevant variables in each primary study. Primary study characteristics can then be investigated as sources of variability in effect size estimates through the use of meta-analytic moderator analyses.

In this workshop, we will learn how to calculate the most common types of effect sizes (the standardized mean difference, correlation coefficient and log-odds ratio) given the different kinds of descriptive and inferential statistics that are reported. We will also learn how to average the effect size estimates across primary studies and how to conduct moderator analyses. Meta-analytic data are complicated and we will cover how best to handle some of the methodological complexities that are encountered. We will also learn how to assess and correct for potential publication bias.

Tasha Beretvas, Ph.D.

Tasha Beretvas is the senior vice provost of faculty affairs and the John L. and Elizabeth G. Hill Centennial professor of quantitative methods in the Educational Psychology department at the University of Texas at Austin. Tasha’s research focuses on the application and evaluation of statistical models in social, behavioral and health sciences research. Read More

  • Workshop Details

The goal of the workshop is to cover the core skills needed to conduct a quantitative meta-analysis. While a good portion of the content has to include statistical formulas and models, the teaching will include demonstrations, explanations and interpretations to connect real data and methodological dilemmas with concepts, formulas and models. Because it is not possible for all methodological dilemmas as well as data and research question challenges to be covered, the understanding and skills learned in this class are intended to be generalized to new scenarios encountered by the learners in their future applied meta-analyses.

This workshop is designed for graduate students, post-doctoral fellows, faculty, and research scientists from the behavioral, social, and health sciences. It is recommended that participants have a working knowledge of the general multiple regression model. Relevant core statistical concepts will be reviewed at the beginning of the workshop. Participants who would benefit from a more in-depth refresher on linear regression prior to attending may wish to watch our (no cost) Linear Regression Playlist on YouTube.

Chapter 1. Introduction and Review 1.1 Introduction to Quantitative Meta-Analysis 1.2 Core Statistics Review

Chapter 2. Calculating the Standardized Mean Difference 2.1 Calculating the Standardized Mean Difference and its Variance 2.2 Transforming Inferential Statistics to Obtain the SMD 2.3 Calculating the SMD from repeated measures design data

Chapter 3. Pooling Standardized Mean Differences 3.1 Fixed-Effects Pooling of Effect Size Estimates 3.2 Random-Effects Pooling of Effect Size Estimates 3.3 Demonstration of Pooling SMDs by hand and with escalc

Chapter 4. Meta-Regression Models 4.1 Regression and Meta-Regression 4.2 Using rma to estimate fixed- & mixed-effects meta-regression models 4.3 Meta-Regression Models: An Example

Chapter 5. Handling Within-Study Dependence in SMDs 5.1 Dependent Effects Introduction 5.2 Using GLS to handle within-study dependence & moderation 5.3 Robust Variance Estimation of Meta-Regression Models

Chapter 6. Missing Data and Publication Bias 6.1 Assessing Publication Bias 6.2 Correcting for Publication Bias

Chapter 7. Meta-Analysis of Correlations 7.1 Pooling Correlation Estimates 7.2 Handling Within-Study Dependence in Correlation Estimates 7.3 Testing publication bias in meta-analysis of Correlations

Chapter 8. Meta-Analysis of Treatment Effects on Dichotomous Outcomes 8.1 Categorical Effect Size Measures 8.2 Meta-Regression of Categorical Effect Sizes 8.3 Final Hurrah of Equations and Stuff

Demonstrations of analyses are presented using the R software program because it is the software that is most rapidly and frequently updated with the latest methodological innovations in meta-analysis. Where possible, supplemental materials demonstrating use of SPSS statistical software are provided. Note that R can be downloaded for free .  While it is helpful to have some familiarity with R, it is not necessary.  The lectures which constitute the majority of the workshop are software-independent.  Note that while code will be shared for all analyses demonstrated in the workshop, the majority of the pedagogy will focus on the concepts and content rather than use of software.

Introduction to Quantitative Meta-Analysis is a four-day workshop originally taught live via Zoom by Tasha Beretvas. Daily lectures were held from 9:00 to 5:00 with morning, lunch, and afternoon breaks. Sessions consisted of comprehensive lectures, detailed presentation of real-data examples with live demonstrations in R, response to participant questions, and general discussion.

Self-paced participants receive  lifetime   access to all course materials, including complete course notes, lecture recordings, software demonstration notes, and data and code for all examples. You can revisit these materials any time you like, without worrying about expiration dates. Pretty awesome, huh?

Full recordings of all lectures and software demonstrations are provided.  You can log in to your account to access these recordings (Select My Workshops, then click on the corresponding workshop tile).

Please see the  sample videos and materials  we have posted for a subset of classes. Each class provides unique content but the format and style is similar across classes.

We offer reduced-price registrations for undergraduate and graduate students who are actively enrolled in a recognized bachelor's, master's or doctoral training program. No application is necessary to qualify for the student tuition rates; simply choose the student rate when beginning the registration process at the top of the page. Confirmation of student status may be requested at a later time.

Dr. Beretvas is the epitome of teaching excellence – she is extremely knowledgeable in her area and is able to explain complex concepts to learners with a variety of backgrounds and clear cares about teaching.

I was very, very impressed with Dr. Beretvas' dedication to teaching us. This class really covered an enormous amount, and she provided incredibly detailed and accessible material with the online format.

I cannot say enough good things about Tasha's teaching and this course. Everything was well organized and clearly presented.

Tasha goes to great lengths to make information accessible to quantitative methods and non-quantitative methods students. Her notes are detailed and organized and her preparation for each class is extensive. It is a joy to learn from someone as enthusiastic and skilled as Tasha!

Tasha was a super fun and entertaining instructor. I loved content and the way she broke down concepts in ways that were easy to understand.

Dr. Beretvas continues to be one of the best professors I have ever had. She was always very engaging during her lectures.

Dr. Beretvas was very knowledgeable about course content. She was very thorough and made learning difficult concepts manageable and fun. 

Quick Navigation

  • Instructors

Self-Paced Workshops

View Upcoming Livestream Workshops

Free Introduction to Structural Equation Modeling

Length: Three Days Instructors: Dan Bauer & Patrick Curran Lifetime Access: No expirations

Mixture Modeling and Latent Class Analysis

Length: Five Days Instructors: Dan Bauer & Doug Steinley Lifetime Access: No expirations

Applied Measurement Modeling

Length: Four Days Instructors: Patrick Curran & Greg Hancock Lifetime Access: No expirations

Causal Inference

Length: Five Days Instructor: Doug Steinley Lifetime Access: No expirations

Multilevel Modeling

Length: Five Days Instructors: Dan Bauer & Patrick Curran Lifetime Access: No expirations

Applied Qualitative Research

Length: Five Days Instructors: Greg Guest & Emily Namey Lifetime Access: No expirations

Modern Missing Data Analysis

Length: Three Days Instructor: Craig Enders Lifetime Access: No expirations

Machine Learning: Theory and Applications

Sample size planning for power and accuracy.

Length: Three Days Instructor: Samantha Anderson Lifetime Access: No expirations

Network Analysis

Applied research design using mixed methods.

Length: Two Days Instructor: Greg Guest Lifetime Access: No expirations

Introduction to Data Visualization in R

Length: Four Days Instructor: Michael Hallquist Lifetime Access: No expirations

Longitudinal Structural Equation Modeling

Analyzing intensive longitudinal data.

Length: Five Days Instructors: J-P Laurenceau & Niall Bolger Lifetime Access: No expirations

meta analysis in quantitative research example

Quantitative Research Methods

  • Introduction
  • Descriptive and Inferential Statistics
  • Hypothesis Testing
  • Regression and Correlation
  • Time Series

Meta-Analysis

  • Mixed Methods
  • Additional Resources
  • Get Research Help

A meta-analysis uses statistical methods to synthesize the results of multiple studies, often by calculating a weighted average of effect sizes.  Before embarking on a meta-analysis, make sure you are familiar with reviews, in particular systematic reviews.  

  • Meta-analysis in medical research Hippokratia article by A. B. Haidich.
  • Meta-Analysis: Recent Developments in Quantitative Methods for Literature Reviews Annual Review of Psychology article by R. Rosenthal and M. R. DiMatteo.
  • Analyzing Data for Meta-analysis Chapter from the Cochrane Review Handbook.
  • A typology of reviews Health Information & Libraries Journal article by M. J. Grant and A. Booth.

Forest Plots

A forest plot is a type of graph used in meta-analyses that displays the results of multiple studies next to each other.

Books and eBooks

meta analysis in quantitative research example

  • << Previous: Time Series
  • Next: Mixed Methods >>
  • Last Updated: Aug 18, 2023 11:55 AM
  • URL: https://guides.library.duq.edu/quant-methods

We couldn’t find any results matching your search.

Please try using other words for your search or explore other sections of the website for relevant information.

We’re sorry, we are currently experiencing some issues, please try again later.

Our team is working diligently to resolve the issue. Thank you for your patience and understanding.

News & Insights

Validea-Logo

META Quantitative Stock Analysis

April 16, 2024 — 08:08 am EDT

Written by John Reese for Validea  ->

Below is Validea's guru fundamental report for META PLATFORMS INC ( META ) . Of the 22 guru strategies we follow, META rates highest using our P/B Growth Investor model based on the published strategy of Partha Mohanram . This growth model looks for low book-to-market stocks that exhibit characteristics associated with sustained future growth.

META PLATFORMS INC ( META ) is a large-cap growth stock in the Business Services industry. The rating using this strategy is 88% based on the firm’s underlying fundamentals and the stock’s valuation. A score of 80% or above typically indicates that the strategy has some interest in the stock and a score above 90% typically indicates strong interest.

The following table summarizes whether the stock meets each of this strategy's tests. Not all criteria in the below table receive equal weighting or are independent, but the table provides a brief overview of the strong and weak points of the security in the context of the strategy's criteria.

Detailed Analysis of META PLATFORMS INC

META Guru Analysis

META Fundamental Analysis

More Information on Partha Mohanram

Partha Mohanram Portfolio

About Partha Mohanram : Sometimes the best investing strategies don't come from the world of investing. Sometimes research that changes the investing world can come from the halls of academia. Partha Mohanram is a great example of this. While academic research has shown that value investing works over time, it has found the opposite for growth investing. Mohanram turned that research on its head by developing a growth model that produced significant market outperformance. His research paper "Separating Winners from Losers among Low Book-to-Market Stocks using Financial Statement Analysis" looked at the criteria that can be used to separate growth stocks that continue their upward trajectory from those that don't. Mohanram is currently the John H. Watson Chair in Value Investing at the University of Toronto and was previously an Associate Professor at the Columbia Business School.

Additional Research Links

Top NASDAQ 100 Stocks

Top Technology Stocks

Magnificent Seven Stocks

High Momentum Stocks

Top AI Stocks

High Insider Ownership Stocks

About Validea : Validea is an investment research service that follows the published strategies of investment legends. Validea offers both stock analysis and model portfolios based on gurus who have outperformed the market over the long-term, including Warren Buffett, Benjamin Graham, Peter Lynch and Martin Zweig. For more information about Validea, click here

The views and opinions expressed herein are the views and opinions of the author and do not necessarily reflect those of Nasdaq, Inc.

Validea logo

Stocks mentioned

More related articles.

This data feed is not available at this time.

Sign up for the TradeTalks newsletter to receive your weekly dose of trading news, trends and education. Delivered Wednesdays.

To add symbols:

  • Type a symbol or company name. When the symbol you want to add appears, add it to My Quotes by selecting it and pressing Enter/Return.
  • Copy and paste multiple symbols separated by spaces.

These symbols will be available throughout the site during your session.

Your symbols have been updated

Edit watchlist.

  • Type a symbol or company name. When the symbol you want to add appears, add it to Watchlist by selecting it and pressing Enter/Return.

Opt in to Smart Portfolio

Smart Portfolio is supported by our partner TipRanks. By connecting my portfolio to TipRanks Smart Portfolio I agree to their Terms of Use .

Meta-Analysis for Nonprofit Research: Synthesizing Quantitative Evidence for Knowledge Advancement

  • Research Papers
  • Published: 29 June 2022
  • Volume 34 , pages 734–746, ( 2023 )

Cite this article

  • ChiaKo Hung   ORCID: orcid.org/0000-0001-6598-6024 1 &
  • Jiahuan Lu 2  

561 Accesses

3 Citations

14 Altmetric

Explore all metrics

The past two decades have witnessed massive growth in the amount of quantitative research in nonprofit studies. Despite the large number of studies, findings from these studies have not always been consistent and cumulative. The diverse and competing findings constitute a barrier to offering clear, coherent knowledge for both research and practice. To further advance nonprofit studies, some have called for meta-analysis to synthesize inconsistent findings. Although meta-analysis has been increasingly used in nonprofit studies in the past decade, many researchers are still not familiar with the method. This article thus introduces meta-analysis to nonprofit scholars and, through an example demonstration, provides general guidelines for nonprofit scholars with background in statistical methods to conduct meta-analyses, with a focus on various judgement calls throughout the research process. This article could help nonprofit scholars who are interested in using meta-analysis to address some unsolved research questions in the nonprofit literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

meta analysis in quantitative research example

Similar content being viewed by others

Improving the quality of empirical nonprofit research: the focal constructs and their measures.

Walter Wymer

meta analysis in quantitative research example

A Century of Nonprofit Studies: Scaling the Knowledge of the Field

Ji Ma & Sara Konrath

meta analysis in quantitative research example

The Systematic Literature Review: Advantages and Applications in Nonprofit Scholarship

Beth Gazley

In this manuscript, we use nonprofit studies to refer to the studies on voluntary actions, nonprofit organizations, and civil society.

We also searched some other related journals in nonprofit studies including International Review on Public and Nonprofit Marketing , Journal of Nonprofit & Public Sector Marketing , Journal of Philanthropy and Marketing , Journal of Nonprofit Education and Leadership , Nonprofit Policy Forum , and only found one meta-analysis (Xu & Huang, 2020 ).

This article introduces the basic steps and important judgment calls for nonprofit scholars who are interested in conducting meta-analysis. The discussion in this section is not exhaustive. Interested readers should refer to meta-analysis textbooks listed in the references.

Ringquist ( 2013 , pp. 121 to 124) provides detailed information about choosing appropriate statistics to estimate effect sizes.

Before analyzing effect sizes, researchers need to choose between a fixed effects and a random effects framework. There are two approaches that researchers can use to decide which framework to use. They are the Q test and I 2 statistic approaches. Researchers can use the Q test approach to identify excess variance in a sample of effect sizes, and use the I 2 statistic approach to assess the magnitude of the variability in effect sizes that is not attributable to sampling errors. Ringquist ( 2013 , pp. 121 to 124) offers detailed information to conduct the Q test and calculate I 2 statistic, providing criteria to choose between the two frameworks. Generally, random effects models are more widely used in social science, since it is less reasonable to assume that all studies shares the same, one common effect.

The inverse variance weight is a function of sample size. Effect sizes from studies that have larger sample are placed more weight.

There might be effects that lie an abnormal distance from other effects. They are outliers. We suggest that researchers exclude them before combing effect size or conduct robustness checks by reporting average effect sizes with and without those outliers.

Blom, R., Kruyen, P. M., Van der Heijden, B. I., & Van Thiel, S. (2020). One HRM fits all? A meta-analysis of the effects of HRM practices in the public, semipublic, and private sector. Review of Public Personnel Administration, 40 (1), 3–35.

Article   Google Scholar  

Borenstein, M. (2009). Effect sizes for continuous data. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed., pp. 221–236). Russell Sage.

Google Scholar  

Cheung, M. W. L. (2015). Meta-analysis: A structural equation modeling approach . John Wiley & Sons.

Book   Google Scholar  

Chalmers, I., Hedges, L. V., & Cooper, H. (2002). A brief history of research synthesis. Evaluation & the Health Professions, 25 (1), 12–37.

Chapman, C. M., Hornsey, M. J., & Gillespie, N. (2021). To what extent is trust a prerequisite for charitable giving? A systematic review and meta-analysis. Nonprofit and Voluntary Sector Quarterly, 50 , 1274–1303. https://doi.org/10.1177/08997640211003250

Cooper, H. (2017). Research synthesis and meta-analysis: A step-by-step approach . Sage.

Daniel, J. L., & Kim, M. (2018). The scale of mission-embeddedness as a nonprofit revenue classification tool: Different earned revenue types, different performance effects. Administration & Society, 50 (7), 947–972.

De Wit, A., & Bekkers, R. (2017). Government support and charitable donations: A meta-analysis of the crowding-out hypothesis. Journal of Public Administration Research and Theory, 27 (2), 301–319.

Fleiss, J. L., & Berlin, J. A. (2009). Effect sizes for dichotomous data. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed., pp. 237–254). Russell Sage.

Geyskens, I., Krishnan, R., Steenkamp, J. B. E., & Cunha, P. V. (2009). A review and evaluation of meta-analysis practices in management research. Journal of Management, 35 (2), 393–419.

Glass, G. V. (1976). Primary, secondary, and meta-analysis of research. Educational Researcher, 5 (10), 3–8.

Gurevitch, J., Koricheva, J., Nakagawa, S., & Stewart, G. (2018). Meta-analysis and the science of research synthesis. Nature, 555 (7695), 175–182.

Hedges, L., & Olkin, I. (1985). Statistical methods for meta-analysis . Academic Press.

Hung, C. (2020). Commercialization and nonprofit donations: A meta-analytic assessment and extension. Nonprofit Management and Leadership . https://doi.org/10.1002/nml.21435

Hung, C., & Hager, M. A. (2019). The impact of revenue diversification on nonprofit financial health: A meta-analysis. Nonprofit and Voluntary Sector Quarterly, 48 (1), 5–27.

Hunt, M. (1997). How science takes stock: The story of meta-analysis . Russell Sage Foundation.

Jackson, S. K., Guerrero, S., & Appe, S. (2014). The state of nonprofit and philanthropic studies doctoral education. Nonprofit and Voluntary Sector Quarterly, 43 (5), 795–811.

Kim, M. (2017). The relationship of nonprofits’ financial health to program outcomes: Empirical evidence from nonprofit arts organizations. Nonprofit and Voluntary Sector Quarterly, 46 (3), 525–548.

Kulik, J. A., & Kulik, C. L. C. (1989). Meta-analysis in education. International Journal of Educational Research, 13 (3), 221–340.

Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis . Sage.

Lu, J. (2016). The philanthropic consequence of government grants to nonprofit organizations: A meta-analysis. Nonprofit Management and Leadership, 26 (4), 381–400.

Lu, J. (2017). Does population heterogeneity really matter to nonprofit sector size? Revisiting Weisbrod’s demand heterogeneity hypothesis. VOLUNTAS: International Journal of Voluntary and Nonprofit Organizations , 1–27.

Lu, J. (2018). Organizational antecedents of nonprofit engagement in policy advocacy: A meta-analytical review. Nonprofit and Voluntary Sector Quarterly, 47 (4_suppl), 177S-203S.

Lu, J., & Xu, C. (2018). Complementary or supplementary? The relationship between government size and nonprofit sector size. VOLUNTAS: International Journal of Voluntary and Nonprofit Organizations, 29 (3), 454–469.

Lu, J., Lin, W., & Wang, Q. (2019). Does a more diversified revenue structure lead to greater financial capacity and less vulnerability in nonprofit organizations? A bibliometric and meta-analysis. VOLUNTAS: International Journal of Voluntary and Nonprofit Organizations, 30 (3), 593–609.

Ma, J., & Konrath, S. (2018). A century of nonprofit studies: Scaling the knowledge of the field. VOLUNTAS: International Journal of Voluntary and Nonprofit Organizations, 29 (6), 1139–1158.

Pearson, K. (1904). Report on certain enteric fever inoculation statistics. BMJ, 3 , 1243–1246.

Pfeffer, J. (1993). Barriers to the advance of organizational science: Paradigm development as a dependent variable. Academy of Management Review, 18 (4), 599–620.

Reed, J. G., & Baxter, P. M. (2009). Using reference databases. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed., pp. 73–101). Russell Sage Foundation.

Ringquist, E. (2013). Meta-analysis for public management and policy . John Wiley & Sons.

Rosethal, R., & DiMatteo, M. (2001). Meta-analysis: Recent developments in quantitative methods for literature review. Annual Review of Psychology, 52 , 59–82.

Rothstein, H. R., Sutton, A. J., & Borenstein, M. (Eds.). (2006). Publication bias in meta-analysis: Prevention, assessment and adjustments . John Wiley & Sons.

Schmidt, F. L. (1992). What do data really mean? Research findings, meta-analysis, and cumulative knowledge in psychology. American Psychologist, 47 (10), 1173–1181.

Schmidt, F. L., & Hunter, J. E. (2015). Methods of meta-analysis: Correcting error and bias in research findings . Sage.

Shadish, W. R., & Lecy, J. D. (2015). The meta-analytic big bang. Research Synthesis Methods, 6 (3), 246–264.

Shoham, A., Ruvio, A., Vigoda-Gadot, E., & Schwabsky, N. (2006). Market orientations in the nonprofit and voluntary sector: A meta-analysis of their relationships with organizational performance. Nonprofit and Voluntary Sector Quarterly, 35 (3), 453–476.

Smith, M. L., & Glass, G. V. (1977). Meta-analysis of psychotherapy outcome studies. American Psychologist, 32 (9), 752–760.

Stanley, T. D., & Jarrell, S. B. (2005). Meta-regression analysis: A quantitative method of literature surveys. Journal of Economic Surveys, 19 (3), 299–308.

Sutton, A. J. (2009). Publication bias. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed., pp. 435–452). Russell Sage Foundation.

Sutton, A. J., Abrams, K. R., Jones, D. R., Jones, D. R., Sheldon, T. A., & Song, F. (2000). Methods for meta-analysis in medical research . Wiley.

Thompson, S. G., & Higgins, J. P. (2002). How should meta-regression analyses be undertaken and interpreted? Statistics in Medicine, 21 (11), 1559–1573.

Tipton, E., Pustejovsky, J. E., & Ahmadi, H. (2019). A history of meta-regression: Technical, conceptual, and practical developments between 1974 and 2018. Research Synthesis Methods, 10 (2), 161–179.

Tranfield, D., Denyer, D., & Smart, P. (2003). Towards a methodology for developing evidence-informed management knowledge by means of systematic review. British Journal of Management, 14 (3), 207–222.

Xu, J., & Huang, G. (2020). The relative effectiveness of gain-framed and loss-framed messages in charity advertising: Meta-analytic evidence and implications. International Journal of Nonprofit and Voluntary Sector Marketing, 25 (4), e1675.

Willems, J., Boenigk, S., & Jegers, M. (2014). Seven trade-offs in measuring nonprofit performance and effectiveness. Voluntas: International Journal of Voluntary and Nonprofit Organizations, 25 (6), 1648–1670.

Download references

This study was not funded by any organization.

Author information

Authors and affiliations.

Public Administration Program, University of Hawaiʻi at Mānoa, 2424 Maile Way, Saunders Hall 631, Honolulu, HI, 96822, USA

ChiaKo Hung

School of Public Affairs and Administration, Rutgers University-Newark, 111 Washington Street, Newark, NJ, 07102, USA

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to ChiaKo Hung .

Ethics declarations

Conflict of interest.

The authors declare that they have no conflict of interests.

Human and Animal Rights

This study does not involve Human Participants and/or Animals.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix 1. Published Meta-Analyses in Three Leading Nonprofit Studies Journals

Rights and permissions.

Reprints and permissions

About this article

Hung, C., Lu, J. Meta-Analysis for Nonprofit Research: Synthesizing Quantitative Evidence for Knowledge Advancement. Voluntas 34 , 734–746 (2023). https://doi.org/10.1007/s11266-022-00505-3

Download citation

Accepted : 23 May 2022

Published : 29 June 2022

Issue Date : August 2023

DOI : https://doi.org/10.1007/s11266-022-00505-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Meta-analysis
  • Research synthesis
  • Nonprofit studies
  • Quantitative research methods
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. PPT

    meta analysis in quantitative research example

  2. Introduction to Quantitative Meta-Analysis

    meta analysis in quantitative research example

  3. PPT

    meta analysis in quantitative research example

  4. Meta-Analysis: A Quantitative Approach To Research Integration

    meta analysis in quantitative research example

  5. Meta-Analysis Methodology for Basic Research: A Practical Guide

    meta analysis in quantitative research example

  6. Week 12: Quantitative Research Methods

    meta analysis in quantitative research example

VIDEO

  1. What is a Meta Analysis?

  2. Step By Step Metaanalysis For Beginners

  3. Basics of Meta-Analysis

  4. Conducting a Meta-Analysis

  5. How To Extract Quantitative Data For Systematic Review And Meta Analysis

  6. What is Meta-Analysis?

COMMENTS

  1. Introduction to systematic review and meta-analysis

    A meta-analysis is a quantitative review, in which the clinical effectiveness is evaluated by calculating the weighted pooled estimate for the interventions in at least two separate studies. The pooled estimate is the outcome of the meta-analysis, and is typically explained using a forest plot (Figs. 3 and and4). 4). The black squares in the ...

  2. Meta-Analysis

    Definition. "A meta-analysis is a formal, epidemiological, quantitative study design that uses statistical methods to generalise the findings of the selected independent studies. Meta-analysis and systematic review are the two most authentic strategies in research. When researchers start looking for the best available evidence concerning ...

  3. How to conduct a meta-analysis in eight steps: a practical guide

    For example, a meta-analysis about startup performance could investigate the impact of different ways to measure the performance construct (e.g., growth vs. profitability vs. survival time) or certain characteristics of the founders as moderators. ... Accordingly, it serves as a quantitative synthesis of a research field (Borenstein et al. 2009 ...

  4. PDF How to conduct a meta-analysis in eight steps: a practical guide

    Meta-analysis is a central method for knowledge accumulation in many scien-tic elds (Aguinis et al. 2011c; Kepes et al. 2013). Similar to a narrative review, it serves as a synopsis of a research question or eld. However, going beyond a narra-tive summary of key ndings, a meta-analysis adds value in providing a quantitative

  5. What Is Meta-Analysis? Definition, Research & Examples

    Meta-analysis is a quantitative research method that involves the systematic synthesis and statistical analysis of data from multiple individual studies on a particular topic or research question. It aims to provide a comprehensive and robust summary of existing evidence by pooling the results of these studies, often leading to more precise and ...

  6. Quantitative evidence synthesis: a practical guide on meta-analysis

    Meta-analysis is a quantitative way of synthesizing results from multiple studies to obtain reliable evidence of an intervention or phenomenon. Indeed, an increasing number of meta-analyses are conducted in environmental sciences, and resulting meta-analytic evidence is often used in environmental policies and decision-making. We conducted a survey of recent meta-analyses in environmental ...

  7. 30 Meta-Analysis and Quantitative Research Synthesis

    Glass (1976) defined primary-, secondary-, and meta-analysis as the analysis of data in an original study, the re-analysis of data previously explored in an effort to answer new questions or existing questions in a new way, and the quantitative analysis of results from multiple studies, respectively.A notable distinction between meta-analysis as compared to primary and secondary analysis ...

  8. Meta-analysis and the science of research synthesis

    Meta-analysis is the quantitative, scientific synthesis of research results. Since the term and modern approaches to research synthesis were first introduced in the 1970s, meta-analysis has had a ...

  9. Research Guides: Study Design 101: Meta-Analysis

    Meta-analysis would be used for the following purposes: To establish statistical significance with studies that have conflicting results. To develop a more correct estimate of effect magnitude. To provide a more complex analysis of harms, safety data, and benefits. To examine subgroups with individual numbers that are not statistically significant.

  10. A step by step guide for conducting a systematic review and meta

    To do the meta-analysis, we can use free software, such as RevMan or R package meta . In this example, we will use the R package meta. The tutorial of meta package can be accessed through "General Package for Meta-Analysis" tutorial pdf . The R codes and its guidance for meta-analysis done can be found in Additional file 5: File S3.

  11. Methodological Guidance Paper: High-Quality Meta-Analysis in a

    Meta-analysis, a set of statistical techniques for synthesizing the results of multiple studies (Borenstein, Hedges, Higgins, & Rothstein, 2009; Higgins & Green, 2011), is used in a systematic review when the guiding research question focuses on a quantitative summary of study results.For example, Dietrichson, Bøg, Filges, and Jørgensen (2017) conducted a systematic review to understand the ...

  12. Reviewing Quantitative Studies: Meta-analysis and Narrative ...

    Meta-analysis is a method of combining the results of a number of quantitative studies to produce a single weighted average result. In order to undertake and to correctly interpret a meta-analysis, it is important to understand the meaning of quantitative research results, in particular the p-value and confidence interval, both of which are widely misunderstood.

  13. Meta‐analysis and traditional systematic literature reviews—What, why

    To illustrate, Dulebohn et al. describe their meta-analysis sample based on the type of organization (63% private and for-profit, 16% public sector, 15% education, and 6% health sector), sample location (83% the United States and 17% rest of the word), and research design (97% study reported cross-sectional results and only 3% reported ...

  14. Introduction to Procedures and Methods of Meta-Analysis

    Abstract and Figures. Meta-Analysis is the quantitative integration of research findings with the help of statistical tools. Reviewing academic literature on meta-analysis and research synthesis ...

  15. Introduction to Quantitative Meta-Analysis

    Quantitative meta-analysis allows synthesis of results from primary studies investigating relations among common variables. The procedure entails first capturing effect sizes that numerically describe the relationship among relevant variables in each primary study. Primary study characteristics can then be investigated as sources of variability ...

  16. LibGuides: Quantitative Research Methods: Meta-Analysis

    Meta-Analysis. A meta-analysis uses statistical methods to synthesize the results of multiple studies, often by calculating a weighted average of effect sizes. Before embarking on a meta-analysis, make sure you are familiar with reviews, in particular systematic reviews. Meta-analysis in medical research. Hippokratia article by A. B. Haidich.

  17. What do meta-analysts need in primary studies? Guidelines ...

    Meta-analysis is a statistical technique that emerged in response to the need to combine results from studies addressing similar research questions to draw a general conclusion about the state-of-the-art of a given research topic (Glass, 1976).This methodology began to be implemented in the 1980s when it was uncommon for authors to make the datasets utilized in their studies freely available.

  18. Understanding Meta-Analysis: A Review of the ...

    Abstract and Figures. Meta-analysis is a quantitative technique that uses specific measures (e.g., an effect size) to indicate the strength of variable relationships for the studies included in ...

  19. Meta-Analysis

    Meta-analysis would be used for the following purposes: To establish statistical significance with studies that have conflicting results. To develop a more correct estimate of effect magnitude. To provide a more complex analysis of harms, safety data, and benefits. To examine subgroups with individual numbers that are not statistically significant.

  20. META Quantitative Stock Analysis

    META Quantitative Stock Analysis April 16, 2024 — 08:08 am EDT ... Partha Mohanram is a great example of this. While academic research has shown that value investing works over time, it has ...

  21. Meta-Analysis for Nonprofit Research: Synthesizing Quantitative

    The first meta-analysis-like research was published in 1904 by Karl Pearson, who gathered data from 11 studies to examine the effectiveness of vaccination in preventing soldiers from contracting typhoid fever (Pearson, 1904).However, it was Glass who coined the term "meta-analysis," and Smith and Glass's meta-analysis of psychotherapy effectiveness that influenced the academic community ...