U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

How to Do a Systematic Review: A Best Practice Guide for Conducting and Reporting Narrative Reviews, Meta-Analyses, and Meta-Syntheses

Affiliations.

  • 1 Behavioural Science Centre, Stirling Management School, University of Stirling, Stirling FK9 4LA, United Kingdom; email: [email protected].
  • 2 Department of Psychological and Behavioural Science, London School of Economics and Political Science, London WC2A 2AE, United Kingdom.
  • 3 Department of Statistics, Northwestern University, Evanston, Illinois 60208, USA; email: [email protected].
  • PMID: 30089228
  • DOI: 10.1146/annurev-psych-010418-102803

Systematic reviews are characterized by a methodical and replicable methodology and presentation. They involve a comprehensive search to locate all relevant published and unpublished work on a subject; a systematic integration of search results; and a critique of the extent, nature, and quality of evidence in relation to a particular research question. The best reviews synthesize studies to draw broad theoretical conclusions about what a literature means, linking theory to evidence and evidence to theory. This guide describes how to plan, conduct, organize, and present a systematic review of quantitative (meta-analysis) or qualitative (narrative review, meta-synthesis) information. We outline core standards and principles and describe commonly encountered problems. Although this guide targets psychological scientists, its high level of abstraction makes it potentially relevant to any subject area or discipline. We argue that systematic reviews are a key methodology for clarifying whether and how research findings replicate and for explaining possible inconsistencies, and we call for researchers to conduct systematic reviews to help elucidate whether there is a replication crisis.

Keywords: evidence; guide; meta-analysis; meta-synthesis; narrative; systematic review; theory.

PubMed Disclaimer

Similar articles

  • The future of Cochrane Neonatal. Soll RF, Ovelman C, McGuire W. Soll RF, et al. Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12. Early Hum Dev. 2020. PMID: 33036834
  • Summarizing systematic reviews: methodological development, conduct and reporting of an umbrella review approach. Aromataris E, Fernandez R, Godfrey CM, Holly C, Khalil H, Tungpunkom P. Aromataris E, et al. Int J Evid Based Healthc. 2015 Sep;13(3):132-40. doi: 10.1097/XEB.0000000000000055. Int J Evid Based Healthc. 2015. PMID: 26360830
  • RAMESES publication standards: meta-narrative reviews. Wong G, Greenhalgh T, Westhorp G, Buckingham J, Pawson R. Wong G, et al. BMC Med. 2013 Jan 29;11:20. doi: 10.1186/1741-7015-11-20. BMC Med. 2013. PMID: 23360661 Free PMC article.
  • A Primer on Systematic Reviews and Meta-Analyses. Nguyen NH, Singh S. Nguyen NH, et al. Semin Liver Dis. 2018 May;38(2):103-111. doi: 10.1055/s-0038-1655776. Epub 2018 Jun 5. Semin Liver Dis. 2018. PMID: 29871017 Review.
  • Publication Bias and Nonreporting Found in Majority of Systematic Reviews and Meta-analyses in Anesthesiology Journals. Hedin RJ, Umberham BA, Detweiler BN, Kollmorgen L, Vassar M. Hedin RJ, et al. Anesth Analg. 2016 Oct;123(4):1018-25. doi: 10.1213/ANE.0000000000001452. Anesth Analg. 2016. PMID: 27537925 Review.
  • Strength of evidence for five happiness strategies. Puterman E, Zieff G, Stoner L. Puterman E, et al. Nat Hum Behav. 2024 Aug 12. doi: 10.1038/s41562-024-01954-0. Online ahead of print. Nat Hum Behav. 2024. PMID: 39134738 No abstract available.
  • Nursing Education During the SARS-COVID-19 Pandemic: The Implementation of Information and Communication Technologies (ICT). Soto-Luffi O, Villegas C, Viscardi S, Ulloa-Inostroza EM. Soto-Luffi O, et al. Med Sci Educ. 2024 May 9;34(4):949-959. doi: 10.1007/s40670-024-02056-2. eCollection 2024 Aug. Med Sci Educ. 2024. PMID: 39099870 Review.
  • Surveillance of Occupational Exposure to Volatile Organic Compounds at Gas Stations: A Scoping Review Protocol. Mendes TMC, Soares JP, Salvador PTCO, Castro JL. Mendes TMC, et al. Int J Environ Res Public Health. 2024 Apr 23;21(5):518. doi: 10.3390/ijerph21050518. Int J Environ Res Public Health. 2024. PMID: 38791733 Free PMC article. Review.
  • Association between poor sleep and mental health issues in Indigenous communities across the globe: a systematic review. Fernandez DR, Lee R, Tran N, Jabran DS, King S, McDaid L. Fernandez DR, et al. Sleep Adv. 2024 May 2;5(1):zpae028. doi: 10.1093/sleepadvances/zpae028. eCollection 2024. Sleep Adv. 2024. PMID: 38721053 Free PMC article.
  • Barriers to ethical treatment of patients in clinical environments: A systematic narrative review. Dehkordi FG, Torabizadeh C, Rakhshan M, Vizeshfar F. Dehkordi FG, et al. Health Sci Rep. 2024 May 1;7(5):e2008. doi: 10.1002/hsr2.2008. eCollection 2024 May. Health Sci Rep. 2024. PMID: 38698790 Free PMC article.
  • Search in MeSH

LinkOut - more resources

Full text sources.

  • Ingenta plc
  • Ovid Technologies, Inc.

Other Literature Sources

  • scite Smart Citations

Miscellaneous

  • NCI CPTAC Assay Portal
  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

  • Search Menu
  • Sign in through your institution
  • Advance articles
  • Editor's Choice
  • 100 years of the AJE
  • Collections
  • Author Guidelines
  • Submission Site
  • Open Access Options
  • About American Journal of Epidemiology
  • About the Johns Hopkins Bloomberg School of Public Health
  • Journals Career Network
  • Editorial Board
  • Advertising and Corporate Services
  • Self-Archiving Policy
  • Dispatch Dates
  • Journals on Oxford Academic
  • Books on Oxford Academic

Society for Epidemiologic Research

Article Contents

Systematic reviews of the literature: an introduction to current methods.

  • Article contents
  • Figures & tables
  • Supplementary Data

Romina Brignardello-Petersen, Nancy Santesso, Gordon H Guyatt, Systematic reviews of the literature: an introduction to current methods, American Journal of Epidemiology , 2024;, kwae232, https://doi.org/10.1093/aje/kwae232

  • Permissions Icon Permissions

Systematic reviews are a type of evidence synthesis in which authors develop explicit eligibility criteria, collect all the available studies that meet these criteria, and summarize results using reproducible methods that minimize biases and errors. Systematic reviews serve different purposes and use a different methodology than other types of evidence synthesis that include narrative reviews, scoping reviews, and overviews of reviews. Systematic reviews can address questions regarding effects of interventions or exposures, diagnostic properties of tests, and prevalence or prognosis of diseases. All rigorous systematic reviews have common processes that include: 1) determining the question and eligibility criteria, including a priori specification of subgroup hypotheses 2) searching for evidence and selecting studies, 3) abstracting data and assessing risk of bias of the included studies, 4) summarizing the data for each outcome of interest, whenever possible using meta-analyses, and 5) assessing the certainty of the evidence and drawing conclusions. There are several tools that can guide and facilitate the systematic review process, but methodological and content expertise are always necessary.

  • narrative review
Month: Total Views:
July 2024 179
August 2024 112

Email alerts

Citing articles via, looking for your next opportunity.

  • Recommend to your Library

Affiliations

  • Online ISSN 1476-6256
  • Print ISSN 0002-9262
  • Copyright © 2024 Johns Hopkins Bloomberg School of Public Health
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Jump to navigation

Home

Cochrane Training

Chapter 14: completing ‘summary of findings’ tables and grading the certainty of the evidence.

Holger J Schünemann, Julian PT Higgins, Gunn E Vist, Paul Glasziou, Elie A Akl, Nicole Skoetz, Gordon H Guyatt; on behalf of the Cochrane GRADEing Methods Group (formerly Applicability and Recommendations Methods Group) and the Cochrane Statistical Methods Group

Key Points:

  • A ‘Summary of findings’ table for a given comparison of interventions provides key information concerning the magnitudes of relative and absolute effects of the interventions examined, the amount of available evidence and the certainty (or quality) of available evidence.
  • ‘Summary of findings’ tables include a row for each important outcome (up to a maximum of seven). Accepted formats of ‘Summary of findings’ tables and interactive ‘Summary of findings’ tables can be produced using GRADE’s software GRADEpro GDT.
  • Cochrane has adopted the GRADE approach (Grading of Recommendations Assessment, Development and Evaluation) for assessing certainty (or quality) of a body of evidence.
  • The GRADE approach specifies four levels of the certainty for a body of evidence for a given outcome: high, moderate, low and very low.
  • GRADE assessments of certainty are determined through consideration of five domains: risk of bias, inconsistency, indirectness, imprecision and publication bias. For evidence from non-randomized studies and rarely randomized studies, assessments can then be upgraded through consideration of three further domains.

Cite this chapter as: Schünemann HJ, Higgins JPT, Vist GE, Glasziou P, Akl EA, Skoetz N, Guyatt GH. Chapter 14: Completing ‘Summary of findings’ tables and grading the certainty of the evidence. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.4 (updated August 2023). Cochrane, 2023. Available from www.training.cochrane.org/handbook .

14.1 ‘Summary of findings’ tables

14.1.1 introduction to ‘summary of findings’ tables.

‘Summary of findings’ tables present the main findings of a review in a transparent, structured and simple tabular format. In particular, they provide key information concerning the certainty or quality of evidence (i.e. the confidence or certainty in the range of an effect estimate or an association), the magnitude of effect of the interventions examined, and the sum of available data on the main outcomes. Cochrane Reviews should incorporate ‘Summary of findings’ tables during planning and publication, and should have at least one key ‘Summary of findings’ table representing the most important comparisons. Some reviews may include more than one ‘Summary of findings’ table, for example if the review addresses more than one major comparison, or includes substantially different populations that require separate tables (e.g. because the effects differ or it is important to show results separately). In the Cochrane Database of Systematic Reviews (CDSR),  all ‘Summary of findings’ tables for a review appear at the beginning, before the Background section.

14.1.2 Selecting outcomes for ‘Summary of findings’ tables

Planning for the ‘Summary of findings’ table starts early in the systematic review, with the selection of the outcomes to be included in: (i) the review; and (ii) the ‘Summary of findings’ table. This is a crucial step, and one that review authors need to address carefully.

To ensure production of optimally useful information, Cochrane Reviews begin by developing a review question and by listing all main outcomes that are important to patients and other decision makers (see Chapter 2 and Chapter 3 ). The GRADE approach to assessing the certainty of the evidence (see Section 14.2 ) defines and operationalizes a rating process that helps separate outcomes into those that are critical, important or not important for decision making. Consultation and feedback on the review protocol, including from consumers and other decision makers, can enhance this process.

Critical outcomes are likely to include clearly important endpoints; typical examples include mortality and major morbidity (such as strokes and myocardial infarction). However, they may also represent frequent minor and rare major side effects, symptoms, quality of life, burdens associated with treatment, and resource issues (costs). Burdens represent the impact of healthcare workload on patient function and well-being, and include the demands of adhering to an intervention that patients or caregivers (e.g. family) may dislike, such as having to undergo more frequent tests, or the restrictions on lifestyle that certain interventions require (Spencer-Bonilla et al 2017).

Frequently, when formulating questions that include all patient-important outcomes for decision making, review authors will confront reports of studies that have not included all these outcomes. This is particularly true for adverse outcomes. For instance, randomized trials might contribute evidence on intended effects, and on frequent, relatively minor side effects, but not report on rare adverse outcomes such as suicide attempts. Chapter 19 discusses strategies for addressing adverse effects. To obtain data for all important outcomes it may be necessary to examine the results of non-randomized studies (see Chapter 24 ). Cochrane, in collaboration with others, has developed guidance for review authors to support their decision about when to look for and include non-randomized studies (Schünemann et al 2013).

If a review includes only randomized trials, these trials may not address all important outcomes and it may therefore not be possible to address these outcomes within the constraints of the review. Review authors should acknowledge these limitations and make them transparent to readers. Review authors are encouraged to include non-randomized studies to examine rare or long-term adverse effects that may not adequately be studied in randomized trials. This raises the possibility that harm outcomes may come from studies in which participants differ from those in studies used in the analysis of benefit. Review authors will then need to consider how much such differences are likely to impact on the findings, and this will influence the certainty of evidence because of concerns about indirectness related to the population (see Section 14.2.2 ).

Non-randomized studies can provide important information not only when randomized trials do not report on an outcome or randomized trials suffer from indirectness, but also when the evidence from randomized trials is rated as very low and non-randomized studies provide evidence of higher certainty. Further discussion of these issues appears also in Chapter 24 .

14.1.3 General template for ‘Summary of findings’ tables

Several alternative standard versions of ‘Summary of findings’ tables have been developed to ensure consistency and ease of use across reviews, inclusion of the most important information needed by decision makers, and optimal presentation (see examples at Figures 14.1.a and 14.1.b ). These formats are supported by research that focused on improved understanding of the information they intend to convey (Carrasco-Labra et al 2016, Langendam et al 2016, Santesso et al 2016). They are available through GRADE’s official software package developed to support the GRADE approach: GRADEpro GDT (www.gradepro.org).

Standard Cochrane ‘Summary of findings’ tables include the following elements using one of the accepted formats. Further guidance on each of these is provided in Section 14.1.6 .

  • A brief description of the population and setting addressed by the available evidence (which may be slightly different to or narrower than those defined by the review question).
  • A brief description of the comparison addressed in the ‘Summary of findings’ table, including both the experimental and comparison interventions.
  • A list of the most critical and/or important health outcomes, both desirable and undesirable, limited to seven or fewer outcomes.
  • A measure of the typical burden of each outcomes (e.g. illustrative risk, or illustrative mean, on comparator intervention).
  • The absolute and relative magnitude of effect measured for each (if both are appropriate).
  • The numbers of participants and studies contributing to the analysis of each outcomes.
  • A GRADE assessment of the overall certainty of the body of evidence for each outcome (which may vary by outcome).
  • Space for comments.
  • Explanations (formerly known as footnotes).

Ideally, ‘Summary of findings’ tables are supported by more detailed tables (known as ‘evidence profiles’) to which the review may be linked, which provide more detailed explanations. Evidence profiles include the same important health outcomes, and provide greater detail than ‘Summary of findings’ tables of both of the individual considerations feeding into the grading of certainty and of the results of the studies (Guyatt et al 2011a). They ensure that a structured approach is used to rating the certainty of evidence. Although they are rarely published in Cochrane Reviews, evidence profiles are often used, for example, by guideline developers in considering the certainty of the evidence to support guideline recommendations. Review authors will find it easier to develop the ‘Summary of findings’ table by completing the rating of the certainty of evidence in the evidence profile first in GRADEpro GDT. They can then automatically convert this to one of the ‘Summary of findings’ formats in GRADEpro GDT, including an interactive ‘Summary of findings’ for publication.

As a measure of the magnitude of effect for dichotomous outcomes, the ‘Summary of findings’ table should provide a relative measure of effect (e.g. risk ratio, odds ratio, hazard) and measures of absolute risk. For other types of data, an absolute measure alone (such as a difference in means for continuous data) might be sufficient. It is important that the magnitude of effect is presented in a meaningful way, which may require some transformation of the result of a meta-analysis (see also Chapter 15, Section 15.4 and Section 15.5 ). Reviews with more than one main comparison should include a separate ‘Summary of findings’ table for each comparison.

Figure 14.1.a provides an example of a ‘Summary of findings’ table. Figure 15.1.b  provides an alternative format that may further facilitate users’ understanding and interpretation of the review’s findings. Evidence evaluating different formats suggests that the ‘Summary of findings’ table should include a risk difference as a measure of the absolute effect and authors should preferably use a format that includes a risk difference .

A detailed description of the contents of a ‘Summary of findings’ table appears in Section 14.1.6 .

Figure 14.1.a Example of a ‘Summary of findings’ table

Summary of findings (for interactive version click here )

anyone taking a long flight (lasting more than 6 hours)

international air travel

compression stockings

without stockings

Outcomes

* (95% CI)

Relative effect (95% CI)

Number of participants (studies)

Certainty of the evidence (GRADE)

Comments

Assumed risk

Corresponding risk

(DVT)

See comment

See comment

Not estimable

2821

(9 studies)

See comment

0 participants developed symptomatic DVT in these studies

(0.04 to 0.26)

2637

(9 studies)

⊕⊕⊕⊕

 

(0 to 3)

(1 to 8)

(2 to 15)

(0.18 to 1.13)

1804

(8 studies)

⊕⊕⊕◯

 

Post-flight values measured on a scale from 0, no oedema, to 10, maximum oedema

The mean oedema score ranged across control groups from

The mean oedema score in the intervention groups was on average

(95% CI –4.9 to –4.5)

 

1246

(6 studies)

⊕⊕◯◯

 

See comment

See comment

Not estimable

2821

(9 studies)

See comment

0 participants developed pulmonary embolus in these studies

See comment

See comment

Not estimable

2821

(9 studies)

See comment

0 participants died in these studies

See comment

See comment

Not estimable

1182

(4 studies)

See comment

The tolerability of the stockings was described as very good with no complaints of side effects in 4 studies

*The basis for the is provided in footnotes. The (and its 95% confidence interval) is based on the assumed risk in the intervention group and the of the intervention (and its 95% CI).

CI: confidence interval; RR: risk ratio; GRADE: GRADE Working Group grades of evidence (see explanations).

a All the stockings in the nine studies included in this review were below-knee compression stockings. In four studies the compression strength was 20 mmHg to 30 mmHg at the ankle. It was 10 mmHg to 20 mmHg in the other four studies. Stockings come in different sizes. If a stocking is too tight around the knee it can prevent essential venous return causing the blood to pool around the knee. Compression stockings should be fitted properly. A stocking that is too tight could cut into the skin on a long flight and potentially cause ulceration and increased risk of DVT. Some stockings can be slightly thicker than normal leg covering and can be potentially restrictive with tight foot wear. It is a good idea to wear stockings around the house prior to travel to ensure a good, comfortable fit. Participants put their stockings on two to three hours before the flight in most of the studies. The availability and cost of stockings can vary.

b Two studies recruited high risk participants defined as those with previous episodes of DVT, coagulation disorders, severe obesity, limited mobility due to bone or joint problems, neoplastic disease within the previous two years, large varicose veins or, in one of the studies, participants taller than 190 cm and heavier than 90 kg. The incidence for the seven studies that excluded high risk participants was 1.45% and the incidence for the two studies that recruited high-risk participants (with at least one risk factor) was 2.43%. We have used 10 and 30 per 1000 to express different risk strata, respectively.

c The confidence interval crosses no difference and does not rule out a small increase.

d The measurement of oedema was not validated (indirectness of the outcome) or blinded to the intervention (risk of bias).

e If there are very few or no events and the number of participants is large, judgement about the certainty of evidence (particularly judgements about imprecision) may be based on the absolute effect. Here the certainty rating may be considered ‘high’ if the outcome was appropriately assessed and the event, in fact, did not occur in 2821 studied participants.

f None of the other studies reported adverse effects, apart from four cases of superficial vein thrombosis in varicose veins in the knee region that were compressed by the upper edge of the stocking in one study.

Figure 14.1.b Example of alternative ‘Summary of findings’ table

children given antibiotics

inpatients and outpatient

probiotics

no probiotics

Follow-up: 10 days to 3 months

Children < 5 years

 

⊕⊕⊕⊝

Due to risk of bias

Probably decreases the incidence of diarrhoea.

1474 (7 studies)

(0.29 to 0.55)

(6.5 to 12.2)

(10.1 to 15.8 fewer)

Children > 5 years

 

⊕⊕⊝⊝

Due to risk of bias and imprecision

May decrease the incidence of diarrhoea.

624 (4 studies)

(0.53 to 1.21)

(5.9 to 13.6)

(5.3 fewer to 2.4 more)

Follow-up: 10 to 44 days

1575 (11 studies)

-

(0.8 to 3.8)

(1 fewer to 2 more)

⊕⊕⊝⊝

Due to risk of bias and inconsistency

There may be little or no difference in adverse events.

Follow-up: 10 days to 3 months

897 (5 studies)

-

The mean duration of diarrhoea without probiotics was

-

(1.18 to 0.02 fewer days)

⊕⊕⊝⊝

Due to imprecision and inconsistency

May decrease the duration of diarrhoea.

Follow-up: 10 days to 3 months

425 (4 studies)

-

The mean stools per day without probiotics was

-

(0.6 to 0 fewer)

⊕⊕⊝⊝

Due to imprecision and inconsistency

There may be little or no difference in stools per day.

*The basis for the (e.g. the median control group risk across studies) is provided in footnotes. The (and its 95% confidence interval) is based on the assumed risk in the comparison group and the of the intervention (and its 95% CI). confidence interval; risk ratio.

Control group risk estimates come from pooled estimates of control groups. Relative effect based on available case analysis

High risk of bias due to high loss to follow-up.

Imprecision due to few events and confidence intervals include appreciable benefit or harm.

Side effects: rash, nausea, flatulence, vomiting, increased phlegm, chest pain, constipation, taste disturbance and low appetite.

Risks were calculated from pooled risk differences.

High risk of bias. Only 11 of 16 trials reported on adverse events, suggesting a selective reporting bias.

Serious inconsistency. Numerous probiotic agents and doses were evaluated amongst a relatively small number of trials, limiting our ability to draw conclusions on the safety of the many probiotics agents and doses administered.

Serious unexplained inconsistency (large heterogeneity I = 79%, P value [P = 0.04], point estimates and confidence intervals vary considerably).

Serious imprecision. The upper bound of 0.02 fewer days of diarrhoea is not considered patient important.

Serious unexplained inconsistency (large heterogeneity I = 78%, P value [P = 0.05], point estimates and confidence intervals vary considerably).

Serious imprecision. The 95% confidence interval includes no effect and lower bound of 0.60 stools per day is of questionable patient importance.

14.1.4 Producing ‘Summary of findings’ tables

The GRADE Working Group’s software, GRADEpro GDT ( www.gradepro.org ), including GRADE’s interactive handbook, is available to assist review authors in the preparation of ‘Summary of findings’ tables. GRADEpro can use data on the comparator group risk and the effect estimate (entered by the review authors or imported from files generated in RevMan) to produce the relative effects and absolute risks associated with experimental interventions. In addition, it leads the user through the process of a GRADE assessment, and produces a table that can be used as a standalone table in a review (including by direct import into software such as RevMan or integration with RevMan Web), or an interactive ‘Summary of findings’ table (see help resources in GRADEpro).

14.1.5 Statistical considerations in ‘Summary of findings’ tables

14.1.5.1 dichotomous outcomes.

‘Summary of findings’ tables should include both absolute and relative measures of effect for dichotomous outcomes. Risk ratios, odds ratios and risk differences are different ways of comparing two groups with dichotomous outcome data (see Chapter 6, Section 6.4.1 ). Furthermore, there are two distinct risk ratios, depending on which event (e.g. ‘yes’ or ‘no’) is the focus of the analysis (see Chapter 6, Section 6.4.1.5 ). In the presence of a non-zero intervention effect, any variation across studies in the comparator group risks (i.e. variation in the risk of the event occurring without the intervention of interest, for example in different populations) makes it impossible for more than one of these measures to be truly the same in every study.

It has long been assumed in epidemiology that relative measures of effect are more consistent than absolute measures of effect from one scenario to another. There is empirical evidence to support this assumption (Engels et al 2000, Deeks and Altman 2001, Furukawa et al 2002). For this reason, meta-analyses should generally use either a risk ratio or an odds ratio as a measure of effect (see Chapter 10, Section 10.4.3 ). Correspondingly, a single estimate of relative effect is likely to be a more appropriate summary than a single estimate of absolute effect. If a relative effect is indeed consistent across studies, then different comparator group risks will have different implications for absolute benefit. For instance, if the risk ratio is consistently 0.75, then the experimental intervention would reduce a comparator group risk of 80% to 60% in the intervention group (an absolute risk reduction of 20 percentage points), but would also reduce a comparator group risk of 20% to 15% in the intervention group (an absolute risk reduction of 5 percentage points).

‘Summary of findings’ tables are built around the assumption of a consistent relative effect. It is therefore important to consider the implications of this effect for different comparator group risks (these can be derived or estimated from a number of sources, see Section 14.1.6.3 ), which may require an assessment of the certainty of evidence for prognostic evidence (Spencer et al 2012, Iorio et al 2015). For any comparator group risk, it is possible to estimate a corresponding intervention group risk (i.e. the absolute risk with the intervention) from the meta-analytic risk ratio or odds ratio. Note that the numbers provided in the ‘Corresponding risk’ column are specific to the ‘risks’ in the adjacent column.

For the meta-analytic risk ratio (RR) and assumed comparator risk (ACR) the corresponding intervention risk is obtained as:

systematic literature review figure

As an example, in Figure 14.1.a , the meta-analytic risk ratio for symptomless deep vein thrombosis (DVT) is RR = 0.10 (95% CI 0.04 to 0.26). Assuming a comparator risk of ACR = 10 per 1000 = 0.01, we obtain:

systematic literature review figure

For the meta-analytic odds ratio (OR) and assumed comparator risk, ACR, the corresponding intervention risk is obtained as:

systematic literature review figure

Upper and lower confidence limits for the corresponding intervention risk are obtained by replacing RR or OR by their upper and lower confidence limits, respectively (e.g. replacing 0.10 with 0.04, then with 0.26, in the example). Such confidence intervals do not incorporate uncertainty in the assumed comparator risks.

When dealing with risk ratios, it is critical that the same definition of ‘event’ is used as was used for the meta-analysis. For example, if the meta-analysis focused on ‘death’ (as opposed to survival) as the event, then corresponding risks in the ‘Summary of findings’ table must also refer to ‘death’.

In (rare) circumstances in which there is clear rationale to assume a consistent risk difference in the meta-analysis, in principle it is possible to present this for relevant ‘assumed risks’ and their corresponding risks, and to present the corresponding (different) relative effects for each assumed risk.

The risk difference expresses the difference between the ACR and the corresponding intervention risk (or the difference between the experimental and the comparator intervention).

For the meta-analytic risk ratio (RR) and assumed comparator risk (ACR) the corresponding risk difference is obtained as (note that risks can also be expressed using percentage or percentage points):

systematic literature review figure

As an example, in Figure 14.1.b the meta-analytic risk ratio is 0.41 (95% CI 0.29 to 0.55) for diarrhoea in children less than 5 years of age. Assuming a comparator group risk of 22.3% we obtain:

systematic literature review figure

For the meta-analytic odds ratio (OR) and assumed comparator risk (ACR) the absolute risk difference is obtained as (percentage points):

systematic literature review figure

Upper and lower confidence limits for the absolute risk difference are obtained by re-running the calculation above while replacing RR or OR by their upper and lower confidence limits, respectively (e.g. replacing 0.41 with 0.28, then with 0.55, in the example). Such confidence intervals do not incorporate uncertainty in the assumed comparator risks.

14.1.5.2 Time-to-event outcomes

Time-to-event outcomes measure whether and when a particular event (e.g. death) occurs (van Dalen et al 2007). The impact of the experimental intervention relative to the comparison group on time-to-event outcomes is usually measured using a hazard ratio (HR) (see Chapter 6, Section 6.8.1 ).

A hazard ratio expresses a relative effect estimate. It may be used in various ways to obtain absolute risks and other interpretable quantities for a specific population. Here we describe how to re-express hazard ratios in terms of: (i) absolute risk of event-free survival within a particular period of time; (ii) absolute risk of an event within a particular period of time; and (iii) median time to the event. All methods are built on an assumption of consistent relative effects (i.e. that the hazard ratio does not vary over time).

(i) Absolute risk of event-free survival within a particular period of time Event-free survival (e.g. overall survival) is commonly reported by individual studies. To obtain absolute effects for time-to-event outcomes measured as event-free survival, the summary HR can be used in conjunction with an assumed proportion of patients who are event-free in the comparator group (Tierney et al 2007). This proportion of patients will be specific to a period of time of observation. However, it is not strictly necessary to specify this period of time. For instance, a proportion of 50% of event-free patients might apply to patients with a high event rate observed over 1 year, or to patients with a low event rate observed over 2 years.

systematic literature review figure

As an example, suppose the meta-analytic hazard ratio is 0.42 (95% CI 0.25 to 0.72). Assuming a comparator group risk of event-free survival (e.g. for overall survival people being alive) at 2 years of ACR = 900 per 1000 = 0.9 we obtain:

systematic literature review figure

so that that 956 per 1000 people will be alive with the experimental intervention at 2 years. The derivation of the risk should be explained in a comment or footnote.

(ii) Absolute risk of an event within a particular period of time To obtain this absolute effect, again the summary HR can be used (Tierney et al 2007):

systematic literature review figure

In the example, suppose we assume a comparator group risk of events (e.g. for mortality, people being dead) at 2 years of ACR = 100 per 1000 = 0.1. We obtain:

systematic literature review figure

so that that 44 per 1000 people will be dead with the experimental intervention at 2 years.

(iii) Median time to the event Instead of absolute numbers, the time to the event in the intervention and comparison groups can be expressed as median survival time in months or years. To obtain median survival time the pooled HR can be applied to an assumed median survival time in the comparator group (Tierney et al 2007):

systematic literature review figure

In the example, assuming a comparator group median survival time of 80 months, we obtain:

systematic literature review figure

For all three of these options for re-expressing results of time-to-event analyses, upper and lower confidence limits for the corresponding intervention risk are obtained by replacing HR by its upper and lower confidence limits, respectively (e.g. replacing 0.42 with 0.25, then with 0.72, in the example). Again, as for dichotomous outcomes, such confidence intervals do not incorporate uncertainty in the assumed comparator group risks. This is of special concern for long-term survival with a low or moderate mortality rate and a corresponding high number of censored patients (i.e. a low number of patients under risk and a high censoring rate).

14.1.6 Detailed contents of a ‘Summary of findings’ table

14.1.6.1 table title and header.

The title of each ‘Summary of findings’ table should specify the healthcare question, framed in terms of the population and making it clear exactly what comparison of interventions are made. In Figure 14.1.a , the population is people taking long aeroplane flights, the intervention is compression stockings, and the control is no compression stockings.

The first rows of each ‘Summary of findings’ table should provide the following ‘header’ information:

Patients or population This further clarifies the population (and possibly the subpopulations) of interest and ideally the magnitude of risk of the most crucial adverse outcome at which an intervention is directed. For instance, people on a long-haul flight may be at different risks for DVT; those using selective serotonin reuptake inhibitors (SSRIs) might be at different risk for side effects; while those with atrial fibrillation may be at low (< 1%), moderate (1% to 4%) or high (> 4%) yearly risk of stroke.

Setting This should state any specific characteristics of the settings of the healthcare question that might limit the applicability of the summary of findings to other settings (e.g. primary care in Europe and North America).

Intervention The experimental intervention.

Comparison The comparator intervention (including no specific intervention).

14.1.6.2 Outcomes

The rows of a ‘Summary of findings’ table should include all desirable and undesirable health outcomes (listed in order of importance) that are essential for decision making, up to a maximum of seven outcomes. If there are more outcomes in the review, review authors will need to omit the less important outcomes from the table, and the decision selecting which outcomes are critical or important to the review should be made during protocol development (see Chapter 3 ). Review authors should provide time frames for the measurement of the outcomes (e.g. 90 days or 12 months) and the type of instrument scores (e.g. ranging from 0 to 100).

Note that review authors should include the pre-specified critical and important outcomes in the table whether data are available or not. However, they should be alert to the possibility that the importance of an outcome (e.g. a serious adverse effect) may only become known after the protocol was written or the analysis was carried out, and should take appropriate actions to include these in the ‘Summary of findings’ table.

The ‘Summary of findings’ table can include effects in subgroups of the population for different comparator risks and effect sizes separately. For instance, in Figure 14.1.b effects are presented for children younger and older than 5 years separately. Review authors may also opt to produce separate ‘Summary of findings’ tables for different populations.

Review authors should include serious adverse events, but it might be possible to combine minor adverse events as a single outcome, and describe this in an explanatory footnote (note that it is not appropriate to add events together unless they are independent, that is, a participant who has experienced one adverse event has an unaffected chance of experiencing the other adverse event).

Outcomes measured at multiple time points represent a particular problem. In general, to keep the table simple, review authors should present multiple time points only for outcomes critical to decision making, where either the result or the decision made are likely to vary over time. The remainder should be presented at a common time point where possible.

Review authors can present continuous outcome measures in the ‘Summary of findings’ table and should endeavour to make these interpretable to the target audience. This requires that the units are clear and readily interpretable, for example, days of pain, or frequency of headache, and the name and scale of any measurement tools used should be stated (e.g. a Visual Analogue Scale, ranging from 0 to 100). However, many measurement instruments are not readily interpretable by non-specialist clinicians or patients, for example, points on a Beck Depression Inventory or quality of life score. For these, a more interpretable presentation might involve converting a continuous to a dichotomous outcome, such as >50% improvement (see Chapter 15, Section 15.5 ).

14.1.6.3 Best estimate of risk with comparator intervention

Review authors should provide up to three typical risks for participants receiving the comparator intervention. For dichotomous outcomes, we recommend that these be presented in the form of the number of people experiencing the event per 100 or 1000 people (natural frequency) depending on the frequency of the outcome. For continuous outcomes, this would be stated as a mean or median value of the outcome measured.

Estimated or assumed comparator intervention risks could be based on assessments of typical risks in different patient groups derived from the review itself, individual representative studies in the review, or risks derived from a systematic review of prognosis studies or other sources of evidence which may in turn require an assessment of the certainty for the prognostic evidence (Spencer et al 2012, Iorio et al 2015). Ideally, risks would reflect groups that clinicians can easily identify on the basis of their presenting features.

An explanatory footnote should specify the source or rationale for each comparator group risk, including the time period to which it corresponds where appropriate. In Figure 14.1.a , clinicians can easily differentiate individuals with risk factors for deep venous thrombosis from those without. If there is known to be little variation in baseline risk then review authors may use the median comparator group risk across studies. If typical risks are not known, an option is to choose the risk from the included studies, providing the second highest for a high and the second lowest for a low risk population.

14.1.6.4 Risk with intervention

For dichotomous outcomes, review authors should provide a corresponding absolute risk for each comparator group risk, along with a confidence interval. This absolute risk with the (experimental) intervention will usually be derived from the meta-analysis result presented in the relative effect column (see Section 14.1.6.6 ). Formulae are provided in Section 14.1.5 . Review authors should present the absolute effect in the same format as the risks with comparator intervention (see Section 14.1.6.3 ), for example as the number of people experiencing the event per 1000 people.

For continuous outcomes, a difference in means or standardized difference in means should be presented with its confidence interval. These will typically be obtained directly from a meta-analysis. Explanatory text should be used to clarify the meaning, as in Figures 14.1.a and 14.1.b .

14.1.6.5 Risk difference

For dichotomous outcomes, the risk difference can be provided using one of the ‘Summary of findings’ table formats as an additional option (see Figure 14.1.b ). This risk difference expresses the difference between the experimental and comparator intervention and will usually be derived from the meta-analysis result presented in the relative effect column (see Section 14.1.6.6 ). Formulae are provided in Section 14.1.5 . Review authors should present the risk difference in the same format as assumed and corresponding risks with comparator intervention (see Section 14.1.6.3 ); for example, as the number of people experiencing the event per 1000 people or as percentage points if the assumed and corresponding risks are expressed in percentage.

For continuous outcomes, if the ‘Summary of findings’ table includes this option, the mean difference can be presented here and the ‘corresponding risk’ column left blank (see Figure 14.1.b ).

14.1.6.6 Relative effect (95% CI)

The relative effect will typically be a risk ratio or odds ratio (or occasionally a hazard ratio) with its accompanying 95% confidence interval, obtained from a meta-analysis performed on the basis of the same effect measure. Risk ratios and odds ratios are similar when the comparator intervention risks are low and effects are small, but may differ considerably when comparator group risks increase. The meta-analysis may involve an assumption of either fixed or random effects, depending on what the review authors consider appropriate, and implying that the relative effect is either an estimate of the effect of the intervention, or an estimate of the average effect of the intervention across studies, respectively.

14.1.6.7 Number of participants (studies)

This column should include the number of participants assessed in the included studies for each outcome and the corresponding number of studies that contributed these participants.

14.1.6.8 Certainty of the evidence (GRADE)

Review authors should comment on the certainty of the evidence (also known as quality of the body of evidence or confidence in the effect estimates). Review authors should use the specific evidence grading system developed by the GRADE Working Group (Atkins et al 2004, Guyatt et al 2008, Guyatt et al 2011a), which is described in detail in Section 14.2 . The GRADE approach categorizes the certainty in a body of evidence as ‘high’, ‘moderate’, ‘low’ or ‘very low’ by outcome. This is a result of judgement, but the judgement process operates within a transparent structure. As an example, the certainty would be ‘high’ if the summary were of several randomized trials with low risk of bias, but the rating of certainty becomes lower if there are concerns about risk of bias, inconsistency, indirectness, imprecision or publication bias. Judgements other than of ‘high’ certainty should be made transparent using explanatory footnotes or the ‘Comments’ column in the ‘Summary of findings’ table (see Section 14.1.6.10 ).

14.1.6.9 Comments

The aim of the ‘Comments’ field is to help interpret the information or data identified in the row. For example, this may be on the validity of the outcome measure or the presence of variables that are associated with the magnitude of effect. Important caveats about the results should be flagged here. Not all rows will need comments, and it is best to leave a blank if there is nothing warranting a comment.

14.1.6.10 Explanations

Detailed explanations should be included as footnotes to support the judgements in the ‘Summary of findings’ table, such as the overall GRADE assessment. The explanations should describe the rationale for important aspects of the content. Table 14.1.a lists guidance for useful explanations. Explanations should be concise, informative, relevant, easy to understand and accurate. If explanations cannot be sufficiently described in footnotes, review authors should provide further details of the issues in the Results and Discussion sections of the review.

Table 14.1.a Guidance for providing useful explanations in ‘Summary of findings’ (SoF) tables. Adapted from Santesso et al (2016)

, Chi , Tau), or the overlap of confidence intervals, or similarity of point estimates. , describe it as considerable, substantial, moderate or not important.

14.2 Assessing the certainty or quality of a body of evidence

14.2.1 the grade approach.

The Grades of Recommendation, Assessment, Development and Evaluation Working Group (GRADE Working Group) has developed a system for grading the certainty of evidence (Schünemann et al 2003, Atkins et al 2004, Schünemann et al 2006, Guyatt et al 2008, Guyatt et al 2011a). Over 100 organizations including the World Health Organization (WHO), the American College of Physicians, the American Society of Hematology (ASH), the Canadian Agency for Drugs and Technology in Health (CADTH) and the National Institutes of Health and Clinical Excellence (NICE) in the UK have adopted the GRADE system ( www.gradeworkinggroup.org ).

Cochrane has also formally adopted this approach, and all Cochrane Reviews should use GRADE to evaluate the certainty of evidence for important outcomes (see MECIR Box 14.2.a ).

MECIR Box 14.2.a Relevant expectations for conduct of intervention reviews

Assessing the certainty of the body of evidence ( )

GRADE is the most widely used approach for summarizing confidence in effects of interventions by outcome across studies. It is preferable to use the online GRADEpro tool, and to use it as described in the help system of the software. This should help to ensure that author teams are accessing the same information to inform their judgements. Ideally, two people working independently should assess the certainty of the body of evidence and reach a consensus view on any downgrading decisions. The five GRADE considerations should be addressed irrespective of whether the review includes a ‘Summary of findings’ table. It is helpful to draw on this information in the Discussion, in the Authors’ conclusions and to convey the certainty in the evidence in the Abstract and Plain language summary.

Justifying assessments of the certainty of the body of evidence ( )

The adoption of a structured approach ensures transparency in formulating an interpretation of the evidence, and the result is more informative to the user.

For systematic reviews, the GRADE approach defines the certainty of a body of evidence as the extent to which one can be confident that an estimate of effect or association is close to the quantity of specific interest. Assessing the certainty of a body of evidence involves consideration of within- and across-study risk of bias (limitations in study design and execution or methodological quality), inconsistency (or heterogeneity), indirectness of evidence, imprecision of the effect estimates and risk of publication bias (see Section 14.2.2 ), as well as domains that may increase our confidence in the effect estimate (as described in Section 14.2.3 ). The GRADE system entails an assessment of the certainty of a body of evidence for each individual outcome. Judgements about the domains that determine the certainty of evidence should be described in the results or discussion section and as part of the ‘Summary of findings’ table.

The GRADE approach specifies four levels of certainty ( Figure 14.2.a ). For interventions, including diagnostic and other tests that are evaluated as interventions (Schünemann et al 2008b, Schünemann et al 2008a, Balshem et al 2011, Schünemann et al 2012), the starting point for rating the certainty of evidence is categorized into two types:

  • randomized trials; and
  • non-randomized studies of interventions (NRSI), including observational studies (including but not limited to cohort studies, and case-control studies, cross-sectional studies, case series and case reports, although not all of these designs are usually included in Cochrane Reviews).

There are many instances in which review authors rely on information from NRSI, in particular to evaluate potential harms (see Chapter 24 ). In addition, review authors can obtain relevant data from both randomized trials and NRSI, with each type of evidence complementing the other (Schünemann et al 2013).

In GRADE, a body of evidence from randomized trials begins with a high-certainty rating while a body of evidence from NRSI begins with a low-certainty rating. The lower rating with NRSI is the result of the potential bias induced by the lack of randomization (i.e. confounding and selection bias).

However, when using the new Risk Of Bias In Non-randomized Studies of Interventions (ROBINS-I) tool (Sterne et al 2016), an assessment tool that covers the risk of bias due to lack of randomization, all studies may start as high certainty of the evidence (Schünemann et al 2018). The approach of starting all study designs (including NRSI) as high certainty does not conflict with the initial GRADE approach of starting the rating of NRSI as low certainty evidence. This is because a body of evidence from NRSI should generally be downgraded by two levels due to the inherent risk of bias associated with the lack of randomization, namely confounding and selection bias. Not downgrading NRSI from high to low certainty needs transparent and detailed justification for what mitigates concerns about confounding and selection bias (Schünemann et al 2018). Very few examples of where not rating down by two levels is appropriate currently exist.

The highest certainty rating is a body of evidence when there are no concerns in any of the GRADE factors listed in Figure 14.2.a . Review authors often downgrade evidence to moderate, low or even very low certainty evidence, depending on the presence of the five factors in Figure 14.2.a . Usually, certainty rating will fall by one level for each factor, up to a maximum of three levels for all factors. If there are very severe problems for any one domain (e.g. when assessing risk of bias, all studies were unconcealed, unblinded and lost over 50% of their patients to follow-up), evidence may fall by two levels due to that factor alone. It is not possible to rate lower than ‘very low certainty’ evidence.

Review authors will generally grade evidence from sound non-randomized studies as low certainty, even if ROBINS-I is used. If, however, such studies yield large effects and there is no obvious bias explaining those effects, review authors may rate the evidence as moderate or – if the effect is large enough – even as high certainty ( Figure 14.2.a ). The very low certainty level is appropriate for, but is not limited to, studies with critical problems and unsystematic clinical observations (e.g. case series or case reports).

Figure 14.2.a Levels of the certainty of a body of evidence in the GRADE approach. *Upgrading criteria are usually applicable to non-randomized studies only (but exceptions exist).


 


 


 

 

⊕⊕⊕⊕

 

 

⊕⊕⊕◯

⊕⊕◯◯

 

 

⊕◯◯◯

14.2.2 Domains that can lead to decreasing the certainty level of a body of evidence   

We now describe in more detail the five reasons (or domains) for downgrading the certainty of a body of evidence for a specific outcome. In each case, if no reason is found for downgrading the evidence, it should be classified as 'no limitation or not serious' (not important enough to warrant downgrading). If a reason is found for downgrading the evidence, it should be classified as 'serious' (downgrading the certainty rating by one level) or 'very serious' (downgrading the certainty grade by two levels). For non-randomized studies assessed with ROBINS-I, rating down by three levels should be classified as 'extremely' serious.

(1) Risk of bias or limitations in the detailed design and implementation

Our confidence in an estimate of effect decreases if studies suffer from major limitations that are likely to result in a biased assessment of the intervention effect. For randomized trials, these methodological limitations include failure to generate a random sequence, lack of allocation sequence concealment, lack of blinding (particularly with subjective outcomes that are highly susceptible to biased assessment), a large loss to follow-up or selective reporting of outcomes. Chapter 8 provides a discussion of study-level assessments of risk of bias in the context of a Cochrane Review, and proposes an approach to assessing the risk of bias for an outcome across studies as ‘Low’ risk of bias, ‘Some concerns’ and ‘High’ risk of bias for randomized trials. Levels of ‘Low’. ‘Moderate’, ‘Serious’ and ‘Critical’ risk of bias arise for non-randomized studies assessed with ROBINS-I ( Chapter 25 ). These assessments should feed directly into this GRADE domain. In particular, ‘Low’ risk of bias would indicate ‘no limitation’; ‘Some concerns’ would indicate either ‘no limitation’ or ‘serious limitation’; and ‘High’ risk of bias would indicate either ‘serious limitation’ or ‘very serious limitation’. ‘Critical’ risk of bias on ROBINS-I would indicate extremely serious limitations in GRADE. Review authors should use their judgement to decide between alternative categories, depending on the likely magnitude of the potential biases.

Every study addressing a particular outcome will differ, to some degree, in the risk of bias. Review authors should make an overall judgement on whether the certainty of evidence for an outcome warrants downgrading on the basis of study limitations. The assessment of study limitations should apply to the studies contributing to the results in the ‘Summary of findings’ table, rather than to all studies that could potentially be included in the analysis. We have argued in Chapter 7, Section 7.6.2 , that the primary analysis should be restricted to studies at low (or low and unclear) risk of bias where possible.

Table 14.2.a presents the judgements that must be made in going from assessments of the risk of bias to judgements about study limitations for each outcome included in a ‘Summary of findings’ table. A rating of high certainty evidence can be achieved only when most evidence comes from studies that met the criteria for low risk of bias. For example, of the 22 studies addressing the impact of beta-blockers on mortality in patients with heart failure, most probably or certainly used concealed allocation of the sequence, all blinded at least some key groups and follow-up of randomized patients was almost complete (Brophy et al 2001). The certainty of evidence might be downgraded by one level when most of the evidence comes from individual studies either with a crucial limitation for one item, or with some limitations for multiple items. An example of very serious limitations, warranting downgrading by two levels, is provided by evidence on surgery versus conservative treatment in the management of patients with lumbar disc prolapse (Gibson and Waddell 2007). We are uncertain of the benefit of surgery in reducing symptoms after one year or longer, because the one study included in the analysis had inadequate concealment of the allocation sequence and the outcome was assessed using a crude rating by the surgeon without blinding.

(2) Unexplained heterogeneity or inconsistency of results

When studies yield widely differing estimates of effect (heterogeneity or variability in results), investigators should look for robust explanations for that heterogeneity. For instance, drugs may have larger relative effects in sicker populations or when given in larger doses. A detailed discussion of heterogeneity and its investigation is provided in Chapter 10, Section 10.10 and Section 10.11 . If an important modifier exists, with good evidence that important outcomes are different in different subgroups (which would ideally be pre-specified), then a separate ‘Summary of findings’ table may be considered for a separate population. For instance, a separate ‘Summary of findings’ table would be used for carotid endarterectomy in symptomatic patients with high grade stenosis (70% to 99%) in which the intervention is, in the hands of the right surgeons, beneficial, and another (if review authors considered it relevant) for asymptomatic patients with low grade stenosis (less than 30%) in which surgery appears harmful (Orrapin and Rerkasem 2017). When heterogeneity exists and affects the interpretation of results, but review authors are unable to identify a plausible explanation with the data available, the certainty of the evidence decreases.

(3) Indirectness of evidence

Two types of indirectness are relevant. First, a review comparing the effectiveness of alternative interventions (say A and B) may find that randomized trials are available, but they have compared A with placebo and B with placebo. Thus, the evidence is restricted to indirect comparisons between A and B. Where indirect comparisons are undertaken within a network meta-analysis context, GRADE for network meta-analysis should be used (see Chapter 11, Section 11.5 ).

Second, a review may find randomized trials that meet eligibility criteria but address a restricted version of the main review question in terms of population, intervention, comparator or outcomes. For example, suppose that in a review addressing an intervention for secondary prevention of coronary heart disease, most identified studies happened to be in people who also had diabetes. Then the evidence may be regarded as indirect in relation to the broader question of interest because the population is primarily related to people with diabetes. The opposite scenario can equally apply: a review addressing the effect of a preventive strategy for coronary heart disease in people with diabetes may consider studies in people without diabetes to provide relevant, albeit indirect, evidence. This would be particularly likely if investigators had conducted few if any randomized trials in the target population (e.g. people with diabetes). Other sources of indirectness may arise from interventions studied (e.g. if in all included studies a technical intervention was implemented by expert, highly trained specialists in specialist centres, then evidence on the effects of the intervention outside these centres may be indirect), comparators used (e.g. if the comparator groups received an intervention that is less effective than standard treatment in most settings) and outcomes assessed (e.g. indirectness due to surrogate outcomes when data on patient-important outcomes are not available, or when investigators seek data on quality of life but only symptoms are reported). Review authors should make judgements transparent when they believe downgrading is justified, based on differences in anticipated effects in the group of primary interest. Review authors may be aided and increase transparency of their judgements about indirectness if they use Table 14.2.b available in the GRADEpro GDT software (Schünemann et al 2013).

(4) Imprecision of results

When studies include few participants or few events, and thus have wide confidence intervals, review authors can lower their rating of the certainty of the evidence. The confidence intervals included in the ‘Summary of findings’ table will provide readers with information that allows them to make, to some extent, their own rating of precision. Review authors can use a calculation of the optimal information size (OIS) or review information size (RIS), similar to sample size calculations, to make judgements about imprecision (Guyatt et al 2011b, Schünemann 2016). The OIS or RIS is calculated on the basis of the number of participants required for an adequately powered individual study. If the 95% confidence interval excludes a risk ratio (RR) of 1.0, and the total number of events or patients exceeds the OIS criterion, precision is adequate. If the 95% CI includes appreciable benefit or harm (an RR of under 0.75 or over 1.25 is often suggested as a very rough guide) downgrading for imprecision may be appropriate even if OIS criteria are met (Guyatt et al 2011b, Schünemann 2016).

(5) High probability of publication bias

The certainty of evidence level may be downgraded if investigators fail to report studies on the basis of results (typically those that show no effect: publication bias) or outcomes (typically those that may be harmful or for which no effect was observed: selective outcome non-reporting bias). Selective reporting of outcomes from among multiple outcomes measured is assessed at the study level as part of the assessment of risk of bias (see Chapter 8, Section 8.7 ), so for the studies contributing to the outcome in the ‘Summary of findings’ table this is addressed by domain 1 above (limitations in the design and implementation). If a large number of studies included in the review do not contribute to an outcome, or if there is evidence of publication bias, the certainty of the evidence may be downgraded. Chapter 13 provides a detailed discussion of reporting biases, including publication bias, and how it may be tackled in a Cochrane Review. A prototypical situation that may elicit suspicion of publication bias is when published evidence includes a number of small studies, all of which are industry-funded (Bhandari et al 2004). For example, 14 studies of flavanoids in patients with haemorrhoids have shown apparent large benefits, but enrolled a total of only 1432 patients (i.e. each study enrolled relatively few patients) (Alonso-Coello et al 2006). The heavy involvement of sponsors in most of these studies raises questions of whether unpublished studies that suggest no benefit exist (publication bias).

A particular body of evidence can suffer from problems associated with more than one of the five factors listed here, and the greater the problems, the lower the certainty of evidence rating that should result. One could imagine a situation in which randomized trials were available, but all or virtually all of these limitations would be present, and in serious form. A very low certainty of evidence rating would result.

Table 14.2.a Further guidelines for domain 1 (of 5) in a GRADE assessment: going from assessments of risk of bias in studies to judgements about study limitations for main outcomes across studies

Low risk of bias

Most information is from results at low risk of bias.

Plausible bias unlikely to seriously alter the results.

No apparent limitations.

No serious limitations, do not downgrade.

Some concerns

Most information is from results at low risk of bias or with some concerns.

Plausible bias that raises some doubt about the results.

Potential limitations are unlikely to lower confidence in the estimate of effect.

No serious limitations, do not downgrade.

Potential limitations are likely to lower confidence in the estimate of effect.

Serious limitations, downgrade one level.

High risk of bias

The proportion of information from results at high risk of bias is sufficient to affect the interpretation of results.

Plausible bias that seriously weakens confidence in the results.

Crucial limitation for one criterion, or some limitations for multiple criteria, sufficient to lower confidence in the estimate of effect.

Serious limitations, downgrade one level.

Crucial limitation for one or more criteria sufficient to substantially lower confidence in the estimate of effect.

Very serious limitations, downgrade two levels.

Table 14.2.b Judgements about indirectness by outcome (available in GRADEpro GDT)

 

Probably yes

Probably no

No

 

 

 

 

Intervention:

Yes

Probably yes

Probably no

No

 

 

 

 

Comparator:

Direct comparison:

Final judgement about indirectness across domains:

 

14.2.3 Domains that may lead to increasing the certainty level of a body of evidence

Although NRSI and downgraded randomized trials will generally yield a low rating for certainty of evidence, there will be unusual circumstances in which review authors could ‘upgrade’ such evidence to moderate or even high certainty ( Table 14.3.a ).

  • Large effects On rare occasions when methodologically well-done observational studies yield large, consistent and precise estimates of the magnitude of an intervention effect, one may be particularly confident in the results. A large estimated effect (e.g. RR >2 or RR <0.5) in the absence of plausible confounders, or a very large effect (e.g. RR >5 or RR <0.2) in studies with no major threats to validity, might qualify for this. In these situations, while the NRSI may possibly have provided an over-estimate of the true effect, the weak study design may not explain all of the apparent observed benefit. Thus, despite reservations based on the observational study design, review authors are confident that the effect exists. The magnitude of the effect in these studies may move the assigned certainty of evidence from low to moderate (if the effect is large in the absence of other methodological limitations). For example, a meta-analysis of observational studies showed that bicycle helmets reduce the risk of head injuries in cyclists by a large margin (odds ratio (OR) 0.31, 95% CI 0.26 to 0.37) (Thompson et al 2000). This large effect, in the absence of obvious bias that could create the association, suggests a rating of moderate-certainty evidence.  Note : GRADE guidance suggests the possibility of rating up one level for a large effect if the relative effect is greater than 2.0. However, if the point estimate of the relative effect is greater than 2.0, but the confidence interval is appreciably below 2.0, then some hesitation would be appropriate in the decision to rate up for a large effect. Another situation allows inference of a strong association without a formal comparative study. Consider the question of the impact of routine colonoscopy versus no screening for colon cancer on the rate of perforation associated with colonoscopy. Here, a large series of representative patients undergoing colonoscopy may provide high certainty evidence about the risk of perforation associated with colonoscopy. When the risk of the event among patients receiving the relevant comparator is known to be near 0 (i.e. we are certain that the incidence of spontaneous colon perforation in patients not undergoing colonoscopy is extremely low), case series or cohort studies of representative patients can provide high certainty evidence of adverse effects associated with an intervention, thereby allowing us to infer a strong association from even a limited number of events.
  • Dose-response The presence of a dose-response gradient may increase our confidence in the findings of observational studies and thereby enhance the assigned certainty of evidence. For example, our confidence in the result of observational studies that show an increased risk of bleeding in patients who have supratherapeutic anticoagulation levels is increased by the observation that there is a dose-response gradient between the length of time needed for blood to clot (as measured by the international normalized ratio (INR)) and an increased risk of bleeding (Levine et al 2004). A systematic review of NRSI investigating the effect of cyclooxygenase-2 inhibitors on cardiovascular events found that the summary estimate (RR) with rofecoxib was 1.33 (95% CI 1.00 to 1.79) with doses less than 25mg/d, and 2.19 (95% CI 1.64 to 2.91) with doses more than 25mg/d. Although residual confounding is likely to exist in the NRSI that address this issue, the existence of a dose-response gradient and the large apparent effect of higher doses of rofecoxib markedly increase our strength of inference that the association cannot be explained by residual confounding, and is therefore likely to be both causal and, at high levels of exposure, substantial.  Note : GRADE guidance suggests the possibility of rating up one level for a large effect if the relative effect is greater than 2.0. Here, the fact that the point estimate of the relative effect is greater than 2.0, but the confidence interval is appreciably below 2.0 might make some hesitate in the decision to rate up for a large effect
  • Plausible confounding On occasion, all plausible biases from randomized or non-randomized studies may be working to under-estimate an apparent intervention effect. For example, if only sicker patients receive an experimental intervention or exposure, yet they still fare better, it is likely that the actual intervention or exposure effect is larger than the data suggest. For instance, a rigorous systematic review of observational studies including a total of 38 million patients demonstrated higher death rates in private for-profit versus private not-for-profit hospitals (Devereaux et al 2002). One possible bias relates to different disease severity in patients in the two hospital types. It is likely, however, that patients in the not-for-profit hospitals were sicker than those in the for-profit hospitals. Thus, to the extent that residual confounding existed, it would bias results against the not-for-profit hospitals. The second likely bias was the possibility that higher numbers of patients with excellent private insurance coverage could lead to a hospital having more resources and a spill-over effect that would benefit those without such coverage. Since for-profit hospitals are likely to admit a larger proportion of such well-insured patients than not-for-profit hospitals, the bias is once again against the not-for-profit hospitals. Since the plausible biases would all diminish the demonstrated intervention effect, one might consider the evidence from these observational studies as moderate rather than low certainty. A parallel situation exists when observational studies have failed to demonstrate an association, but all plausible biases would have increased an intervention effect. This situation will usually arise in the exploration of apparent harmful effects. For example, because the hypoglycaemic drug phenformin causes lactic acidosis, the related agent metformin was under suspicion for the same toxicity. Nevertheless, very large observational studies have failed to demonstrate an association (Salpeter et al 2007). Given the likelihood that clinicians would be more alert to lactic acidosis in the presence of the agent and over-report its occurrence, one might consider this moderate, or even high certainty, evidence refuting a causal relationship between typical therapeutic doses of metformin and lactic acidosis.

14.3 Describing the assessment of the certainty of a body of evidence using the GRADE framework

Review authors should report the grading of the certainty of evidence in the Results section for each outcome for which this has been performed, providing the rationale for downgrading or upgrading the evidence, and referring to the ‘Summary of findings’ table where applicable.

Table 14.3.a provides a framework and examples for how review authors can justify their judgements about the certainty of evidence in each domain. These justifications should also be included in explanatory notes to the ‘Summary of Findings’ table (see Section 14.1.6.10 ).

Chapter 15, Section 15.6 , describes in more detail how the overall GRADE assessment across all domains can be used to draw conclusions about the effects of the intervention, as well as providing implications for future research.

Table 14.3.a Framework for describing the certainty of evidence and justifying downgrading or upgrading

Describe the risk of bias based on the criteria used in the risk-of-bias table.

Downgraded because of 10 randomized trials, five did not blind patients and caretakers.

Describe the degree of inconsistency by outcome using one or more indicators (e.g. I and P value), confidence interval overlap, difference in point estimate, between-study variance.

Not downgraded because the proportion of the variability in effect estimates that is due to true heterogeneity rather than chance is not important (I = 0%).

Describe if the majority of studies address the PICO – were they similar to the question posed?

Downgraded because the included studies were restricted to patients with advanced cancer.

Describe the number of events, and width of the confidence intervals.

The confidence intervals for the effect on mortality are consistent with both an appreciable benefit and appreciable harm and we lowered the certainty.

Describe the possible degree of publication bias.

1. The funnel plot of 14 randomized trials indicated that there were several small studies that showed a small positive effect, but small studies that showed no effect or harm may have been unpublished. The certainty of the evidence was lowered.

2. There are only three small positive studies, it appears that studies showing no effect or harm have not been published. There also is for-profit interest in the intervention. The certainty of the evidence was lowered.

Describe the magnitude of the effect and the widths of the associate confidence intervals.

Upgraded because the RR is large: 0.3 (95% CI 0.2 to 0.4), with a sufficient number of events to be precise.

 

The studies show a clear relation with increases in the outcome of an outcome (e.g. lung cancer) with higher exposure levels.

Upgraded because the dose-response relation shows a relative risk increase of 10% in never smokers, 15% in smokers of 10 pack years and 20% in smokers of 15 pack years.

Describe which opposing plausible biases and confounders may have not been considered.

The estimate of effect is not controlled for the following possible confounders: smoking, degree of education, but the distribution of these factors in the studies is likely to lead to an under-estimate of the true effect. The certainty of the evidence was increased.

14.4 Chapter information

Authors: Holger J Schünemann, Julian PT Higgins, Gunn E Vist, Paul Glasziou, Elie A Akl, Nicole Skoetz, Gordon H Guyatt; on behalf of the Cochrane GRADEing Methods Group (formerly Applicability and Recommendations Methods Group) and the Cochrane Statistical Methods Group

Acknowledgements: Andrew D Oxman contributed to earlier versions. Professor Penny Hawe contributed to the text on adverse effects in earlier versions. Jon Deeks provided helpful contributions on an earlier version of this chapter. For details of previous authors and editors of the Handbook , please refer to the Preface.

Funding: This work was in part supported by funding from the Michael G DeGroote Cochrane Canada Centre and the Ontario Ministry of Health.

14.5 References

Alonso-Coello P, Zhou Q, Martinez-Zapata MJ, Mills E, Heels-Ansdell D, Johanson JF, Guyatt G. Meta-analysis of flavonoids for the treatment of haemorrhoids. British Journal of Surgery 2006; 93 : 909-920.

Atkins D, Best D, Briss PA, Eccles M, Falck-Ytter Y, Flottorp S, Guyatt GH, Harbour RT, Haugh MC, Henry D, Hill S, Jaeschke R, Leng G, Liberati A, Magrini N, Mason J, Middleton P, Mrukowicz J, O'Connell D, Oxman AD, Phillips B, Schünemann HJ, Edejer TT, Varonen H, Vist GE, Williams JW, Jr., Zaza S. Grading quality of evidence and strength of recommendations. BMJ 2004; 328 : 1490.

Balshem H, Helfand M, Schünemann HJ, Oxman AD, Kunz R, Brozek J, Vist GE, Falck-Ytter Y, Meerpohl J, Norris S, Guyatt GH. GRADE guidelines: 3. Rating the quality of evidence. Journal of Clinical Epidemiology 2011; 64 : 401-406.

Bhandari M, Busse JW, Jackowski D, Montori VM, Schünemann H, Sprague S, Mears D, Schemitsch EH, Heels-Ansdell D, Devereaux PJ. Association between industry funding and statistically significant pro-industry findings in medical and surgical randomized trials. Canadian Medical Association Journal 2004; 170 : 477-480.

Brophy JM, Joseph L, Rouleau JL. Beta-blockers in congestive heart failure. A Bayesian meta-analysis. Annals of Internal Medicine 2001; 134 : 550-560.

Carrasco-Labra A, Brignardello-Petersen R, Santesso N, Neumann I, Mustafa RA, Mbuagbaw L, Etxeandia Ikobaltzeta I, De Stio C, McCullagh LJ, Alonso-Coello P, Meerpohl JJ, Vandvik PO, Brozek JL, Akl EA, Bossuyt P, Churchill R, Glenton C, Rosenbaum S, Tugwell P, Welch V, Garner P, Guyatt G, Schünemann HJ. Improving GRADE evidence tables part 1: a randomized trial shows improved understanding of content in summary of findings tables with a new format. Journal of Clinical Epidemiology 2016; 74 : 7-18.

Deeks JJ, Altman DG. Effect measures for meta-analysis of trials with binary outcomes. In: Egger M, Davey Smith G, Altman DG, editors. Systematic Reviews in Health Care: Meta-analysis in Context . 2nd ed. London (UK): BMJ Publication Group; 2001. p. 313-335.

Devereaux PJ, Choi PT, Lacchetti C, Weaver B, Schünemann HJ, Haines T, Lavis JN, Grant BJ, Haslam DR, Bhandari M, Sullivan T, Cook DJ, Walter SD, Meade M, Khan H, Bhatnagar N, Guyatt GH. A systematic review and meta-analysis of studies comparing mortality rates of private for-profit and private not-for-profit hospitals. Canadian Medical Association Journal 2002; 166 : 1399-1406.

Engels EA, Schmid CH, Terrin N, Olkin I, Lau J. Heterogeneity and statistical significance in meta-analysis: an empirical study of 125 meta-analyses. Statistics in Medicine 2000; 19 : 1707-1728.

Furukawa TA, Guyatt GH, Griffith LE. Can we individualize the 'number needed to treat'? An empirical study of summary effect measures in meta-analyses. International Journal of Epidemiology 2002; 31 : 72-76.

Gibson JN, Waddell G. Surgical interventions for lumbar disc prolapse: updated Cochrane Review. Spine 2007; 32 : 1735-1747.

Guyatt G, Oxman A, Vist G, Kunz R, Falck-Ytter Y, Alonso-Coello P, Schünemann H. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ 2008; 336 : 3.

Guyatt G, Oxman AD, Akl EA, Kunz R, Vist G, Brozek J, Norris S, Falck-Ytter Y, Glasziou P, DeBeer H, Jaeschke R, Rind D, Meerpohl J, Dahm P, Schünemann HJ. GRADE guidelines: 1. Introduction-GRADE evidence profiles and summary of findings tables. Journal of Clinical Epidemiology 2011a; 64 : 383-394.

Guyatt GH, Oxman AD, Kunz R, Brozek J, Alonso-Coello P, Rind D, Devereaux PJ, Montori VM, Freyschuss B, Vist G, Jaeschke R, Williams JW, Jr., Murad MH, Sinclair D, Falck-Ytter Y, Meerpohl J, Whittington C, Thorlund K, Andrews J, Schünemann HJ. GRADE guidelines 6. Rating the quality of evidence--imprecision. Journal of Clinical Epidemiology 2011b; 64 : 1283-1293.

Iorio A, Spencer FA, Falavigna M, Alba C, Lang E, Burnand B, McGinn T, Hayden J, Williams K, Shea B, Wolff R, Kujpers T, Perel P, Vandvik PO, Glasziou P, Schünemann H, Guyatt G. Use of GRADE for assessment of evidence about prognosis: rating confidence in estimates of event rates in broad categories of patients. BMJ 2015; 350 : h870.

Langendam M, Carrasco-Labra A, Santesso N, Mustafa RA, Brignardello-Petersen R, Ventresca M, Heus P, Lasserson T, Moustgaard R, Brozek J, Schünemann HJ. Improving GRADE evidence tables part 2: a systematic survey of explanatory notes shows more guidance is needed. Journal of Clinical Epidemiology 2016; 74 : 19-27.

Levine MN, Raskob G, Landefeld S, Kearon C, Schulman S. Hemorrhagic complications of anticoagulant treatment: the Seventh ACCP Conference on Antithrombotic and Thrombolytic Therapy. Chest 2004; 126 : 287S-310S.

Orrapin S, Rerkasem K. Carotid endarterectomy for symptomatic carotid stenosis. Cochrane Database of Systematic Reviews 2017; 6 : CD001081.

Salpeter S, Greyber E, Pasternak G, Salpeter E. Risk of fatal and nonfatal lactic acidosis with metformin use in type 2 diabetes mellitus. Cochrane Database of Systematic Reviews 2007; 4 : CD002967.

Santesso N, Carrasco-Labra A, Langendam M, Brignardello-Petersen R, Mustafa RA, Heus P, Lasserson T, Opiyo N, Kunnamo I, Sinclair D, Garner P, Treweek S, Tovey D, Akl EA, Tugwell P, Brozek JL, Guyatt G, Schünemann HJ. Improving GRADE evidence tables part 3: detailed guidance for explanatory footnotes supports creating and understanding GRADE certainty in the evidence judgments. Journal of Clinical Epidemiology 2016; 74 : 28-39.

Schünemann HJ, Best D, Vist G, Oxman AD, Group GW. Letters, numbers, symbols and words: how to communicate grades of evidence and recommendations. Canadian Medical Association Journal 2003; 169 : 677-680.

Schünemann HJ, Jaeschke R, Cook DJ, Bria WF, El-Solh AA, Ernst A, Fahy BF, Gould MK, Horan KL, Krishnan JA, Manthous CA, Maurer JR, McNicholas WT, Oxman AD, Rubenfeld G, Turino GM, Guyatt G. An official ATS statement: grading the quality of evidence and strength of recommendations in ATS guidelines and recommendations. American Journal of Respiratory and Critical Care Medicine 2006; 174 : 605-614.

Schünemann HJ, Oxman AD, Brozek J, Glasziou P, Jaeschke R, Vist GE, Williams JW, Jr., Kunz R, Craig J, Montori VM, Bossuyt P, Guyatt GH. Grading quality of evidence and strength of recommendations for diagnostic tests and strategies. BMJ 2008a; 336 : 1106-1110.

Schünemann HJ, Oxman AD, Brozek J, Glasziou P, Bossuyt P, Chang S, Muti P, Jaeschke R, Guyatt GH. GRADE: assessing the quality of evidence for diagnostic recommendations. ACP Journal Club 2008b; 149 : 2.

Schünemann HJ, Mustafa R, Brozek J. [Diagnostic accuracy and linked evidence--testing the chain]. Zeitschrift für Evidenz, Fortbildung und Qualität im Gesundheitswesen 2012; 106 : 153-160.

Schünemann HJ, Tugwell P, Reeves BC, Akl EA, Santesso N, Spencer FA, Shea B, Wells G, Helfand M. Non-randomized studies as a source of complementary, sequential or replacement evidence for randomized controlled trials in systematic reviews on the effects of interventions. Research Synthesis Methods 2013; 4 : 49-62.

Schünemann HJ. Interpreting GRADE's levels of certainty or quality of the evidence: GRADE for statisticians, considering review information size or less emphasis on imprecision? Journal of Clinical Epidemiology 2016; 75 : 6-15.

Schünemann HJ, Cuello C, Akl EA, Mustafa RA, Meerpohl JJ, Thayer K, Morgan RL, Gartlehner G, Kunz R, Katikireddi SV, Sterne J, Higgins JPT, Guyatt G, Group GW. GRADE guidelines: 18. How ROBINS-I and other tools to assess risk of bias in nonrandomized studies should be used to rate the certainty of a body of evidence. Journal of Clinical Epidemiology 2018.

Spencer-Bonilla G, Quinones AR, Montori VM, International Minimally Disruptive Medicine W. Assessing the Burden of Treatment. Journal of General Internal Medicine 2017; 32 : 1141-1145.

Spencer FA, Iorio A, You J, Murad MH, Schünemann HJ, Vandvik PO, Crowther MA, Pottie K, Lang ES, Meerpohl JJ, Falck-Ytter Y, Alonso-Coello P, Guyatt GH. Uncertainties in baseline risk estimates and confidence in treatment effects. BMJ 2012; 345 : e7401.

Sterne JAC, Hernán MA, Reeves BC, Savović J, Berkman ND, Viswanathan M, Henry D, Altman DG, Ansari MT, Boutron I, Carpenter JR, Chan AW, Churchill R, Deeks JJ, Hróbjartsson A, Kirkham J, Jüni P, Loke YK, Pigott TD, Ramsay CR, Regidor D, Rothstein HR, Sandhu L, Santaguida PL, Schünemann HJ, Shea B, Shrier I, Tugwell P, Turner L, Valentine JC, Waddington H, Waters E, Wells GA, Whiting PF, Higgins JPT. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ 2016; 355 : i4919.

Thompson DC, Rivara FP, Thompson R. Helmets for preventing head and facial injuries in bicyclists. Cochrane Database of Systematic Reviews 2000; 2 : CD001855.

Tierney JF, Stewart LA, Ghersi D, Burdett S, Sydes MR. Practical methods for incorporating summary time-to-event data into meta-analysis. Trials 2007; 8 .

van Dalen EC, Tierney JF, Kremer LCM. Tips and tricks for understanding and using SR results. No. 7: time‐to‐event data. Evidence-Based Child Health 2007; 2 : 1089-1090.

For permission to re-use material from the Handbook (either academic or commercial), please see here for full details.

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • Write for Us
  • BMJ Journals

You are here

  • Volume 24, Issue 2
  • Five tips for developing useful literature summary tables for writing review articles
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0000-0003-0157-5319 Ahtisham Younas 1 , 2 ,
  • http://orcid.org/0000-0002-7839-8130 Parveen Ali 3 , 4
  • 1 Memorial University of Newfoundland , St John's , Newfoundland , Canada
  • 2 Swat College of Nursing , Pakistan
  • 3 School of Nursing and Midwifery , University of Sheffield , Sheffield , South Yorkshire , UK
  • 4 Sheffield University Interpersonal Violence Research Group , Sheffield University , Sheffield , UK
  • Correspondence to Ahtisham Younas, Memorial University of Newfoundland, St John's, NL A1C 5C4, Canada; ay6133{at}mun.ca

https://doi.org/10.1136/ebnurs-2021-103417

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Literature reviews offer a critical synthesis of empirical and theoretical literature to assess the strength of evidence, develop guidelines for practice and policymaking, and identify areas for future research. 1 It is often essential and usually the first task in any research endeavour, particularly in masters or doctoral level education. For effective data extraction and rigorous synthesis in reviews, the use of literature summary tables is of utmost importance. A literature summary table provides a synopsis of an included article. It succinctly presents its purpose, methods, findings and other relevant information pertinent to the review. The aim of developing these literature summary tables is to provide the reader with the information at one glance. Since there are multiple types of reviews (eg, systematic, integrative, scoping, critical and mixed methods) with distinct purposes and techniques, 2 there could be various approaches for developing literature summary tables making it a complex task specialty for the novice researchers or reviewers. Here, we offer five tips for authors of the review articles, relevant to all types of reviews, for creating useful and relevant literature summary tables. We also provide examples from our published reviews to illustrate how useful literature summary tables can be developed and what sort of information should be provided.

Tip 1: provide detailed information about frameworks and methods

  • Download figure
  • Open in new tab
  • Download powerpoint

Tabular literature summaries from a scoping review. Source: Rasheed et al . 3

The provision of information about conceptual and theoretical frameworks and methods is useful for several reasons. First, in quantitative (reviews synthesising the results of quantitative studies) and mixed reviews (reviews synthesising the results of both qualitative and quantitative studies to address a mixed review question), it allows the readers to assess the congruence of the core findings and methods with the adapted framework and tested assumptions. In qualitative reviews (reviews synthesising results of qualitative studies), this information is beneficial for readers to recognise the underlying philosophical and paradigmatic stance of the authors of the included articles. For example, imagine the authors of an article, included in a review, used phenomenological inquiry for their research. In that case, the review authors and the readers of the review need to know what kind of (transcendental or hermeneutic) philosophical stance guided the inquiry. Review authors should, therefore, include the philosophical stance in their literature summary for the particular article. Second, information about frameworks and methods enables review authors and readers to judge the quality of the research, which allows for discerning the strengths and limitations of the article. For example, if authors of an included article intended to develop a new scale and test its psychometric properties. To achieve this aim, they used a convenience sample of 150 participants and performed exploratory (EFA) and confirmatory factor analysis (CFA) on the same sample. Such an approach would indicate a flawed methodology because EFA and CFA should not be conducted on the same sample. The review authors must include this information in their summary table. Omitting this information from a summary could lead to the inclusion of a flawed article in the review, thereby jeopardising the review’s rigour.

Tip 2: include strengths and limitations for each article

Critical appraisal of individual articles included in a review is crucial for increasing the rigour of the review. Despite using various templates for critical appraisal, authors often do not provide detailed information about each reviewed article’s strengths and limitations. Merely noting the quality score based on standardised critical appraisal templates is not adequate because the readers should be able to identify the reasons for assigning a weak or moderate rating. Many recent critical appraisal checklists (eg, Mixed Methods Appraisal Tool) discourage review authors from assigning a quality score and recommend noting the main strengths and limitations of included studies. It is also vital that methodological and conceptual limitations and strengths of the articles included in the review are provided because not all review articles include empirical research papers. Rather some review synthesises the theoretical aspects of articles. Providing information about conceptual limitations is also important for readers to judge the quality of foundations of the research. For example, if you included a mixed-methods study in the review, reporting the methodological and conceptual limitations about ‘integration’ is critical for evaluating the study’s strength. Suppose the authors only collected qualitative and quantitative data and did not state the intent and timing of integration. In that case, the strength of the study is weak. Integration only occurred at the levels of data collection. However, integration may not have occurred at the analysis, interpretation and reporting levels.

Tip 3: write conceptual contribution of each reviewed article

While reading and evaluating review papers, we have observed that many review authors only provide core results of the article included in a review and do not explain the conceptual contribution offered by the included article. We refer to conceptual contribution as a description of how the article’s key results contribute towards the development of potential codes, themes or subthemes, or emerging patterns that are reported as the review findings. For example, the authors of a review article noted that one of the research articles included in their review demonstrated the usefulness of case studies and reflective logs as strategies for fostering compassion in nursing students. The conceptual contribution of this research article could be that experiential learning is one way to teach compassion to nursing students, as supported by case studies and reflective logs. This conceptual contribution of the article should be mentioned in the literature summary table. Delineating each reviewed article’s conceptual contribution is particularly beneficial in qualitative reviews, mixed-methods reviews, and critical reviews that often focus on developing models and describing or explaining various phenomena. Figure 2 offers an example of a literature summary table. 4

Tabular literature summaries from a critical review. Source: Younas and Maddigan. 4

Tip 4: compose potential themes from each article during summary writing

While developing literature summary tables, many authors use themes or subthemes reported in the given articles as the key results of their own review. Such an approach prevents the review authors from understanding the article’s conceptual contribution, developing rigorous synthesis and drawing reasonable interpretations of results from an individual article. Ultimately, it affects the generation of novel review findings. For example, one of the articles about women’s healthcare-seeking behaviours in developing countries reported a theme ‘social-cultural determinants of health as precursors of delays’. Instead of using this theme as one of the review findings, the reviewers should read and interpret beyond the given description in an article, compare and contrast themes, findings from one article with findings and themes from another article to find similarities and differences and to understand and explain bigger picture for their readers. Therefore, while developing literature summary tables, think twice before using the predeveloped themes. Including your themes in the summary tables (see figure 1 ) demonstrates to the readers that a robust method of data extraction and synthesis has been followed.

Tip 5: create your personalised template for literature summaries

Often templates are available for data extraction and development of literature summary tables. The available templates may be in the form of a table, chart or a structured framework that extracts some essential information about every article. The commonly used information may include authors, purpose, methods, key results and quality scores. While extracting all relevant information is important, such templates should be tailored to meet the needs of the individuals’ review. For example, for a review about the effectiveness of healthcare interventions, a literature summary table must include information about the intervention, its type, content timing, duration, setting, effectiveness, negative consequences, and receivers and implementers’ experiences of its usage. Similarly, literature summary tables for articles included in a meta-synthesis must include information about the participants’ characteristics, research context and conceptual contribution of each reviewed article so as to help the reader make an informed decision about the usefulness or lack of usefulness of the individual article in the review and the whole review.

In conclusion, narrative or systematic reviews are almost always conducted as a part of any educational project (thesis or dissertation) or academic or clinical research. Literature reviews are the foundation of research on a given topic. Robust and high-quality reviews play an instrumental role in guiding research, practice and policymaking. However, the quality of reviews is also contingent on rigorous data extraction and synthesis, which require developing literature summaries. We have outlined five tips that could enhance the quality of the data extraction and synthesis process by developing useful literature summaries.

  • Aromataris E ,
  • Rasheed SP ,

Twitter @Ahtisham04, @parveenazamali

Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

Competing interests None declared.

Patient consent for publication Not required.

Provenance and peer review Not commissioned; externally peer reviewed.

Read the full text or download the PDF:

Simmons University logo

Nursing - Systematic Reviews: Levels of Evidence

  • Levels of Evidence
  • Meta-Analyses
  • Definitions
  • Citation Search
  • Write & Cite
  • Give Feedback

Nursing: systematic reviews

"How would I use the 6S Model while taking care of a patient?" .cls-1{fill:#fff;stroke:#79a13f;stroke-miterlimit:10;stroke-width:5px;}.cls-2{fill:#79a13f;} The 6S Model is designed to work from the top down, starting with Systems - also referred to as computerized decision support systems (CDSSs). DiCenso et al. describes that, “an evidence-based clinical information system integrates and concisely summarizes all relevant and important research evidence about a clinical problem, is updated as new research evidence becomes available, and automatically links (through an electronic medical record) a specific patient’s circumstances to the relevant information” (2009). Systematic reviews lead up to this type of bio-available level of evidence.

What are systematic reviews, polit–beck evidence hierarchy/levels of evidence scale for therapy questions.

"Figure 2.2 [in context of book] shows our eight-level evidence hierarchy for Therapy/intervention questions. This hierarchy ranks sources of evidence with respect the readiness of an intervention to be put to use in practice" (Polit & Beck, 2021, p. 28). Levels are ranked on risk of bias - level one being the least bias, level eight being the most biased. There are several types of levels of evidence scales designed for answering different questions. "An evidence hierarchy for Prognosis questions, for example, is different from the hierarchy for Therapy questions" (p. 29).

Advantages of Levels of Evidence Scales

"Through controls imposed by manipulation, comparison, and randomization, alternative explanations can be discredited. It is because of this strength that meta-analyses of RCTs, which integrate evidence from multiple experiments, are at the pinnacle of the evidence hierarchies for Therapy questions" (p. 188).

"Tip: Traditional evidence hierarchies or level of evidence scales (e.g., Figure 2.2), rank evidence sources almost exclusively based on the risk of internal validity threats" (p. 217).

Systematic reviews can provide researchers with knowledge that prior evidence shows. This can help clarify established efficacy of a treatment without unnecessary and thus unethical research. Greenhalgh (2019) illustrates this citing Dean Fergusson and colleagues (2005) systematic review on a clinical surgical topic (p. 128).

Limits of Levels of Evidence Scales

Regarding the importance of real-world clinical practice settings, and the conflicting tradeoffs between internal and external validity, Polit and Beck (2021) write, "the first (and most prevalent) approach is to emphasize one and sacrifice another. Most often, it is external validity that is sacrificed. For example, external validity is not even considered in ranking evidence in level of evidence scales" (p. 221). ... From an EBP perspective, it is important to remember that drawing inferences about causal relationships relies not only on how high up on the evidence hierarchy a study is (Figure 2.2), but also, for any given level of the hierarchy, how successful the researcher was in managing study validity and balancing competing validity demands" (p. 222).

Polit and Beck note Levin (2014) that an evidence hierarchy "is not meant to provide a quality rating for evidence retrieved in the search for an answer" (p. 6), and as the Oxford Center for Evidence-Based Medicine concurs that evidence scales are, 'NOT intended to provide you with a definitive judgment about the quality of the evidence. There will inevitably be cases where "lower-level" evidence...will provide stronger than a "higher level" study (Howick et al., 2011, p.2)'" (p. 30).

Level of evidence (e.g., Figure 2.2) + Quality of evidence = Strength of evidence .

The 6S Model of Levels of Evidence

"The 6S hierarchy does not imply a gradient of evidence in terms of quality , but rather in terms of ease in retrieving relevant evidence to address a clinical question. At all levels, the evidence should be assessed for quality and relevance" (Polit & Beck, 2021, p. 24, Tip box).

The 6S Pyramid proposes a structure of quantitative evidence where articles that include pre-appraised and pre-synthesized studies are located at the top of the hierarchy (McMaster U., n.d.).

It can help to consider the level of evidence that a document represents, for example, a scientific article that summarizes and analyses many similar articles may provide more insight than the conclusion of a single research article. This is not to say that summaries can not be flawed, nor does it suggest that rare case studies should be ignored. The aim of health research is the well-being of all people, therefore it is important to use current evidence in light of patient preferences negotiated with clinical expertise.

Other Gradings in Levels of Evidence

While it is accepted that the strongest evidence is derived from meta-analyses, various evidence grading systems exist. for example: The Johns Hopkins Nursing Evidence-Based Practice model ranks evidence from level I to level V, as follows (Seben et al., 2010): Level I: Meta-analysis of randomized clinical trials (RCTs); experimental studies; RCTs Level II: Quasi-experimental studies Level III: Non-experimental or qualitative studies Level IV: Opinions of nationally recognized experts based on research evidence or an expert consensus panel Level V: Opinions of individual experts based on non-research evidence (e.g., case studies, literature reviews, organizational experience, and personal experience) The American Association of Critical-Care Nurses (AACN) evidence level system , updated in 2009, ranks evidence as follows (Armola et al., 2009): Level A: Meta-analysis of multiple controlled studies or meta-synthesis of qualitative studies with results that consistently support a specific action, intervention, or treatment Level B: Well-designed, controlled randomized or non-randomized studies with results that consistently support a specific action, intervention, or treatment Level C: Qualitative, descriptive, or correlational studies, integrative or systematic reviews, or RCTs with inconsistent results Level D: Peer-reviewed professional organizational standards, with clinical studies to support recommendations Level E: Theory-based evidence from expert opinion or multiple case reports Level M: Manufacturers’ recommendations (2017)

EBM Pyramid and EBM Page Generator

Unfiltered are resources that are primary sources describing original research. Randomized controlled trials, cohort studies, case-controlled studies, and case series/reports are considered unfiltered information.

Filtered are resources that are secondary sources which summarize and analyze the available evidence. They evaluate the quality of individual studies and often provide recommendations for practice. Systematic reviews, critically-appraised topics, and critically-appraised individual articles are considered filtered information.

Armola, R. R., Bourgault, A. M., Halm, M. A., Board, R. M., Bucher, L., Harrington, L., ... Medina, J. (2009). AACN levels of evidence. What's new? Critical Care Nurse , 29 (4), 70-73. doi:10.4037/ccn2009969

DiCenso, A., Bayley, L., & Haynes, R. B. (2009). Accessing pre-appraised evidence: Fine-tuning the 5S model into a 6S model. BMJ Evidence-Based Nursing , 12 (4) https://ebn.bmj.com/content/12/4/99.2.short

Fergusson, D., Glass, K. C., Hutton, B., & Shapiro, S. (2005). Randomized controlled trials of Aprotinin in cardiac surgery: Could clinical equipoise have stopped the bleeding?. Clinical Trials , 2 (3), 218-232.

Glover, J., Izzo, D., Odato, K. & Wang, L. (2008). Evidence-based mental health resources . EBM Pyramid and EBM Page Generator. Copyright 2008. All Rights Reserved. Retrieved April 28, 2020 from https://web.archive.org/web/20200219181415/http://www.dartmouth.edu/~biomed/resources.htmld/guides/ebm_psych_resources.html Note. Document removed from host. Old link used with the WayBack Machine of the Internet Archive to retrieve the original webpage on 2/10/21 http://www.dartmouth.edu/~biomed/resources.htmld/guides/ebm_psych_resources.html

Greenhalgh, T. (2019). How to read a paper: The basics of evidence-based medicine and healthcare . (Sixth ed.). Wiley Blackwell.

Haynes, R. B. (2001). Of studies, syntheses, synopses, and systems: The “4S” evolution of services for finding current best evidence. BMJ Evidence-Based Medicine , 6 (2), 36-38.

Haynes, R. B. (2006). Of studies, syntheses, synopses, summaries, and systems: the “5S” evolution of information services for evidence-based healthcare decisions. BMJ Evidence-Based Medicine , 11 (6), 162-164.

McMaster University (n.d.). 6S Search Pyramid Tool https://www.nccmt.ca/capacity-development/6s-search-pyramid

Polit, D., & Beck, C. (2019). Nursing research: Generating and assessing evidence for nursing practice . Wolters Kluwer Health.

Schub, E., Walsh, K. & Pravikoff D. (Ed.) (2017). Evidence-based nursing practice: Implementing [Skill Set]. Nursing Reference Center Plus

Seben, S., March, K. S., & Pugh, L. C. (2010). Evidence-based practice: The forum approach. American Nurse Today , 5 (11), 32-34.

  • Systematic Review from the Encyclopedia of Nursing Research by Cheryl Holly Systematic reviews provide reliable evidential summaries of past research for the busy practitioner. By pooling results from multiple studies, findings are based on multiple populations, conditions, and circumstances. The pooled results of many small and large studies have more precise, powerful, and convincing conclusions (Holly, Salmond, & Saimbert, 2016) [ references in article ]. This scholarly synthesis of research findings and other evidence forms the foundation for evidence-based practice allowing the practitioner to make up-to-date decisions.

Standards & Guides

  • Cochrane Handbook for Systematic Reviews of Interventions The Cochrane Handbook for Systematic Reviews of Interventions is the official guide that describes in detail the process of preparing and maintaining Cochrane systematic reviews on the effects of healthcare interventions.
  • Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) PRISMA is an evidence-based minimum set of items for reporting in systematic reviews and meta-analyses. PRISMA focuses on the reporting of reviews evaluating randomized trials, but can also be used as a basis for reporting systematic reviews of other types of research, particularly evaluations of interventions.
  • Systematic Reviews by The Centre for Reviews and Dissemination "The guidance has been written for those with an understanding of health research but who are new to systematic reviews; those with some experience but who want to learn more; and for commissioners. We hope that experienced systematic reviewers will also find this guidance of value; for example when planning a review in an area that is unfamiliar or with an expanded scope. This guidance might also be useful to those who need to evaluate the quality of systematic reviews, including, for example, anyone with responsibility for implementing systematic review findings" (CRD, 2009, p. vi, "Who should use this guide")

  • Carrying out systematic literature reviews: An introduction by Alan Davies Systematic reviews provide a synthesis of evidence for a specific topic of interest, summarising the results of multiple studies to aid in clinical decisions and resource allocation. They remain among the best forms of evidence, and reduce the bias inherent in other methods. A solid understanding of the systematic review process can be of benefit to nurses that carry out such reviews, and for those who make decisions based on them. An overview of the main steps involved in carrying out a systematic review is presented, including some of the common tools and frameworks utilised in this area. This should provide a good starting point for those that are considering embarking on such work, and to aid readers of such reviews in their understanding of the main review components, in order to appraise the quality of a review that may be used to inform subsequent clinical decision making (Davies, 2019, Abstract)
  • Papers that summarize other papers (systematic reviews and meta-analyses) by Trisha Greenhalgh ... a systematic review is an overview of primary studies that: contains a statement of objectives, sources and methods; has been conducted in a way that is explicit, transparent and reproducible (Figure 9.1) [ Table found in book chapter ]. The most enduring and reliable systematic reviews, notably those undertaken by the Cochrane Collaboration (discussed later in this chapter), are regularly updated to incorporate new evidence (Greenhalgh, 2020, p. 117, Chapter 9).
  • A PRISMA assessment of the reporting quality of systematic reviews of nursing published in the Cochrane Library and paper-based journals by Juxia Zhang et al. The Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) was released as a standard of reporting systematic reviewers (SRs). However, not all SRs adhere completely to this standard. This study aimed to evaluate the reporting quality of SRs published in the Cochrane Library and paper-based journals (Zhang et al., 2019, Abstract).

Cochrane [Username]. (2016, Jan 27). What are systematic reviews? YouTube. https://www.youtube.com/watch?v=egJlW4vkb1Y

Davies, A. (2019). Carrying out systematic literature reviews: An introduction. British Journal of Nursing , 28 (15), 1008–1014. https://doi-org.ezproxy.simmons.edu/10.12968/bjon.2019.28.15.1008

Greenhalgh, T. (2019). Papers that summarize other papers (systematic reviews and meta-analyses). In How to read a Paper : The basics of evidence-based medicine and healthcare . (Sixth ed., pp. 117-136). Wiley Blackwell.

Holly, C. (2017). Systematic review. In J. Fitzpatrick (Ed.), Encyclopedia of nursing research (4th ed.). Springer Publishing Company. Credo Reference.

Zhang, J., Han, L., Shields, L., Tian, J., & Wang, J. (2019). A PRISMA assessment of the reporting quality of systematic reviews of nursing published in the Cochrane Library and paper-based journals. Medicine , 98 (49), e18099. https://doi.org/10.1097/MD.0000000000018099

  • << Previous: Start
  • Next: Meta-Analyses >>
  • Last Updated: Nov 3, 2023 1:19 PM
  • URL: https://simmons.libguides.com/systematic-reviews
  • Introduction
  • Conclusions
  • Article Information

LMIC indicates low- and- middle-income country; SR, systematic review.

a This review included distinct conclusions about separate conditions and comparators, and so it appears in this map more than once.

eAppendix 1. Search Strategies

eAppendix 2. Excluded Studies

eAppendix 3. Evidence Table

eAppendix 4. Conditions in Previously Published Map in 2018 and Current Map

eReferences.

Data Sharing Statement

See More About

Sign up for emails based on your interests, select your interests.

Customize your JAMA Network experience by selecting one or more topics from the list below.

  • Academic Medicine
  • Acid Base, Electrolytes, Fluids
  • Allergy and Clinical Immunology
  • American Indian or Alaska Natives
  • Anesthesiology
  • Anticoagulation
  • Art and Images in Psychiatry
  • Artificial Intelligence
  • Assisted Reproduction
  • Bleeding and Transfusion
  • Caring for the Critically Ill Patient
  • Challenges in Clinical Electrocardiography
  • Climate and Health
  • Climate Change
  • Clinical Challenge
  • Clinical Decision Support
  • Clinical Implications of Basic Neuroscience
  • Clinical Pharmacy and Pharmacology
  • Complementary and Alternative Medicine
  • Consensus Statements
  • Coronavirus (COVID-19)
  • Critical Care Medicine
  • Cultural Competency
  • Dental Medicine
  • Dermatology
  • Diabetes and Endocrinology
  • Diagnostic Test Interpretation
  • Drug Development
  • Electronic Health Records
  • Emergency Medicine
  • End of Life, Hospice, Palliative Care
  • Environmental Health
  • Equity, Diversity, and Inclusion
  • Facial Plastic Surgery
  • Gastroenterology and Hepatology
  • Genetics and Genomics
  • Genomics and Precision Health
  • Global Health
  • Guide to Statistics and Methods
  • Hair Disorders
  • Health Care Delivery Models
  • Health Care Economics, Insurance, Payment
  • Health Care Quality
  • Health Care Reform
  • Health Care Safety
  • Health Care Workforce
  • Health Disparities
  • Health Inequities
  • Health Policy
  • Health Systems Science
  • History of Medicine
  • Hypertension
  • Images in Neurology
  • Implementation Science
  • Infectious Diseases
  • Innovations in Health Care Delivery
  • JAMA Infographic
  • Law and Medicine
  • Leading Change
  • Less is More
  • LGBTQIA Medicine
  • Lifestyle Behaviors
  • Medical Coding
  • Medical Devices and Equipment
  • Medical Education
  • Medical Education and Training
  • Medical Journals and Publishing
  • Mobile Health and Telemedicine
  • Narrative Medicine
  • Neuroscience and Psychiatry
  • Notable Notes
  • Nutrition, Obesity, Exercise
  • Obstetrics and Gynecology
  • Occupational Health
  • Ophthalmology
  • Orthopedics
  • Otolaryngology
  • Pain Medicine
  • Palliative Care
  • Pathology and Laboratory Medicine
  • Patient Care
  • Patient Information
  • Performance Improvement
  • Performance Measures
  • Perioperative Care and Consultation
  • Pharmacoeconomics
  • Pharmacoepidemiology
  • Pharmacogenetics
  • Pharmacy and Clinical Pharmacology
  • Physical Medicine and Rehabilitation
  • Physical Therapy
  • Physician Leadership
  • Population Health
  • Primary Care
  • Professional Well-being
  • Professionalism
  • Psychiatry and Behavioral Health
  • Public Health
  • Pulmonary Medicine
  • Regulatory Agencies
  • Reproductive Health
  • Research, Methods, Statistics
  • Resuscitation
  • Rheumatology
  • Risk Management
  • Scientific Discovery and the Future of Medicine
  • Shared Decision Making and Communication
  • Sleep Medicine
  • Sports Medicine
  • Stem Cell Transplantation
  • Substance Use and Addiction Medicine
  • Surgical Innovation
  • Surgical Pearls
  • Teachable Moment
  • Technology and Finance
  • The Art of JAMA
  • The Arts and Medicine
  • The Rational Clinical Examination
  • Tobacco and e-Cigarettes
  • Translational Medicine
  • Trauma and Injury
  • Treatment Adherence
  • Ultrasonography
  • Users' Guide to the Medical Literature
  • Vaccination
  • Venous Thromboembolism
  • Veterans Health
  • Women's Health
  • Workflow and Process
  • Wound Care, Infection, Healing

Get the latest research based on your areas of interest.

Others also liked.

  • Download PDF
  • X Facebook More LinkedIn

Mak S , Allen J , Begashaw M, et al. Use of Massage Therapy for Pain, 2018-2023 : A Systematic Review . JAMA Netw Open. 2024;7(7):e2422259. doi:10.1001/jamanetworkopen.2024.22259

Manage citations:

© 2024

  • Permissions

Use of Massage Therapy for Pain, 2018-2023 : A Systematic Review

  • 1 Veterans Health Administration, Greater Los Angeles Healthcare System, Los Angeles, California
  • 2 UCLA Fielding School of Public Health, University of California, Los Angeles
  • 3 RAND Corporation, Santa Monica, California

Question   What is the certainty or quality of evidence in recent systematic reviews for use of massage therapy for painful adult health conditions?

Findings   This systematic review identified 129 systematic reviews in a search of the literature published since 2018; of these, 41 assessed the certainty or quality of evidence of their conclusions. Overall, 17 systematic reviews regarding 13 health conditions were mapped, and most reviews concluded that the certainty of evidence was low or very low.

Meaning   This study found that despite massage therapy having been the subject of hundreds of randomized clinical trials and dozens of systematic reviews about adult health conditions since 2018, there were few conclusions that had greater than low certainty of evidence.

Importance   Massage therapy is a popular treatment that has been advocated for dozens of painful adult health conditions and has a large evidence base.

Objective   To map systematic reviews, conclusions, and certainty or quality of evidence for outcomes of massage therapy for painful adult health conditions.

Evidence Review   In this systematic review, a computerized search was conducted of PubMed, the Allied and Complementary Medicine Database, the Cumulated Index to Nursing and Allied Health Literature, the Cochrane Database of Systematic Reviews, and Web of Science from 2018 to 2023. Included studies were systematic reviews of massage therapy for pain in adult health conditions that formally rated the certainty, quality, or strength of evidence for conclusions. Studies of sports massage therapy, osteopathy, dry cupping or dry needling, and internal massage therapy (eg, for pelvic floor pain) were ineligible, as were self-administered massage therapy techniques, such as foam rolling. Reviews were categorized as those with at least 1 conclusion rated as high-certainty evidence, at least 1 conclusion rated as moderate-certainty evidence, and all conclusions rated as low- or very low–certainty evidence; a full list of conclusions and certainty of evidence was collected.

Findings   A total of 129 systematic reviews of massage therapy for painful adult health conditions were found; of these, 41 reviews used a formal method to rate certainty or quality of evidence of their conclusions and 17 reviews were mapped, covering 13 health conditions. Across these reviews, no conclusions were rated as high certainty of evidence. There were 7 conclusions that were rated as moderate-certainty evidence; all remaining conclusions were rated as low- or very low–certainty evidence. All conclusions rated as moderate certainty were that massage therapy had a beneficial associations with pain.

Conclusions and Relevance   This study found that despite a large number of randomized clinical trials, systematic reviews of massage therapy for painful adult health conditions rated a minority of conclusions as moderate-certainty evidence and that conclusions with moderate- or high-certainty evidence that massage therapy was superior to other active therapies were rare.

Massage therapy is a popular and widely accepted complementary and integrative health modality for individuals seeking relief from pain. 1 This therapy is the practice of manual assessment and manipulation of the superficial soft tissues of skin, muscle, tendon, ligament, and fascia and the structures that lie within the superficial tissues for therapeutic purpose. 2 Individuals may seek massage therapy to address pain where conventional treatments may not always provide complete relief or may come with potential adverse effects. Massage therapy encompasses a range of techniques, styles, and durations and is intended to be delivered by uniquely trained and credentialed therapists. 3 Original research studies have reported on massage therapy delivered by a wide variety of health care professionals, such as physical therapists, physiotherapists, and nurses. 4 , 5 Despite massage therapy’s popularity and long history in practice, evidence of beneficial outcomes associated with massage therapy remains limited.

The Department of Veterans Affairs (VA) previously produced an evidence map of massage therapy for pain, which included systematic reviews published through 2018. 6 An evidence map is a form of systemic review that assesses a broad field to identify the state of the evidence, gaps in knowledge, and future research needs and that presents results in a user-friendly format, often a visual figure or graph. 7 To categorize this evidence base for use in decision-making by policymakers and practitioners, VA policymakers requested a new evidence map of reviews published since 2018 to answer the question “What is the certainty of evidence in systematic reviews of massage therapy for pain?”

This systematic review is an extension of a study commissioned by the VA. While not a full systematic review, this study nevertheless reports methods and results using the Preferred Reporting Items for Systematic Reviews and Meta-analyses ( PRISMA ) reporting guideline where applicable and filed the a priori protocol with the VA Evidence Synthesis Program Coordinating Center. Requirements for review and informed consent were waived because the study was designated as not human participants research.

Literature searches were based on searches used for the evidence map of massage therapy completed in 2018. 8 We searched 5 databases for relevant records published from July 2018 to April 2023 using the search terms “massage,” “acupressure,” “shiatsu,” “myofascial release therapy,” “systematic*,” “metaanaly*,” and similar terms. The databases were PubMed, the Allied and Complementary Medicine Database, the Cumulated Index to Nursing and Allied Health Literature, the Cochrane Database of Systematic Reviews, and Web of Science. See eAppendix 1 in Supplement 1 for full search strategies.

Each title was screened independently by 2 authors for relevance (S.M., J.A., and P.G.S.). Abstracts were then reviewed in duplicate, with any discrepancies resolved by group discussion. To be included, abstracts or titles needed to be about efficacy or effectiveness of massage therapy for a painful adult health condition and be a systematic review with more than 1 study about massage therapy. A systematic review was defined as a review that had a documented systematic method for identifying and critically appraising evidence. In general, any therapist-delivered modality described as massage therapy by review authors was considered eligible (eg, tuina, acupressure, auricular acupressure, reflexology, and myofascial release). Sports massage therapy, osteopathy, dry cupping or dry needling, and internal massage therapy (eg, for pelvic floor pain) were ineligible, as were self-administered massage therapy techniques, like foam rolling. Reviews had to be about a painful condition for adults, and we excluded publications in low- and middle-income countries because of differences in resources for usual care or other active treatments for included conditions. Publications were required to compare massage therapy with sham or placebo massage, usual care, or other active therapies. Systematic reviews that covered other interventions were eligible if results for massage therapy were reported separately.

We next restricted eligibility to reviews that used formal methods to assess the certainty (sometimes called strength or quality) of the evidence for conclusions. In general, this meant using Grading of Recommendations, Assessment, Development, and Evaluations (GRADE). 9 However, other formal methods were also included, such as the approach used by the US Agency for Healthcare Research and Quality (AHRQ) Evidence-based Practice Center (EPC) program. To be included, a review had to state or cite the method used and report the certainty (or strength or quality) of evidence for each conclusion. After we applied this restriction, most health conditions had only 1 systematic review meeting the eligibility criteria, and we used this review for the map. Among conditions for which we identified more than 1 review meeting the eligibility criteria, we first assessed whether reviews differed in some other feature used to classify reviews on our map (eg, different comparators or type of massage therapy), which we would label with the appropriate designation (such as vs usual care or reflexology ). If there were multiple reviews about the same condition and they did not differ in some other feature, we selected the systematic review we judged as being most informative for readers. In general, this was the most recent review or the review with the greatest number of included studies.

Data on study condition, number of articles in a review, intervention characteristics, comparators, conclusions, and certainty, quality, or strength of evidence were extracted by 1 reviewer and then verified by a second reviewer (S.M., J.A., and P.G.S.). Our evidence mapping process produced a visual depiction of the evidence for massage therapy, as well as an accompanying narrative with an ancillary figure and table.

The visual depiction or evidence map uses a bubble plot format to display information on 4 dimensions: bubble size, bubble label, x-axis, and y-axis. This allowed us to provide the following types of information about each included systematic review:

Number of articles in systematic review (bubble size): The size of each bubble corresponds to the number of relevant primary research studies included in a systematic review.

Condition (bubble label): Each bubble is labeled with the condition discussed by that systematic review.

Shapes and colors: Intervention characteristics for each condition are presented in the form of colors (type of intervention) and shapes (comparators). For type of intervention, we included nonspecified massage therapy, tuina, myofascial release, reflexology, acupressure, and auricular acupressure. For comparators, we included mixed comparators with subgroups, mixed comparators with no subgroups, sham or placebo, and active therapy or usual care. A condition can appear more than once if multiple systematic reviews included different type of massage therapy or different comparators.

Strength of findings (rows): Each condition is plotted on the map based on the ratings of certainty of evidence statement as reported in the systematic reviews: high, moderate, low, or very low.

Outcome associated with massage therapy (columns): Each condition is plotted in potential benefit or no benefit as the outcome associated with massage therapy. Columns are not mutually exclusive. A review could have more than 1 conclusion, and conclusions could differ in the benefit associated with massage therapy. Both conclusions are included on the map.

Risk of bias is not part of the method of an evidence map. We assessed the quality of included reviews using criteria developed by the U S Preventive Services Task Force (USPSTF). Certainty of evidence as determined by the original authors of the systematic review was abstracted for each conclusion in each systematic review and tabulated.

The search identified 1164 potentially relevant citations. Among 129 full-text articles screened, 41 publications were retained for further review. Of these, 24 reviews were excluded from the map for the following reasons: only 1 primary study about interventions of interest (11 studies), outcomes associated with massage therapy could not be distinguished from other included interventions (5 studies), not an intervention of interest (3 studies), not a comparison of interest (2 studies), overlap with a more recent or larger review that was already included on the map (2 studies), and self-delivered therapy (1 study). We included 17 publications in this map covering 13 health conditions. 4 , 10 - 25 The literature flowchart ( Figure 1 ) summarizes results of the study selection process, and eAppendix 2 in Supplement 1 presents citations for all excluded reviews at full-text screening.

The total number of primary studies about massage therapy for pain in the included reviews ranged from 2 studies to 23 studies. There were 12 reviews that included fewer than 10 primary studies 4 , 11 - 17 , 20 - 23 and 5 reviews that included 10 to 25 studies about massage therapy for pain. 10 , 18 , 19 , 24 , 25 Of included reviews, 3 reviews were completed by the Cochrane Collaboration 4 , 19 , 23 and 2 reviews were completed by the AHRQ EPC program. 11 , 18

We categorized the included 17 reviews by health condition. These categories were cancer-related pain, 15 , 24 back pain (including chronic back pain, 25 chronic low back pain, 18 , 22 and low back pain 17 ), chronic neck pain, 18 fibromyalgia, 21 labor pain, 4 , 19 mechanical neck pain, 13 myofascial pain, 14 palliative care needs, 10 plantar fasciitis, 12 post–breast cancer surgery pain, 16 postcesarean pain, 23 postpartum pain, 20 and postoperative pain. 11

Of 17 included reviews, 3 reviews included more than 1 type of massage therapy and 14 reviews included 1 type of massage therapy. Reviews by Chou et al 11 and Smith et al 16 included acupressure and nonspecified massage therapy as interventions. The review by Candy et al 7 included reflexology and nonspecified massage therapy as interventions. Of the 14 reviews with 1 type of massage therapy, there were 5 reviews describing nonspecified massage therapy, 10 , 14 , 17 , 20 1 review about tuina, 22 5 reviews about myofascial release, 8 , 9 , 12 , 18 , 19 and 3 reviews about acupressure. 13 , 15 , 21

A variety of comparators were included in reviews. Of 9 reviews that included more than 1 comparator in analyses, 4 , 11 , 13 , 14 , 18 - 22 2 reviews did not conduct separate analyses by comparator (labeled mixed with no subgroups ) 13 , 14 and 3 reviews conducted separate analyses by comparator (labeled mixed with subgroups ). 4 , 21 , 22 The other 4 reviews included a mix of comparators with separate conclusions: sham or placebo and active therapy or usual care, 11 mixed with no subgroups and active therapy or usual care, 18 mixed with subgroups and active therapy or usual care, 20 and mixed with no subgroups, sham, and active therapy or usual care. 19 There were 8 reviews that included 1 comparator only in their analyses, 10 , 12 , 15 - 17 , 23 - 25 with 7 reviews that described interventions compared with active therapy or usual care only, 10 , 12 , 15 , 17 , 23 - 25 while 1 review limited inclusion to primary studies with a sham or placebo comparator. 16

There was substantial variation in the reporting of other details from primary studies in included reviews. Any study that did not specify the mode of delivery was included; studies that explicitly stated that massage therapy was self-delivered were excluded. Of the 17 included reviews, 5 reviews provided details of personnel who administered the therapy, including massage therapist, nurse, aromatherapist, physiotherapist, and reflexologist. 4 , 10 , 19 - 21 A total of 7 reviews presented length of sessions (eg, 5-minute or 90-minute sessions for massage therapy studies and 30-second or 5-minute sessions for acupressure studies). 10 , 16 , 18 , 20 - 23 With the exception of the review by He et al, 15 all reviews reported details about frequency, duration, or both when available. A total of 9 reviews included information about frequency of sessions (eg, 1 session or once every 3 weeks for massage therapy studies and 4 times per day or daily for acupressure studies), 10 , 12 , 16 - 18 , 20 - 23 and 9 reviews reported duration of sessions (eg, single session or 3 months). 10 - 12 , 16 - 18 , 20 , 22 , 23 There were 7 reviews that included details about follow-up (eg, 1 week or 12 months). 10 , 13 , 17 , 18 , 21 , 23 , 25

Using USPSTF criteria to rate the quality of included reviews, 10 reviews were rated good 4 , 10 , 11 , 14 - 16 , 18 , 19 , 21 , 23 and 7 reviews were rated fair. 12 , 13 , 17 , 20 , 22 , 24 , 25 See eAppendix 3 in Supplement 1 for each review’s rating.

Figure 2 is a visual depiction of the following types of information about each included systematic review: condition, types of comparison treatments (shapes), types of massage therapy (color), number of articles included for each conclusion (bubble size), outcomes associated with massage therapy for pain (columns), and certainty of evidence rating (rows). There were 6 reviews mapped more than once, reflecting primary studies describing more than 1 health condition, 18 more than 1 type of massage therapy, 10 , 20 or outcomes associated with massage therapy compared with different comparators. 11 , 17 - 19 There were 7 conditions from reviews 14 , 16 - 19 , 21 , 22 that reported 1 conclusion rated as moderate-certainty evidence, all of which concluded that massage therapy was associated with beneficial outcomes for pain ( Table 1 ). However, most other conditions had conclusions rated as low- or very low–certainty evidence (12 reviews about 10 conditions 4 , 10 - 13 , 15 , 17 - 20 , 23 - 25 ). This rating means “Our confidence in the effect estimate is limited. The true effect may be substantially different from the estimate of effect,” or “We have very little confidence in the effect estimate.” See eAppendix 3 in Supplement 1 for conclusions in all reviews. This map included 4 conditions that did not appear in the 2018 map, 12 , 16 , 20 , 23 and there were 8 conditions in the 2018 map that did not have new reviews meeting eligibility criteria (mainly a formal grading of the certainty of evidence); 7 health conditions 10 , 11 , 13 - 15 , 17 , 18 , 21 , 22 , 24 , 25 were included in the 2018 map and the new map (see details in eAppendix 4 in Supplement 1 ).

Evidence about adverse events was collected by approximately half of included reviews, and no serious adverse events were reported. While 11 of 17 reviews 10 , 11 , 13 , 15 , 17 - 19 , 22 - 25 described adverse events, 2 reviews 18 , 23 included certainty of evidence conclusions for adverse events for 3 health conditions ( Table 2 ).

There is a large literature of original randomized clinical trials and systematic reviews of randomized clinical trials of massage therapy as a treatment for pain. Our systematic review found that despite this literature, there were only a few conditions for which authors of systematic reviews concluded that there was at least moderate-certainty evidence regarding health outcomes associated with massage therapy and pain. Most reviews reported low- or very low–certainty evidence. Although adverse events associated with massage therapy for pain were rare, the evidence was limited. For reviews that had conclusions about adverse events, authors were uncertain if there was a difference between groups or did not find a difference between groups and rated the evidence low to very low certainty of evidence.

Massage therapy is a broad term that is inclusive of many styles and techniques. We applied exclusion criteria determined a priori to help identify publications for inclusion in the evidence map. Despite that procedure, there was still a lack of clarity in determining what massage therapy is. For instance, acupressure was sometimes considered acupuncture and other times considered massage therapy, depending on author definition. In this case, we reviewed and included only publications that were explicitly labeled acupressure and did not review publications about acupuncture only. This highlights a fundamental issue with examining the evidence base of massage therapy for pain when there is ambiguity in defining what is considered massage therapy.

Unlike a pharmaceutical placebo, sham massage therapy may not be truly inactive. It is conceivable that even the light touch or touch with no clear criterion 26 used in sham massage therapy may be associated with some positive outcomes, meaning that patients who receive the massage therapy intervention and those who receive a sham massage therapy could both demonstrate some degree of symptom improvement. Limitations of sham comparators raise the question of whether sham or placebo treatment is an appropriate comparison group in massage therapy trials. It may be more informative to compare massage therapy with other treatments that are accessible and whose benefits are known so that any added beneficial outcomes associated with massage therapy could be better isolated and understood.

Compared with the 2018 map, our map included 4 new conditions not on the 2018 map, while 8 conditions from the 2018 map had no new reviews meeting eligibility criteria and 7 health conditions appeared in both maps. Despite identifying new conditions and conclusions with higher certainty of evidence in several reviews in our updated search, most included reviews reported low or very low certainty of evidence, suggesting that the most critical research need is for better evidence to increase certainty of evidence for massage therapy for pain. This is a challenge given that massage, like other complementary and integrative health interventions, does not have the historical research infrastructure that most health professions have. 27 Nevertheless, it is only when systematic reviews and meta-analyses are conducted with high-quality primary studies that the association or lack of association of massage therapy with pain will reach higher certainties of evidence. Studies comparing massage therapy with placebo or sham are probably not the priority; rather, the priority should be studies comparing massage therapy with other recommended, accepted, and active therapies for pain. Studies comparing massage therapy with other recommended therapies should also have a sufficiently long follow-up to allow any nonspecific outcomes (eg, those associated with receiving some new treatment) to dissipate. For example, this period has been proposed to be at least 6 months for studies of chronic pain.

There are 2 main limitations to this systematic review’s evidence map. The first, common to all systematic reviews, is that we may not have identified all potentially eligible evidence. If a systematic review was published in a journal not indexed in any of 5 databases we searched and we did not identify it as part of our search of references of included publications, then we would have missed it. Nevertheless, our search strategy identified more than 200 publications about massage therapy for pain published since July 2018, so we did not lack potential reviews to evaluate. The second limitation of evidence maps is that we did not independently evaluate the source evidence; in other words, we took conclusions of authors of the systematic review at face value. That is the nature of an evidence map. Particular to this application of the mapping process, we mapped the review we deemed most informative for the 2 health conditions that had more than 1 eligible review (back pain and labor pain). This necessarily requires judgment, and others could disagree with that judgment. We included the citation for reviews excluded from the map for this overlap reason in supplemental material, and interested readers can review it for additional information. As in all evidence-based products and particularly in 1 such as this covering a large and complex evidence base, it is possible that there are errors of data extraction and compilation. We used dual review to minimize the chance of such errors, but if we are notified of errors, we will correct them.

Although this systematic review found that the number of conclusions about the effectiveness of massage therapy that were judged to have at least moderate certainty of evidence was greater now than in 2018, it was still small relative to the need. More high-quality randomized clinical trials are needed to provide a stronger evidence base to assess the effect of massage therapy on pain. For painful conditions that do not have at least moderate-certainty evidence supporting use of massage therapy, new studies that address limitations of existing research are needed. The field of massage therapy would be best advanced by educating the wider research community with clearer definitions of massage therapy and whether it is appropriate to include multiple modalities in the same systematic review.

Accepted for Publication: May 15, 2024.

Published: July 15, 2024. doi:10.1001/jamanetworkopen.2024.22259

Open Access: This is an open access article distributed under the terms of the CC-BY License . © 2024 Mak S et al. JAMA Network Open .

Corresponding Author: Selene Mak, PhD, MPH, Veterans Health Administration, Greater Los Angeles Healthcare System, 11301 Wilshire Blvd, Los Angeles, CA 90073 ( [email protected] ).

Author Contributions: Drs Mak and Shekelle had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

Concept and design: Mak, Miake-Lye, Shekelle.

Acquisition, analysis, or interpretation of data: Mak, Allen, Begashaw, Beroes-Severin, De Vries, Lawson, Shekelle.

Drafting of the manuscript: Mak, Allen, Begashaw, Beroes-Severin, De Vries, Lawson, Shekelle.

Critical review of the manuscript for important intellectual content: Mak, Miake-Lye, Shekelle.

Statistical analysis: Allen.

Obtained funding: Shekelle.

Administrative, technical, or material support: Begashaw, Miake-Lye, Beroes-Severin, De Vries, Lawson.

Supervision: Mak, Shekelle.

Conflict of Interest Disclosures: None reported.

Funding/Support: Funding was provided by the Department of Veterans Affairs Health Services Research and Development.

Role of the Funder/Sponsor: The funders had no role in the collection, management, analysis, and interpretation of the data and preparation of the manuscript. The funders participated in the design and conduct of the study, the review and approval of the manuscript, and the decision to submit the manuscript for publication.

Data Sharing Statement: See Supplement 2 .

  • Register for email alerts with links to free full-text articles
  • Access PDFs of free articles
  • Manage your interests
  • Save searches and receive search alerts

Error: User registration is currently not allowed.

Username or Email Address

Google Authenticator code

Remember Me

Lost your password?

← Go to BMJ

Subscribe to the PwC Newsletter

Join the community, edit social preview.

systematic literature review figure

Add a new code entry for this paper

Remove a code repository from this paper, mark the official implementation from paper authors, add a new evaluation result row.

TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK REMOVE

Remove a task

Add a method, remove a method, edit datasets, systematic literature review of ai-enabled spectrum management in 6g and future networks.

12 Jun 2024  ·  Bushra Sabir , Shuiqiao Yang , David Nguyen , Nan Wu , Alsharif Abuadbba , Hajime Suzuki , Shangqi Lai , Wei Ni , Ding Ming , Surya Nepal · Edit social preview

Artificial Intelligence (AI) has advanced significantly in various domains like healthcare, finance, and cybersecurity, with successes such as DeepMind's medical imaging and Tesla's autonomous vehicles. As telecommunications transition from 5G to 6G, integrating AI is crucial for complex demands like data processing, network optimization, and security. Despite ongoing research, there's a gap in consolidating AI-enabled Spectrum Management (AISM) advancements. Traditional spectrum management methods are inadequate for 6G due to its dynamic and complex demands, making AI essential for spectrum optimization, security, and network efficiency. This study aims to address this gap by: (i) Conducting a systematic review of AISM methodologies, focusing on learning models, data handling techniques, and performance metrics. (ii) Examining security and privacy concerns related to AI and traditional network threats within AISM contexts. Using the Systematic Literature Review (SLR) methodology, we meticulously analyzed 110 primary studies to: (a) Identify AI's utility in spectrum management. (b) Develop a taxonomy of AI approaches. (c) Classify datasets and performance metrics used. (d) Detail security and privacy threats and countermeasures. Our findings reveal challenges such as under-explored AI usage in critical AISM systems, computational resource demands, transparency issues, the need for real-world datasets, imbalances in security and privacy research, and the absence of testbeds, benchmarks, and security analysis tools. Addressing these challenges is vital for maximizing AI's potential in advancing 6G technology.

Code Edit Add Remove Mark official

Datasets edit.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Mark Access Health Policy
  • v.11(1); 2023
  • PMC10443963

Logo of jmaph

Systematic literature reviews over the years

Beata smela.

a Assignity, Krakow, Poland

Mondher Toumi

b Public Health Department, Aix-Marseille University, Marseille, France

Karolina Świerk

Konrad gawlik, emilie clay.

c Clever-Access, Paris, France

Laurent Boyer

Associated data.

The data supporting the findings of this study are available within the article and its supplementary materials.

Purpose: Nowadays, systematic literature reviews (SLRs) and meta-analyses are often placed at the top of the study hierarchy of evidence. The main objective of this paper is to evaluate the trends in SLRs of randomized controlled trials (RCTs) throughout the years.

Methods: Medline database was searched, using a highly focused search strategy. Each paper was coded according to a specific ICD-10 code; the number of RCTs included in each evaluated SLR was also retrieved. All SLRs analyzing RCTs were included. Protocols, commentaries, or errata were excluded. No restrictions were applied.

Results: A total of 7,465 titles and abstracts were analyzed, from which 6,892 were included for further analyses. There was a gradual increase in the number of annual published SLRs, with a significant increase in published articles during the last several years. Overall, the most frequently analyzed areas were diseases of the circulatory system ( n  = 750) and endocrine, nutritional, and metabolic diseases ( n  = 734). The majority of SLRs included between 11 and 50 RCTs each.

Conclusions: The recognition of SLRs’ usefulness is growing at an increasing speed, which is reflected by the growing number of published studies. The most frequently evaluated diseases are in alignment with leading causes of death and disability worldwide.

Introduction

Presenting background information about a subject or documenting the growth of knowledge over time can be achieved with narrative reviews of the literature. However, they tend to be subjective as they rely on the author’s expertise on discussed topic, and offer a condensed presentation of a subject rather than an extensive one. Furthermore, they are frequently based on articles chosen selectively from the available material, which puts them at risk for systematic bias [ 1 ]. Typically, narrative reviews don’t describe how the review process was carried out [ 2 ]. As a result, they usually do not provide a thorough foundation for theory development and testing [ 3 ]. In 1979, British epidemiologist, Archie Cochrane wrote: ‘It is surely a great criticism of our profession that we have not organised a critical summary, by speciality or subspecialty, updated periodically, of all relevant randomised controlled trials’ [ 4 ]. That is why researchers in the field of healthcare have been working on a program of systematic reviews on the efficacy of therapies starting in the 1980s. In order to collect, assess, and promote research information, the Cochrane Collaboration was established in 1993. Since then, an extensive set of guidelines for conducting systematic reviews has been produced [ 5 ]. Other organizations have also joined this effort to convert the knowledge gained by health experts into practice, the main aim being to assist evidence-based medicine (EBM) practitioners in decision-making [ 6 ]. Nowadays, systematic literature reviews (SLRs) and meta-analyses are often placed at the top of the evidence hierarchy, usually depicted as a pyramid, ordered by the design and risk of bias of included studies [ 7 ]. In contrast to narrative reviews, systematic reviews address a specific research question [ 8 ]. This includes collecting all primary research applicable to the established review question and critically evaluating and synthesizing the data [ 9 ]. There are a few stages of conducting an SLR. Defining the review question, establishing hypotheses, and coming up with a review title are all part of the first stage. Titles should ideally be succinct and descriptive, e.g. intervention for the population with a given condition. One should always a priori define inclusion and exclusion criteria (according to PICO: P – population, I – intervention, C – comparison, O – outcomes), and study type (i.e RCTs). The development of a search strategy is another key step in performing a good quality SLR. Searching typically involves using several electronic databases (such as MEDLINE, EMBASE, or Cochrane CENTRAL), but they can also include consulting article reference lists, manually scanning important journals (hand-searching), or speaking directly with experts and scholars [ 10 ]. Once all abstracts are found, the following step is their screening – the process of identifying articles for inclusion and removing duplicates [ 8 ]. Then, appropriate full-text articles are gathered. Data from selected studies are then extracted. Data analysis should be carried out after quality assessment. Alternatively, some of these methods may be streamlined or omitted to produce evidence in a resource-efficient manner, in a form of rapid review, which is less comprehensive than a traditional SLR [ 11 ]. The first phase of this procedure includes a straightforward descriptive review of each study, usually referred to as qualitative analysis. If it is possible to combine results from different studies, the second phase – quantitative analysis, or meta-analysis – can be performed [ 12 ]. If used appropriately, meta-analysis will increase the accuracy of estimates of treatment outcomes, reducing the likelihood of false positive or negative findings and possibly allowing for the earlier implementation of successful therapies [ 13 ]. The number of SLRs seems to be exploding over the years while no quantification of this phenomena has been described to the authors knowledge. The main objective of this paper was to evaluate the volume trends in SLRs of RCTs throughout the years.

To analyse the overall increase of the number of SLRs over the years, the broad search in PubMed was performed in May 2023, using the following search string: ‘randomised controlled’ OR ‘randomised clinical’ OR ‘randomized controlled’ OR ‘randomized clinical’ OR RCT* . Later, an appropriate filter was applied in order to retrieve only studies with an SLR design. The total numbers of retrieved studies stratified by publication years were exported into an Excel file.

To run a detailed analysis of the trends in RCT SLRs over the years, a rapid review was conducted in Medline database on a smaller, representative sample of references, and results were compared to the ones from PubMed, to analyse consistency between them, and check if the same trend in the number of published SLRs would be observed. Medline database was searched via Ovid in May 2023 using a highly focused search strategy: (systematic review* or systematic literature review*).ti AND randomi?ed controlled trial×.ti . The search results were then imported to EndNote 20 (Clarivate) program and analysed using the Eppi-Reviewer Web software [ 14 , 15 ]. A single screening of titles and abstracts was performed. Additionally, based on information provided in titles and abstracts, each paper was coded according to a specific ICD-10 code (depending on the analysed disease area or procedure, Table 1 ) or, if no ICD-10 code was applicable (for example, the SLR analysed healthy subjects), to ‘Treatments’ code, consisting of ‘Pain treatment’, ‘Anaesthesia’, ‘Supplements/diet’ and ‘Other treatments/interventions’ subcategories. When appropriate, two or more codes were selected. Data regarding the number of RCTs included in each analysed SLR was also retrieved and divided into six categories: 1–10, 11–50, 51–100, 101–200, >200, and not reported (in abstract/title, NR).

Disease area according to ICD-10 classification.

CodeTitleCodeTitle
A00–B99Certain infectious and parasitic diseasesL00–L99Diseases of the skin and subcutaneous tissue
C00–D48NeoplasmsM00–M99Diseases of the musculoskeletal system and connective tissue
D50–D89Diseases of the blood and blood-forming organs and certain disorders involving the immune mechanismN00–N99Diseases of the genitourinary system
E00–E90Endocrine, nutritional and metabolic diseasesO00–O99Pregnancy, childbirth and the puerperium
F00–F99Mental and behavioral disordersP00–P96Certain conditions originating in the perinatal period
G00–G99Diseases of the nervous systemQ00–Q99Congenital malformations, deformations and chromosomal abnormalities
H00–H59Diseases of the eye and adnexaR00–R99Symptoms, signs and abnormal clinical and laboratory findings, not elsewhere classified
H60–H95Diseases of the ear and mastoid processS00–T98Injury, poisoning and certain other consequences of external causes
I00–I99Diseases of the circulatory systemV01–Y98External causes of morbidity and mortality
J00–J99Diseases of the respiratory systemZ00–Z99Factors influencing health status and contact with health services
K00–K93Diseases of the digestive systemU00–U99Codes for special purposes

All SLRs analysing RCTs were included, without any restrictions on population, interventions, or outcomes. Protocols, commentaries, or errata were excluded. No restrictions on date or language were applied.

Database analysis

The PubMed search for the SLR of RCTs yielded 86,765 results ( Figure 1 ). The first identified article was issued in 1990 and compared the effects of corticosteroid administration to no corticosteroid treatment before preterm delivery based on data from 12 RCTs [ 16 ]. Later, another 10 articles were published in 1994, and since then, new articles were being released annually. The number of newly published SLRs gradually increased with each year: 100 articles were published in 1999, and 1000 in 2005. In recent years, a couple of thousands of new RCT SLRs were being published annually; what is more, 37% of all identified records were published since the year 2020 ( n  = 32,174). We observe an exponential growth of published RCT SLRs. The largest number of new records was observed in the year 2022, with more than 10,000 publications released. As presented in Figure 2 , the number of published RCTs was also growing; however, it reached a maximum in 2014 and has been staying on a similar level nowadays.

An external file that holds a picture, illustration, etc.
Object name is ZJMA_A_2244305_F0001_OC.jpg

The number of RCT SLRs published over the years. Source: PubMed (search run in May 2023).

An external file that holds a picture, illustration, etc.
Object name is ZJMA_A_2244305_F0002_OC.jpg

The number of RCTs published over the years. Source: PubMed (search run in May 2023).

Rapid review

The highly targeted search conducted in Medline yielded 7,534 records. After deduplication, 7,465 titles and abstracts were analysed, from which 6,892 were included for further analyses ( Figure 3 ). The oldest retrieved publications date back to 1994. The gradual increase in the number of published RCT SLRs was consistent with the one observed in the PubMed analysis: more than 200 articles were released in 2013, and more than 600 in 2019; there was an intensification of the increase during the last several years: from over 800 publications in 2020 to over 1,400 published in 2022. Therefore, it was assumed that the identified records provided a representative sample for further analysis.

An external file that holds a picture, illustration, etc.
Object name is ZJMA_A_2244305_F0003_OC.jpg

Distribution of RCT SLRs over the years (search run in May 2023).

The distribution of all identified RCT SLRs by the evaluated disease area and the number of included RCTs is presented in Figure 4 . Overall, the most frequently analysed area in the identified articles were diseases of the circulatory system (I00-I99, n  = 750; 10.9% of included articles), such as heart failure or stroke, closely followed by endocrine, nutritional, and metabolic diseases (E00-E90, n  = 734; 10.7%), mainly diabetes. Additionally, 7.8% of SLRs were focused on assessing the impact of various supplementations or diets ( n  = 535). The relatively low number of SLRs assessing the following disease areas were identified: congenital malformations, deformations and chromosomal abnormalities (Q00-Q99, n  = 10), diseases of the ear and mastoid process (H60-H95, n  = 21) and external causes of morbidity and mortality (V01-Y98, n  = 26).

An external file that holds a picture, illustration, etc.
Object name is ZJMA_A_2244305_F0004_OC.jpg

Number of RCT SLRs stratified by disease area and by the number of included RCTs.

It was shown that the majority of SLRs summarised data from a moderate number of RCTs (between 11 and 50, n  = 3,194; 46.4%); furthermore, one-third of analysed reviews included a lower number of studies (between 1 and 10). Larger SLRs, including more than 51 trials, were fewer in number: 5.2% of them included between 51 and 100 trials ( n  = 361), and 1.9% – between 101 and 200 ( n  = 131); only 87 out of 6,892 identified reviews analysed more than 200 trials. Furthermore, in 6.9% of articles, the number of incorporated trials was not reported in the abstract ( n  = 474). This proportion was consistent throughout the years and in various disease areas.

Figure 5 depicts the distribution of studies according to disease area in three distinct periods: from 1994 to 2015 (A), from 2016 to 2019 (B) and from 2020 onward (C).

An external file that holds a picture, illustration, etc.
Object name is ZJMA_A_2244305_F0005_OC.jpg

The number of RCT SLRs stratified by disease area, published in 3 distinct periods: from 1994 to 2015, from 2016 to 2019 and from 2020 to March of 2023.

A total of 1,624 identified RCT SLRs were published between 1994 and 2015. Diseases affecting the circulatory system (I00-I99; n  = 176; 10.8%) and the musculoskeletal system and connective tissue (M00-M99, n  = 138; 8.5%) were the main areas of focus. With 7.5% and 7.4% of studies, respectively, neoplasms (C00-D48, n  = 121) and mental and behavioural disorders (F00-F99, n = 120) were also among the most commonly studied topics. Interestingly, in highest number of cases, studies could not be assigned to any specific ICD-10 code ( n = 223; 13.7%).

In the publishing period between 2016 and 2019, 1,880 RCT SLRs were identified. One significant change from the 1994–2015 period is that reviews on endocrine, nutritional, and metabolic diseases (E00-E90, n = 232; 12.3%) outnumbered the SLRs focused on circulatory system diseases (I00-I99, n  = 212; 11.3%). Additionally, the number of studies analysing the impact of supplementations or diets increased twofold.

Half of all analysed RCT SLRs were published in recent years (2020 to March 2023; n  = 3,416). Similar to the previous interval, the most frequently analysed disease area was endocrine, nutritional, and metabolic diseases ( n  = 394; 11.5%), closely followed by the diseases of the circulatory system ( n  = 363, 10.6%) and of the musculoskeletal system and connective tissue ( n = 305; 8.9%). The number of newly published studies analysing the impact of supplementations or diets further increased by twofold in comparison to the 2016–2019 period. In the recent years, threefold increase in number of trials concerning nervous system diseases (G00-G99) such as Alzheimer’s disease was observed. The first appearance of codes for special purposes (U00-U99; n  = 80), related to the COVID-19 pandemic, is also worth noting.

In recent years, evidence synthesis became more crucial than individual studies. It helps with comparing similar studies, combining their findings, and making evidence more accessible, as well as with the identification of the most cost-effective treatments and future research to be better designed. Back in 1995, the Cochrane Group released the Cochrane Database of Systematic Reviews (CDSR) which consisted of 50 reviews [ 17 , 18 ]. In comparison, the number of reviews in 2015 was above 6,000. In 1998, CDSR was made accessible on the internet. Out of 2,500 reviews released each year, 20% are written by Cochrane [ 17 ]. Overall, in 2010, approximately 75 trials and 11 systematic literature reviews were published every day [ 19 ]. However, the number of issued SLRs did not exceed the number of narrative, non-systematic reviews, which growth is even higher. There are also much more journals publishing them [ 19 ]. Furthermore, SLRs and trials are lower in the number of published works than case reports [ 19 ]. Interestingly, data shows that 95% of all articles and 98% of core clinical journals were produced by just 30 nations globally. By 2018, there was an increase in all publication types; however, the most significant increase could be noticed in terms of publications of meta-analyses from China, which was leading the chart ( n  = 4,659); the United States of America was leading in the case of systematic reviews ( n  = 3,654), clinical trials ( n  = 11,095) and RCTs ( n  = 7,953) [ 20 ].

With so many SLRs published nowadays, it is crucial to use trustworthy data. The quality of trials was defined in the literature as ‘the likelihood of the trial design to generate unbiased results’ [ 21 ]. The quality of individual included studies is affecting the quality of the entire SLR; therefore, a proper bias assessment is a crucial step and key component. This is especially relevant if the evidence of medical treatment effectiveness is inconclusive. There are many tools that help with performing a quality assessment of RCTs such as Jadad scale [ 8 ], or Risk of Bias tool for randomized trials 2.0 (RoB 2.0), which is the suggested method for evaluating bias of studies that are part of Cochrane Reviews [ 22 ]. Its structure is made out of five domains: randomization process, deviations from intended interventions, missing outcome data, measurement of the outcome, and selection of the reported results [ 22 ]. Just like in the case of RCTs, quality of SLRs varies. A Measurement Tool To Assess Systematic Reviews (AMSTAR) was published in 2007 to make it possible for health professionals and policymakers to quickly evaluate the quality of SLRs of interventional RCTs. However, due to some criticisms, such as being focused mainly on RCTs and considering articles written in languages other than English as ‘grey literature’, the second, current version was developed in 2017. AMSTAR-2 takes additionally non-RCTs into account for assessment with the goal to determine if the most crucial information is reported in SLRs [ 23–25 ]. CASP Systematic Review Checklist is also commonly used instrument recommended by World Health Organization and Cochrane as an approachable alternative for novice qualitative researchers [ 26 ]. It consists of ten questions, divided into three sections that help to determine if the results of the study are valid (Section A), what are those results (Section B) and if the results will help locally (Section C) [ 27 ]. Lastly, while not a quality assessment instrument, the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) could be used to improve the reporting of SLRs and meta-analyses; it is also a useful tool for critical appraisal of published SLRs [ 28 ].

COVID-19 pandemic shook the world in the beginning of 2020. The need for more information about this disease was high as there were many uncertainties. Already in April 2020, quick search yielded 6831 articles from a total of 1430 journals about COVID-19 [ 29 ]. The most frequent study designs were review articles ( n  = 202) and SLRs ( n  = 43). An average of almost 59 articles were published everyday [ 29 ]. The so-called ‘covidization’ emerged – until August of 2021, in general and internal medicine publications, COVID-19 received up to 79.3% of citations and was mentioned in 98 of the top 100 most-cited articles [ 30 ]. That trend is also noticeable in the number of SLRs containing keywords for COVID-19 published in PubMed since 2020 ( Figure 6 , search run in May 2023 - Appendix ). However, according to a systematic analysis of SLRs on COVID-19 published in 2021, methodological quality of the reviews was poor: out of 243 assessed with AMSTAR-2, 12.3% had moderate quality, 25.9% had low quality, and 61.7% had critically low quality [ 31 ]. Those conclusions were confirmed by other authors. Abbott et al. conducted an analysis of early published COVID-19 SLRs and found that 88 out of 280 reviews being assessed met SLR criteria, and only 3 of them had moderate or high quality according to AMSTAR-2. Fifty-two of those SLRs have been completed within 3 weeks, and submission and publication process took 3 weeks in 50% of cases. Publications received high attention despite being of low quality [ 32 ]. It shows that studies reported as SLRs should not be considered as of high quality from the beginning; each of reader should analyze the methodology undertaken and consider its impact on the findings.

An external file that holds a picture, illustration, etc.
Object name is ZJMA_A_2244305_F0006_OC.jpg

The number of SLRs on COVID-19 published since 2020. Source: PubMed (search run in May 2023).

The number of publications is increasing not only for RCT SLRs. Similar tendencies may be observed for different study designs. According to the performed analysis of PubMed data, the overall trend in publishing the SLRs of epidemiological studies is similar to the SLRs on RCTs: the quantity of publications from both study designs consistently grows, although RCT SLRs are more numerous (in 2022 RCTs: n  = 10,061; epidemiological: n  = 8,947) ( Figure 7 , search run in May 2023 - Appendix ). Development of epidemiological studies may be a result of a recent interest in real-world evidence (RWE) and its increasing role in health-care decisions. Increasing digitisation of health records facilitating data analysis, as well as increasing focus on the importance of patient-reported outcomes supports this trend [ 33 ].

An external file that holds a picture, illustration, etc.
Object name is ZJMA_A_2244305_F0007_OC.jpg

The number of epidemiological SLRs published over the years. Source: PubMed (search run in May 2023).

According to the Global Health Estimates published by the World Health Organization (WHO), covering the period between the year 2000 and 2019, non-communicable diseases (e.g.: chronic diseases such as heart disease, chronic respiratory disease, cancer or diabetes) made up 7 of the world’s top 10 causes of death [ 34 , 35 ]. Ischemic heart diseases were in the lead, accounting for 8.9 million deaths in 2019, while stroke was in second place, causing 11% of deaths [ 35 ]. This is consistent with the number of published RCT SLRs in the cardiovascular area and explains the interests that physicians and trialists across the world take in cardiovascular diseases. With such a high death rate caused by those illnesses it is essential to study all possible treatments or interventions that can allow to ease the suffering of many patients and hopefully extend their life expectancy and quality. In recent years, deaths from diabetes increased by 70% globally and represented the greatest percentage increase of all WHO regions [ 34 , 35 ], which was also noticeable in the SLRs’ trends. Another change from the previous years was the appearance of Alzheimer’s disease and other forms of dementia among the top 10 death causes worldwide [ 34 ], also discernible in our analysis.

The diseases in question were also impacting the quality of life and disability – heart diseases, diabetes, stroke, lung cancer and chronic obstructive pulmonary disease were collectively responsible for nearly 100 million additional healthy life-years lost in 2019 compared to the year 2000 [ 34 ]. According to both WHO [ 35 ] and the Global Burden of Disease published by The Institute for Health Metrics and Evaluation (IHME) [ 36 ], neonatal diseases were one of the leading causes of disability-adjusted life years (DALYs) in 2019; however, this was not captured in the current analysis, since there is no uniform ICD-10 code for this area ( Figure 8 ).

An external file that holds a picture, illustration, etc.
Object name is ZJMA_A_2244305_F0008_OC.jpg

The burden of disease by cause, measured in DALYs. Source: Institute for Health Metrics and Evaluation (IHME): GBD results [ 36 ].

The analysis identified several disease areas with a relatively low number of SLRs: congenital malformations, deformations and chromosomal abnormalities (Q00-Q99, n  = 10), diseases of the ear and mastoid process (H60-H95, n  = 21) and external causes of morbidity and mortality (V01-Y98, n  = 26). In terms of congenital and chromosomal diseases, it could be assumed that few RCT studies are being conducted due to the low numbers of patients, as they are often rare diseases; additionally, treatment options are usually symptomatic, with limited options to treat the underlying cause, e.g. gene therapies. Treatment for diseases occurring due to external causes also tends to be mostly symptomatic; therefore, very few RCTs focused specifically on the cause itself (such as injury or poisoning) would be available. The insufficiency in the number of SLRs of RCTs on the diseases of the ear and mastoid process is in accordance with the WHO report on hearing from 2021, which underlines that there is a significant gap in services for ear and hearing worldwide – for instance, there is an 83% gap between need for and access to hearing aid use; the authors state that the reasons could include the lack of accurate information and stigmatizing mindsets surrounding ear diseases [ 37 ].

It is also worth to mention about an increasing number of SLRs reporting on food supplements and diets. Between 2010 and 2020 over 70,000 new articles on nutraceuticals became available in PubMed. COVID-19 pandemic led to even higher interest in dietary supplements in early 2020. Consumers were looking for additional protection from disease and it resulted in 44% increase in sales during the first wave of the pandemic in the US, relative to the same period in the previous year. Supplement sales in March 2020 increased by 63% and about 40–60% versus the same period in 2019 in the UK and France, respectively [ 38 ]. In some authors’ opinion, supplement market growth trends will not continue and should normalize to pre-pandemic values during the following years [ 39 ]. According to other sources, global supplement market is expected to grow, and the main factors influencing this trend are as follows: focusing on well-being, preventive healthcare and shifting from standard pharmaceuticals to supplements and diets, and the growing geriatric population [ 40 ].

Conclusions

The recognition of RCT SLRs’ usefulness for providing a synthetized unbiased information has led to increased volume of SLRs. RCT SLR publications are growing at an exponential speed. The rapid increase in the number of published RCT SLRs in the last 3 years is partly driven by the emergence of the COVID-19 pandemic. While SLR is considered as the gold standard to unequivocally address evidence, in the case of COVID-19 it was source of controversies and outcome divergence between studies. The most frequently evaluated diseases through RCT SLRs are aligned with leading causes of death and disability worldwide indicated in the reports published by WHO and IHME [ 35 , 36 ]. The emergence of food supplements and diets illustrate the increase interest for such interventions that may be considered at the frontier of lifestyle and medicine. Although SLR is recognized as the most rigorous way to perform review, the number of narrative review remains more important than SLR. It is interesting to notice that epidemiological SLRs are growing fast and are about to catch up the number of RCT SLRs. The development of real-world evidence to assess interventions, the larger access to historical databases, may have played a role in the development of epidemiological studies. SLR will continue to grow as the number of RCTs and epidemiological studies will grow making the need for unbiased summary increasingly important for supporting EBM.

Supplementary Material

Covid-19 search strategy (“systematic reviews” filter applied) run on 15.05.2023 in pubmed.

(‘COVID-19’[Mesh] OR ‘SARS-CoV-2’[Mesh] OR ‘COVID-19 Vaccines’[Mesh] OR ‘COVID-19 Serological Testing’[Mesh] OR ‘COVID-19 Nucleic Acid Testing’[Mesh] OR ‘SARS-CoV-2 variants’ [Supplementary Concept] OR ‘COVID-19 drug treatment’ [Supplementary Concept] OR ‘COVID-19 serotherapy’ [Supplementary Concept] OR ‘2019-nCoV’ OR ‘2019nCoV’ OR ‘cov 2’ OR ‘COVID-19’ OR ‘sars coronavirus 2’ OR ‘sars cov 2’ OR ‘SARS-CoV-2’ OR ‘severe acute respiratory syndrome coronavirus 2’ OR ‘coronavirus 2’ OR ‘COVID 19’ OR ‘COVID-19’ OR ‘2019 ncov’ OR ‘2019nCoV’ OR ‘corona virus disease 2019’ OR ‘cov2’ OR ‘COVID-19’ OR ‘COVID19’ OR ‘nCov 2019’ OR ‘nCoV’ OR ‘new corona virus’ OR ‘new coronaviruses’ OR ‘novel corona virus’ OR ‘novel coronaviruses’ OR ‘sars coronavirus 2’ OR ‘SARS2’ OR ‘SARS-CoV-2’ OR ‘severe acute respiratory syndrome coronavirus 2’)

Epidemiologic studies search strategy (“Systematic reviews” filter applied) run on 15.05.2023 in PubMed

‘epidemiologic stud*’ OR ‘epidemiology’ OR ‘epidemiologic’ OR ‘epidemiological’ OR ‘epidemiol*’

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

Supplemental data for this article can be accessed online at https://doi.org/10.1080/20016689.2023.2244305

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

applsci-logo

Article Menu

systematic literature review figure

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Analysing near-miss incidents in construction: a systematic literature review.

systematic literature review figure

1. Introduction

  • Q 1 —Are near-miss events in construction industry the subject of scientific research?
  • Q 2 —What methods have been employed thus far to obtain information on near misses and systems for recording incidents in construction companies?
  • Q 3 —What methods have been used to analyse the information and figures obtained?
  • Q 4 —What are the key aspects of near misses in the construction industry that have been of interest to the researchers?

2. Definition of Near-Miss Events

3. research methodology, 4.1. a statistical analysis of publications, 4.2. methods used to obtain information about near misses, 4.2.1. traditional methods.

  • Traditional registration forms
  • Computerized systems for the recording of events
  • Surveys and interviews

4.2.2. Real-Time Monitoring Systems

  • Employee-tracking systems
  • Video surveillance systems
  • Wearable technology
  • Motion sensors

4.3. Methods Used to Analyse the Information and Figures That Have Been Obtained

4.3.1. quantitative and qualitative statistical methods, 4.3.2. analysis using artificial intelligence (ai), 4.3.3. building information modelling, 4.4. key aspects of near-miss investigations in the construction industry, 4.4.1. occupational risk assessment, 4.4.2. causes of hazards in construction, 4.4.3. time series of near misses, 4.4.4. material factors of construction processes, 4.5. a comprehensive overview of the research questions and references on near misses in the construction industry, 5. discussion, 5.1. interest of researchers in near misses in construction (question 1), 5.2. methods used to obtain near-miss information (question 2), 5.3. methods used to analyse the information and data sets (question 3), 5.4. key aspects of near-miss investigations in the construction industry (question 4), 6. conclusions.

  • A quantitative analysis of the Q 1 question has revealed a positive trend, namely that there is a growing interest among researchers in studying near misses in construction. The greatest interest in NM topics is observed in the United States of America, China, the United Kingdom, Australia, Hong Kong, and Germany. Additionally, there has been a recent emergence of interest in Poland. The majority of articles are mainly published in journals such as Safety Science (10), Journal of Construction Engineering and Management (8), and Automation in Construction (5);
  • The analysis of question Q 2 illustrates that traditional paper-based event registration systems are currently being superseded by advanced IT systems. However, both traditional and advanced systems are subject to the disadvantage of relying on employee-reported data, which introduces a significant degree of uncertainty regarding in the quality of the information provided. A substantial proportion of the data and findings presented in the studies was obtained through surveys and interviews. The implementation of real-time monitoring systems is becoming increasingly prevalent in construction sites. The objective of such systems is to provide immediate alerts in the event of potential hazards, thereby preventing a significant number of near misses. Real-time monitoring systems employ a range of technologies, including ultrasonic technology, radio frequency identification (RFID), inertial measurement units (IMUs), real-time location systems (RTLSs), industrial cameras, wearable technology, motion sensors, and advanced IT technologies, among others;
  • The analysis of acquired near-miss data is primarily conducted through the utilisation of quantitative and qualitative statistical methods, as evidenced by the examination of the Q 3 question. In recent years, research utilising artificial intelligence (AI) has made significant advances. The most commonly employed artificial intelligence techniques include text mining, machine learning, and artificial neural networks. The growing deployment of Building Information Modelling (BIM) technology has precipitated a profound transformation in the safety management of construction sites, with the advent of sophisticated tools for the identification and management of hazardous occurrences;
  • In response to question Q 4 , the study of near misses in the construction industry has identified several key aspects that have attracted the attention of researchers. These include the utilisation of both quantitative and qualitative methodologies for risk assessment, the analysis of the causes of hazards, the identification of accident precursors through the creation of time series, and the examination of material factors pertaining to construction processes. Researchers are focusing on the utilisation of both databases and advanced technologies, such as real-time location tracking, for the assessment and analysis of occupational risks. Techniques such as Analytic Hierarchy Process (AHP) and clustering facilitate a comprehensive assessment and categorisation of incidents, thereby enabling the identification of patterns and susceptibility to specific types of accidents. Moreover, the impact of a company’s safety climate and organisational culture on the frequency and characteristics of near misses represents a pivotal area of investigation. The findings of this research indicate that effective safety management requires a holistic approach that integrates technology, risk management and safety culture, with the objective of reducing accidents and enhancing overall working conditions on construction sites.

7. Gaps and Future Research Directions, Limitations

  • Given the diversity and variability of construction sites and the changing conditions and circumstances of work, it is essential to create homogeneous clusters of near misses and to analyse the phenomena within these clusters. The formation of such clusters may be contingent upon the direct causes of the events in question;
  • Given the inherently dynamic nature of construction, it is essential to analyse time series of events that indicate trends in development and safety levels. The numerical characteristics of these trends may be used to construct predictive models for future accidents and near misses;
  • The authors have identified potential avenues for future research, which could involve the development of mathematical models using techniques such as linear regression, artificial intelligence, and machine learning. The objective of these models is to predict the probable timing of occupational accidents within defined incident categories, utilising data from near misses. Moreover, efforts are being made to gain access to the hazardous incident recording systems of different construction companies, with a view to facilitating comparison of the resulting data;
  • One significant limitation of near-miss research is the lack of an integrated database that encompasses a diverse range of construction sites and construction work. A data resource of this nature would be of immense value for the purpose of conducting comprehensive analyses and formulating effective risk management strategies. This issue can be attributed to two factors: firstly, the reluctance of company managers to share their databases with researchers specialising in risk assessment, and secondly, the reluctance of employees to report near-miss incidents. Such actions may result in adverse consequences for employees, including disciplinary action or negative perceptions from managers. This consequently results in the recording of only a subset of incidents, thereby distorting the true picture of safety on the site.

Author Contributions

Institutional review board statement, informed consent statement, data availability statement, conflicts of interest.

YearSource TitleDOI/ISBN/ISSNReference
1999Construction Management and Economics10.1080/014461999371691[ ]
2002Structural Engineer14665123[ ]
2009Building a Sustainable Future—Proceedings of the 2009 Construction Research Congress10.1061/41020(339)4[ ]
2010Safety Science10.1016/j.ssci.2010.04.009[ ]
2010Automation in Construction10.1016/j.autcon.2009.11.017[ ]
2010Safety Science10.1016/j.ssci.2009.06.006[ ]
2012Journal of Construction Engineering and Management10.1061/(ASCE)CO.1943-7862.0000518[ ]
2013ISARC 2013—30th International Symposium on Automation and Robotics in Construction and Mining, Held in Conjunction with the 23rd World Mining Congress10.22260/isarc2013/0113[ ]
2014Proceedings of the Institution of Civil Engineers: Civil Engineering10.1680/cien.14.00010[ ]
2014Safety Science10.1016/j.ssci.2013.12.012[ ]
2014Journal of Construction Engineering and Management10.1061/(ASCE)CO.1943-7862.0000795[ ]
201431st International Symposium on Automation and Robotics in Construction and Mining, ISARC 2014—Proceedings10.22260/isarc2014/0115[ ]
2014Construction Research Congress 2014: Construction in a Global Network—Proceedings of the 2014 Construction Research Congress10.1061/9780784413517.0181[ ]
2014Construction Research Congress 2014: Construction in a Global Network—Proceedings of the 2014 Construction Research Congress10.1061/9780784413517.0235[ ]
2014Construction Research Congress 2014: Construction in a Global Network—Proceedings of the 2014 Construction Research Congress10.1061/9780784413517.0096[ ]
2015Automation in Construction10.1016/j.autcon.2015.09.003[ ]
201532nd International Symposium on Automation and Robotics in Construction and Mining: Connected to the Future, Proceedings10.22260/isarc2015/0062[ ]
2015ASSE Professional Development Conference and Exposition 2015-[ ]
2015Congress on Computing in Civil Engineering, Proceedings10.1061/9780784479247.019[ ]
2016Automation in Construction10.1016/j.autcon.2016.03.008[ ]
2016Automation in Construction10.1016/j.autcon.2016.04.007[ ]
2016IEEE IAS Electrical Safety Workshop10.1109/ESW.2016.7499701[ ]
2016Journal of Construction Engineering and Management10.1061/(ASCE)CO.1943-7862.0001100[ ]
2016Safety Science10.1016/j.ssci.2015.11.025[ ]
2016Journal of Construction Engineering and Management10.1061/(ASCE)CO.1943-7862.0001049[ ]
2016IEEE Transactions on Industry Applications10.1109/TIA.2015.2461180[ ]
2017Safety Science10.1016/j.ssci.2017.06.012[ ]
2017ENR (Engineering News-Record)8919526[ ]
20176th CSCE-CRC International Construction Specialty Conference 2017—Held as Part of the Canadian Society for Civil Engineering Annual Conference and General Meeting 2017978-151087841-9[ ]
2017Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)10.1007/978-3-319-72323-5_12[ ]
2017Journal of Construction Engineering and Management10.1061/(ASCE)CO.1943-7862.0001209[ ]
2017Safety Science10.1016/j.ssci.2016.08.027[ ]
2017Safety Science10.1016/j.ssci.2016.08.022[ ]
2018Safety Science10.1016/j.ssci.2018.04.004[ ]
2018International Journal of Construction Management10.1080/15623599.2017.1382067[ ]
2018Journal of Construction Engineering and Management10.1061/(ASCE)CO.1943-7862.0001420[ ]
2018Proceedings of SPIE—The International Society for Optical Engineering10.1117/12.2296548[ ]
2019Automation in Construction10.1016/j.autcon.2019.102854[ ]
2019Physica A: Statistical Mechanics and its Applications10.1016/j.physa.2019.121495[ ]
2019Sustainability (Switzerland)10.3390/su11051264[ ]
2019Computing in Civil Engineering 2019: Data, Sensing, and Analytics—Selected Papers from the ASCE International Conference on Computing in Civil Engineering 2019978-078448243-8[ ]
2019Journal of Health, Safety and Environment18379362[ ]
2019Computing in Civil Engineering 2019: Data, Sensing, and Analytics—Selected Papers from the ASCE International Conference on Computing in Civil Engineering 2019978-078448243-8[ ]
2019Computing in Civil Engineering 2019: Smart Cities, Sustainability, and Resilience—Selected Papers from the ASCE International Conference on Computing in Civil Engineering 201910.1061/9780784482445.026[ ]
2019Journal of Construction Engineering and Management10.1061/(ASCE)CO.1943-7862.0001582[ ]
2019Advances in Intelligent Systems and Computing10.1007/978-3-030-02053-8_107[ ]
2020Accident Analysis and Prevention10.1016/j.aap.2020.105496[ ]
2020Advanced Engineering Informatics10.1016/j.aei.2020.101062[ ]
2020Advanced Engineering Informatics10.1016/j.aei.2020.101060[ ]
2020ARCOM 2020—Association of Researchers in Construction Management, 36th Annual Conference 2020—Proceedings978-099554633-2[ ]
2020International Journal of Building Pathology and Adaptation10.1108/IJBPA-03-2020-0018[ ]
2020Communications in Computer and Information Science10.1007/978-3-030-42852-5_8[ ]
2021Journal of Architectural Engineering10.1061/(ASCE)AE.1943-5568.0000501[ ]
2021Safety Science10.1016/j.ssci.2021.105368[ ]
2021ACM International Conference Proceeding Series10.1145/3482632.3487473[ ]
2021Reliability Engineering and System Safety10.1016/j.ress.2021.107687[ ]
2021Proceedings of the 37th Annual ARCOM Conference, ARCOM 2021-[ ]
2022Buildings10.3390/buildings12111855[ ]
2022Safety Science10.1016/j.ssci.2022.105704[ ]
2022Sensors10.3390/s22093482[ ]
2022Proceedings of International Structural Engineering and Construction10.14455/ISEC.2022.9(2).CSA-03[ ]
2022Journal of Information Technology in Construction10.36680/j.itcon.2022.045[ ]
2022Forensic Engineering 2022: Elevating Forensic Engineering—Selected Papers from the 9th Congress on Forensic Engineering10.1061/9780784484555.005[ ]
2022Computational Intelligence and Neuroscience10.1155/2022/4851615[ ]
2022International Journal of Construction Management10.1080/15623599.2020.1839704[ ]
2023Journal of Construction Engineering and Management10.1061/JCEMD4.COENG-13979[ ]
2023Heliyon10.1016/j.heliyon.2023.e21607[ ]
2023Accident Analysis and Prevention10.1016/j.aap.2023.107224[ ]
2023Safety10.3390/safety9030047[ ]
2023Engineering, Construction and Architectural Management10.1108/ECAM-09-2021-0797[ ]
2023Advanced Engineering Informatics10.1016/j.aei.2023.101929[ ]
2023Engineering, Construction and Architectural Management10.1108/ECAM-05-2023-0458[ ]
2023Intelligent Automation and Soft Computing10.32604/iasc.2023.031359[ ]
2023International Journal of Construction Management10.1080/15623599.2020.1847405[ ]
2024Heliyon10.1016/j.heliyon.2024.e26410[ ]
  • Occupational Risk|Safety and Health at Work EU-OSHA. Available online: https://osha.europa.eu/en/tools-and-resources/eu-osha-thesaurus/term/70194i (accessed on 28 June 2023).
  • Guo, S.; Zhou, X.; Tang, B.; Gong, P. Exploring the Behavioral Risk Chains of Accidents Using Complex Network Theory in the Construction Industry. Phys. A Stat. Mech. Its Appl. 2020 , 560 , 125012. [ Google Scholar ] [ CrossRef ]
  • Woźniak, Z.; Hoła, B. The Structure of near Misses and Occupational Accidents in the Polish Construction Industry. Heliyon 2024 , 10 , e26410. [ Google Scholar ] [ CrossRef ]
  • Li, X.; Sun, W.; Fu, H.; Bu, Q.; Zhang, Z.; Huang, J.; Zang, D.; Sun, Y.; Ma, Y.; Wang, R.; et al. Schedule Risk Model of Water Intake Tunnel Construction Considering Mood Factors and Its Application. Sci. Rep. 2024 , 14 , 3857. [ Google Scholar ] [ CrossRef ]
  • Li, X.; Huang, J.; Li, C.; Luo, N.; Lei, W.; Fan, H.; Sun, Y.; Chen, W. Study on Construction Resource Optimization and Uncertain Risk of Urban Sewage Pipe Network. Period. Polytech. Civ. Eng. 2022 , 66 , 335–343. [ Google Scholar ] [ CrossRef ]
  • Central Statistical Office Central Statistical Office/Thematic Areas/Labor Market/Working Conditions/Accidents at Work/Accidents at Work in the 1st Quarter of 2024. Available online: https://stat.gov.pl/obszary-tematyczne/rynek-pracy/warunki-pracy-wypadki-przy-pracy/wypadki-przy-pracy-w-1-kwartale-2024-roku,3,55.html (accessed on 17 July 2024).
  • Manzo, J. The $ 5 Billion Cost of Construction Fatalities in the United States: A 50 State Comparison ; The Midwest Economic Policy Institute (MEPI): Saint Paul, MN, USA, 2017. [ Google Scholar ]
  • Sousa, V.; Almeida, N.M.; Dias, L.A. Risk-Based Management of Occupational Safety and Health in the Construction Industry—Part 1: Background Knowledge. Saf. Sci. 2014 , 66 , 75–86. [ Google Scholar ] [ CrossRef ]
  • Amirah, N.A.; Him, N.F.N.; Rashid, A.; Rasheed, R.; Zaliha, T.N.; Afthanorhan, A. Fostering a Safety Culture in Manufacturing through Safety Behavior: A Structural Equation Modelling Approach. J. Saf. Sustain. 2024; in press . [ Google Scholar ] [ CrossRef ]
  • Heinrich, H.W. Industrial Accident Prevention ; A Scientific Approach; McGraw-Hill: New York, NY, USA, 1931. [ Google Scholar ]
  • Near Miss Definition Per OSHA—What Is a Near Miss? Available online: https://safetystage.com/osha-compliance/near-miss-definition-osha/ (accessed on 17 August 2024).
  • Cambraia, F.B.; Saurin, T.A.; Formoso, C.T. Identification, Analysis and Dissemination of Information on near Misses: A Case Study in the Construction Industry. Saf. Sci. 2010 , 48 , 91–99. [ Google Scholar ] [ CrossRef ]
  • Tan, J.; Li, M. How to Achieve Accurate Accountability under Current Administrative Accountability System for Work Safety Accidents in Chemical Industry in China: A Case Study on Major Work Safety Accidents during 2010–2020. J. Chin. Hum. Resour. Manag. 2022 , 13 , 26–40. [ Google Scholar ] [ CrossRef ]
  • Wu, W.; Gibb, A.G.F.; Li, Q. Accident Precursors and near Misses on Construction Sites: An Investigative Tool to Derive Information from Accident Databases. Saf. Sci. 2010 , 48 , 845–858. [ Google Scholar ] [ CrossRef ]
  • Janicak, C.A. Fall-Related Deaths in the Construction Industry. J. Saf. Res. 1998 , 29 , 35–42. [ Google Scholar ] [ CrossRef ]
  • Li, H.; Yang, X.; Wang, F.; Rose, T.; Chan, G.; Dong, S. Stochastic State Sequence Model to Predict Construction Site Safety States through Real-Time Location Systems. Saf. Sci. 2016 , 84 , 78–87. [ Google Scholar ] [ CrossRef ]
  • Yang, K.; Aria, S.; Ahn, C.R.; Stentz, T.L. Automated Detection of Near-Miss Fall Incidents in Iron Workers Using Inertial Measurement Units. In Proceedings of the Construction Research Congress 2014: Construction in a Global Network, Atlanta, GA, USA, 19–21 May 2014; pp. 935–944. [ Google Scholar ] [ CrossRef ]
  • Raviv, G.; Fishbain, B.; Shapira, A. Analyzing Risk Factors in Crane-Related near-Miss and Accident Reports. Saf. Sci. 2017 , 91 , 192–205. [ Google Scholar ] [ CrossRef ]
  • Zhao, X.; Zhang, M.; Cao, T. A Study of Using Smartphone to Detect and Identify Construction Workers’ near-Miss Falls Based on ANN. In Proceedings of the Nondestructive Characterization and Monitoring of Advanced Materials, Aerospace, Civil Infrastructure, and Transportation XII, Denver, CO, USA, 4–8 March 2018; p. 80. [ Google Scholar ] [ CrossRef ]
  • Santiago, K.; Yang, X.; Ruano-Herreria, E.C.; Chalmers, J.; Cavicchia, P.; Caban-Martinez, A.J. Characterising near Misses and Injuries in the Temporary Agency Construction Workforce: Qualitative Study Approach. Occup. Environ. Med. 2020 , 77 , 94–99. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • What Is OSHA’s Definition of a Near Miss. Available online: https://www.osha.com/blog/near-miss-definition (accessed on 4 August 2023).
  • Martins, I. Investigation of Occupational Accidents and Diseases a Practical Guide for Labour Inspectors ; International Labour Office: Geneva, Switzerland, 2015. [ Google Scholar ]
  • National Safety Council. Near Miss Reporting Systems ; National Safety Council: Singapore, 2013. [ Google Scholar ]
  • PKN PN-ISO 45001:2018-06 ; Occupational Health and Safety Management Systems—Requirements with Guidance for Use. CRC Press: Boca Raton, FL, USA, 2019.
  • PKN PN-N-18001:2004 ; Occupational Health and Safety Management Systems—Requirements. CRC Press: Boca Raton, FL, USA, 2004.
  • World Health Organisation. WHO Draft GuiDelines for Adverse Event Reporting and Learning Systems ; World Health Organisation: Geneva, Switzerland, 2005. [ Google Scholar ]
  • International Atomic Energy Agency IAEA Satety Glossary. Terminology Used in Nuclear Safety and Radiation Protection: 2007 Edition ; International Atomic Energy Agency: Vienna, Austria, 2007. [ Google Scholar ]
  • Marks, E.; Teizer, J.; Hinze, J. Near Miss Reporting Program to Enhance Construction Worker Safety Performance. In Proceedings of the Construction Research Congress 2014: Construction in a Global Network, Atlanta, GA, USA, 19 May 2014; pp. 2315–2324. [ Google Scholar ] [ CrossRef ]
  • Gnoni, M.G.; Saleh, J.H. Near-Miss Management Systems and Observability-in-Depth: Handling Safety Incidents and Accident Precursors in Light of Safety Principles. Saf. Sci. 2017 , 91 , 154–167. [ Google Scholar ] [ CrossRef ]
  • Thoroman, B.; Goode, N.; Salmon, P. System Thinking Applied to near Misses: A Review of Industry-Wide near Miss Reporting Systems. Theor. Issues Ergon. Sci. 2018 , 19 , 712–737. [ Google Scholar ] [ CrossRef ]
  • Gnoni, M.G.; Tornese, F.; Guglielmi, A.; Pellicci, M.; Campo, G.; De Merich, D. Near Miss Management Systems in the Industrial Sector: A Literature Review. Saf. Sci. 2022 , 150 , 105704. [ Google Scholar ] [ CrossRef ]
  • Bird, F. Management Guide to Loss Control ; Loss Control Publications: Houston, TX, USA, 1975. [ Google Scholar ]
  • Zimmermann. Bauer International Norms and Identity ; Zimmermann: Sydney, NSW, Australia, 2006; pp. 5–21. [ Google Scholar ]
  • Arslan, M.; Cruz, C.; Ginhac, D. Semantic Trajectory Insights for Worker Safety in Dynamic Environments. Autom. Constr. 2019 , 106 , 102854. [ Google Scholar ] [ CrossRef ]
  • Arslan, M.; Cruz, C.; Ginhac, D. Visualizing Intrusions in Dynamic Building Environments for Worker Safety. Saf. Sci. 2019 , 120 , 428–446. [ Google Scholar ] [ CrossRef ]
  • Zhou, C.; Chen, R.; Jiang, S.; Zhou, Y.; Ding, L.; Skibniewski, M.J.; Lin, X. Human Dynamics in Near-Miss Accidents Resulting from Unsafe Behavior of Construction Workers. Phys. A Stat. Mech. Its Appl. 2019 , 530 , 121495. [ Google Scholar ] [ CrossRef ]
  • Chen, F.; Wang, C.; Wang, J.; Zhi, Y.; Wang, Z. Risk Assessment of Chemical Process Considering Dynamic Probability of near Misses Based on Bayesian Theory and Event Tree Analysis. J. Loss Prev. Process Ind. 2020 , 68 , 104280. [ Google Scholar ] [ CrossRef ]
  • Wright, L.; Van Der Schaaf, T. Accident versus near Miss Causation: A Critical Review of the Literature, an Empirical Test in the UK Railway Domain, and Their Implications for Other Sectors. J. Hazard. Mater. 2004 , 111 , 105–110. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Saleh, J.H.; Saltmarsh, E.A.; Favar, F.M.; Loı¨c Brevault, L. Accident Precursors, near Misses, and Warning Signs: Critical Review and Formal Definitions within the Framework of Discrete Event Systems. Reliab. Eng. Syst. Saf. 2013 , 114 , 148–154. [ Google Scholar ] [ CrossRef ]
  • Fred, A. Manuele Reviewing Heinrich. Am. Soc. Saf. Prof. 2011 , 56 , 52–61. [ Google Scholar ]
  • Love, P.E.D.; Tenekedjiev, K. Understanding Near-Miss Count Data on Construction Sites Using Greedy D-Vine Copula Marginal Regression: A Comment. Reliab. Eng. Syst. Saf. 2022 , 217 , 108021. [ Google Scholar ] [ CrossRef ]
  • Jan van Eck, N.; Waltman, L. VOSviewer Manual ; Universiteit Leiden: Leiden, The Netherlands, 2015. [ Google Scholar ]
  • Scopus. Content Coverage Guide ; Elsevier: Amsterdam, The Netherlands, 2023; pp. 1–24. [ Google Scholar ]
  • Lukic, D.; Littlejohn, A.; Margaryan, A. A Framework for Learning from Incidents in the Workplace. Saf. Sci. 2012 , 50 , 950–957. [ Google Scholar ] [ CrossRef ]
  • Teizer, J.; Cheng, T. Proximity Hazard Indicator for Workers-on-Foot near Miss Interactions with Construction Equipment and Geo-Referenced Hazard Area. Autom. Constr. 2015 , 60 , 58–73. [ Google Scholar ] [ CrossRef ]
  • Zong, L.; Fu, G. A Study on Designing No-Penalty Reporting System about Enterprise Staff’s near Miss. Adv. Mater. Res. 2011 , 255–260 , 3846–3851. [ Google Scholar ] [ CrossRef ]
  • Golovina, O.; Teizer, J.; Pradhananga, N. Heat Map Generation for Predictive Safety Planning: Preventing Struck-by and near Miss Interactions between Workers-on-Foot and Construction Equipment. Autom. Constr. 2016 , 71 , 99–115. [ Google Scholar ] [ CrossRef ]
  • Zou, P.X.W.; Lun, P.; Cipolla, D.; Mohamed, S. Cloud-Based Safety Information and Communication System in Infrastructure Construction. Saf. Sci. 2017 , 98 , 50–69. [ Google Scholar ] [ CrossRef ]
  • Hinze, J.; Godfrey, R. An Evaluation of Safety Performance Measures for Construction Projects. J. Constr. Res. 2011 , 4 , 5–15. [ Google Scholar ] [ CrossRef ]
  • Construction Inspection Software|IAuditor by SafetyCulture. Available online: https://safetyculture.com/construction/ (accessed on 25 August 2023).
  • Incident Reporting Made Easy|Safety Compliance|Mobile EHS Solutions. Available online: https://www.safety-reports.com/lp/safety/incident/ (accessed on 25 August 2023).
  • Wu, F.; Wu, T.; Yuce, M.R. An Internet-of-Things (IoT) Network System for Connected Safety and Health Monitoring Applications. Sensors 2019 , 19 , 21. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Fang, W.; Luo, H.; Xu, S.; Love, P.E.D.; Lu, Z.; Ye, C. Automated Text Classification of Near-Misses from Safety Reports: An Improved Deep Learning Approach. Adv. Eng. Inform. 2020 , 44 , 101060. [ Google Scholar ] [ CrossRef ]
  • Gatti, U.C.; Lin, K.-Y.; Caldera, C.; Chiang, R. Exploring the Relationship between Chronic Sleep Deprivation and Safety on Construction Sites. In Proceedings of the Construction Research Congress 2014: Construction in a Global Network, Atlanta, GA, USA, 19–24 May 2014; pp. 1772–1781. [ Google Scholar ] [ CrossRef ]
  • Hon, C.K.H.; Chan, A.P.C.; Yam, M.C.H. Relationships between Safety Climate and Safety Performance of Building Repair, Maintenance, Minor Alteration, and Addition (RMAA) Works. Saf. Sci. 2014 , 65 , 10–19. [ Google Scholar ] [ CrossRef ]
  • Oni, O.; Olanrewaju, A.; Cheen, K.S. Accidents at construction sites and near-misses: A constant problem. Int. Struct. Eng. Constr. 2022 , 9 , 2022. [ Google Scholar ] [ CrossRef ]
  • Wu, W.; Yang, H.; Chew, D.A.S.; Yang, S.-H.; Gibb, A.G.F.; Li, Q. Towards an Autonomous Real-Time Tracking System of near-Miss Accidents on Construction Sites. Autom. Constr. 2010 , 19 , 134–141. [ Google Scholar ] [ CrossRef ]
  • Aria, S.S.; Yang, K.; Ahn, C.R.; Vuran, M.C. Near-Miss Accident Detection for Ironworkers Using Inertial Measurement Unit Sensors. In Proceedings of the International Symposium on Automation and Robotics in Construction, ISARC 2014, Sydney, Australia, 9–11 July 2014; Volume 31, pp. 854–859. [ Google Scholar ] [ CrossRef ]
  • Hasanzadeh, S.; Garza, J.M. de la Productivity-Safety Model: Debunking the Myth of the Productivity-Safety Divide through a Mixed-Reality Residential Roofing Task. J. Constr. Eng. Manag. 2020 , 146 , 04020124. [ Google Scholar ] [ CrossRef ]
  • Teizer, J. Magnetic Field Proximity Detection and Alert Technology for Safe Heavy Construction Equipment Operation. In Proceedings of the 32nd International Symposium on Automation and Robotics in Construction, Oulu, Finland, 15–18 June 2015. [ Google Scholar ] [ CrossRef ]
  • Mohajeri, M.; Ardeshir, A.; Banki, M.T.; Malekitabar, H. Discovering Causality Patterns of Unsafe Behavior Leading to Fall Hazards on Construction Sites. Int. J. Constr. Manag. 2022 , 22 , 3034–3044. [ Google Scholar ] [ CrossRef ]
  • Kisaezehra; Farooq, M.U.; Bhutto, M.A.; Kazi, A.K. Real-Time Safety Helmet Detection Using Yolov5 at Construction Sites. Intell. Autom. Soft Comput. 2023 , 36 , 911–927. [ Google Scholar ] [ CrossRef ]
  • Li, C.; Ding, L. Falling Objects Detection for near Miss Incidents Identification on Construction Site. In Proceedings of the ASCE International Conference on Computing in Civil Engineering, Atlanta, GA, USA, 17–19 June 2019; pp. 138–145. [ Google Scholar ] [ CrossRef ]
  • Jeelani, I.; Ramshankar, H.; Han, K.; Albert, A.; Asadi, K. Real-Time Hazard Proximity Detection—Localization of Workers Using Visual Data. In Proceedings of the ASCE International Conference on Computing in Civil Engineering, Atlanta, GA, USA, 17–19 June 2019; pp. 281–289. [ Google Scholar ] [ CrossRef ]
  • Lim, T.-K.; Park, S.-M.; Lee, H.-C.; Lee, D.-E. Artificial Neural Network–Based Slip-Trip Classifier Using Smart Sensor for Construction Workplace. J. Constr. Eng. Manag. 2015 , 142 , 04015065. [ Google Scholar ] [ CrossRef ]
  • Yang, K.; Jebelli, H.; Ahn, C.R.; Vuran, M.C. Threshold-Based Approach to Detect Near-Miss Falls of Iron Workers Using Inertial Measurement Units. In Proceedings of the 2015 International Workshop on Computing in Civil Engineering, Austin, TX, USA, 21–23 June 2015; 2015; 2015, pp. 148–155. [ Google Scholar ] [ CrossRef ]
  • Yang, K.; Ahn, C.R.; Vuran, M.C.; Aria, S.S. Semi-Supervised near-Miss Fall Detection for Ironworkers with a Wearable Inertial Measurement Unit. Autom. Constr. 2016 , 68 , 194–202. [ Google Scholar ] [ CrossRef ]
  • Raviv, G.; Shapira, A.; Fishbain, B. AHP-Based Analysis of the Risk Potential of Safety Incidents: Case Study of Cranes in the Construction Industry. Saf. Sci. 2017 , 91 , 298–309. [ Google Scholar ] [ CrossRef ]
  • Saurin, T.A.; Formoso, C.T.; Reck, R.; Beck da Silva Etges, B.M.; Ribeiro JL, D. Findings from the Analysis of Incident-Reporting Systems of Construction Companies. J. Constr. Eng. Manag. 2015 , 141 , 05015007. [ Google Scholar ] [ CrossRef ]
  • Williams, E.; Sherratt, F.; Norton, E. Exploring the Value in near Miss Reporting for Construction Safety. In Proceedings of the 37th Annual Conference, Virtual Event, 6–10 December 2021; pp. 319–328. [ Google Scholar ]
  • Baker, H.; Smith, S.; Masterton, G.; Hewlett, B. Data-Led Learning: Using Natural Language Processing (NLP) and Machine Learning to Learn from Construction Site Safety Failures. In Proceedings of the 36th Annual ARCOM Conference, Online, 7–8 September 2020; pp. 356–365. [ Google Scholar ]
  • Jin, R.; Wang, F.; Liu, D. Dynamic Probabilistic Analysis of Accidents in Construction Projects by Combining Precursor Data and Expert Judgments. Adv. Eng. Inform. 2020 , 44 , 101062. [ Google Scholar ] [ CrossRef ]
  • Zhou, Z.; Li, C.; Mi, C.; Qian, L. Exploring the Potential Use of Near-Miss Information to Improve Construction Safety Performance. Sustainability 2019 , 11 , 1264. [ Google Scholar ] [ CrossRef ]
  • Boateng, E.B.; Pillay, M.; Davis, P. Predicting the Level of Safety Performance Using an Artificial Neural Network. Adv. Intell. Syst. Comput. 2019 , 876 , 705–710. [ Google Scholar ] [ CrossRef ]
  • Zhang, M.; Cao, T.; Zhao, X. Using Smartphones to Detect and Identify Construction Workers’ Near-Miss Falls Based on ANN. J. Constr. Eng. Manag. 2018 , 145 , 04018120. [ Google Scholar ] [ CrossRef ]
  • Gadekar, H.; Bugalia, N. Automatic Classification of Construction Safety Reports Using Semi-Supervised YAKE-Guided LDA Approach. Adv. Eng. Inform. 2023 , 56 , 101929. [ Google Scholar ] [ CrossRef ]
  • Zhu, Y.; Liao, H.; Huang, D. Using Text Mining and Multilevel Association Rules to Process and Analyze Incident Reports in China. Accid. Anal. Prev. 2023 , 191 , 107224. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Li, M.; Lin, Q.; Jin, H. Research on Near-Miss Incidents Monitoring and Early Warning System for Building Construction Sites Based on Blockchain Technology. J. Constr. Eng. Manag. 2023 , 149 , 04023124. [ Google Scholar ] [ CrossRef ]
  • Chung, W.W.S.; Tariq, S.; Mohandes, S.R.; Zayed, T. IoT-Based Application for Construction Site Safety Monitoring. Int. J. Constr. Manag. 2020 , 23 , 58–74. [ Google Scholar ] [ CrossRef ]
  • Liu, X.; Xu, F.; Zhang, Z.; Sun, K. Fall-Portent Detection for Construction Sites Based on Computer Vision and Machine Learning. Eng. Constr. Archit. Manag. 2023; ahead-of-print . [ Google Scholar ] [ CrossRef ]
  • Abbasi, H.; Guerrieri, A.; Lee, J.; Yang, K. Mobile Device-Based Struck-By Hazard Recognition in Construction Using a High-Frequency Sound. Sensors 2022 , 22 , 3482. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Wang, F.; Li, H.; Dong, C. Understanding Near-Miss Count Data on Construction Sites Using Greedy D-Vine Copula Marginal Regression. Reliab. Eng. Syst. Saf. 2021 , 213 , 107687. [ Google Scholar ] [ CrossRef ]
  • Bugalia, N.; Tarani, V.; Student, G.; Kedia, J.; Gadekar, H. Machine Learning-Based Automated Classification Of Worker-Reported Safety Reports In Construction. J. Inf. Technol. Constr. 2022 , 27 , 926–950. [ Google Scholar ] [ CrossRef ]
  • Chen, S.; Xi, J.; Chen, Y.; Zhao, J. Association Mining of Near Misses in Hydropower Engineering Construction Based on Convolutional Neural Network Text Classification. Comput. Intell. Neurosci. 2022 , 2022 , 4851615. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Tang, S.; Golparvar-Fard, M.; Naphade, M.; Gopalakrishna, M.M. Video-Based Activity Forecasting for Construction Safety Monitoring Use Cases. In Proceedings of the ASCE International Conference on Computing in Civil Engineering, Atlanta, GA, USA, 17–19 June 2019; pp. 204–210. [ Google Scholar ] [ CrossRef ]
  • Rashid, K.M.; Behzadan, A.H. Risk Behavior-Based Trajectory Prediction for Construction Site Safety Monitoring. J. Constr. Eng. Manag. 2018 , 144 , 04017106. [ Google Scholar ] [ CrossRef ]
  • Shen, X.; Marks, E. Near-Miss Information Visualization Tool in BIM for Construction Safety. J. Constr. Eng. Manag. 2016 , 142 , 04015100. [ Google Scholar ] [ CrossRef ]
  • Erusta, N.E.; Sertyesilisik, B. An Investigation into Improving Occupational Health and Safety Performance of Construction Projects through Usage of BIM for Lean Management. In Communications in Computer and Information Science (CCIS) ; Springer: Berlin/Heidelberg, Germany, 2020; Volume 1188, pp. 91–100. [ Google Scholar ] [ CrossRef ]
  • Coffland, M.M.; Kim, A.; Sadatsafavi, H.; Uber, M.M. Improved Data Storage for Better Safety Analysis and Decision Making in Large Construction Management Firms. Available online: https://www.researchgate.net/publication/320474383_Improved_Data_Storage_for_Better_Safety_Analysis_and_Decision_Making_in_Large_Construction_Management_Firms (accessed on 12 June 2024).
  • Zhou, Z.; Li, Q.; Wu, W. Developing a Versatile Subway Construction Incident Database for Safety Management. J. Constr. Eng. Manag. 2011 , 138 , 1169–1180. [ Google Scholar ] [ CrossRef ]
  • Wu, W.; Yang, H.; Li, Q.; Chew, D. An Integrated Information Management Model for Proactive Prevention of Struck-by-Falling-Object Accidents on Construction Sites. Autom. Constr. 2013 , 34 , 67–74. [ Google Scholar ] [ CrossRef ]
  • Hoła, B. Identification and Evaluation of Processes in a Construction Enterprise. Arch. Civ. Mech. Eng. 2015 , 15 , 419–426. [ Google Scholar ] [ CrossRef ]
  • Zhou, C.; Ding, L.; Skibniewski, M.J.; Luo, H.; Jiang, S. Characterizing Time Series of Near-Miss Accidents in Metro Construction via Complex Network Theory. Saf. Sci. 2017 , 98 , 145–158. [ Google Scholar ] [ CrossRef ]
  • Woźniak, Z.; Hoła, B. Time Series Analysis of Hazardous Events Based on Data Recorded in a Polish Construction Company. Arch. Civ. Eng. 2024; in process . [ Google Scholar ]
  • Drozd, W. Characteristics of Construction Site in Terms of Occupational Safety. J. Civ. Eng. Environ. Archit. 2016 , 63 , 165–172. [ Google Scholar ]
  • Meliá, J.L.; Mearns, K.; Silva, S.A.; Lima, M.L. Safety Climate Responses and the Perceived Risk of Accidents in the Construction Industry. Saf. Sci. 2008 , 46 , 949–958. [ Google Scholar ] [ CrossRef ]
  • Bugalia, N.; Maemura, Y.; Ozawa, K. A System Dynamics Model for Near-Miss Reporting in Complex Systems. Saf. Sci. 2021 , 142 , 105368. [ Google Scholar ] [ CrossRef ]
  • Gyi, D.E.; Gibb, A.G.F.; Haslam, R.A. The Quality of Accident and Health Data in the Construction Industry: Interviews with Senior Managers. Constr. Manag. Econ. 1999 , 17 , 197–204. [ Google Scholar ] [ CrossRef ]
  • Menzies, J. Structural Safety: Learning and Warnings. Struct. Eng. 2002 , 80 , 15–16. [ Google Scholar ]
  • Fullerton, C.E.; Allread, B.S.; Teizer, J. Pro-Active-Real-Time Personnel Warning System. In Proceedings of the Construction Research Congress 2009: Building a Sustainable Future, Seattle, WA, USA, 5–7 April 2009; pp. 31–40. [ Google Scholar ] [ CrossRef ]
  • Marks, E.D.; Wetherford, J.E.; Teizer, J.; Yabuki, N. Potential of Leading Indicator Data Collection and Analysis for Proximity Detection and Alert Technology in Construction. In Proceedings of the 30th ISARC—International Symposium on Automation and Robotics in Construction Conference, Montreal, QC, Canada, 11–15 August 2013; pp. 1029–1036. [ Google Scholar ] [ CrossRef ]
  • Martin, H.; Lewis, T.M. Pinpointing Safety Leadership Factors for Safe Construction Sites in Trinidad and Tobago. J. Constr. Eng. Manag. 2014 , 140 , 04013046. [ Google Scholar ] [ CrossRef ]
  • Hobson, P.; Emery, D.; Brown, L.; Bashford, R.; Gill, J. People–Plant Interface Training: Targeting an Industry Fatal Risk. Proc. Inst. Civ. Eng. Civ. Eng. 2014 , 167 , 138–144. [ Google Scholar ] [ CrossRef ]
  • Marks, E.; Mckay, B.; Awolusi, I. Using near Misses to Enhance Safety Performance in Construction. In Proceedings of the ASSE Professional Development Conference and Exposition, Dallas, TX, USA, 7–10 June 2015. [ Google Scholar ]
  • Popp, J.D.; Scarborough, M.S. Investigations of near Miss Incidents—New Facility Construction and Commissioning Activities. IEEE Trans. Ind. Appl. 2016 , 53 , 615–621. [ Google Scholar ] [ CrossRef ]
  • Nickel, P.; Lungfiel, A.; Trabold, R.J. Reconstruction of near Misses and Accidents for Analyses from Virtual Reality Usability Study. In Lecture Notes in Computer Science ; Springer: Berlin/Heidelberg, Germany, 2017; Volume 10700, pp. 182–191. [ Google Scholar ] [ CrossRef ]
  • Gambatese, J.A.; Pestana, C.; Lee, H.W. Alignment between Lean Principles and Practices and Worker Safety Behavior. J. Constr. Eng. Manag. 2017 , 143 , 04016083. [ Google Scholar ] [ CrossRef ]
  • Van Voorhis, S.; Korman, R. Reading Signs of Trouble. Eng. News-Rec. 2017 , 278 , 14–17. [ Google Scholar ]
  • Doan, D.R. Investigation of a near-miss shock incident. IEEE Trans. Ind. Appl. 2016 , 52 , 560–561. [ Google Scholar ] [ CrossRef ]
  • Oswald, D.; Sherratt, F.; Smith, S. Problems with safety observation reporting: A construction industry case study. Saf. Sci. 2018 , 107 , 35–45. [ Google Scholar ] [ CrossRef ]
  • Raviv, G.; Shapira, A. Systematic approach to crane-related near-miss analysis in the construction industry. Int. J. Constr. Manag. 2018 , 18 , 310–320. [ Google Scholar ] [ CrossRef ]
  • Whiteoak, J.; Appleby, J. Mate, that was bloody close! A case history of a nearmiss program in the Australian construction industry. J. Health Saf. Environ. 2019 , 35 , 31–43. [ Google Scholar ]
  • Duryan, M.; Smyth, H.; Roberts, A.; Rowlinson, S.; Sherratt, F. Knowledge transfer for occupational health and safety: Cultivating health and safety learning culture in construction firms. Accid. Anal. Prev. 2020 , 139 , 105496. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Shaikh, A.Y.; Osei-Kyei, R.; Hardie, M. A critical analysis of safety performance indicators in construction. Int. J. Build. Pathol. Adapt. 2020 , 39 , 547–580. [ Google Scholar ] [ CrossRef ]
  • Martin, H.; Mohan, N.; Ellis, L.; Dunne, S. Exploring the Role of PPE Knowledge, Attitude, and Correct Practices in Safety Outcomes on Construction Sites. J. Archit. Eng. 2021 , 27 , 05021011. [ Google Scholar ] [ CrossRef ]
  • Qin, Z.; Wu, S. A simulation model of engineering construction near-miss event disclosure strategy based on evolutionary game theory. In Proceedings of the 2021 4th International Conference on Information Systems and Computer Aided Education, Dalian, China, 24–26 September 2021; pp. 2572–2577. [ Google Scholar ] [ CrossRef ]
  • Alamoudi, M. The Integration of NOSACQ-50 with Importance-Performance Analysis Technique to Evaluate and Analyze Safety Climate Dimensions in the Construction Sector in Saudi Arabia. Buildings 2022 , 12 , 1855. [ Google Scholar ] [ CrossRef ]
  • Herrmann, A.W. Development of CROSS in the United States. In Proceedings of the Forensic Engineering 2022: Elevating Forensic Engineering—Selected Papers from the 9th Congress on Forensic Engineering, Denver, Colorado, 4–7 November 2022; Volume 2, pp. 40–43. [ Google Scholar ] [ CrossRef ]
  • Al Shaaili, M.; Al Alawi, M.; Ekyalimpa, R.; Al Mawli, B.; Al-Mamun, A.; Al Shahri, M. Near-miss accidents data analysis and knowledge dissemination in water construction projects in Oman. Heliyon 2023 , 9 , e21607. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Agnusdei, G.P.; Gnoni, M.G.; Tornese, F.; De Merich, D.; Guglielmi, A.; Pellicci, M. Application of Near-Miss Management Systems: An Exploratory Field Analysis in the Italian Industrial Sector. Safety 2023 , 9 , 47. [ Google Scholar ] [ CrossRef ]
  • Duan, P.; Zhou, J. A science mapping approach-based review of near-miss research in construction. Eng. Constr. Archit. Manag. 2023 , 30 , 2582–2601. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

No.Name of Institution/OrganizationDefinition
1Occupational Safety and Health Administration (OSHA) [ ]“A near-miss is a potential hazard or incident in which no property was damaged and no personal injury was sustained, but where, given a slight shift in time or position, damage or injury easily could have occurred. Near misses also may be referred to as close calls, near accidents, or injury-free events.”
2International Labour Organization (ILO) [ ]“An event, not necessarily defined under national laws and regulations, that could have caused harm to persons at work or to the public, e.g., a brick that
falls off scaffolding but does not hit anyone”
3American National Safety Council (NSC) [ ]“A Near Miss is an unplanned event that did not result in injury, illness, or damage—but had the potential to do so”
4PN-ISO 45001:2018-06 [ ]A near-miss incident is described as an event that does not result in injury or health issues.
5PN-N-18001:2004 [ ]A near-miss incident is an accident event without injury.
6World Health Organization (WHO) [ ]Near misses have been defined as a serious error that has the potential to cause harm but are not due to chance or interception.
7International Atomic Energy Agency (IAEA) [ ]Near misses have been defined as potentially significant events that could have consequences but did not due to the conditions at the time.
No.JournalNumber of Publications
1Safety Science10
2Journal of Construction Engineering and Management8
3Automation in Construction5
4Advanced Engineering Informatics3
5Construction Research Congress 2014 Construction in a Global Network Proceedings of the 2014 Construction Research Congress3
6International Journal of Construction Management3
7Accident Analysis and Prevention2
8Computing in Civil Engineering 2019 Data Sensing and Analytics Selected Papers From The ASCE International Conference2
9Engineering Construction and Architectural Management2
10Heliyon2
Cluster NumberColourBasic Keywords
1blueconstruction, construction sites, decision making, machine learning, near misses, neural networks, project management, safety, workers
2greenbuilding industry, construction industry, construction projects, construction work, human, near miss, near misses, occupational accident, occupational safety, safety, management, safety performance
3redaccident prevention, construction equipment, construction, safety, construction workers, hazards, human resource management, leading indicators, machinery, occupational risks, risk management, safety engineering
4yellowaccidents, risk assessment, civil engineering, near miss, surveys
Number of QuestionQuestionReferences
Q Are near misses in the construction industry studied scientifically?[ , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ]
Q What methods have been used to obtain information on near misses and systems for recording incidents in construction companies?[ , , , , , , , , , , , , , , , , , , , , ]
Q What methods have been used to analyse the information and figures that have been obtained?[ , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ]
Q What are the key aspects of near misses in the construction industry that have been of interest to the researchers?[ , , , , , , , , , , , , ]
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Woźniak, Z.; Hoła, B. Analysing Near-Miss Incidents in Construction: A Systematic Literature Review. Appl. Sci. 2024 , 14 , 7260. https://doi.org/10.3390/app14167260

Woźniak Z, Hoła B. Analysing Near-Miss Incidents in Construction: A Systematic Literature Review. Applied Sciences . 2024; 14(16):7260. https://doi.org/10.3390/app14167260

Woźniak, Zuzanna, and Bożena Hoła. 2024. "Analysing Near-Miss Incidents in Construction: A Systematic Literature Review" Applied Sciences 14, no. 16: 7260. https://doi.org/10.3390/app14167260

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

COMMENTS

  1. How to Do a Systematic Review: A Best Practice Guide for Conducting and

    The best reviews synthesize studies to draw broad theoretical conclusions about what a literature means, linking theory to evidence and evidence to theory. This guide describes how to plan, conduct, organize, and present a systematic review of quantitative (meta-analysis) or qualitative (narrative review, meta-synthesis) information.

  2. Guidance on Conducting a Systematic Literature Review

    Literature reviews establish the foundation of academic inquires. However, in the planning field, we lack rigorous systematic reviews. In this article, through a systematic search on the methodology of literature review, we categorize a typology of literature reviews, discuss steps in conducting a systematic literature review, and provide suggestions on how to enhance rigor in literature ...

  3. How-to conduct a systematic literature review: A quick guide for

    Overview. A Systematic Literature Review (SLR) is a research methodology to collect, identify, and critically analyze the available research studies (e.g., articles, conference proceedings, books, dissertations) through a systematic procedure .An SLR updates the reader with current literature about a subject .The goal is to review critical points of current knowledge on a topic about research ...

  4. Ten Steps to Conduct a Systematic Review

    It is usually the initial figure presented in the results section of your systematic review . ... 2016. How to do a systematic literature review in nursing: a step-by-step guide. [Google Scholar] 9. Utilization of the PICO framework to improve searching PubMed for clinical questions. Schardt C, Adams MB, Owens T, Keitz S, Fontelo P. BMC Med ...

  5. Conducting systematic literature reviews and bibliometric analyses

    The rationale for systematic literature reviews has been well established in some fields such as medicine for decades (e.g. Mulrow, 1994); however, there are still few methodological guidelines available in the management sciences on how to assemble and structure such reviews (for exceptions, see Denyer and Tranfield, 2009; Tranfield et al., 2003 and related publications).

  6. (PDF) A guide to systematic literature reviews

    The first stage in conducting a systematic. review is to develop a protocol that clearly defines: 1) the aims. and objectives of the review; 2) the inclusion and exclusion. criteria for studies ...

  7. Systematic reviews: Structure, form and content

    Abstract. This article aims to provide an overview of the structure, form and content of systematic reviews. It focuses in particular on the literature searching component, and covers systematic database searching techniques, searching for grey literature and the importance of librarian involvement in the search.

  8. (PDF) Systematic Literature Reviews: An Introduction

    systematic reviews is increasing exponentially (Figure 1, see also (Bastian et al., 2010)). However, there are also methodological and practical challenges to systematic reviews. First, the

  9. Systematic reviews of the literature: an introduction to current

    Systematic reviews serve different purposes and use a different methodology than other types of evidence synthesis that include narrative reviews, scoping reviews, and overviews of reviews. Systematic reviews can address questions regarding effects of interventions or exposures, diagnostic properties of tests, and prevalence or prognosis of ...

  10. An overview of methodological approaches in systematic reviews

    1. INTRODUCTION. Evidence synthesis is a prerequisite for knowledge translation. 1 A well conducted systematic review (SR), often in conjunction with meta‐analyses (MA) when appropriate, is considered the "gold standard" of methods for synthesizing evidence related to a topic of interest. 2 The central strength of an SR is the transparency of the methods used to systematically search ...

  11. PRISMA statement

    Here you can access information about the PRISMA reporting guidelines, which are designed to help authors transparently report why their systematic review was done, what methods they used, and what they found. The main PRISMA reporting guideline (the PRISMA 2020 statement) primarily provides guidance for the reporting of systematic reviews ...

  12. Chapter 14: Completing 'Summary of findings' tables and ...

    The highest certainty rating is a body of evidence when there are no concerns in any of the GRADE factors listed in Figure 14.2.a. Review authors often downgrade evidence to moderate, ... Systematic Reviews in Health Care: Meta-analysis in Context. 2nd ed. London (UK): ...

  13. (PDF) Systematic Literature Review: Some Examples

    Report. 4. ii. Example for a Systematic Literature Review: In references 5 example for paper that use Systematic Literature Review (SlR) example: ( Event-Driven Process Chain for Modeling and ...

  14. Five tips for developing useful literature summary tables for writing

    Literature reviews offer a critical synthesis of empirical and theoretical literature to assess the strength of evidence, develop guidelines for practice and policymaking, and identify areas for future research.1 It is often essential and usually the first task in any research endeavour, particularly in masters or doctoral level education. For effective data extraction and rigorous synthesis ...

  15. Full article: Digitalising the Systematic Literature Review process

    A systematic review can consider only quantitative studies (i.e., meta-analysis), or just qualitative studies (i.e., meta-ethnography; Mays et al., Citation 2005). All in all, SLR combines the Literature Review core feature, the use of scientific sources, with the structured, unbiased, and evidence-based Systematic Review (see, Figure 2). It is ...

  16. How-to conduct a systematic literature review: A quick guide for

    Method details Overview. A Systematic Literature Review (SLR) is a research methodology to collect, identify, and critically analyze the available research studies (e.g., articles, conference proceedings, books, dissertations) through a systematic procedure [12].An SLR updates the reader with current literature about a subject [6].The goal is to review critical points of current knowledge on a ...

  17. Full article: Systematic literature reviews over the years

    Purpose: Nowadays, systematic literature reviews (SLRs) and meta-analyses are often placed at the top of the study hierarchy of evidence. The main objective of this paper is to evaluate the trends in SLRs of randomized controlled trials (RCTs) throughout the years. ... Figure 5 depicts the distribution of studies according to disease area in ...

  18. LibGuides: Nursing

    Level of evidence (e.g., Figure 2.2) + Quality of evidence = Strength of evidence. Thus, in coming to a conclusion about the quality of the evidence, it is insufficient to simply 'level' the evidence using an LOE scale-it must also be appraised" (p. 36). ... Carrying out systematic literature reviews: An introduction. British Journal of ...

  19. Full article: A systematic literature review of the impact of impaired

    A systematic literature review of the impact of impaired self-awareness on the process of rehabilitation in acquired brain injury. ... and 4 further relevant papers were found. 17 articles were therefore included within this review. Figure 1 shows a PRISMA flow diagram (Citation 29) of the screening and selection process.

  20. Systematic Literature Reviews and Meta-Analyses

    Systematic literature reviews provide an overview of the state of research on a given topic and enable an assessment of the quality of individual studies. They also allow the results of different studies to be evaluated together when these are inconsistent. ... (Figure 2a), there is a roughly funnel shaped distribution of the effect estimates ...

  21. Use of Massage Therapy for Pain, 2018-2023 : A Systematic Review

    This systematic review maps the certainty and quality of evidence reported by systematic reviews in 2018 to 2023 of massage therapy for pain in adults. ... Figure 1. Literature Flowchart. View Large Download. LMIC indicates low- and- middle-income country; SR, systematic review. Figure 2. Evidence Map.

  22. Rapid reviews methods series: Guidance on literature search

    This paper is part of a series of methodological guidance from the Cochrane Rapid Reviews Methods Group. Rapid reviews (RR) use modified systematic review methods to accelerate the review process while maintaining systematic, transparent and reproducible methods. In this paper, we address considerations for RR searches. We cover the main areas relevant to the search process: preparation and ...

  23. A practical guide to data analysis in general literature reviews

    A general literature review starts with formulating a research question, defining the population, and conducting a systematic search in scientific databases, steps that are well-described elsewhere. 1,2,3 Once students feel confident that they have thoroughly combed through relevant databases and found the most relevant research on the topic ...

  24. Systematic Literature Review of AI-enabled Spectrum Management in 6G

    Using the Systematic Literature Review (SLR) methodology, we meticulously analyzed 110 primary studies to: (a) Identify AI's utility in spectrum management. (b) Develop a taxonomy of AI approaches. (c) Classify datasets and performance metrics used. (d) Detail security and privacy threats and countermeasures.

  25. Applied Sciences

    This systematic literature review delves into the extensive landscape of emotion recognition, sentiment analysis, and affective computing, analyzing 609 articles. Exploring the intricate relationships among these research domains, and leveraging data from four well-established sources—IEEE, Science Direct, Springer, and MDPI—this systematic review classifies studies in four modalities ...

  26. Sustainability

    This systematic literature review (SLR) examines the integration of circular economy (CE) principles into the agri-food supply chain over the past 20 years. The review aims to consolidate existing knowledge, identify research gaps, and provide actionable insights for future research. A comprehensive search across major databases yielded 1200 articles, which were screened, filtered, and ...

  27. Methodological Investigation: Traditional and Systematic Reviews as

    Traditional Literature Review (TLR) has been stated to be a retrospective account of previous research on certain topic (Li & Wang, 2018). Meanwhile, Systematic Literature Review (SLR) has been stated as a means of evaluating and interpreting all available research significant to a singular research question, topic area, or phenomenon of ...

  28. Systematic literature reviews over the years

    Purpose: Nowadays, systematic literature reviews (SLRs) and meta-analyses are often placed at the top of the study hierarchy of evidence. The main objective of this paper is to evaluate the trends in SLRs of randomized controlled trials (RCTs) throughout the years. ... Figure 5 depicts the distribution of studies according to disease area in ...

  29. (PDF) A systematic literature review of the impact of impaired self

    A systematic literature review of the impact of impaired self-awareness on the process of rehabilitation in acquired brain injury. ... All figure content in this area was uploaded by Pete Fleming.

  30. Analysing Near-Miss Incidents in Construction: A Systematic Literature

    The construction sector is notorious for its high rate of fatalities globally. Previous research has established that near-miss incidents act as precursors to accidents. This study aims to identify research gaps in the literature on near-miss events in construction and to define potential directions for future research. The Scopus database serves as the knowledge source for this study. To ...