4 GRADE analysis

The GRADE (Grading of Recommendations, Assessment, Development, and Evaluations) approach is a systematic framework designed to assess the quality of evidence in healthcare and inform recommendation formulation. By evaluating a range of criteria, it ensures that recommendations are both reliable and pertinent. Further information can be found here. The developed GRADE application offers an interactive platform that effectively applies these evaluative principles, providing users with tools to assess and grade evidence meticulously. Further refinements are anticipated, particularly regarding two aspects:

The application of GRADE’s rigorous healthcare-centric rules to social science research may necessitate modifications. Discussions will center on tailoring these rules to accommodate the distinct types of evidence and methodologies prevalent in social science studies.
Further discussions are anticipated to refine specific decision-making rules within the GRADE framework. These updates will enhance the framework’s applicability to a broader spectrum of research contexts.

This version is an initial draft and will evolve as these discussions progress and additional insights are integrated.

4.1 Application Interface

Summary Statistics: A detailed breakdown of the evidence quality assessment, including:
- Study Info: Details like the number of studies, study design, total sample.
- Estimate: The effect size estimate and its confidence interval.
- GRADE Criteria: This section presents a tabulated overview of the assessed GRADE criteria, including Risk of Bias (ROB), Inconsistency, Imprecision, and Publication Bias, with respective statistics and decisions.
- Decision Table: A summary table that presents a consolidated view of the study’s GRADE assessments across different criteria. This table may include the number of studies, study design, sample size, effect estimates, confidence intervals, and the GRADE criteria assessments.
- Display Sources?: A choice for users to determine if they want to display the sources of the studies within the generated tables.
- Include All Outcomes: This toggle allows users to decide whether to analyze all outcomes together or to focus on a selected outcome.
Choose GRADE Criteria: Users can select specific GRADE criteria to focus their analysis on Risk of Bias, Inconsistency, Indirectness, Imprecision, other considerations like Publication Bias, or other considerations such as Large effect. If you choose specific domain, the relevant data visulization will be displayed.
- Plot Analysis: A graphical representation, such as a bar chart, showing the proportion of studies with high risk of bias, aiding in the visual assessment of evidence quality.
Export the Excel Data: Provides users with the functionality to download the analyzed data in Excel format for offline review or further processing.

4.2 Risk of Bias

In the GRADE framework, the term ‘Risk of Bias’ refers to the likelihood that methodological flaws in study design or execution may result in systematic errors, compromising the validity of the research findings. The application employs the 3ie coding system for risk of bias assessment.

4.2.1 Decision Rule

A high prevalence of studies classified with a ‘High Risk of Bias’ negatively impacts the overall quality of evidence. To categorize the severity of bias, the application implements decision rules based on established thresholds:

Proportion of “High Risk of Bias” <= 20% -> “Not serious”
Proportion of “High Risk of Bias” is between 20% and 50% -> “Serious”
Proportion of “High Risk of Bias” >= 50% -> “Very serious.”

4.3 Inconsistency

Inconsistency within the GRADE framework addresses the variability in results across different studies. This variability might stem from actual differences in study designs, populations, interventions, and outcomes, or might indicate underlying heterogeneity that affects the reliability of the evidence. The application measures inconsistency using the I² statistic, which quantifies the proportion of total variation across studies that is attributable to heterogeneity rather than chance.

4.3.1 Decision Rule

The application’s decision rules for grading the severity of inconsistency take into account both the extent of heterogeneity (as measured by I²) and the rate at which confidence intervals cross predefined thresholds for effect size (rate_down). These rules help determine the impact of inconsistency on the overall quality of evidence:

I² < 50% and No Thresholds Crossed : Viewed as ‘Not serious’. This suggests low heterogeneity and consistent evidence across studies without significant variability.
Either I² ≥ 50% or Thresholds Crossed, but not both: Rated as ‘Serious’. This scenario reflects either notable heterogeneity or evidence variability, suggesting moderate concern regarding inconsistency.
I² ≥ 50% and Thresholds Crossed: Considered ‘Very serious’. This indicates substantial heterogeneity and significant inconsistency in the evidence, warranting potential downgrading of evidence quality.

4.4 Indirectness (Not developed)

4.5 Imprecision

Within the GRADE framework, imprecision is evaluated by scrutinizing the confidence intervals around the effect estimates to gauge the evidence’s certainty. This assessment hinges on the width of the confidence intervals and their overlap with predefined thresholds for effect sizes. The approach to imprecision is informed by the GRADE guidance, which emphasizes the significance of contextual factors in determining the certainty of evidence (Schünemann et al., 2022).

4.5.1 Decision Rule

The decision rules for imprecision are based on the extent to which confidence intervals cross these SMD thresholds and the size of study populations. To facilitate this assessment, the application utilizes the following SMD thresholds to delineate small, moderate, and large effects:

Threshold
- Small Effect: SMD = 0.2
- Moderate Effect: SMD = 0.5
- Large Effect: SMD = 0.8
No Thresholds Crossed & All Studies >400 Participants => ‘Not Serious’
One Threshold Crossed or Any Study ≤400 Participants => ‘Serious’
Two Thresholds Crossed and Any Study ≤400 Participants =>‘Very Serious’
Three or More Thresholds Crossed => ‘Extremely Serious’.

4.6 Other Considerations

4.6.1 Publication Bias

GRADE considers the potential for publication bias, which the application addresses through statistical tests like the rank correlation test and regression test. These analyses help identify asymmetries in the funnel plot, indicating possible publication bias.

Egger’s Test: This test is used to detect funnel plot asymmetry, which can indicate the presence of publication bias. The presence of bias might suggest that studies with significant or favorable results are more likely to be published.
Rank Correlation Test: This test evaluates the correlation between the treatment effects and their variances. A significant result may imply publication bias, suggesting small studies with positive outcomes are more likely to be published.

4.6.2 Decision Rule

Significant Bias Detected (p-value < 0.05 in either test): Implies a strong suspicion of publication bias, which could skew the evidence base by over-representing studies with positive results.
No Significant Bias Detected (p-value ≥ 0.05 in both tests): Indicates that publication bias is undetected, suggesting a balanced representation of studies irrespective of their findings.

4.6.3 Large Effect (Not developed)

4.7 File Export

The ‘Export the excel data’ feature allows users to download a comprehensive dataset inclusive of all measured outcomes when ‘Include all outcomes?’ is selected as ‘Yes.’ This dataset is formatted for seamless integration with the GRADE website and facilitates further analysis or dissemination.